8000 FileNotFoundError: file /content/scidocs/data/recomm-tmp/model.tar.gz not found · Issue #24 · allenai/scidocs · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
FileNotFoundError: file /content/scidocs/data/recomm-tmp/model.tar.gz not found #24
Open
@JohnGiorgi

Description

@JohnGiorgi

Hi! Trying to run SciDocs on my own model. I produce the three files of embeddings and then run the evaluation suite:

from scidocs import get_scidocs_metrics
from scidocs.paths import DataPaths

# point to the data, which should be in scidocs/data by default
data_paths = DataPaths()

# now run the evaluation
scidocs_metrics = get_scidocs_metrics(
    data_paths,
    str(classification_embeddings_path),
    str(user_activity_and_citations_embeddings_path),
    str(recomm_embeddings_path),
    val_or_test='test',  # set to 'val' if tuning hyperparams
    n_jobs=12,           # the classification tasks can be parallelized
    cuda_device=0        # the recomm task can use a GPU if this is set to 0, 1, etc
)

print(scidocs_metrics)

The first few tasks seem to work okay:

Loading MAG/MeSH embeddings...

reading embeddings from file...: 48473it [00:16, 2908.12it/s]

Running the MAG task...
Fitting 3 folds for each of 7 candidates, totalling 21 fits

[Parallel(n_jobs=12)]: Using backend LokyBackend with 12 concurrent workers.
[Parallel(n_jobs=12)]: Done  21 out of  21 | elapsed:  5.9min finished

Running the MeSH task...
Fitting 3 folds for each of 7 candidates, totalling 21 fits

[Parallel(n_jobs=12)]: Using backend LokyBackend with 12 concurrent workers.
[Parallel(n_jobs=12)]: Done  21 out of  21 | elapsed:  4.6min finished

Loading co-view, co-read, cite, and co-cite embeddings...

reading embeddings from file...: 142009it [00:50, 2803.86it/s]

But when it hits the recomm task it errors out:

Running the recomm task...

[/content/scidocs/scidocs/__init__.py](https://localhost:8080/#) in get_scidocs_metrics(data_paths, classification_embeddings_path, user_activity_and_citations_embeddings_path, recomm_embeddings_path, val_or_test, n_jobs, cuda_device)
     39     scidocs_metrics.update(get_mag_mesh_metrics(data_paths, classification_embeddings_path, val_or_test=val_or_test, n_jobs=n_jobs))
     40     scidocs_metrics.update(get_view_cite_read_metrics(data_paths, user_activity_and_citations_embeddings_path, val_or_test=val_or_test))
---> 41     scidocs_metrics.update(get_recomm_metrics(data_paths, recomm_embeddings_path, val_or_test=val_or_test, cuda_device=cuda_device))
     42 
     43     return scidocs_metrics

[/content/scidocs/scidocs/recomm_click_eval.py](https://localhost:8080/#) in get_recomm_metrics(data_paths, embeddings_path, val_or_test, cuda_device)
    166     subprocess.run(command)
    167     metrics = evaluate_ranking_performance(simpapers_model_path, data_paths.recomm_test if val_or_test=='test'
--> 168        else data_paths.recomm_val, int(cuda_device))
    169     return {'recomm': {
    170         'adj-NDCG': np.round(100 * float(metrics['Adj-ndcg']), 2),

[/content/scidocs/scidocs/recomm_click_eval.py](https://localhost:8080/#) in evaluate_ranking_performance(archive_path, test_data_path, cuda_device)
     22 def evaluate_ranking_performance(archive_path, test_data_path, cuda_device):
     23 
---> 24     archive = archival.load_archive(archive_path, cuda_device=cuda_device)
     25     params = archive.config
     26     sr = archive.model

[/usr/local/lib/python3.7/dist-packages/allennlp/models/archival.py](https://localhost:8080/#) in load_archive(archive_file, cuda_device, overrides, weights_file)
    168     """
    169     # redirect to the cache, if necessary
--> 170     resolved_archive_file = cached_path(archive_file)
    171 
    172     if resolved_archive_file == archive_file:

[/usr/local/lib/python3.7/dist-packages/allennlp/common/file_utils.py](https://localhost:8080/#) in cached_path(url_or_filename, cache_dir)
    104     elif parsed.scheme == '':
    105         # File, but it doesn't exist.
--> 106         raise FileNotFoundError("file {} not found".format(url_or_filename))
    107     else:
    108         # Something unknown

FileNotFoundError: file /content/scidocs/data/recomm-tmp/model.tar.gz not found

Looks like it can't find /content/scidocs/data/recomm-tmp/model.tar.gz. Should this have been downloaded by the call to aws s3 sync?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0