FileNotFoundError: file /content/scidocs/data/recomm-tmp/model.tar.gz not found

Hi! Trying to run SciDocs on my own model. I produce the three files of embeddings and then run the evaluation suite:

from scidocs import get_scidocs_metrics
from scidocs.paths import DataPaths

# point to the data, which should be in scidocs/data by default
data_paths = DataPaths()

# now run the evaluation
scidocs_metrics = get_scidocs_metrics(
    data_paths,
    str(classification_embeddings_path),
    str(user_activity_and_citations_embeddings_path),
    str(recomm_embeddings_path),
    val_or_test='test',  # set to 'val' if tuning hyperparams
    n_jobs=12,           # the classification tasks can be parallelized
    cuda_device=0        # the recomm task can use a GPU if this is set to 0, 1, etc
)

print(scidocs_metrics)

The first few tasks seem to work okay:

Loading MAG/MeSH embeddings...

reading embeddings from file...: 48473it [00:16, 2908.12it/s]

Running the MAG task...
Fitting 3 folds for each of 7 candidates, totalling 21 fits

[Parallel(n_jobs=12)]: Using backend LokyBackend with 12 concurrent workers.
[Parallel(n_jobs=12)]: Done  21 out of  21 | elapsed:  5.9min finished

Running the MeSH task...
Fitting 3 folds for each of 7 candidates, totalling 21 fits

[Parallel(n_jobs=12)]: Using backend LokyBackend with 12 concurrent workers.
[Parallel(n_jobs=12)]: Done  21 out of  21 | elapsed:  4.6min finished

Loading co-view, co-read, cite, and co-cite embeddings...

reading embeddings from file...: 142009it [00:50, 2803.86it/s]

But when it hits the recomm task it errors out:

Running the recomm task...

[/content/scidocs/scidocs/__init__.py](https://localhost:8080/#) in get_scidocs_metrics(data_paths, classification_embeddings_path, user_activity_and_citations_embeddings_path, recomm_embeddings_path, val_or_test, n_jobs, cuda_device)
     39     scidocs_metrics.update(get_mag_mesh_metrics(data_paths, classification_embeddings_path, val_or_test=val_or_test, n_jobs=n_jobs))
     40     scidocs_metrics.update(get_view_cite_read_metrics(data_paths, user_activity_and_citations_embeddings_path, val_or_test=val_or_test))
---> 41     scidocs_metrics.update(get_recomm_metrics(data_paths, recomm_embeddings_path, val_or_test=val_or_test, cuda_device=cuda_device))
     42 
     43     return scidocs_metrics

[/content/scidocs/scidocs/recomm_click_eval.py](https://localhost:8080/#) in get_recomm_metrics(data_paths, embeddings_path, val_or_test, cuda_device)
    166     subprocess.run(command)
    167     metrics = evaluate_ranking_performance(simpapers_model_path, data_paths.recomm_test if val_or_test=='test'
--> 168        else data_paths.recomm_val, int(cuda_device))
    169     return {'recomm': {
    170         'adj-NDCG': np.round(100 * float(metrics['Adj-ndcg']), 2),

[/content/scidocs/scidocs/recomm_click_eval.py](https://localhost:8080/#) in evaluate_ranking_performance(archive_path, test_data_path, cuda_device)
     22 def evaluate_ranking_performance(archive_path, test_data_path, cuda_device):
     23 
---> 24     archive = archival.load_archive(archive_path, cuda_device=cuda_device)
     25     params = archive.config
     26     sr = archive.model

[/usr/local/lib/python3.7/dist-packages/allennlp/models/archival.py](https://localhost:8080/#) in load_archive(archive_file, cuda_device, overrides, weights_file)
    168     """
    169     # redirect to the cache, if necessary
--> 170     resolved_archive_file = cached_path(archive_file)
    171 
    172     if resolved_archive_file == archive_file:

[/usr/local/lib/python3.7/dist-packages/allennlp/common/file_utils.py](https://localhost:8080/#) in cached_path(url_or_filename, cache_dir)
    104     elif parsed.scheme == '':
    105         # File, but it doesn't exist.
--> 106         raise FileNotFoundError("file {} not found".format(url_or_filename))
    107     else:
    108         # Something unknown

FileNotFoundError: file /content/scidocs/data/recomm-tmp/model.tar.gz not found

Looks like it can't find /content/scidocs/data/recomm-tmp/model.tar.gz. Should this have been downloaded by the call to aws s3 sync?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions