8000 Canʻt save embeddings.pt file - validation error for TrainConfig pre_word_vecs_enc - Object has no attribute 'pre_word_vecs_enc' · Issue #94 · eole-nlp/eole · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Canʻt save embeddings.pt file - validation error for TrainConfig pre_word_vecs_enc - Object has no attribute 'pre_word_vecs_enc' #94

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
HURIMOZ opened this issue Aug 31, 2024 · 2 comments
Labels
bug Something isn't working contribution welcome Feel free to PR

Comments

@HURIMOZ
Copy link
HURIMOZ commented Aug 31, 2024

Hi everyone, I get an error on saving embeddings:

(TY-EN) ubuntu@ip-172-31-2-199:~/TY-EN/eole/recipes/wmt17$ eole train --config wmt17_enty.yaml
[2024-08-31 00:23:51,304 INFO] Get special vocabs from Transforms: {'src': [], 'tgt': []}.
[2024-08-31 00:23:51,367 INFO] Reading encoder embeddings from data/en.wiki.bpe.vs25000.d300.w2v-256.txt
[2024-08-31 00:23:53,487 INFO]  Found 25000 total vectors in file.
[2024-08-31 00:23:53,487 INFO] After filtering to vectors in vocab:
[2024-08-31 00:23:53,493 INFO]  * enc: 16041 match, 7 missing, (99.96%)
[2024-08-31 00:23:53,493 INFO]
Saving encoder embeddings as:
        * enc: processed_data/.enc_embeddings.pt
Traceback (most recent call last):
  File "/home/ubuntu/TY-EN/TY-EN/bin/eole", line 33, in <module>
    sys.exit(load_entry_point('eole', 'console_scripts', 'eole')())
  File "/home/ubuntu/TY-EN/eole/eole/bin/main.py", line 39, in main
    bin_cls.run(args)
  File "/home/ubuntu/TY-EN/eole/eole/bin/run/train.py", line 69, in run
    train(config)
  File "/home/ubuntu/TY-EN/eole/eole/bin/run/train.py", line 56, in train
    train_process(config, device_id=0)
  File "/home/ubuntu/TY-EN/eole/eole/train_single.py", line 141, in main
    checkpoint, vocabs, transforms, config = _init_train(config)
  File "/home/ubuntu/TY-EN/eole/eole/train_single.py", line 96, in _init_train
    vocabs, transforms = prepare_transforms_vocabs(config, transforms_cls)
  File "/home/ubuntu/TY-EN/eole/eole/train_single.py", line 38, in prepare_transforms_vocabs
    prepare_pretrained_embeddings(config, vocabs)
  File "/home/ubuntu/TY-EN/eole/eole/modules/embeddings.py", line 331, in prepare_pretrained_embeddings
    config.pre_word_vecs_enc = enc_output_file
  File "/home/ubuntu/TY-EN/TY-EN/lib/python3.10/site-packages/pydantic/main.py", line 853, in __setattr__
    self.__pydantic_validator__.validate_assignment(self, name, value)
pydantic_core._pydantic_core.ValidationError: 1 
/
  Object has no attribute 'pre_word_vecs_enc' [ty
/pe=no_such_attribute, input_value='processed_data/.enc_embeddings.pt', input_type=str]
    For further information visit https://errors.pydantic.dev/2.8/v/no_such_attribute

Iʻm not sure what causes this error.

@francoishernandez francoishernandez added the bug Something isn't working label Sep 16, 2024
@francoishernandez
Copy link
Member

You probably just need to explictly add the missing field(s) around here:

eole/eole/config/data.py

Lines 39 to 54 in ff39275

# pre trained embeddings stuff, might be put elsewhere
both_embeddings: str | None = Field(
default=None,
description="Path to the embeddings file to use for both source and target tokens.",
)
src_embeddings: str | None = Field(
default=None,
description="Path to the embeddings file to use for sou 8000 rce tokens.",
)
tgt_embeddings: str | None = Field(
default=None,
description="Path to the embeddings file to use for target tokens.",
)
embeddings_type: Literal["GloVe", "word2vec"] | None = Field(
default=None, description="Type of embeddings file."
)

This path is not really tested, so feel free to add some cases in the workflow and test script, especially if you intend to rely on such features in the future.

@francoishernandez francoishernandez added the contribution welcome Feel free to PR label Sep 16, 2024
@HURIMOZ
Copy link
Author
HURIMOZ commented Sep 19, 2024

Thank you François. I had an indentation wrong.
It works now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working contribution welcome Feel free to PR
Projects
None yet
Development

No branches or pull requests

2 participants
0