8000 The memory footprint kept increasing during training and everything else was fine · Issue #1 · NJUNLP/njuqe · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

The memory footprint kept increasing during training and everything else was fine #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, 8000 you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Jack-Yang-S opened this issue Dec 5, 2023 · 1 comment

Comments

@Jack-Yang-S
Copy link

Hello guys, your work is great, but during my training, my memory usage continues to increase steadily, and I can see the normal update of various indicators. I will soon run out of 40GB memory and be forced to stop. What could be the problem?

@hy5468
Copy link
Collaborator
hy5468 commented Dec 15, 2023

Hi! Thanks for your attention! You may not use --qe-meter in pre-train. To calculate the right dataset-level metrics like Pearson, MCC, and F1-MULT, we have to save the predictions in the "reduce_metrics" function of qe loss. Fairseq may record all training states including these predictions. Therefore memory usage continues to increase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0