You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After seeing some issues related to OOM errors due to high prompt length, I was wondering if an option to generate sequences with Evo using higher prompts (>1kb and so on) would be, rather than GPU sharding, to decrease the float precision?
I believe that currently it is set to float16 (as in model.backbone = model.backbone.to(torch.bfloat16) from the generation_to_folding.py script), but would float8 be an option (is it anyhow compatible with Evo?)?
If so, do you expect a big decrease in generation performance, or do you already have data representing precision vs performance?
Thanks so much!
The text was updated successfully, but these errors were encountered:
Hi!
After seeing some issues related to OOM errors due to high prompt length, I was wondering if an option to generate sequences with Evo using higher prompts (>1kb and so on) would be, rather than GPU sharding, to decrease the float precision?
float16
(as inmodel.backbone = model.backbone.to(torch.bfloat16)
from thegeneration_to_folding.py
script), but wouldfloat8
be an option (is it anyhow compatible with Evo?)?Thanks so much!
The text was updated successfully, but these errors were encountered: