Decrease model pre 8574 cision #119

alexandre239 · 2025-03-20T15:54:22Z

Hi!

After seeing some issues related to OOM errors due to high prompt length, I was wondering if an option to generate sequences with Evo using higher prompts (>1kb and so on) would be, rather than GPU sharding, to decrease the float precision?

I believe that currently it is set to float16 (as in model.backbone = model.backbone.to(torch.bfloat16) from the generation_to_folding.py script), but would float8 be an option (is it anyhow compatible with Evo?)?
If so, do you expect a big decrease in generation performance, or do you already have data representing precision vs performance?

Thanks so much!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decrease model pre 8574 cision #119

Decrease model precision #119

Decrease model pre 8574 cision #119

Decrease model precision #119

Comments