-
Notifications
You must be signed in to change notification settings - Fork 8.4k
Added explicit memory management during the VAE decode process. < 8000 span class="f1-light color-fg-muted">#7587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I can't merge this as is because it would break on non cuda devices but does it actually improve things? h = h_new does "del" the h variable. del doesn't actually delete anything, it only removes that reference to the object. |
@comfyanonymous Hi, thank you for your review. |
You are correct that non-CUDA devices cannot use torch.cuda.empty_cache() to release VRAM. However, this step is quite important for CUDA devices. I can add logic here to check if CUDA is available and apply this optimization specifically for CUDA devices. |
I can also show you the difference in VRAM usage during the VAE decoding stage, comparing the results with and without explicit memory management. |
If empty_cache is needed, comfy.model_management.soft_empty_cache() might be better for compatibility. |
…ache() with model_management.soft_empty_cache().
@chaObserv Thank you for your valuable suggestion! I have already replaced torch.cuda.empty_cache() with model_management.soft_empty_cache() to support non-CUDA devices. |
Hello, I noticed that during the VAE decode process, the feature maps as intermediate variables were not explicitly deleted, which led to unnecessary CUDA out of memory errors. I have fixed this by expli 8000 citly deleting the intermediate variables.