8000 Initial IMM training loss by stockeh · Pull Request #4 · lumalabs/imm · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Initial IMM training loss #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

stockeh
Copy link
@stockeh stockeh commented Apr 18, 2025

Initial attempt at the training loss defined in the paper (assumes constant decrement in $\eta(t)$ for mapping function).

@XinYu-Andy
Copy link

Initial attempt at the training loss defined in the paper (assumes constant decrement in η ( t ) for mapping function).

Hi, thanks for your effort! I am curious why you use ema weight for y_r? Have you conducted some experiments and found it worked better?

@stockeh
Copy link
Author
stockeh commented Apr 18, 2025

I am curious why you use ema weight for y_r?

@XinYu-Andy thank you! Using ema weights for y_r initially made more sense to me. But, I just tested with the model and the loss is decreasing considerably better so far, although still just as noisy (w/ global batch size of 800).

I updated the pr with this change for now!

@XinYu-Andy
Copy link

I am curious why you use ema weight for y_r?

@XinYu-Andy thank you! Using ema weights for y_r initially made more sense to me. But, I just tested with the model and the loss is decreasing considerably better so far, although still just as noisy (w/ global batch size of 800).

I updated the pr with this change for now!

Are you doing experiments on cifar10? I conducted the experiment for a few weeks but was still not able to reproduce the results reported in the paper...

@stockeh
Copy link
Author
stockeh commented Apr 18, 2025

Are you doing experiments on cifar10?

Yes, with the DDPM++ UNet, using all the same reported hyperparams. I just started today and haven't extensively experimented yet, but would like to see a stable loss before anything further.

@XinYu-Andy
Copy link

Are you doing experiments on cifar10?

Yes, with the DDPM++ UNet, using all the same reported hyperparams. I just started today and haven't extensively experimented yet, but would like to see a stable loss before anything further.

Sounds good!👍

@stockeh
Copy link
Author
stockeh commented Apr 22, 2025

The parameter ordering for ddim was notationally different than that in Algorithm 1. This was fixed, but now the loss starts very, very small (log-log scale loss attached).
Screenshot 2025-04-21 at 6 10 48 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet < 45D0 /div>
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0