Initial IMM training loss #4

stockeh · 2025-04-18T05:28:58Z

Initial attempt at the training loss defined in the paper (assumes constant decrement in $\eta(t)$ for mapping function).

XinYu-Andy · 2025-04-18T05:37:39Z

Initial attempt at the training loss defined in the paper (assumes constant decrement in η ( t ) for mapping function).

Hi, thanks for your effort! I am curious why you use ema weight for y_r? Have you conducted some experiments and found it worked better?

stockeh · 2025-04-18T06:20:46Z

I am curious why you use ema weight for y_r?

@XinYu-Andy thank you! Using ema weights for y_r initially made more sense to me. But, I just tested with the model and the loss is decreasing considerably better so far, although still just as noisy (w/ global batch size of 800).

I updated the pr with this change for now!

XinYu-Andy · 2025-04-18T06:39:11Z

I am curious why you use ema weight for y_r?

@XinYu-Andy thank you! Using ema weights for y_r initially made more sense to me. But, I just tested with the model and the loss is decreasing considerably better so far, although still just as noisy (w/ global batch size of 800).

I updated the pr with this change for now!

Are you doing experiments on cifar10? I conducted the experiment for a few weeks but was still not able to reproduce the results reported in the paper...

stockeh · 2025-04-18T06:46:18Z

Are you doing experiments on cifar10?

Yes, with the DDPM++ UNet, using all the same reported hyperparams. I just started today and haven't extensively experimented yet, but would like to see a stable loss before anything further.

XinYu-Andy · 2025-04-18T06:49:58Z

Are you doing experiments on cifar10?

Yes, with the DDPM++ UNet, using all the same reported hyperparams. I just started today and haven't extensively experimented yet, but would like to see a stable loss before anything further.

Sounds good!👍

stockeh · 2025-04-22T00:11:38Z

The parameter ordering for ddim was notationally different than that in Algorithm 1. This was fixed, but now the loss starts very, very small (log-log scale loss attached).

initial imm training loss

41f425c

y_r: removed ema inplace of model weights

6e7d6ca

training loss ddim param order for s and t

cd870e0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial IMM training loss #4

Initial IMM training loss #4

Initial IMM training loss #4

Are you sure you want to change the base?

Initial IMM training loss #4

Conversation