8000 How to match quality of the video that is generated using playground vs. huggingface weights? · Issue #131 · genmoai/mochi · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

How to match quality of the video that is generated using playground vs. huggingface weights? #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Annusha opened this issue Feb 18, 2025 · 3 comments

Comments

@Annusha
Copy link
Annusha commented Feb 18, 2025

Hi,

I've tried to generate video using the same prompt using both playground and then using downloaded weights. I would like to note that I tried only huggingface "genmo/mochi-1-preview" model and was following the code below:

import torch
from diffusers import MochiPipeline
from diffusers.utils import export_to_video

pipe = MochiPipeline.from_pretrained("genmo/mochi-1-preview")

# Enable memory savings - disabled, as I was using H100 and it was enough
# pipe.enable_model_cpu_offload()
# pipe.enable_vae_tiling()

prompt = "A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors."

with torch.autocast("cuda", torch.bfloat16, cache_enabled=False):
      frames = pipe(prompt, num_frames=84).frames[0]

export_to_video(frames, "mochi.mp4", fps=30)

The results are quite different (top is from the playground and the other one was generated on the local machine).
I played around with guidance_scale that improved the results a bit.
What else should I change to match the results with the playground?

mochi-red-helmet.mp4
mochi-red-helmet.huggingface.weights2.mp4
@970814
Copy link
970814 commented Feb 26, 2025

Looks funny, I have the same problem

@weathon
Copy link
weathon commented Apr 27, 2025

I have the same problem

@Annusha
Copy link
Author
Annusha commented Apr 30, 2025

you can augment text prompt and then it works just fine. E.g. instead of

A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors

use

Produce a cinematic scene shot on 35mm film with vivid colors. Focus on a 30-year-old spaceman wearing a handmade red wool-knit motorcycle helmet, standing in a vast salt desert under a bright blue sky. Maintain a calm, contemplative atmosphere with gentle camera movement. Show a close-up on his face as he slowly turns his head, highlighting the texture of the helmet and the subtle shifts in his expression. Emphasize a sense of solitude and quiet wonder in this serene, introspective moment.

guidance_scale 6 worked good for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0