How to match quality of the video that is generated using playground vs. huggingface weights? · Issue #131 · genmoai/mochi · GitHub
More Web Proxy on the site http://driver.im/
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've tried to generate video using the same prompt using both playground and then using downloaded weights. I would like to note that I tried only huggingface "genmo/mochi-1-preview" model and was following the code below:
importtorchfromdiffusersimportMochiPipelinefromdiffusers.utilsimportexport_to_videopipe=MochiPipeline.from_pretrained("genmo/mochi-1-preview")
# Enable memory savings - disabled, as I was using H100 and it was enough# pipe.enable_model_cpu_offload()# pipe.enable_vae_tiling()prompt="A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors."withtorch.autocast("cuda", torch.bfloat16, cache_enabled=False):
frames=pipe(prompt, num_frames=84).frames[0]
export_to_video(frames, "mochi.mp4", fps=30)
The results are quite different (top is from the playground and the other one was generated on the local machine).
I played around with guidance_scale that improved the results a bit.
What else should I change to match the results with the playground?
you can augment text prompt and then it works just fine. E.g. instead of
A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors
use
Produce a cinematic scene shot on 35mm film with vivid colors. Focus on a 30-year-old spaceman wearing a handmade red wool-knit motorcycle helmet, standing in a vast salt desert under a bright blue sky. Maintain a calm, contemplative atmosphere with gentle camera movement. Show a close-up on his face as he slowly turns his head, highlighting the texture of the helmet and the subtle shifts in his expression. Emphasize a sense of solitude and quiet wonder in this serene, introspective moment.
Hi,
I've tried to generate video using the same prompt using both playground and then using downloaded weights. I would like to note that I tried only huggingface "genmo/mochi-1-preview" model and was following the code below:
The results are quite different (top is from the playground and the other one was generated on the local machine).
I played around with guidance_scale that improved the results a bit.
What else should I change to match the results with the playground?
mochi-red-helmet.mp4
mochi-red-helmet.huggingface.weights2.mp4
The text was updated successfully, but these errors were encountered: