Lists (2)
Sort Name ascending (A-Z)
Stars
[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Official implementations for paper: VACE: All-in-One Video Creation and Editing
AIHUB 시각화자료질의응답 데이터셋을 기반으로 만든 VLM 벤치마크 데이터셋
Efficient Part-level 3D Object Generation via Dual Volume Packing
Official PyTorch implementation for "Large Language Diffusion Models"
collection of diffusion model papers categorized by their subareas
Official PyTorch implementation of paper “InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction”
Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding
Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translation. And a video editing benchmark code.
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
Official repository for BrickGPT, the first approach for generating physically stable toy brick models from text prompts.
Wan: Open and Advanced Large-Scale Video Generative Models
A curated list of recent diffusion models for video generation, editing, and various other applications.
[CVPR 2024] Official implementation of CVPR 2024 paper: "Inversion-Free Image Editing with Natural Language"
Coherent Video Inpainting Using Optical Flow-Guided Efficient Diffusion
A Modular Framework for 3D Generation and Beyond [WIP]
RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models [CVPR 2024]
Roblox Foundation Model for 3D Intelligence
This repository aims to develop CoT Steering based on CoT without Prompting. It focuses on enhancing the model’s latent reasoning capability without additional training by leveraging Test-Time Scal…
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion (CVPR2025)
PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
Korean Large MultiModal FFT Code
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
[NeurIPS'24 Spotlight] Text2CAD: Generating Sequential CAD Designs from Beginner-to-Expert Level Text Prompts
GUNETR_pplus: Gradient enhanced UNETR_pplus with LiTS liver segmentation
Papers and Datasets about Point Cloud.