8000 wangkai930418 (Kai Wang) / Starred · GitHub

More Web Proxy on the site http://driver.im/

wangkai930418

Follow

🏋️‍♂️

Kai Wang wangkai930418

🏋️‍♂️

Follow

PostDoc in CVC, UAB

94 followers · 117 following

CVC,UAB
Barcelona
20:38 (UTC +02:00)
wangkai930418.github.io
https://orcid.org/0000-0002-9605-8279

Achievements

Achievements

Starred repositories

Vchitect / Latte

[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.

Python 1,836 187 Updated Apr 8, 2025

baaivision / Uni3D

[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI

Python 584 37 Updated Jan 17, 2024

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 4,163 334 Updated Jun 17, 2025

YuzheZhang-1999 / DiffTSR

[CVPR2024] Diffusion-based Blind Text Image Super-Resolution (Official)

Python 145 8 Updated May 8, 2025

wuyi2020 / StyleAR

StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation

Python 31 1 Updated Jun 6, 2025

LargeWorldModel / LWM

Large World Model -- Modeling Text and Video with Millions Context

Python 7,295 557 Updated Oct 19, 2024

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

318 8 Updated May 22, 2025

deepffff / SADis

The code of the paper "Free-Lunch Color-Texture Disentanglement for Stylized Image Generation"

Python 8 1 Updated Jun 3, 2025

Paper2Poster / Paper2Poster

Open-source Multi-agent Poster Generation from Papers

Python 2,092 111 Updated Jun 17, 2025

YuyangSunshine / bioprotocolbench

Python 28 Updated Jun 12, 2025

djghosh13 / geneval

GenEval: An object-focused framework for evaluating text-to-image alignment

HTML 304 21 Updated Mar 3, 2025

lupantech / chameleon-llm

Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".

Jupyter Notebook 1,132 91 Updated Dec 23, 2023

Alpha-VLLM / Lumina-mGPT-2.0

Lumina-mGPT 2.0: Stand-alone Autoregressive Image Modeling

Python 709 41 Updated May 1, 2025

Qinyu-Allen-Zhao / DiSA

Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation

Jupyter Notebook 134 Updated May 27, 2025

VIPL-GENUN / Jodi

Codebase for "Jodi: Unification of Visual Generation and Understanding via Joint Modeling"

Python 76 2 Updated Jun 17, 2025

aigc3d / LHM

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

Python 2,207 171 Updated Apr 17, 2025

yuangpeng / dreambench_plus

[ICLR 2025] Official code implementation of DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Python 110 Updated Feb 23, 2025

Gen-Verse / MMaDA

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models

Python 1,091 49 Updated Jun 13, 2025

bytedance / DreamO

DreamO: A Unified Framework for Image Customization

Python 1,532 113 Updated May 30, 2025

Zheng-Chong / CatVTON

[ICLR 2025] CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) …

Python 1,419 165 Updated Feb 24, 2025

G-U-N / Diffusion-NPO

[ICLR 2025] official implementation of "Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models"

Jupyter Notebook 23 2 Updated May 17, 2025

stepfun-ai / Step1X-3D

Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets

Python 713 40 Updated Jun 9, 2025

LINs-lab / UCGM

[Preprint] UCGM: Unified Continuous Generative Models

Python 151 7 Updated May 27, 2025

ZhuiyiTechnology / roformer

Rotary Transformer

Python 970 55 Updated Mar 21, 2022

River-Zhang / ICEdit

Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enou…

Python 1,709 98 Updated May 16, 2025

lllyasviel / FramePack

Lets make video diffusion practical!

Python 14,504 1,295 Updated May 4, 2025

evahuman / EVA_Official

Expressive Gaussian Human Avatars from Monocular RGB Video (NeurIPS 2024)

Python 45 3 Updated May 28, 2025

alsdudrla10 / ARD

[CVPR 2025 Oral] PyTorch re-implementation for Autoregressive Distillation of Diffusion Transformers (ARD).

Python 107 3 Updated Apr 16, 2025

hutaiHang / ATM

25 1 Updated Apr 15, 2025

shashankvkt / DoRA_ICLR24

This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video""

Python 89 12 Updated May 17, 2024

Starred topics

zero-shot-learning

normalizing-flows

0