- 🔭 I’m currently interested in MLLM and visual generation.
- ⚡ I graduated from Carnegie Mellon University and am currently an MLE at ByteDance.
CMU -> ByteDance | Multimodal Understanding & Generation
-
Carnegie Mellon University
- Pittsburgh, PA
- https://scholar.google.com/citations?user=IDbqDdEAAAAJ&hl=zh-CN
Pinned Loading
-
ByteFlow-AI/TokenFlow
ByteFlow-AI/TokenFlow Public[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
-
ByteFlow-AI/DetailFlow
ByteFlow-AI/DetailFlow Public🔥 Official impl. of "DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction"
-
ruohaoguo/avis
ruohaoguo/avis Public[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
-
ruohaoguo/ovavss
ruohaoguo/ovavss PublicOfficial Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].
-
bytedance/AvatarVerse
bytedance/AvatarVerse Publiccode repo for the paper "AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose" (AAAI2024)
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.