Lists (1)
Sort Name ascending (A-Z)
Stars
Generic automation framework for acceptance testing and RPA
Implementing Faster RCNN via detectron2 to detect UI Elements
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering
Stable Video Diffusion Training Code and Extensions.
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition
PyTorch Implementation of StyleSinger(AAAI 2024): Style Transfer for Out-of-Domain Singing Voice Synthesis
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Make any web page a desktop application
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling
Segment Anything in High Quality [NeurIPS 2023]
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code
[NeurIPS 2024] Generalizable and Animatable Gaussian Head Avatar
[ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Official implementation of "DCT-Net: Domain-Calibrated Translation for Portrait Stylization", SIGGRAPH 2022 (TOG); Multi-style cartoonization
Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
A generative speech model for daily dialogue.