Lists (32)
Sort Name ascending (A-Z)
3D
Acoustic
Adversarial training
Autonomous
Avatar
Cemera
Depth Estimation
Diffusion
Distributed System
Foundation Model
google-research
HPC
Image quality
Image Restoration
Invertible Neural Network
LLM
Mesh
MultiModality
NeRF
Neural Rendering
PointCloud
Robotic
SLAM
Steganography
Stereo Vision
Text-To-Video
Transformer
Vein Biometric
Visual Representation
VLM
VTON
Watermarking
Stars
Official implementation of "Watermarking Images in Self-Supervised Latent-Spaces"
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Tensorflow implementation of 'Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning'
[CVPR 2024] EditGuard: Versatile Image Watermarking for Tamper Localization and Copyright Protection
[CVPR 2025] OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
A high-throughput and memory-efficient inference and serving engine for LLMs
[CVPR 2025] GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Pytorch implementation of GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting
[CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
[ICLR 2025] VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking (Official Implementation)
deepbeepmeep / Wan2GP
Forked from Wan-Video/Wan2.1Wan 2.1 for the GPU Poor
[CVPR 2025 Oral] VGGT: Visual Geometry Grounded Transformer
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Enjoy the magic of Diffusion models!
[NeurIPS 2024] Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer
[CVPR 2025] Learning Flow Fields in Attention for Controllable Person Image Generation
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Wan: Open and Advanced Large-Scale Video Generative Models
A lightweight data processing framework built on DuckDB and 3FS.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.