-
Wuhan University
- Wuhan
-
04:57
(UTC +08:00) - https://qianmingduowan.github.io/
- in/ming-qian-b77b6b205
Highlights
- Pro
Stars
[CVPR 2025 Hightlight] PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
[CVPR 2025 Best Paper Award Candidate] VGGT: Visual Geometry Grounded Transformer
A simple training-free approach adapting DUSt3R for dynamic scenes.
Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation
[CVPR'25] Official Implementations for Paper - AniDoc: Animation Creation Made Easier
CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
Code for "Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation"
VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis
[CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Text
[CVPR'25] Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System
ASCII generator (image to text, image to image, video to video)
[ICLR'25] Official PyTorch implementation of "Framer: Interactive Frame Interpolation".
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
[NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Download, model, analyze, and visualize street networks and other geospatial features from OpenStreetMap.
Official repository for VIGOR : Cross-View Image Geo-localization beyond One-to-one Retrieval
Code for "Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text" (NeurIPS 2024).
High-Resolution Image Synthesis with Latent Diffusion Models
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
A PyTorch implementation of "GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis"
Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[ECCV 2024 - Oral] ACE0 is a learning-based structure-from-motion approach that estimates camera parameters of sets of images by learning a multi-view consistent, implicit scene representation.
Generative Models by Stability AI
A cross-platform, high performance renderer for Gaussian Splatting using Vulkan Compute. Supports Windows, Linux, macOS, iOS, and visionOS
🌊 [ECCV'24 Oral] MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images