-
Nankai University
- Hangzhou, China
-
11:11
(UTC +08:00) - https://zhengli97.github.io/
Lists (3)
Sort Name ascending (A-Z)
Stars
JREion / DPC
[CVPR 2025] Official PyTorch Code for "DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models"
Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
Solve Visual Understanding with Reinforced VLMs
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Code release for "SegLLM: Multi-round Reasoning Segmentation"
Minimal reproduction of DeepSeek R1-Zero
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Training Large Language Model to Reason in a Continuous Latent Space
[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention
Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.
[NeurIPS 2024] OPUS: Occupancy Prediction Using a Sparse Set
Official PyTorch Code for "ATPrompt: Textual Prompt Learning with Embedded Attributes"
The official code for "TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning" | [AAAI2025]
[AAAI 2025] Pre-Training a Density-Aware Pose Transformer for Robust LiDAR-based 3D Human Pose Estimation
PyTorch implementation of SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
[PR 2024] Official PyTorch Code for "Dual Teachers for Self-Knowledge Distillation"