8000 zhengli97 (Zheng Li) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View zhengli97's full-sized avatar

Block or report zhengli97

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

JREion / DPC

[CVPR 2025] Official PyTorch Code for "DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models"

Python 13 2 Updated Apr 11, 2025

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Python 156 42 Updated Mar 12, 2025

[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

Python 338 28 Updated Aug 24, 2024

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 424 14 Updated Jan 4, 2025

(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Python 98 1 Updated Mar 6, 2025

Solve Visual Understanding with Reinforced VLMs

Python 4,975 308 Updated May 11, 2025
Jupyter Notebook 774 71 Updated Aug 7, 2024

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 2,210 154 Updated Feb 16, 2025

Code release for "SegLLM: Multi-round Reasoning Segmentation"

Python 95 8 Updated Feb 20, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,788 1,487 Updated Apr 24, 2025

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,830 1,745 Updated Feb 26, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,341 155 Updated Mar 20, 2025

Simple RL training for reasoning

Python 3,576 266 Updated Apr 10, 2025

Training Large Language Model to Reason in a Continuous Latent Space

Python 1,113 101 Updated Jan 24, 2025

[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,403 60 Updated Apr 28, 2025
Python 111 6 Updated Jan 21, 2025

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 2,680 203 Updated May 12, 2025

Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.

Python 231 13 Updated Feb 27, 2025

[NeurIPS 2024] OPUS: Occupancy Prediction Using a Sparse Set

Python 94 3 Updated Feb 16, 2025

Full codes of LeaDQ (AAAI 2025)

Python 6 1 Updated Dec 16, 2024

Official PyTorch Code for "ATPrompt: Textual Prompt Learning with Embedded Attributes"

Python 34 1 Updated Dec 23, 2024

The official code for "TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning" | [AAAI2025]

Python 37 3 Updated Mar 13, 2025

[AAAI 2025] Pre-Training a Density-Aware Pose Transformer for Robust LiDAR-based 3D Human Pose Estimation

Python 27 1 Updated Apr 15, 2025

PyTorch implementation of SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation

Jupyter Notebook 25 2 Updated Apr 5, 2025

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 7,954 493 Updated May 18, 2025

将知乎专栏文章转换为 Markdown 文件保存到本地

Python 394 63 Updated Mar 21, 2025

[PR 2024] Official PyTorch Code for "Dual Teachers for Self-Knowledge Distillation"

Python 10 Updated Nov 28, 2024
Next
0