8000 haihuangcode / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View haihuangcode's full-sized avatar
  • Zhejiang university

Block or report haihuangcode

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Lists (9)

Sort Name ascending (A-Z)
Sort by
< 10000 a class="SelectMenu-item" role="menuitemradio" aria-checked="true" href="/haihuangcode?tab=stars&user_lists_direction=asc&user_lists_sort=name" data-pjax="#profile-lists-container"> Name ascending (A-Z) Name descending (Z-A) Newest Oldest Last updated
Showing results

[TMLR 2025🔥] A survey for the autoregressive models in vision.

612 18 Updated May 23, 2025
Python 832 24 Updated May 23, 2025

DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support

Python 584 53 Updated Mar 22, 2024

An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 608 19 Updated May 20, 2025

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,239 262 Updated Jan 18, 2025

Official repository for VisionZip (CVPR 2025)

Python 284 12 Updated Feb 27, 2025

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark

Python 199 1 Updated Mar 24, 2025

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 811 58 Updated May 19, 2025

Lets make video diffusion practical!

Python 13,505 1,164 Updated May 4, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 47,871 7,555 Updated May 23, 2025

Community maintained hardware plugin for vLLM on Ascend

Python 667 157 Updated May 23, 2025

GenEval: An object-focused framework for evaluating text-to-image alignment

HTML 279 17 Updated Mar 3, 2025

HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo

Python 1,434 129 Updated May 20, 2025

[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 1,282 63 Updated Apr 24, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 8,153 619 Updated Apr 27, 2025

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

Python 415 11 Updated May 22, 2025
A2C2

Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"

Jupyter Notebook 243 9 Updated Apr 30, 2025

[arxiv 2024] MLLMs for art

14 1 Updated Jan 16, 2025

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…

Jupyter Notebook 578 30 Updated Jul 13, 2024

OpenMMLab Computer Vision Foundation

Python 6,130 1,688 Updated Apr 25, 2025

Domain Generalization with MixStyle (ICLR'21)

Python 299 41 Updated Oct 6, 2022

Code release for VTW (AAAI 2025) Oral

Python 39 1 Updated Jan 18, 2025

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 4,164 270 Updated May 20, 2025

mixup: Beyond Empirical Risk Minimization

Python 1,179 225 Updated Oct 12, 2021

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 21,506 1,421 Updated May 22, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,569 758 Updated May 15, 2025

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Python 220 13 Updated Sep 16, 2024
Python 98 8 Updated Jul 30, 2024

A method to increase the speed and lower the memory footprint of existing vision transformers.

Python 1,051 71 Updated Jun 17, 2024

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 10,083 880 Updated May 21, 2025
Next
0