8000 Luyitas (Mingda Jia) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Luyitas's full-sized avatar
  • Peking University
  • ShenZhen, China

Block or report Luyitas

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Text

Python 51 Updated Mar 16, 2025

CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!

1,671 150 Updated May 9, 2023

Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"

Python 21 1 Updated Feb 2, 2025

[AAAI 25] Official Implementation for ”E-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment“

Python 42 1 Updated Apr 22, 2025

[CVPR2022] PyTorch re-implementation of Prompt Distribution Learning

18 1 Updated May 6, 2023

[NeurIPS2023] Neural-Logic Human-Object Interaction Detection

Python 11 2 Updated Aug 24, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,911 130 Updated Oct 30, 2024

Disentangled Pre-training for Human-Object Interaction Detection

Python 21 1 Updated Nov 3, 2024

PyTorch implementation of Sinusodial Representation networks (SIREN)

Python 264 11 Updated Dec 8, 2022

Code repository of the paper "CKConv: Continuous Kernel Convolution For Sequential Data" published at ICLR 2022. https://arxiv.org/abs/2102.02611

Python 121 16 Updated Nov 29, 2022

Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Python 1,208 77 Updated Jul 14, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 2,011 114 Updated Jul 29, 2024

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

Python 761 54 Updated May 10, 2022

Open source implementation of "Vision Transformers Need Registers"

Python 179 15 Updated Apr 6, 2025

[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

Python 341 28 Updated Aug 24, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,178 80 Updated Jan 23, 2025

[NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Python 1,059 41 Updated Oct 9, 2024

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,774 78 Updated Aug 15, 2024

[ICLR 2025][arXiv:2406.07548] Image and Video Tokenization with Binary Spherical Quantization

Python 162 Updated Jun 12, 2024

支持远程办公的中国公司

2,711 93 Updated Dec 30, 2024

[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.

Python 296 7 Updated Jul 9, 2024

Extend BoxDiff to SDXL (SDXL-based layout-to-image generation)

Python 24 1 Updated May 23, 2024

Codes for ICLR 2025 Paper: Towards Semantic Equivalence of Tokenization in Multimodal LLM

Python 64 1 Updated Apr 19, 2025

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 6,618 561 Updated Apr 19, 2025

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

564 21 Updated May 8, 2025

A collection of resources on controllable generation with text-to-image diffusion models.

1,048 28 Updated Dec 31, 2024

Accepted as [NeurIPS 2024] Spotlight Presentation Paper

Jupyter Notebook 6,301 638 Updated Sep 26, 2024

[AAAI-25] Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference

Python 278 11 Updated Jan 8, 2025

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Python 940 46 Updated Oct 16, 2024
Next
0