8000 IMCCretrieval (Eggroll) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View IMCCretrieval's full-sized avatar

Block or report IMCCretrieval

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Awesome Incremental Learning

4,091 598 Updated Apr 28, 2025

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

Python 622 39 Updated Jan 7, 2024

Awesome papers & datasets specifically focused on long-term videos.

277 12 Updated Nov 15, 2024

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Python 279 12 Updated Jun 13, 2024

The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)

Python 29 3 Updated Mar 29, 2024

A family of lightweight multimodal models.

Python 1,020 74 Updated Nov 18, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

2,377 108 Updated Jun 4, 2025

VICReg official code base

Python 538 91 Updated Jul 6, 2023

Paper Reading of IMCC groups.

18 16 Updated Apr 23, 2025

A curated list of recent diffusion models for video generation, editing, and various other applications.

4,502 270 Updated Jun 6, 2025

[ICCV2023] - CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

Python 35 5 Updated Oct 8, 2024

[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

Python 293 17 Updated Dec 25, 2024

CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!

1,670 150 Updated May 9, 2023

✨✨Latest Advances on Multimodal Large Language Models

15,498 1,003 Updated Jun 6, 2025

[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Jupyter Notebook 278 17 Updated Mar 14, 2024
Python 32 6 Updated Mar 10, 2023

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 38,790 4,412 Updated Jun 10, 2025

本人的科研经验

6,927 406 Updated Jun 4, 2025

[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列

Python 1,058 93 Updated Jun 13, 2024

Deep Fourier Ranking Quantization for Semi-Supervised Image Retrieval -- TIP22

Python 6 Updated Jul 1, 2023

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Python 1,039 70 Updated Oct 6, 2024

MomentDiff: Generative Video Moment Retrieval from Random to Real--NeurIPS 2023

Python 79 Updated Nov 2, 2023

[Communications Chemistry 2023] Highly accurate and large-scale collision cross section prediction with graph neural network for compound identification

Jupyter Notebook 59 2 Updated Sep 22, 2021

yolov5的openvino模型,带异步推理

Python 56 Updated Oct 11, 2020

[ICCV 2023] Official implement of <Disentangle then Parse: Night-time Semantic Segmentation with Illumination Disentanglement>

Python 71 1 Updated Feb 26, 2024

[TCSVT2022] Industria Scene Text Detection

Python 80 6 Updated Mar 3, 2023

Official implementation of "Self-slimmed Vision Transformer" (ECCV2022)

Python 72 1 Updated Jul 20, 2022
Python 86 2 Updated Apr 12, 2023

[CVPR 2024] SimDA: Simple Diffusion Adapter for Efficient Video Generation

Python 128 4 Updated May 7, 2024
Next
0