8000 Acwy / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Acwy's full-sized avatar

Block or report Acwy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Gemini is a modern LaTex beamerposter theme 🖼

TeX 1,097 253 Updated Jun 11, 2025

A concise but complete implementation of CLIP with various experimental improvements from recent papers

Python 713 47 Updated Oct 16, 2023

Let your Claude able to think

TypeScript 15,250 1,771 Updated Mar 10, 2025

UniMD: Towards Unifying Moment retrieval and temporal action Detection

Python 48 1 Updated Jul 5, 2024

[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval

Python 58 4 Updated Jun 19, 2024

[CVPR 2023] Vote2Cap-DETR and [T-PAMI 2024] Vote2Cap-DETR++; A set-to-set perspective towards 3D Dense Captioning; State-of-the-Art 3D Dense Captioning methods

Python 95 8 Updated Aug 17, 2024

OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.

Python 264 22 Updated Apr 29, 2025

xlliu7 / TadTR

[TIP 2022] End-to-end Temporal Action Detection with Transformer

Python 153 12 Updated Feb 19, 2023

Large-scale text-video dataset. 10 million captioned short videos.

Python 641 40 Updated Aug 14, 2024

Segment Anything in High Quality [NeurIPS 2023]

Jupyter Notebook 3,987 243 Updated Dec 7, 2024

Accepted by CVPR 2024

Python 33 1 Updated May 16, 2024

Inference code for Llama models

Python 58,390 9,777 Updated Jan 26, 2025

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 3,019 275 Updated Jun 4, 2024

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,251 262 Updated Jan 18, 2025
Python 35 5 Updated Sep 16, 2024

❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119

Python 1,123 96 Updated Sep 2, 2023
Python 92 8 Updated Sep 23, 2023

An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks

Python 2,041 204 Updated Nov 16, 2023

Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)

Python 1,979 226 Updated May 20, 2024
Python 188 10 Updated Jul 12, 2024

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Python 314 28 Updated Jul 19, 2024

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

Python 631 41 Updated Jan 29, 2025

[CVPR 2024] Offical implemention of the paper "DePT: Decoupled Prompt Tuning"

Jupyter Notebook 104 3 Updated May 28, 2025

Code release for "VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning"

Python 48 3 Updated Jul 2, 2024

Kolmogorov Arnold Networks

Jupyter Notebook 15,733 1,494 Updated Jan 19, 2025

Accepted as [NeurIPS 2024] Spotlight Presentation Paper

Jupyter Notebook 6,306 638 Updated Sep 26, 2024

Multimodal Prompting with Missing Modalities for Visual Recognition, CVPR'23

Python 207 14 Updated Dec 13, 2023

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 30,040 4,050 Updated Jul 17, 2024

Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.

Python 273 25 Updated Oct 13, 2023
Next
0