8000 vicchu / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View vicchu's full-sized avatar

Block or report vicchu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

It is my belief that you, the postgraduate students and job-seekers for whom the book is primarily meant will benefit from reading it; however, it is my hope that even the most experienced research…

4,619 303 Updated Jan 21, 2022

Bag of Tricks and A Strong Baseline for Deep Person Re-identification

Python 2,303 579 Updated Apr 23, 2020

Reading list for research topics in multimodal machine learning

6,467 882 Updated Aug 20, 2024

A treasure chest for visual classification and recognition powered by PaddlePaddle

Python 5,653 1,187 Updated May 12, 2025

SIGIR paper Conversational Fashion Image Retrieval via Multiturn Natural Language Feedback

14 2 Updated Oct 17, 2022

This repo consists of the QA dataset collected for performing person search with natural language.

4 Updated Apr 9, 2021

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Ins…

423 48 Updated Dec 15, 2024

Official implementation of the Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) | ICCV 2021 - Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models

Python 38 8 Updated Jun 26, 2024
Python 131 14 Updated Dec 10, 2022

Open-source toolbox for visual fashion analysis based on PyTorch

Python 1,309 293 Updated May 10, 2024

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

1,153 104 Updated Aug 19, 2022

Official repository of ICCV 2021 - Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models

112 3 Updated May 21, 2025
Jupyter Notebook 58 6 Updated Dec 20, 2023

[ACL 2021] Learning Relation Alignment for Calibrated Cross-modal Retrieval

Python 30 4 Updated May 16, 2023

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Python 84 24 Updated Mar 24, 2021

A curated list of Multimodal Related Research.

Python 1,348 150 Updated Aug 5, 2023

A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).

666 49 Updated Jan 7, 2024

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsens…

Python 970 105 Updated Feb 27, 2023

project page for VinVL

355 25 Updated Jul 26, 2023

A curated list of deep learning resources for video-text retrieval.

621 67 Updated Oct 20, 2023

Awesome Cross-modality Person Re-identification

148 32 Updated Jul 14, 2022

A Simple, High-efficiency, Strong framework for person re-Identification.

Python 72 19 Updated Apr 19, 2021

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Python 10,531 2,089 Updated Nov 3, 2023

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Python 5,564 939 Updated Apr 24, 2025

A reading list of papers about Visual Question Answering.

32 6 Updated Aug 17, 2022

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

662 95 Updated Jul 6, 2023

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Python 57 5 Updated Apr 5, 2022

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

Python 72 10 Updated May 22, 2023
Next
0