Lists (1)
Sort Name ascending (A-Z)
Starred repositories
ECCV24 - Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance
Official Implementation for ICDAR2024 paper "Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism"
Hugging Face RoBERTa with Flash Attention 2
An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents. ICDAR, 2021"
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
[ICML 2024] Official implementation of "LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions."
An MIT License of YOLOv9, YOLOv7, YOLO-RD
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
NVIDIA Math Libraries for the Python Ecosystem
Code for the LREC-Coling 2024 paper "VI-OOD: A Unified Representation Learning Framework for Textual Out-of-distribution Detection"
Schedule-Free Optimization in PyTorch
Reaching LLaMA2 Performance with 0.1M Dollars
OCR, layout analysis, reading order, table recognition in 90+ languages
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
This repository contains the official implementation of the research paper, "FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization" ICCV 2023
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Table structure recognition dataset of the paper: Complicated Table Structure Recognition
(ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
Segment Anything in High Quality [NeurIPS 2023]
Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment
Official codes of ICCV2023 paper: <<FemtoDet: an object detection baseline for energy versus performance tradeoffs>>
STRExp is a framework that provides Explainability (XAI) to Scene Text Recognition (STR) models.
[ICCV 2023] Code base for Revisiting Scene Text Recognition: A Data Perspective
[ECCV 2024] codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior