anxiangsir

🤩

Xiang An anxiangsir

🤩

216 followers · 145 following

Achievements

x3 x3

Achievements

x3 x3

Highlights

Lists (1)

Sort

🚀 My stack

5 repositories

Stars

emova-ollm / EMOVA

Official PyTorch implementation of EMOVA in CVPR 2025 (https://arxiv.org/abs/2409.18042)

Python 55 6 Updated Mar 16, 2025

Niujunbo2002 / NativeRes-LLaVA

Official code repo for our work "Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models"

Python 27 Updated Jun 17, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,628 262 Updated Jun 18, 2025

facebookresearch / vjepa2

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 1,626 129 Updated Jun 27, 2025

peterant330 / KUEA

[ICML'25] Kernel-based Unsupervised Embedding Alignment for Enhanced Visual Representation in Vision-language Models

Python 8 Updated Jun 9, 2025

facebookresearch / zero

PyTorch implementation of Zero-Shot Vision Encoder Grafting via LLM Surrogates [ICCV 2025]

Python 42 1 Updated Jun 25, 2025

roboflow / rf100-vl

Code from the paper "Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models"

Python 69 3 Updated Jun 2, 2025

TapXWorld / ChinaTextbook

所有小初高、大学PDF教材。

Roff 41,872 9,328 Updated May 18, 2025

papercopilot / paperlists

Processed / Cleaned Data for Paper Copilot

Python 508 18 Updated Jun 24, 2025

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 91,112 24,546 Updated Jun 29, 2025

yifan123 / flow_grpo

An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 797 29 Updated Jun 16, 2025

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,270 47 Updated Jun 14, 2025

xiaomoguhz / DeCLIP

[CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception

Python 62 3 Updated Jun 10, 2025

msed-Ebrahimi / GIF

GIF: Generative Inspiration for Face Recognition at Scale

12 Updated May 7, 2025

ZhangYuanhan-AI / CelebA-Spoof

[ECCV2020] A Large-Scale Face Anti-Spoofing Dataset

Python 569 94 Updated Feb 26, 2021

ZhangYuanhan-AI / NOAH

[TPAMI] Searching prompt modules for parameter-efficient transfer learning.

Python 232 11 Updated Dec 8, 2023

huggingface / nanoVLM

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 3,583 320 Updated Jun 27, 2025

XiaomiMiMo / MiMo

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,477 64 Updated Jun 5, 2025

EvolvingLMMs-Lab / Aero-1

Python 72 6 Updated May 4, 2025

xiaobai1217 / Awesome-Video-Datasets

Video datasets

1,432 103 Updated Mar 8, 2023

deepglint / UniME

The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"

Python 75 2 Updated May 19, 2025

xiaoxing2001 / DeGLA

Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]

Python 13 1 Updated May 1, 2025

facebookresearch / perception_models

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,346 76 Updated May 28, 2025

fengshikun / UniGEM

Python 15 5 Updated Apr 3, 2025

thuml / depyf

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 691 27 Updated Apr 20, 2025

mk-minchul / sapiensid

Python 9 Updated Jun 21, 2025

yeezhu / UNIT

PyTorch implementation of "UNIT: Unifying Image and Text Recognition in One Vision Encoder", NeurlPS 2024.

Python 30 2 Updated Sep 26, 2024

grokability / snipe-it

A free open source IT asset/license management system

PHP 12,376 3,444 Updated Jun 27, 2025

breezedeus / Pix2Text

An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowe…

Jupyter Notebook 2,480 227 Updated May 7, 2025

tanhuajie / Reason-RFT

⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.

Python 168 10 Updated Jun 10, 2025