MhLiao

Minghui Liao MhLiao

644 followers · 3 following

HUST
Wuhan, China

Stars

HW-whistleblower / True-Story-of-Pangu

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

10,770 1,326 Updated Jul 9, 2025

shuyansy / Visual-Text-Processing-survey

The official project of paper "Visual Text Processing: A Comprehensive Review and Unified Evaluation""

Python 69 3 Updated Jun 5, 2025

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 5,290 323 Updated Jun 26, 2025

StarsfieldAI / R1-V

Witness the aha moment of VLM with less than $3.

Python 3,835 289 Updated May 19, 2025

yuyq96 / R1-Vision

R1-Vision: Let's first take a look at the image

Python 47 1 Updated Feb 16, 2025

getomni-ai / zerox

OCR & Document Extraction using vision models

TypeScript 11,539 778 Updated May 20, 2025

yuyq96 / TextHawk

Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Python 62 3 Updated Nov 1, 2024

opendatalab / MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

Python 38,833 3,194 Updated Jul 11, 2025

OpenGVLab / GUI-Odyssey

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 20…

Python 119 6 Updated Nov 12, 2024

OpenGVLab / OmniCorpus

[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 374 6 Updated May 5, 2025

mlfoundations / MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

819 19 Updated Jul 31, 2024

magic-research / magic-animate

[CVPR 2024] Official repository for "MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model"

Python 10,792 1,104 Updated Jun 21, 2024

luban-agi / Awesome-AIGC-Tutorials

Curated tutorials and resources for Large Language Models, AI Painting, and more.

4,254 282 Updated Mar 31, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

15,790 1,026 Updated Jul 11, 2025

lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Python 3,897 452 Updated Apr 28, 2025

meta-llama / codellama

Inference code for CodeLlama models

Python 16,349 1,922 Updated Aug 12, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 12,840 2,923 Updated Jul 11, 2025

e2b-dev / awesome-ai-agents

A list of AI autonomous agents

19,511 1,526 Updated Feb 26, 2025

mlfoundations / open_flamingo

An open-source framework for training large multimodal models.

Python 3,976 306 Updated Aug 31, 2024

chatchat-space / Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 35,554 5,954 Updated Mar 25, 2025

langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 111,243 18,116 Updated Jul 11, 2025

mindspore-lab / mindocr

A toolbox of ocr models and algorithms based on MindSpore

Python 276 60 Updated Apr 3, 2025

OpenGVLab / InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editin…

Python 3,215 231 Updated Aug 20, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,019 2,543 Updated Aug 12, 2024

Vision-CAIR / MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,699 2,936 Updated Sep 2, 2024

buptlihang / CDLA

CDLA: A Chinese document layout analysis (CDLA) dataset

Python 271 32 Updated Sep 13, 2021

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Python 1,351 83 Updated Jan 23, 2024

facebookresearch / ConvNeXt-V2

Code release for ConvNeXt V2 model

Python 1,788 141 Updated Aug 14, 2024

CVCUDA / CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,533 234 Updated May 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly