8000 MhLiao (Minghui Liao) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View MhLiao's full-sized avatar

Block or report MhLiao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

10,770 1,326 Updated Jul 9, 2025

The official project of paper "Visual Text Processing: A Comprehensive Review and Unified Evaluation""

Python 69 3 Updated Jun 5, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,290 323 Updated Jun 26, 2025

Witness the aha moment of VLM with less than $3.

Python 3,835 289 Updated May 19, 2025

R1-Vision: Let's first take a look at the image

Python 47 1 Updated Feb 16, 2025

OCR & Document Extraction using vision models

TypeScript 11,539 778 Updated May 20, 2025

Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Python 62 3 Updated Nov 1, 2024

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 38,833 3,194 Updated Jul 11, 2025

GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 20…

Python 119 6 Updated Nov 12, 2024

[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 374 6 Updated May 5, 2025

MINT-1T: A one trillion token multimodal interleaved dataset.

819 19 Updated Jul 31, 2024

[CVPR 2024] Official repository for "MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model"

Python 10,792 1,104 Updated Jun 21, 2024

Curated tutorials and resources for Large Language Models, AI Painting, and more.

4,254 282 Updated Mar 31, 2024

✨✨Latest Advances on Multimodal Large Language Models

15,790 1,026 Updated Jul 11, 2025

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Python 3,897 452 Updated Apr 28, 2025

Inference code for CodeLlama models

Python 16,349 1,922 Updated Aug 12, 2024

Ongoing research training transformer models at scale

Python 12,840 2,923 Updated Jul 11, 2025
Python 784 45 Updated Jul 8, 2024

A list of AI autonomous agents

19,511 1,526 Updated Feb 26, 2025

An open-source framework for training large multimodal models.

Python 3,976 306 Updated Aug 31, 2024

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 35,554 5,954 Updated Mar 25, 2025

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 111,243 18,116 Updated Jul 11, 2025

A toolbox of ocr models and algorithms based on MindSpore

Python 276 60 Updated Apr 3, 2025

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editin…

Python 3,215 231 Updated Aug 20, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,019 2,543 Updated Aug 12, 2024

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,699 2,936 Updated Sep 2, 2024

CDLA: A Chinese document layout analysis (CDLA) dataset

Python 271 32 Updated Sep 13, 2021

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Python 1,351 83 Updated Jan 23, 2024

Code release for ConvNeXt V2 model

Python 1,788 141 Updated Aug 14, 2024

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,533 234 Updated May 21, 2025
Next
0