8000 cryingjin (Yejin Lee) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View cryingjin's full-sized avatar
😤
😤
  • LG AI Research
  • Seoul, Korea

Block or report cryingjin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training data, instruction fine-tuning data, and In-Context learning …

49 4 Updated May 7, 2025

Korean MM Benchmarks Evaluation code

Python 2 1 Updated Mar 15, 2025

Get your documents ready for gen AI

Python 33,659 2,236 Updated Jul 4, 2025
Python 7 Updated Jun 19, 2025

Official repository for EXAONE Deep built by LG AI Research

391 22 Updated Jun 2, 2025
Jupyter Notebook 5 1 Updated Jan 6, 2023

Resources on Large Language Models for Table Processing

102 7 Updated Oct 24, 2024

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Python 1,997 177 Updated Jul 4, 2025

Datasets and Evaluation Scripts for CompHRDoc

Python 45 7 Updated Feb 25, 2025

Parse PDFs into markdown using Vision LLMs

Python 394 53 Updated Feb 8, 2025

Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step

Python 129 6 Updated Feb 17, 2025

[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation

Python 561 49 Updated Jul 4, 2025

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,705 674 Updated Feb 10, 2025

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Python 1,415 113 Updated Apr 14, 2025

Official implementation of the ANLS* metric

Python 19 Updated Jun 16, 2025

This is the official release of the datasets introduced in the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding.

5 Updated Nov 14, 2024

An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.

Python 908 124 Updated Jun 25, 2025

Document Artifical Intelligence

180 8 Updated Apr 23, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 51,466 8,499 Updated Jul 4, 2025

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …

HTML 11,807 975 Updated Jul 3, 2025

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

Python 6,598 530 Updated Jun 11, 2025

Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)

Python 53 5 Updated Jun 26, 2025
Python 13 4 Updated Mar 13, 2024

KoCLIP: Korean port of OpenAI CLIP, in Flax

Python 152 19 Updated Aug 22, 2023
Python 3,986 376 Updated Jun 13, 2025

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Python 2,491 184 Updated Apr 2, 2025
Python 136 10 Updated Feb 13, 2024

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

1,206 59 Updated Jun 28, 2024

InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024)

Python 160 6 Updated May 31, 2024
Next
0