Cheng-Fu Yang*, Wan-Cyuan Fan*, Fu-En Yang, Yu-Chiang Frank Wang, "LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity", Proceedings of the IEEE/CVF Conference on Compu…

Jupyter Notebook 60 7 Updated Apr 3, 2022

luohongyin / LangCode

LangCode - Improving alignment and reasoning of large language models (LLMs) with natural language embedded program (NLEP).

Python 43 7 Updated Sep 22, 2023

para-lost / AutoPresent

Code for the paper "AutoPresent: Designing Structured Visuals From Scratch" (CVPR 2025)

Python 102 7 Updated May 26, 2025

icip-cas / PPTAgent

PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides

Python 1,634 177 Updated Jun 25, 2025

docling-project / docling

Get your documents ready for gen AI

Python 32,794 2,127 Updated Jun 26, 2025

jujumilk3 / leaked-system-prompts

Collection of leaked system prompts

10,780 1,406 Updated Jun 11, 2025

neuml / txtai

💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows

Python 11,144 703 Updated Jun 24, 2025

Hon-Wong / Elysium

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM

Python 76 4 Updated Oct 25, 2024

PaddlePaddle / PaddleX

All-in-One Development Tool based on PaddlePaddle

Python 5,580 1,043 Updated Jun 26, 2025

1230young / bizgen

[CVPR 2025] This is an official inference code of the paper "BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation" . Project page: https://bizgen-msra.github.io/

Python 279 37 Updated Apr 5, 2025

nyanp / chat2plot

chat to visualization with LLM

Python 252 32 Updated Nov 19, 2023

LayTextLLM / LayTextLLM

Jupyter Notebook 94 11 Updated Dec 23, 2024

Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis

Python 5,323 498 Updated Aug 15, 2024

Visual-AI / 3DRS

The official repository for paper "MLLMs Need 3D-Aware Representation Supervision for Scene Understanding"

Python 58 Updated Jun 12, 2025

Paper2Poster / Paper2Poster

Open-source Multi-agent Poster Generation from Papers

Python 2,194 123 Updated Jun 17, 2025

baaivision / EVE

EVE Series: Encoder-Free Vision-Language Models from BAAI

Python 332 8 Updated Mar 1, 2025

facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 50,612 5,955 Updated Sep 18, 2024