An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
-
Updated
Jun 13, 2025 - Python
8000
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
A Python-based REST API for PDF OCR using AI models with PyTorch and Transformers that runs in a Docker container.
Add a description, image, and links to the llm-ocr topic page so that developers can more easily learn about it.
To associate your repository with the llm-ocr topic, visit your repo's landing page and select "manage topics."