8000 GitHub - hparreao/doclingconverter: Quick way to convert files (PDF, DOCX, HTML, PPTX, Images) to (MD, JSON, YAML) using Docling and Streamlit
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

hparreao/doclingconverter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

doclingconverter

A Streamlit web application for converting various document formats using the Docling library.

Streamlit Application: https://doclingconvert.streamlit.app/

Features

  • Convert multiple document formats (PDF, DOCX, HTML, PPTX, Images)
  • Multiple output formats (Markdown, JSON, YAML)
  • OCR support for scanned documents
  • Advanced image resolution settings
  • Clean and intuitive interface

Installation

  1. Clone the repository:
git clone https://github.com/hparreao/doclingconverter.git
cd docling-converter
  1. Create a virtual environment and install dependencies:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
  1. Run the app locally:
streamlit run app.py

Usage

  1. Select the document type from the dropdown
  2. Upload your document
  3. Choose the desired output format
  4. Adjust advanced settings if needed
  5. Click "Start Conversion"
  6. Download the converted file

About

Quick way to convert files (PDF, DOCX, HTML, PPTX, Images) to (MD, JSON, YAML) using Docling and Streamlit

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0