8000 GitHub - shijinpjlab/llm-webkit-mirror
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

shijinpjlab/llm-webkit-mirror

 
 

Repository files navigation

Changelog

  • 2024/11/25: Project Initialization

Table of Contents

  1. llm-web-kit
  2. TODO
  3. Known Issues
  4. FAQ
  5. All Thanks To Our Contributors
  6. License Information
  7. Acknowledgments
  8. Citation
  9. Star History
  10. Links

llm-web-kit

Project Introduction

llm-web-kit is a python library that ..

Key Features

  • Remove headers, footers, footnotes, page numbers, etc., to ensure semantic coherence.
  • Output text in human-readable order, suitable for single-column, multi-column, and complex layouts.

Quick Start

from llm_web_kit.simple import extract_html_to_md
import traceback
from loguru import logger

def extract(url:str, html:str) -> str:
    try:
        nlp_md = extract_html_to_md(url, html)
        # or mm_nlp_md = extract_html_to_mm_md(url, html)
        return nlp_md
    except Exception as e:
        logger.exception(e)
    return None

if __name__=="__main__":
    url = ""
    html = ""
    markdown = extract(url, html)

Usage

TODO

Known Issues

FAQ

contributors

contributors

License Information

Acknowledgments

Citation

Star History

Star History Chart

links

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 95.0%
  • Python 4.4%
  • XSLT 0.5%
  • Shell 0.1%
  • C# 0.0%
  • Java 0.0%
0