8000 hunterheiden / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View hunterheiden's full-sized avatar

Block or report hunterheiden

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Python 7,983 758 Updated Jun 22, 2025

Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models

768 50 Updated May 21, 2025

We collect papers about "large language models (LLM) for table-related tasks", e.g., using LLM for Table QA task. “表格+LLM”相关论文整理

511 37 Updated Jul 9, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 41,887 3,341 Updated Jul 11, 2025

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Python 159 11 Updated Apr 3, 2024

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Python 2,696 216 Updated Jun 19, 2025
Jupyter Notebook 5 1 Updated Jan 6, 2023
Jupyter Notebook 6 2 Updated Feb 22, 2022

We release the UICaption dataset. The dataset consists of UI images (icons and screenshots) and associated text descriptions. This dataset was used to pre-train the Lexi model which provides a gene…

Python 41 6 Updated Nov 29, 2022

A curated list of awesome Web Font Icons

1,419 77 Updated Mar 27, 2025

The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”

Python 963 57 Updated Jan 30, 2024

The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and describe the UI elements present on the screen: their type, loca…

72 11 Updated Mar 7, 2024

A One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 2,736 332 Updated Jul 11, 2025

A collection of useful .gitignore templates

167,949 83,051 Updated Jul 7, 2025

ScreenQA dataset was introduced in the "ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots" paper. It contains ~86K question-answer pairs collected by human annotators for ~35K…

Python 120 9 Updated Feb 7, 2025

[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist web agents

Jupyter Notebook 842 112 Updated Apr 3, 2025

Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments

Jupyter Notebook 61 3 Updated Aug 19, 2024

MiniCPM4: Ultra-Efficient LLMs on End Devices, achieving 5+ speedup on typical end-side chips

Jupyter Notebook 8,079 503 Updated Jul 8, 2025

An open-source framework for training large multimodal models.

Python 3,976 306 Updated Aug 31, 2024

The model, data and code for the visual GUI Agent SeeClick

HTML 398 19 Updated Nov 22, 2024

Machine Learning Engineering Open Book

Python 14,282 862 Updated Jul 9, 2025

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 2,029 186 Updated Jun 30, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 39,306 4,460 Updated Jul 11, 2025
Jupyter Notebook 117 17 Updated Dec 4, 2023

The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget captioning model (please see the EMNLP'20 paper: https://arxiv…

22 2 Updated Jun 24, 2021

It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item Selection (VIS) data. Both datasets are written TFRecords.

44 4 Updated Aug 2, 2021

Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"

Assembly 555 42 Updated Dec 28, 2024

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Python 2,830 208 Updated Mar 8, 2024

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

Python 280 41 Updated Feb 13, 2023

OCR Annotations from Amazon Textract for Industry Documents Library

Python 102 7 Updated Aug 20, 2022
Next
0