8000 starsy / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View starsy's full-sized avatar
  • Cisco Systems
  • Shanghai, China

Block or report starsy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Python 734 49 Updated Sep 27, 2024

Convert PDF to markdown + JSON quickly with high accuracy

Python 26,390 1,711 Updated Jul 7, 2025

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …

Python 8,178 704 Updated Jul 7, 2025

LLM inference in C/C++

C++ 82,720 12,293 Updated Jul 8, 2025

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

Python 1,289 172 Updated Jul 7, 2025

官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project

1,717 101 Updated Jul 3, 2024

Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali

Python 2,301 154 Updated Jul 3, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 53,829 6,588 Updated Jul 8, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 51,756 8,567 Updated Jul 8, 2025

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 15,040 1,273 Updated May 23, 2024

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 17,763 1,190 Updated Jul 8, 2025

A Python library to extract tabular data from PDFs

Python 3,344 500 Updated Jul 8, 2025

Go ahead and axolotl questions

Python 9,842 1,064 Updated Jul 8, 2025

Production-ready platform for agentic workflow development.

TypeScript 106,145 16,053 Updated Jul 8, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 8,542 735 Updated Jul 8, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,133 261 Updated Jul 6, 2025

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,891 522 Updated Apr 11, 2025

Security and compliance proxy for LLM APIs

JavaScript 47 9 Updated Jul 21, 2023

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Python 25,065 3,417 Updated Jul 8, 2025

用 Express 和 Vue3 搭建的 ChatGPT 演示网页

Vue 31,957 11,162 Updated Aug 16, 2024

LLM API 管理 & 分发系统,支持 OpenAI、Azure、Anthropic Claude、Google Gemini、DeepSeek、字节豆包、ChatGLM、文心一言、讯飞星火、通义千问、360 智脑、腾讯混元等主流模型,统一 API 适配,可用于 key 管理与二次分发。单可执行文件,提供 Docker 镜像,一键部署,开箱即用。LLM API management & k…

JavaScript 26,033 5,280 Updated Feb 21, 2025

Container plugin for Slurm Workload Manager

C 353 35 Updated Nov 6, 2024

🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transf…

Go 33,743 2,614 Updated Jul 8, 2025

💬 MaxKB is an open-source AI assistant for enterprise. It seamlessly integrates RAG pipelines, supports robust workflows, and provides MCP tool-use capabilities.

Python 16,998 2,203 Updated Jul 8, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,788 1,438 Updated Jun 30, 2025

This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.

JavaScript 130,291 17,275 Updated Jun 18, 2025

Minimal keyword extraction with BERT

Python 3,927 373 Updated Mar 25, 2025

structured outputs for llms

Python 10,907 815 Updated Jul 8, 2025

Joplin - the privacy-focused note t 3AB4 aking app with sync capabilities for Windows, macOS, Linux, Android and iOS.

TypeScript 50,208 5,390 Updated Jul 7, 2025

Fork of turndown-plugin-gfm for Jopin

JavaScript 12 6 Updated Jun 27, 2021
Next
0