- Beijing
Starred repositories
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Upserts, Deletes And Incremental Processing on Big Data.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Master programming by recreating your favorite technologies from scratch.
Large World Model -- Modeling Text and Video with Millions Context
A high-throughput and memory-efficient inference and serving engine for LLMs
Apache Spark - A unified analytics engine for large-scale data processing
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Fast and simple stream processing of files in tar files, useful for deep learning, big data, and many other applications.
Code and documentation to train Stanford's Alpaca models, and generate the data.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
🔊 Text-Prompted Generative Audio Model