Stars
Code associated with NLPeer: A unified resource for the study of peer review
SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.
This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs
python package to parse pdfs with different parsers
Agent Framework / shim to use Pydantic with LLMs
Simple, unified interface to multiple Generative AI providers
Split bib files for anthology bibliography for overleaf
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. β π€π€
Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
Curated list of datasets and tools for post-training.
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
Convert PDF to HTML without losing text or format.
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Minimal and clean examples of machine learning algorithms implementations
awesome synthetic (text) datasets
Summarize existing representative LLMs text datasets.
Simple, open source, lightweight and privacy-friendly web analytics alternative to Google Analytics.
Powerful topic model visualization in Python
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
[EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.
Class-specific Keyword Extraction code, for the KONVENS 2024 paper: "An Improved Method for Class-specific Keyword Extraction: A Case Study in the German Business Registry"
Build and share delightful machine learning apps, all in Python. π Star to support our work!
library supporting NLP and CV research on scientific papers
A collection of 1000+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML).