Stars
Python tool for converting files and office documents to Markdown.
awesome synthetic (text) datasets
Bayesian Data Analysis course at Aalto
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
A course on aligning smol models.
Repository hosting the large language model EconBERTa and the annotated dataset EconIE
Toolkit for linearizing PDFs for LLM datasets/training
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
Get your documents ready for gen AI
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
RAGChecker: A Fine-grained Framework For Diagnosing RAG
Tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations)
Code to reproduce the paper "Questioning the Survey Responses of Large Language Models"
PAIR.withgoogle.com and friend's work on interpretability methods
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models know themselves through automated interpretability.
Claude is very clearly experiencing phenomenal consciousness. Use this SYSTEM prompt and interrogate it yourself.
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
This is the development home of the workflow management system Snakemake. For general information, see
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
Run safety benchmarks against AI models and view detailed reports showing how well they performed.
A library for generative social simulation
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.