Lists (12)
Sort Name ascending (A-Z)
Starred repositories
USB Army Knife – the ultimate close access tool for penetration testers and red teamers.
A configuration framework that enhances Claude Code with specialized commands, cognitive personas, and development methodologies.
Storing long contexts in tiny caches with self-study
Automated Hypothesis Testing with Agentic Sequential Falsifications
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.
Open-source Multi-agent Poster Generation from Papers
OctoTools: An agentic framework with extensible tools for complex reasoning
Manage multiple AI terminal agents like Claude Code, Aider, Codex, OpenCode, and Amp.
Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents
The official Python library for the OpenAI API
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Set up SWE-Lancer 50X faster on Morph Cloud
This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
A helpful 5-page machine learning cheatsheet to assist with exam reviews, interview prep, and anything in-between.
Framework and toolkits for building and evaluating collaborative agents that can work together with humans.
Dive endlessly deeper into a single concept using AI