- Bay Area, CA, USA
-
03:30
(UTC -07:00) - chuckwooters.com
- in/chuck-wooters
- https://@chuckw.bsky.social
NLP
NeuSpell: A Neural Spelling Correction Toolkit
A library to synthesize text datasets using Large Language Models (LLM)
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
An annotated implementation of the Transformer paper.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Large-scale pretrained models for goal-directed dialog
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Neural information retrieval / Semantic search / Bi-encoders
Curated list of awesome tools, demos, docs for ChatGPT and GPT-3
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
MTEB: Massive Text Embedding Benchmark
Retrieval and Retrieval-augmented LLMs
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.