Stars
Official PyTorch Implementation of HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization, CVPR 2023
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
Phylogenetic orthology inference for comparative genomics
Official code repository of "BERTology Meets Biology: Interpreting Attention in Protein Language Models."
GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A Unified Library for Parameter-Efficient and Modular Transfer Learning
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
Implementation of ICLR 21 paper: Probing BERT in Hyperbolic Spaces
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Foundation Models for Genomics & Transcriptomics
DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome
[ICLR 2024] DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome