Stars
Rust library for indexing and quickly searching large pretraining corpora
Repository containing code for the paper on identification of source domains by contrastive learning
The official repo for the GlobalBias dataset and associated paper: 'Who is better at math, Jenny or Jingzhen? Exploring Intersectional Biases in Large Language Models'
Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).
ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
Evaluating Text Representations on Lexical Composition
A browser extension that alerts you when you navigate to a website belonging to an organization whose employees are on strike.
Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition, EACL 2021"
A BERT-based Chinese Text Encoder Enhanced by N-gram Representations
A curated list of pretrained sentence and word embedding models
Zebra Crossing: an easy-to-use digital safety checklist
Super easy library for BERT based NLP models
dalinvip / LatticeLSTM
Forked from jiesutd/LatticeLSTMChinese NER using Lattice LSTM. Code for ACL 2018 paper.
A fast LSTM Language Model for large vocabulary language like Japanese and Chinese
Code for the ACL 2018 paper "Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context"
Python Flask & jQuery AJAX sample app