Stars
Robyn is a Super Fast Async Python Web Framework with a Rust runtime.
A toolkit for evaluation of natural language generation (NLG), including BLEU, ROUGE, METEOR, and CIDEr.
LingoWhale-8B: Open Bilingual LLMs | 开源双语预训练大模型
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021
Matplotlib styles for scientific plotting
Chinese version of GPT2 training code, using BERT tokenizer.
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
Productive, portable, and performant GPU programming in Python.
Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization
Learning to Copy for Automatic Post-Editing (EMNLP 2019)
Offline documentation built from official PyTorch release
ShadowsocksX-NG-R: Shadowsocks(R) Client for MacOS
A Tensorflow implementation of CapsNet(Capsules Net) in paper Dynamic Routing Between Capsules
📚 Solutions to Introduction to Algorithms Third Edition
All Algorithms implemented in Python
A list of machine translation open-source toolkits maintained by Tsinghua Natural Language Processing Group
A text generation reading list maintained by Tsinghua Natural Language Processing Group.
Improving the Transformer translation model with document-level context
A machine translation reading list maintained by Tsinghua Natural Language Processing Group
An open-source classical Chinese information processing toolkit developed by Tsinghua Natural Language Processing Group
An open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group