- Earth
NLP
[EMNLP 2022] Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning
A PyTorch-based model pruning toolkit for pre-trained language models
EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
中文文本分类,TextCNN,TextRNN,FastText,TextRCNN,BiLSTM_Attention,DPCNN,Transformer,基于pytorch,开箱即用。
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018)
Demo app for running a bot with Rasa Enterprise
Data augmentation for NLP, presented at EMNLP 2019
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Large Language Model Text Generation Inference
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
MTEB: Massive Text Embedding Benchmark
Netease Youdao's open-source embedding and reranker models for RAG products.
FlashInfer: Kernel Library for LLM Serving
This is a continuously updated handbook for readers to easily track the latest Text-to-SQL techniques in the literature and provide practical guidance for researchers and practitioners. Official re…