Stars
文本相似度(匹配)计算,提供Baseline、训练、推理、指标分析...代码包含TensorFlow/Pytorch双版本
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
NLP句子编码、句子embedding、语义相似度:BERT_avg、BERT_whitening、SBERT、SmiCSE
Facilitating the design, comparison and sharing of deep text matching models.
使用Bert,ERNIE,进行中文文本分类
HDLTex: Hierarchical Deep Learning for Text Classification
Hierarchy-Aware Global Model for Hierarchical Text Classification
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Heterogenous, Task- and Domain-Specific Benchmark for Unsupervised Sentence Embeddings used in the TSDAE paper: https://arxiv.org/abs/2104.06979.
The code of CIKM'19 paper《Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach》
Open and Knowledgeable NLP Toolkit including CWS, POS Tagging, NER, and Entity Typing
你管这破玩意叫操作系统源码 — 像小说一样品读 Linux 0.11 核心代码
A list of contrastive Learning papers
A curated list of papers dedicated to neural text (semantic) matching.
State-of-the-Art Text Embeddings
Must-read papers on prompt-based tuning for pre-trained language models.
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
A library for efficient similarity search and clustering of dense vectors.
Implementation of AAAI 21 paper: Nested Named Entity Recognition with Partially Observed TreeCRFs
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
CNN-RNN中文文本分类,基于TensorFlow