Stars
Example models using DeepSpeed
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Chat with any PDF. Easily upload the PDF documents you'd like to chat with. Instant answers. Ask questions, extract information, and summarize documents with AI. Sources included.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
Making large AI models cheaper, faster and more accessible
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
A series of large language models developed by Baichuan Intelligent Technology
LAVIS - A One-stop Library for Language-Vision Intelligence
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
SGPT: GPT Sentence Embeddings for Semantic Search
Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)
Home of StarCoder: fine-tuning & inference!
waterzxj / the-algorithm
Forked from twitter/the-algorithmSource code for Twitter's Recommendation Algorithm
A simple and fast KD-tree for points in Python for kNN or nearest points. (damm short at just ~60 lines) No libraries needed.
Implemention some Baseline Model upon Bert for Text Classification
搜集、整理、发布 中文 自然语言处理 语料/数据集,与 有志之士 共同 促进 中文 自然语言处理 的 发展。
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
A natural language modeling framework based on PyTorch
all kinds of text classification models and more with deep learning
Classic papers and resources on recommendation
An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
A system for quickly generating training data with weak supervision