💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
-
Updated
Jul 4, 2025 - Rust
8000
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Natural language detection library for Rust. Try demo online: https://whatlang.org/
The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
The Jieba Chinese Word Segmentation Implemented in Rust
A fast, low-resource Natural Language Processing and Text Correction library written in Rust.
🎤 vibrato: Viterbi-based accelerated tokenizer
Checks all your documentation for spelling and grammar mistakes with hunspell and a nlprule based checker for grammar
🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer
🕷️ The pipeline for the OSCAR corpus
Simple NLP in Rust with Python bindings
Use multiple LLM backends in a single crate, simple builder-based configuration, and built-in prompt chaining & templating.
Official Rust Implementation of Model2Vec
Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)
Created by Alan Turing