8000 GitHub - WallGuan/elit-tokenizer: ELIT Tokenizer
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

WallGuan/elit-tokenizer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ELIT Tokenizer

ELIT (Emory Information and Language Technology) features an English tokenizer that splits text into a sequence of tokens and segment them into sentences using lexicon-based heuristics. This project is led by the Emory NLP Research Laboratory and under the Apache 2.0 license.

  • Latest release: 1.0 (10/15/2021)

Installation

Python 3.7 or higher is recommended:

pip install elit_tokenizer

Documentation

Contact

About

ELIT Tokenizer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%
0