8000 GitHub - wiertz/tcorpus: High level tools and wrappers for text corpus preparation and analysis
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

wiertz/tcorpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tcorpus

tcorpus is a collection of high level tools for text corpus preparation and discourse analysis. It is being developed and used for research at the chair of Economic Geography and Sustainable Development, University of Freiburg. Things may change and break regularly, but you are welcome to see if any of it is useful.

The package relies on several dependencies for performing natural language processing tasks. Amongst other dependencies, it uses

  • flair for named entity recognition
  • syntok for segmentation and tokenization
  • NLTK for parsing grammatical structures

While tcorpus is free to use and distribute under an MIT License, this may not be the case for all dependencies. Please consider if depedency licenses cover your use case.

About

High level tools and wrappers for text corpus preparation and analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0