8000 GitHub - sfermoy/modNLP: modnlp aims to provide a modular architecture and tools for natural language processing
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

sfermoy/modNLP

Repository files navigation

MODNLP: Modular Suite of NLP Tools for Corpus Analysis

https://sourceforge.net/projects/modnlp/

modnlp aims to provide a modular architecture and tools for natural language processing written (mainly) in Java. It provides an API and tools for (inverted) indexing, storage and retrieval of large amounts of text, with (XML-based) handling of meta-data, tools for text categorisation, including, functionality for XML parsing, term set reduction (and basic keyword extraction), probabilistic classifier induction, sample classification tools, and evaluation modules, a suite of corpus management, curation and distributed access tools.

If you use the tool please consider referencing it using the following article:

Luz, S., & Sheehan, S. (2020). Methods and visualization tools for the analysis of medical, political and scientific concepts in Genealogies of Knowledge. Palgrave Communications, 6(1), 1-20.

The modnlp suite is distributed under the GNU General Public License. See COPYING (in the distribution tar archives for details).

Access to the development version via GIT through sourceforge. E.g.:

git clone git://git.code.sf.net/p/modnlp/code modnlp-code

or

git clone http://git.code.sf.net/p/modnlp/code modnlp

(see http://sourceforge.net/p/modnlp/code/ci/master/tree/ )

See top-level README for how to get the required external (3rd party) libraries.

Packages:

  • modnlp-teccli-VERSION-bin-gok: binaries (executable) concordance browser and visualisation software with configuration files set for access to the Genealogies of Knowledge (GoK) project.

    • 0.8.6-bin-gok: bug fixes in idx files.
  • modnlp-idx-VERSION-bin-gok: text and metadata indexer pre-configured for GoK corpus.

    • 0.8.5: bug-fixes (corrected discrepancy between freq lists and concordances).
  • modnlp-tecser-VERSION-bin.tar.gz: corpus server software.

    • 0.1.4: bug fixes in idx files.
  • modnlp-VERSION.tar.gz: full source code (tc, idx, tecser, teccli).

    • 0.2.1: stable teccli 0.8.6, idx 0.8.5, tecser 0.1.4 releases.

About

modnlp aims to provide a modular architecture and tools for natural language processing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
0