8000 Releases · clarinsi/classla · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Releases: clarinsi/classla

Modernise allowed range of python versions

06 May 15:31
Compare
Choose a tag to compare

With this release, the range of allowed python versions range from 3.7 to 3.13.

The core functionality remains the same.

Update SL models, add type="spoken"

13 Feb 14:16
Compare
Choose a tag to compare

Update Slovenian models:

  • Standard parser
  • Standard NER tagger

Add a new spoken type for SL:

  • Spoken morphosyntactic tagger
  • Spoken lemmatizer
  • Spoken UD dependency parser

v2.1.1

10 Apr 11:10
Compare
Choose a tag to compare

reldi-tokeniser 1.0.3 added as dependency, in which a bug in abbreviation loading has been resolved.

v2.1

08 Aug 07:32
e3ace3b
Compare
Choose a tag to compare
  • Added new models for all languages
  • Added new "web" processing type
  • Fixed sentence splitting in the tokenizers

v2.0

16 Feb 18:41
Compare
Choose a tag to compare
  • Added new models for standard Slovenian
  • Added new inflectional lexicon for Slovenian
  • Adapted tests to new model outputs
  • Modified lexicon to store underscores instead of empty strings
  • Other changes

v1.2.0

29 Jun 11:32
Compare
Choose a tag to compare
  • Added SRL parsing to Slovenian language
  • Fixed training for lemmatizer and pos tagger
  • Added toy tests for all trainings
  • Other smaller fixes

v1.1.1

06 May 09:21
Compare
Choose a tag to compare
  • Updated external package version requirements. Mainly due to updates in Slovenian obeliks tokenizer

v1.1.0

12 Jan 09:36
Compare
Choose a tag to compare
  • Added tokenizer pretag option for both obeliks and reldi-tokeniser (via pos_lemma_pretag)
  • Updated Slovene inflectional lexicon and moved from lemmatizer model to morphosyntactic annotation model
  • Added upos and ufeats control to Slovene inflectional lexicon
  • Other smaller fixes

v1.0.2

07 Sep 08:21
Compare
Choose a tag to compare
  • fixed issue where the parser produced non-CONLLU-compliant extension labels with underscores (e.g. cc_preconj) instead of colon-separated labels (e.g. cc:preconj)
  • during lemmatization, if a token consists of a character that is not present in the seq2seq vocabulary, lemma will now be identical to the token
  • added PUNCT control
  • fixed MISC collumn bug for NER
  • punct in Bulgarian UPOS was renamed to Z
0