8000 Neural tools integration by cgr71ii · Pull Request #235 · bitextor/bitextor · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Neural tools integration #235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 54 commits into from
Jun 1, 2022
Merged

Neural tools integration #235

merged 54 commits into from
Jun 1, 2022

Conversation

cgr71ii
Copy link
Collaborator
@cgr71ii cgr71ii commented May 31, 2022

Changes:

  • Neural tools have been integrated: Vecalign and Neural Document Aligner.
    • Neural tools have been adapted in order to work with the features which had not support before: headers, paragraph identification and deferred crawling.
  • Third party modules have been moved to a specific directory instead of being in the root directory.
  • The upload steps from the CI tests has been removed.
  • Tests optimized and updated.
  • Update documentation.

Fixes:

  • Hunalign was not working well with N * N alignments from docalign step when the trg index had been matched with another src index previously.
  • When Hunalign didn't return any result, the execution failed.
  • Dictionary docalign was not processing all the possible matches, what might lead to non deterministic results with an older version of the docalign utility script.

cgr71ii added 30 commits May 4, 2022 16:41
Documentation updated and installation support
Neural tests
Update neural dependencies installation
Add header, remove N-N alignments and only allow vecalign if NDA
and vice verse
Vecalign optional installation removed due to new setup.py
installation
@cgr71ii cgr71ii marked this pull request as ready for review June 1, 2022 07:38
@cgr71ii cgr71ii requested a review from lpla June 1, 2022 07:38
@cgr71ii cgr71ii merged commit fdc7e08 into master Jun 1, 2022
@lpla lpla deleted the neural branch January 25, 2023 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0