Tags · benbrandt/text-splitter

v0.27.0

May 28, 2025
bada552
zip
tar.gz
Notes

v0.26.0

Bump check in ci

May 9, 2025
5f04d63
zip
tar.gz
Notes

v0.25.1

chore: attempt to lower requirement on memchr

Mar 25, 2025
0e8c166
zip
tar.gz
Notes

v0.25.0

ci: remove caches

Mar 22, 2025
2fe70b5
zip
tar.gz
Notes

v0.24.2

prep v0.24.2

Mar 19, 2025
11f12e4
zip
tar.gz
Notes

v0.24.1

chore: update deps

Feb 24, 2025
a3ee673
zip
tar.gz
Notes

v0.24.0

deps update

Feb 15, 2025
ae97576
zip
tar.gz
Notes

v0.23.0

chore: prep v0.23

Feb 9, 2025
0a22ee0
zip
tar.gz
Notes

v0.22.0

Jan 17, 2025
217fb50
zip
tar.gz
Notes

v0.21.0

feat!: special tokens encoded by default (#512)

* feat!: special tokens encoded by default

Special tokens are now also encoded by both Huggingface and Tiktoken tokenizers. This is closer to the default behavior on the Python side, and should make sure if a model adds tokens at the beginning or end of a sequence, these are accounted for as well.

* test: fix python tests

* docs: clarify which tokenizers are affected

Jan 16, 2025
9da8748
zip
tar.gz
Notes

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.27.0

v0.26.0

v0.25.1

v0.25.0

v0.24.2

v0.24.1

v0.24.0

v0.23.0

v0.22.0

v0.21.0

Tags: benbrandt/text-splitter