sentencepiece
cranv0.2.5Text Tokenization using Byte Pair Encoding and Unigram Modelling. Unsupervised text tokenizer allowing to perform byte pair encoding and unigram modelling. Wraps the 'sentencepiece' library <https://github.com/google/sentencepiece> which provides a language independent tokenizer to split text in words and smaller subword units. The techniques are explained in the
License MPL-2.0weak copyleft0 versions1 maintainers2 deps84 weekly dl
bnosac/sentencepiece47
/ 100
Health
safe to use
[email protected] is safe to use (health: 47/100)
Health breakdown0 – 100
20/25
maintenance
0/20
popularity
25/25
security
0/15
maturity
2/15
community
Vulnerabilities
0
none known
Health History
Dependency Tree
License Audit
API access
Get this data programmatically — free, no authentication.
curl https://depscope.dev/api/check/cran/sentencepieceFirst published · 2026-02-16 01:49:10
Last updated · 2026-02-09T13:40:02+00:00