Tokenize text into morphemes. The morphemepiece algorithm uses a lookup table to determine the morpheme breakdown of words, and falls back on a modified wordpiece tokenization algorithm for words not found in the lookup table.
Version: 1.2.3 Imports: dlr (≥ 1.0.0), fastmatch, magrittr, memoise (≥ 2.0.0), morphemepiece.data, piecemaker (≥ 1.0.0), purrr (≥ 0.3.4), readr, rlang, stringr (≥ 1.4.0) Suggests: dplyr, fs, ggplot2, here, knitr, remotes, rmarkdown, testthat (≥ 3.0.0), utils Published: 2022-04-16 DOI: 10.32614/CRAN.package.morphemepiece Author: Jonathan Bratt [aut, cre], Jon Harmon [aut], Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph] Maintainer: Jonathan Bratt <jonathan.bratt at macmillan.com> BugReports: https://github.com/macmillancontentscience/morphemepiece/issues License: Apache License (≥ 2) URL: https://github.com/macmillancontentscience/morphemepiece NeedsCompilation: no Materials: README, NEWS CRAN checks: morphemepiece resultsRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4