Maps a sequence of terms to their term frequencies using the hashing trick.
New in version 1.2.0.
number of features (default: 2^20)
Notes
The terms must be hashable (can not be dict/set/listâ¦).
Examples
>>> htf = HashingTF(100) >>> doc = "a a b b c d".split(" ") >>> htf.transform(doc) SparseVector(100, {...})
Methods
indexOf
(term)
Returns the index of the input term.
setBinary
(value)
If True, term frequency vector will be binary such that non-zero term counts will be set to 1 (default: False)
transform
(document)
Transforms the input document (list of terms) to term frequency vectors, or transform the RDD of document to RDD of term frequency vectors.
Methods Documentation
Returns the index of the input term.
New in version 1.2.0.
If True, term frequency vector will be binary such that non-zero term counts will be set to 1 (default: False)
New in version 2.0.0.
Transforms the input document (list of terms) to term frequency vectors, or transform the RDD of document to RDD of term frequency vectors.
New in version 1.2.0.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4