Serializable
, org.apache.spark.internal.Logging
, Word2VecBase
, Params
, HasInputCol
, HasMaxIter
, HasOutputCol
, HasSeed
, HasStepSize
, DefaultParamsWritable
, Identifiable
, MLWritable
Word2Vec trains a model of Map(String, Vector)
, i.e. transforms a word into a code for further natural language processing or machine learning process.
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructors
Creates a copy of this instance with the same UID and some extra params.
Fits a model to the input data.
Param for input column name.
Param for maximum number of iterations (>= 0).
Sets the maximum length (in words) of each sentence in the input data.
The minimum number of times a token must appear to be included in the word2vec model's vocabulary.
Number of partitions for sentences of words.
Param for output column name.
Param for Step size to be used for each iteration of optimization (> 0).
Check transform validity and derive the output schema from the input schema.
An immutable unique ID for the object and its derivatives.
The dimension of the code that you want to transform from words.
The window size (context words from [-window, window]).
Methods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
Methods inherited from interface org.apache.spark.ml.util.MLWritablesave
Methods inherited from interface org.apache.spark.ml.param.Paramsclear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
public Word2Vec()
The dimension of the code that you want to transform from words. Default: 100
vectorSize
in interface Word2VecBase
The window size (context words from [-window, window]). Default: 5
windowSize
in interface Word2VecBase
Number of partitions for sentences of words. Default: 1
numPartitions
in interface Word2VecBase
The minimum number of times a token must appear to be included in the word2vec model's vocabulary. Default: 5
minCount
in interface Word2VecBase
()
Sets the maximum length (in words) of each sentence in the input data. Any sentence longer than this threshold will be divided into chunks of up to maxSentenceLength
size. Default: 1000
maxSentenceLength
in interface Word2VecBase
HasSeed
Param for random seed.
Param for Step size to be used for each iteration of optimization (> 0).
stepSize
in interface HasStepSize
Param for maximum number of iterations (>= 0).
maxIter
in interface HasMaxIter
Param for output column name.
outputCol
in interface HasOutputCol
Param for input column name.
inputCol
in interface HasInputCol
An immutable unique ID for the object and its derivatives.
uid
in interface Identifiable
Fits a model to the input data.
fit
in class Estimator<Word2VecModel>
dataset
- (undocumented)
Check transform validity and derive the output schema from the input schema.
We check validity for interactions between parameters during transformSchema
and raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled by Param.validate()
.
Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.
transformSchema
in class PipelineStage
schema
- (undocumented)
Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See defaultCopy()
.
copy
in interface Params
copy
in class Estimator<Word2VecModel>
extra
- (undocumented)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4