RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://api-docs.databricks.com/python/pyspark/latest/api/pyspark.ml.Pipeline.html below:

Pipeline — PySpark master documentation

PipelineÂ¶

class pyspark.ml.Pipeline(*, stages: Optional[List[PipelineStage]] = None)Â¶

A simple pipeline, which acts as an estimator. A Pipeline consists of a sequence of stages, each of which is either an Estimator or a Transformer. When Pipeline.fit() is called, the stages are executed in order. If a stage is an Estimator, its Estimator.fit() method will be called on the input dataset to fit a model. Then the model, which is a transformer, will be used to transform the dataset as the input to the next stage. If a stage is a Transformer, its Transformer.transform() method will be called to produce the dataset for the next stage. The fitted model from a Pipeline is a PipelineModel, which consists of fitted models and transformers, corresponding to the pipeline stages. If stages is an empty list, the pipeline acts as an identity transformer.

Methods

clear(param)

Clears a param from the param map if it has been explicitly set.

copy([extra])

Creates a copy of this instance.

explainParam(param)

Explains a single param and returns its name, doc, and optional default value and user-supplied value in a string.

explainParams()

Returns the documentation of all params with their optionally default values and user-supplied values.

extractParamMap([extra])

Extracts the embedded default param values and user-supplied values, and then merges them with extra values from input into a flat param map, where the latter value is used if there exist conflicts, i.e., with ordering: default param values < user-supplied values < extra.

fit(dataset[,Â params])

Fits a model to the input dataset with optional parameters.

fitMultiple(dataset,Â paramMaps)

Fits a model to the input dataset for each param map in paramMaps.

getOrDefault(param)

Gets the value of a param in the user-supplied param map or its default value.

getParam(paramName)

Gets a param by its name.

getStages()

Get pipeline stages.

hasDefault(param)

Checks whether a param has a default value.

hasParam(paramName)

Tests whether this instance contains a param with a given (string) name.

isDefined(param)

Checks whether a param is explicitly set by user or has a default value.

isSet(param)

Checks whether a param is explicitly set by user.

load(path)

Reads an ML instance from the input path, a shortcut of read().load(path).

read()

Returns an MLReader instance for this class.

save(path)

Save this ML instance to the given path, a shortcut of âwrite().save(path)â.

set(param,Â value)

Sets a parameter in the embedded param map.

setParams(self,Â \*[,Â stages])

Sets params for Pipeline.

setStages(value)

Set pipeline stages.

write()

Returns an MLWriter instance for this ML instance.

Attributes

Methods Documentation

clear(param: pyspark.ml.param.Param) → NoneÂ¶: Clears a param from the param map if it has been explicitly set.

copy(extra: Optional[ParamMap] = None) → PipelineÂ¶

Creates a copy of this instance.

Parameters

extradict, optional: extra parameters

Returns

Pipeline: new instance

explainParam(param: Union[str, pyspark.ml.param.Param]) → strÂ¶: Explains a single param and returns its name, doc, and optional default value and user-supplied value in a string.

explainParams() → strÂ¶: Returns the documentation of all params with their optionally default values and user-supplied values.

Parameters

extradict, optional: extra param values

Returns

dict: merged param map

fit(dataset: pyspark.sql.dataframe.DataFrame, params: Union[ParamMap, List[ParamMap], Tuple[ParamMap], None] = None) → Union[M, List[M]]Â¶

Fits a model to the input dataset with optional parameters.

Parameters

datasetpyspark.sql.DataFrame: input dataset.
paramsdict or list or tuple, optional: an optional param map that overrides embedded params. If a list/tuple of param maps is given, this calls fit on each param map and returns a list of models.

Returns

Transformer or a list of Transformer: fitted model(s)

fitMultiple(dataset: pyspark.sql.dataframe.DataFrame, paramMaps: Sequence[ParamMap]) → Iterator[Tuple[int, M]]Â¶

Fits a model to the input dataset for each param map in paramMaps.

Parameters

datasetpyspark.sql.DataFrame: input dataset.
paramMapscollections.abc.Sequence: A Sequence of param maps.

Returns

_FitMultipleIterator: A thread safe iterable which contains one model for each param map. Each call to next(modelIterator) will return (index, model) where model was fit using paramMaps[index]. index values may not be sequential.

getOrDefault(param: Union[str, pyspark.ml.param.Param[T]]) → Union[Any, T]Â¶: Gets the value of a param in the user-supplied param map or its default value. Raises an error if neither is set.

getParam(paramName: str) → pyspark.ml.param.Param Â¶: Gets a param by its name.

getStages() → List[PipelineStage]Â¶: Get pipeline stages.

hasDefault(param: Union[str, pyspark.ml.param.Param[Any]]) → boolÂ¶: Checks whether a param has a default value.

hasParam(paramName: str) → boolÂ¶: Tests whether this instance contains a param with a given (string) name.

isDefined(param: Union[str, pyspark.ml.param.Param[Any]]) → boolÂ¶: Checks whether a param is explicitly set by user or has a default value.

isSet(param: Union[str, pyspark.ml.param.Param[Any]]) → boolÂ¶: Checks whether a param is explicitly set by user.

classmethod load(path: str) → RLÂ¶: Reads an ML instance from the input path, a shortcut of read().load(path).

classmethod read() → pyspark.ml.pipeline.PipelineReaderÂ¶: Returns an MLReader instance for this class.

save(path: str) → NoneÂ¶: Save this ML instance to the given path, a shortcut of âwrite().save(path)â.

set(param: pyspark.ml.param.Param, value: Any) → NoneÂ¶: Sets a parameter in the embedded param map.

setParams(self, \*, stages=None)Â¶: Sets params for Pipeline.

setStages(value: List[PipelineStage]) → PipelineÂ¶

Set pipeline stages.

Parameters

valuelist: of pyspark.ml.Transformer or pyspark.ml.Estimator

Returns

Pipeline: the pipeline instance

write() → pyspark.ml.util.MLWriter Â¶: Returns an MLWriter instance for this ML instance.

Attributes Documentation

paramsÂ¶: Returns all params ordered by name. The default implementation uses dir() to get all attributes of type Param.

stages: pyspark.ml.param.Param[List[PipelineStage]] = Param(parent='undefined', name='stages', doc='a list of pipeline stages')Â¶

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4