RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://spark.apache.org/docs/latest/api/R/reference/repartitionByRange.html below:

Repartition by range â repartitionByRange â¢ SparkR

The following options for repartition by range are possible:

1. Return a new SparkDataFrame range partitioned by the given columns into numPartitions.
2. Return a new SparkDataFrame range partitioned by the given column(s), using spark.sql.shuffle.partitions as number of partitions.

At least one partition-by expression must be specified. When no explicit sort order is specified, "ascending nulls first" is assumed.

Usage

repartitionByRange(x, ...)

# S4 method for class 'SparkDataFrame'
repartitionByRange(x, numPartitions = NULL, col = NULL, ...)

Arguments

x: a SparkDataFrame.
...: additional column(s) to be used in the range partitioning.
numPartitions: the number of partitions to use.
col: the column by which the range partitioning will be performed.

Details

Note that due to performance reasons this method uses sampling to estimate the ranges. Hence, the output may not be consistent, since sampling can return different values. The sample size can be controlled by the config spark.sql.execution.rangeExchange.sampleSizePerPartition.

Note

repartitionByRange since 2.4.0

See also

repartition, coalesce

Other SparkDataFrame functions: SparkDataFrame-class, agg(), alias(), arrange(), as.data.frame(), attach,SparkDataFrame-method, broadcast(), cache(), checkpoint(), coalesce(), collect(), colnames(), coltypes(), createOrReplaceTempView(), crossJoin(), cube(), dapplyCollect(), dapply(), describe(), dim(), distinct(), dropDuplicates(), dropna(), drop(), dtypes(), exceptAll(), except(), explain(), filter(), first(), gapplyCollect(), gapply(), getNumPartitions(), group_by(), head(), hint(), histogram(), insertInto(), intersectAll(), intersect(), isLocal(), isStreaming(), join(), limit(), localCheckpoint(), merge(), mutate(), ncol(), nrow(), persist(), printSchema(), randomSplit(), rbind(), rename(), repartition(), rollup(), sample(), saveAsTable(), schema(), selectExpr(), select(), showDF(), show(), storageLevel(), str(), subset(), summary(), take(), toJSON(), unionAll(), unionByName(), union(), unpersist(), unpivot(), withColumn(), withWatermark(), with(), write.df(), write.jdbc(), write.json(), write.orc(), write.parquet(), write.stream(), write.text()

Examples

if (FALSE) { # \dontrun{
sparkR.session()
path <- "path/to/file.json"
df <- read.json(path)
newDF <- repartitionByRange(df, col = df$col1, df$col2)
newDF <- repartitionByRange(df, 3L, col = df$col1, df$col2)
} # }

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://spark.apache.org/docs/latest/api/R/reference/repartitionByRange.html below:

Repartition by range â repartitionByRange â¢ SparkR

Repartition by range â repartitionByRange â¢ SparkR