A scalable graph clustering algorithm. Users can call spark.assignClusters
to return a cluster assignment for each input vertex. Run the PIC algorithm and returns a cluster assignment for each input vertex.
spark.assignClusters(data, ...)
# S4 method for class 'SparkDataFrame'
spark.assignClusters(
data,
k = 2L,
initMode = c("random", "degree"),
maxIter = 20L,
sourceCol = "src",
destinationCol = "dst",
weightCol = NULL
)
Arguments
a SparkDataFrame.
additional argument(s) passed to the method.
the number of clusters to create.
the initialization algorithm; "random" or "degree"
the maximum number of iterations.
the name of the input column for source vertex IDs.
the name of the input column for destination vertex IDs
weight column name. If this is not set or NULL
, we treat all instance weights as 1.0.
A dataset that contains columns of vertex id and the corresponding cluster for the id. The schema of it will be: id: integer
, cluster: integer
spark.assignClusters(SparkDataFrame) since 3.0.0
Examplesif (FALSE) { # \dontrun{
df <- createDataFrame(list(list(0L, 1L, 1.0), list(0L, 2L, 1.0),
list(1L, 2L, 1.0), list(3L, 4L, 1.0),
list(4L, 0L, 0.1)),
schema = c("src", "dst", "weight"))
clusters <- spark.assignClusters(df, initMode = "degree", weightCol = "weight")
showDF(clusters)
} # }
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4