Additional JDBC database connection properties can be set (...) You can find the JDBC-specific option and parameter documentation for reading tables via JDBC in https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html#data-source-optionData Source Option in the version you use.
Usageread.jdbc(
url,
tableName,
partitionColumn = NULL,
lowerBound = NULL,
upperBound = NULL,
numPartitions = 0L,
predicates = list(),
...
)
Arguments
JDBC database url of the form jdbc:subprotocol:subname
the name of the table in the external database
the name of a column of numeric, date, or timestamp type that will be used for partitioning.
the minimum value of partitionColumn
used to decide partition stride
the maximum value of partitionColumn
used to decide partition stride
the number of partitions, This, along with lowerBound
(inclusive), upperBound
(exclusive), form partition strides for generated WHERE clause expressions used to split the column partitionColumn
evenly. This defaults to SparkContext.defaultParallelism when unset.
a list of conditions in the where clause; each one defines one partition
additional JDBC database connection named properties.
Only one of partitionColumn or predicates should be set. Partitions of the table will be retrieved in parallel based on the numPartitions
or by the predicates.
Don't create too many partitions in parallel on a large cluster; otherwise Spark might crash your external database systems.
Noteread.jdbc since 2.0.0
Examplesif (FALSE) { # \dontrun{
sparkR.session()
jdbcUrl <- "jdbc:mysql://localhost:3306/databasename"
df <- read.jdbc(jdbcUrl, "table", predicates = list("field<=123"), user = "username")
df2 <- read.jdbc(jdbcUrl, "table2", partitionColumn = "index", lowerBound = 0,
upperBound = 10000, user = "username", password = "password")
} # }
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4