Transform data facts from rows into additional columns using SQL and controlTable.
blocks_to_rowrecs_q( tallTable, keyColumns, controlTable, my_db, ..., columnsToCopy = NULL, tempNameGenerator = mk_tmp_name_source("mvtcq"), strict = FALSE, controlTableKeys = colnames(controlTable)[[1]], checkNames = TRUE, checkKeys = FALSE, showQuery = FALSE, defaultValue = NULL, dropDups = TRUE, temporary = FALSE, resultName = NULL, incoming_qualifiers = NULL, outgoing_qualifiers = NULL, executeQuery = TRUE )Arguments tallTable
name of table containing data to be mapped (db/Spark data)
keyColumnscharacter list of column defining row groups
controlTabletable specifying mapping (local data frame)
my_dbdb handle
...force later arguments to be by name.
columnsToCopycharacter list of column names to copy
tempNameGeneratora tempNameGenerator from cdata::mk_tmp_name_source()
strictlogical, if TRUE check control table name forms
controlTableKeyscharacter, which column names of the control table are considered to be keys.
checkNameslogical, if TRUE check names
checkKeyslogical, if TRUE check keying of tallTable
showQueryif TRUE print query
defaultValueif not NULL literal to use for non-match values.
dropDupslogical if TRUE suppress duplicate columns (duplicate determined by name, not content).
temporarylogical, if TRUE make result temporary.
resultNamecharacter, name for result table.
incoming_qualifiersoptional named ordered vector of strings carrying additional db hierarchy terms, such as schema.
outgoing_qualifiersoptional named ordered vector of strings carrying additional db hierarchy terms, such as schema.
executeQuerylogical, if TRUE execute the query and return result.
Valuewide table built by mapping key-grouped tallTable rows to one row per group
DetailsThis is using the theory of "fluid data"n (https://github.com/WinVector/cdata), which includes the principle that each data cell has coordinates independent of the storage details and storage detail dependent coordinates (usually row-id, column-id, and group-id) can be re-derived at will (the other principle is that there may not be "one true preferred data shape" and many re-shapings of data may be needed to match data to different algorithms and methods).
The controlTable defines the names of each data element in the two notations: the notation of the tall table (which is row oriented) and the notation of the wide table (which is column oriented). controlTable[ , 1] (the group label) cross colnames(controlTable) (the column labels) are names of data cells in the long form. controlTable[ , 2:ncol(controlTable)] (column labels) are names of data cells in the wide form. To get behavior similar to tidyr::gather/spread one builds the control table by running an appropriate query over the data.
Some discussion and examples can be found here: https://winvector.github.io/FluidData/FluidData.html and here https://github.com/WinVector/cdata.
See alsobuild_pivot_control_q
, blocks_to_rowrecs
if (requireNamespace("DBI", quietly
= TRUE) && requireNamespace("RSQLite", quietly
= TRUE)) { my_db <- DBI::dbConnect(RSQLite::SQLite(),
":memory:") # pivot example d <- data.frame(meas
= c('AUC',
'R2'), val
= c(0.6,
0.2)) rquery::rq_copy_to(my_db,
'd',
d, temporary
= TRUE) cT <- build_pivot_control_q('d', columnToTakeKeysFrom
= 'meas', columnToTakeValuesFrom
= 'val', my_db
= my_db) tab <- blocks_to_rowrecs_q('d', keyColumns
= NULL, controlTable
= cT, my_db
= my_db) qlook(my_db,
tab) DBI::dbDisconnect(my_db) }#> table `mvtcq_04050014603667178285_0000000000` SQLiteConnection #> nrow: 1 #> 'data.frame': 1 obs. of 2 variables: #> $ AUC: num 0.6 #> $ R2 : num 0.2
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4