Sorting of an rtables
table is done at a path, meaning a sort operation will occur at a particular location within the table, and the direct children of the element at that path will be reordered. This occurs whether those children are subtables themselves, or individual rows. Sorting is done via the sort_at_path()
function, which accepts both a (row) path and a scoring function. See the pathing vignette for details about paths.
A score function accepts a subtree or TableRow
and returns a single orderable (typically numeric) value. Within the subtable currently being sorted, the children are then reordered by the value of the score function. Importantly, âcontentâ (ContentRow
) and âvaluesâ (DataRow
) need to be treated differently in the scoring function as they are retrieved: the content of a subtable is retrieved via the content _table
accessor.
The cont_n_allcols()
scoring function provided by rtables
, works by scoring subtables by the sum of the first elements in the first row of the subtableâs content table. Note that this function fails if the child being scored does not have a content function (i.e., if summarize_row_groups()
was not used at the corresponding point in the layout). We can see this in itâs definition, below:
cont_n_allcols
# function (tt)
# {
# ctab <- content_table(tt)
# if (NROW(ctab) == 0) {
# stop("cont_n_allcols score function used at subtable [",
# obj_name(tt), "] that has no content table.")
# }
# sum(sapply(row_values(tree_children(ctab)[[1]]), function(cv) cv[1]))
# }
# <bytecode: 0x561500f66540>
# <environment: namespace:rtables>
Therefore, a fundamental difference between pruning and sorting is that sorting occurs at particular places in the table, as defined by a path.
For example, we can sort the strata values (ContentRow
) by observation counts within just the ASIAN
subtable:
sort_at_path(pruned, path = c("RACE", "ASIAN", "STRATA1"), scorefun = cont_n_allcols)
# A: Drug X B: Placebo C: Combination
# F M F M F M
# âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# ASIAN 44 (62.9%) 35 (68.6%) 37 (66.1%) 31 (62.0%) 40 (65.6%) 44 (64.7%)
# A 15 (21.4%) 12 (23.5%) 14 (25.0%) 6 (12.0%) 15 (24.6%) 16 (23.5%)
# Mean 30.40 34.42 35.43 30.33 37.40 36.25
# C 13 (18.6%) 15 (29.4%) 10 (17.9%) 9 (18.0%) 15 (24.6%) 16 (23.5%)
# Mean 36.92 35.60 34.00 31.89 33.47 31.38
# B 16 (22.9%) 8 (15.7%) 13 (23.2%) 16 (32.0%) 10 (16.4%) 12 (17.6%)
# Mean 33.75 34.88 32.46 30.94 33.30 35.92
# BLACK OR AFRICAN AMERICAN 18 (25.7%) 10 (19.6%) 12 (21.4%) 12 (24.0%) 13 (21.3%) 14 (20.6%)
# A 5 (7.1%) 1 (2.0%) 5 (8.9%) 2 (4.0%) 4 (6.6%) 4 (5.9%)
# Mean 31.20 33.00 28.00 30.00 30.75 36.50
# B 7 (10.0%) 3 (5.9%) 3 (5.4%) 3 (6.0%) 6 (9.8%) 6 (8.8%)
# Mean 36.14 34.33 29.67 32.00 36.33 31.00
# C 6 (8.6%) 6 (11.8%) 4 (7.1%) 7 (14.0%) 3 (4.9%) 4 (5.9%)
# Mean 31.33 39.67 34.50 34.00 33.00 36.50
# WHITE 8 (11.4%) 6 (11.8%) 7 (12.5%) 7 (14.0%) 8 (13.1%) 10 (14.7%)
# A 2 (2.9%) 1 (2.0%) 3 (5.4%) 3 (6.0%) 1 (1.6%) 5 (7.4%)
# Mean 34.00 45.00 29.33 33.33 35.00 32.80
# B 4 (5.7%) 3 (5.9%) 1 (1.8%) 4 (8.0%) 3 (4.9%) 1 (1.5%)
# Mean 37.00 43.67 48.00 36.75 34.33 36.00
# C 2 (2.9%) 2 (3.9%) 3 (5.4%) 0 (0.0%) 4 (6.6%) 4 (5.9%)
# Mean 35.50 44.00 44.67 NA 38.50 35.00
# B and C are swapped as the global count (sum of all column counts) of strata C is higher than the one of strata B
Wildcards in Sort Paths
A sorting path can contain one or more instances of the â*â wildcard. Each of these indicates that the children of each subtable matching this *
element of the path should be sorted separately as indicated by the remainder of the path after the *
and the score function.
Thus we can extend our sorting of strata within the ASIAN
subtable to all race-specific subtables by using the wildcard:
sort_at_path(pruned, path = c("RACE", "*", "STRATA1"), scorefun = cont_n_allcols)
# A: Drug X B: Placebo C: Combination
# F M F M F M
# âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# ASIAN 44 (62.9%) 35 (68.6%) 37 (66.1%) 31 (62.0%) 40 (65.6%) 44 (64.7%)
# A 15 (21.4%) 12 (23.5%) 14 (25.0%) 6 (12.0%) 15 (24.6%) 16 (23.5%)
# Mean 30.40 34.42 35.43 30.33 37.40 36.25
# C 13 (18.6%) 15 (29.4%) 10 (17.9%) 9 (18.0%) 15 (24.6%) 16 (23.5%)
# Mean 36.92 35.60 34.00 31.89 33.47 31.38
# B 16 (22.9%) 8 (15.7%) 13 (23.2%) 16 (32.0%) 10 (16.4%) 12 (17.6%)
# Mean 33.75 34.88 32.46 30.94 33.30 35.92
# BLACK OR AFRICAN AMERICAN 18 (25.7%) 10 (19.6%) 12 (21.4%) 12 (24.0%) 13 (21.3%) 14 (20.6%)
# C 6 (8.6%) 6 (11.8%) 4 (7.1%) 7 (14.0%) 3 (4.9%) 4 (5.9%)
# Mean 31.33 39.67 34.50 34.00 33.00 36.50
# B 7 (10.0%) 3 (5.9%) 3 (5.4%) 3 (6.0%) 6 (9.8%) 6 (8.8%)
# Mean 36.14 34.33 29.67 32.00 36.33 31.00
# A 5 (7.1%) 1 (2.0%) 5 (8.9%) 2 (4.0%) 4 (6.6%) 4 (5.9%)
# Mean 31.20 33.00 28.00 30.00 30.75 36.50
# WHITE 8 (11.4%) 6 (11.8%) 7 (12.5%) 7 (14.0%) 8 (13.1%) 10 (14.7%)
# B 4 (5.7%) 3 (5.9%) 1 (1.8%) 4 (8.0%) 3 (4.9%) 1 (1.5%)
# Mean 37.00 43.67 48.00 36.75 34.33 36.00
# A 2 (2.9%) 1 (2.0%) 3 (5.4%) 3 (6.0%) 1 (1.6%) 5 (7.4%)
# Mean 34.00 45.00 29.33 33.33 35.00 32.80
# C 2 (2.9%) 2 (3.9%) 3 (5.4%) 0 (0.0%) 4 (6.6%) 4 (5.9%)
# Mean 35.50 44.00 44.67 NA 38.50 35.00
# All subtables, i.e. ASIAN, BLACK..., and WHITE, are reordered separately
The above is equivalent to separately calling the following:
tmptbl <- sort_at_path(pruned, path = c("RACE", "ASIAN", "STRATA1"), scorefun = cont_n_allcols)
tmptbl <- sort_at_path(tmptbl, path = c("RACE", "BLACK OR AFRICAN AMERICAN", "STRATA1"), scorefun = cont_n_allcols)
tmptbl <- sort_at_path(tmptbl, path = c("RACE", "WHITE", "STRATA1"), scorefun = cont_n_allcols)
tmptbl
# A: Drug X B: Placebo C: Combination
# F M F M F M
# âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# ASIAN 44 (62.9%) 35 (68.6%) 37 (66.1%) 31 (62.0%) 40 (65.6%) 44 (64.7%)
# A 15 (21.4%) 12 (23.5%) 14 (25.0%) 6 (12.0%) 15 (24.6%) 16 (23.5%)
# Mean 30.40 34.42 35.43 30.33 37.40 36.25
# C 13 (18.6%) 15 (29.4%) 10 (17.9%) 9 (18.0%) 15 (24.6%) 16 (23.5%)
# Mean 36.92 35.60 34.00 31.89 33.47 31.38
# B 16 (22.9%) 8 (15.7%) 13 (23.2%) 16 (32.0%) 10 (16.4%) 12 (17.6%)
# Mean 33.75 34.88 32.46 30.94 33.30 35.92
# BLACK OR AFRICAN AMERICAN 18 (25.7%) 10 (19.6%) 12 (21.4%) 12 (24.0%) 13 (21.3%) 14 (20.6%)
# C 6 (8.6%) 6 (11.8%) 4 (7.1%) 7 (14.0%) 3 (4.9%) 4 (5.9%)
# Mean 31.33 39.67 34.50 34.00 33.00 36.50
# B 7 (10.0%) 3 (5.9%) 3 (5.4%) 3 (6.0%) 6 (9.8%) 6 (8.8%)
# Mean 36.14 34.33 29.67 32.00 36.33 31.00
# A 5 (7.1%) 1 (2.0%) 5 (8.9%) 2 (4.0%) 4 (6.6%) 4 (5.9%)
# Mean 31.20 33.00 28.00 30.00 30.75 36.50
# WHITE 8 (11.4%) 6 (11.8%) 7 (12.5%) 7 (14.0%) 8 (13.1%) 10 (14.7%)
# B 4 (5.7%) 3 (5.9%) 1 (1.8%) 4 (8.0%) 3 (4.9%) 1 (1.5%)
# Mean 37.00 43.67 48.00 36.75 34.33 36.00
# A 2 (2.9%) 1 (2.0%) 3 (5.4%) 3 (6.0%) 1 (1.6%) 5 (7.4%)
# Mean 34.00 45.00 29.33 33.33 35.00 32.80
# C 2 (2.9%) 2 (3.9%) 3 (5.4%) 0 (0.0%) 4 (6.6%) 4 (5.9%)
# Mean 35.50 44.00 44.67 NA 38.50 35.00
It is possible to understand better pathing with table_structure()
that highlights the tree-like structure and the node names:
table_structure(pruned)
# [TableTree] RACE
# [TableTree] ASIAN [cont: 1 x 6]
# [TableTree] STRATA1
# [TableTree] A [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
# [TableTree] B [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
# [TableTree] C [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
# [TableTree] BLACK OR AFRICAN AMERICAN [cont: 1 x 6]
# [TableTree] STRATA1
# [TableTree] A [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
# [TableTree] B [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
# [TableTree] C [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
# [TableTree] WHITE [cont: 1 x 6]
# [TableTree] STRATA1
# [TableTree] A [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
# [TableTree] B [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
# [TableTree] C [cont: 1 x 6]
# [ElementaryTable] AGE (1 x 6)
or with row_paths_summary
:
row_paths_summary(pruned)
# rowname node_class path
# âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# ASIAN ContentRow RACE, ASIAN, @content, ASIAN
# A ContentRow RACE, ASIAN, STRATA1, A, @content, A
# Mean DataRow RACE, ASIAN, STRATA1, A, AGE, Mean
# B ContentRow RACE, ASIAN, STRATA1, B, @content, B
# Mean DataRow RACE, ASIAN, STRATA1, B, AGE, Mean
# C ContentRow RACE, ASIAN, STRATA1, C, @content, C
# Mean DataRow RACE, ASIAN, STRATA1, C, AGE, Mean
# BLACK OR AFRICAN AMERICAN ContentRow RACE, BLACK OR AFRICAN AMERICAN, @content, BLACK OR AFRICAN AMERICAN
# A ContentRow RACE, BLACK OR AFRICAN AMERICAN, STRATA1, A, @content, A
# Mean DataRow RACE, BLACK OR AFRICAN AMERICAN, STRATA1, A, AGE, Mean
# B ContentRow RACE, BLACK OR AFRICAN AMERICAN, STRATA1, B, @content, B
# Mean DataRow RACE, BLACK OR AFRICAN AMERICAN, STRATA1, B, AGE, Mean
# C ContentRow RACE, BLACK OR AFRICAN AMERICAN, STRATA1, C, @content, C
# Mean DataRow RACE, BLACK OR AFRICAN AMERICAN, STRATA1, C, AGE, Mean
# WHITE ContentRow RACE, WHITE, @content, WHITE
# A ContentRow RACE, WHITE, STRATA1, A, @content, A
# Mean DataRow RACE, WHITE, STRATA1, A, AGE, Mean
# B ContentRow RACE, WHITE, STRATA1, B, @content, B
# Mean DataRow RACE, WHITE, STRATA1, B, AGE, Mean
# C ContentRow RACE, WHITE, STRATA1, C, @content, C
# Mean DataRow RACE, WHITE, STRATA1, C, AGE, Mean
Note in the latter we see content rows as those with paths following @content
, e.g., ASIAN, @content, ASIAN
. The first of these at a given path (i.e., <path>, @content, <>
are the rows which will be used by the scoring functions which begin with cont_
.
We can directly sort the ethnicity by observations in increasing order:
ethsort <- sort_at_path(pruned, path = c("RACE"), scorefun = cont_n_allcols, decreasing = FALSE)
ethsort
# A: Drug X B: Placebo C: Combination
# F M F M F M
# âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# WHITE 8 (11.4%) 6 (11.8%) 7 (12.5%) 7 (14.0%) 8 (13.1%) 10 (14.7%)
# A 2 (2.9%) 1 (2.0%) 3 (5.4%) 3 (6.0%) 1 (1.6%) 5 (7.4%)
# Mean 34.00 45.00 29.33 33.33 35.00 32.80
# B 4 (5.7%) 3 (5.9%) 1 (1.8%) 4 (8.0%) 3 (4.9%) 1 (1.5%)
# Mean 37.00 43.67 48.00 36.75 34.33 36.00
# C 2 (2.9%) 2 (3.9%) 3 (5.4%) 0 (0.0%) 4 (6.6%) 4 (5.9%)
# Mean 35.50 44.00 44.67 NA 38.50 35.00
# BLACK OR AFRICAN AMERICAN 18 (25.7%) 10 (19.6%) 12 (21.4%) 12 (24.0%) 13 (21.3%) 14 (20.6%)
# A 5 (7.1%) 1 (2.0%) 5 (8.9%) 2 (4.0%) 4 (6.6%) 4 (5.9%)
# Mean 31.20 33.00 28.00 30.00 30.75 36.50
# B 7 (10.0%) 3 (5.9%) 3 (5.4%) 3 (6.0%) 6 (9.8%) 6 (8.8%)
# Mean 36.14 34.33 29.67 32.00 36.33 31.00
# C 6 (8.6%) 6 (11.8%) 4 (7.1%) 7 (14.0%) 3 (4.9%) 4 (5.9%)
# Mean 31.33 39.67 34.50 34.00 33.00 36.50
# ASIAN 44 (62.9%) 35 (68.6%) 37 (66.1%) 31 (62.0%) 40 (65.6%) 44 (64.7%)
# A 15 (21.4%) 12 (23.5%) 14 (25.0%) 6 (12.0%) 15 (24.6%) 16 (23.5%)
# Mean 30.40 34.42 35.43 30.33 37.40 36.25
# B 16 (22.9%) 8 (15.7%) 13 (23.2%) 16 (32.0%) 10 (16.4%) 12 (17.6%)
# Mean 33.75 34.88 32.46 30.94 33.30 35.92
# C 13 (18.6%) 15 (29.4%) 10 (17.9%) 9 (18.0%) 15 (24.6%) 16 (23.5%)
# Mean 36.92 35.60 34.00 31.89 33.47 31.38
Within each ethnicity separately, sort the strata by number of females in arm C (i.e. column position 5
):
sort_at_path(pruned, path = c("RACE", "*", "STRATA1"), cont_n_onecol(5))
# A: Drug X B: Placebo C: Combination
# F M F M F M
# âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# ASIAN 44 (62.9%) 35 (68.6%) 37 (66.1%) 31 (62.0%) 40 (65.6%) 44 (64.7%)
# A 15 (21.4%) 12 (23.5%) 14 (25.0%) 6 (12.0%) 15 (24.6%) 16 (23.5%)
# Mean 30.40 34.42 35.43 30.33 37.40 36.25
# C 13 (18.6%) 15 (29.4%) 10 (17.9%) 9 (18.0%) 15 (24.6%) 16 (23.5%)
# Mean 36.92 35.60 34.00 31.89 33.47 31.38
# B 16 (22.9%) 8 (15.7%) 13 (23.2%) 16 (32.0%) 10 (16.4%) 12 (17.6%)
# Mean 33.75 34.88 32.46 30.94 33.30 35.92
# BLACK OR AFRICAN AMERICAN 18 (25.7%) 10 (19.6%) 12 (21.4%) 12 (24.0%) 13 (21.3%) 14 (20.6%)
# B 7 (10.0%) 3 (5.9%) 3 (5.4%) 3 (6.0%) 6 (9.8%) 6 (8.8%)
# Mean 36.14 34.33 29.67 32.00 36.33 31.00
# A 5 (7.1%) 1 (2.0%) 5 (8.9%) 2 (4.0%) 4 (6.6%) 4 (5.9%)
# Mean 31.20 33.00 28.00 30.00 30.75 36.50
# C 6 (8.6%) 6 (11.8%) 4 (7.1%) 7 (14.0%) 3 (4.9%) 4 (5.9%)
# Mean 31.33 39.67 34.50 34.00 33.00 36.50
# WHITE 8 (11.4%) 6 (11.8%) 7 (12.5%) 7 (14.0%) 8 (13.1%) 10 (14.7%)
# C 2 (2.9%) 2 (3.9%) 3 (5.4%) 0 (0.0%) 4 (6.6%) 4 (5.9%)
# Mean 35.50 44.00 44.67 NA 38.50 35.00
# B 4 (5.7%) 3 (5.9%) 1 (1.8%) 4 (8.0%) 3 (4.9%) 1 (1.5%)
# Mean 37.00 43.67 48.00 36.75 34.33 36.00
# A 2 (2.9%) 1 (2.0%) 3 (5.4%) 3 (6.0%) 1 (1.6%) 5 (7.4%)
# Mean 34.00 45.00 29.33 33.33 35.00 32.80
Sorting Within an Analysis Subtable
When sorting within an analysis subtable (e.g., the subtable generated when your analysis function generates more than one row per group of data), the name of that subtable (generally the name of the variable being analyzed) must appear in the path, even if the variable label is not displayed when the table is printed.
To show the differences between sorting an analysis subtable (DataRow
), and a content subtable (ContentRow
), we modify and prune (as before) a similar raw table as before:
more_analysis_fnc <- function(x) {
in_rows(
"median" = median(x),
"mean" = mean(x),
.formats = "xx.x"
)
}
raw_lyt <- basic_table() %>%
split_cols_by("ARM") %>%
split_rows_by(
"RACE",
split_fun = drop_and_remove_levels("WHITE") # dropping WHITE levels
) %>%
summarize_row_groups() %>%
split_rows_by("STRATA1") %>%
summarize_row_groups() %>%
analyze("AGE", afun = more_analysis_fnc)
tbl <- build_table(raw_lyt, DM) %>%
prune_table() %>%
print()
# A: Drug X B: Placebo C: Combination
# ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# ASIAN 79 (65.3%) 68 (64.2%) 84 (65.1%)
# A 27 (22.3%) 20 (18.9%) 31 (24.0%)
# median 30.0 33.0 36.0
# mean 32.2 33.9 36.8
# B 24 (19.8%) 29 (27.4%) 22 (17.1%)
# median 32.5 32.0 34.0
# mean 34.1 31.6 34.7
# C 28 (23.1%) 19 (17.9%) 31 (24.0%)
# median 36.5 34.0 33.0
# mean 36.2 33.0 32.4
# BLACK OR AFRICAN AMERICAN 28 (23.1%) 24 (22.6%) 27 (20.9%)
# A 6 (5.0%) 7 (6.6%) 8 (6.2%)
# median 32.0 29.0 32.5
# mean 31.5 28.6 33.6
# B 10 (8.3%) 6 (5.7%) 12 (9.3%)
# median 33.0 30.0 33.5
# mean 35.6 30.8 33.7
# C 12 (9.9%) 11 (10.4%) 7 (5.4%)
# median 33.0 36.0 32.0
# mean 35.5 34.2 35.0
What should we do now if we want to sort each median and mean in each of the strata variables? We need to write a custom score function as the ready-made ones at the moment work only with content nodes (content_table()
access function for cont_n_allcols()
and cont_n_onecol()
, of which we will talk in a moment). But before that, we need to think about what are we ordering, i.e. we need to specify the right path. We suggest looking at the structure first with table_structure()
or row_paths_summary()
.
table_structure(tbl) # Direct inspection into the tree-like structure of rtables
# [TableTree] RACE
# [TableTree] ASIAN [cont: 1 x 3]
# [TableTree] STRATA1
# [TableTree] A [cont: 1 x 3]
# [ElementaryTable] AGE (2 x 3)
# [TableTree] B [cont: 1 x 3]
# [ElementaryTable] AGE (2 x 3)
# [TableTree] C [cont: 1 x 3]
# [ElementaryTable] AGE (2 x 3)
# [TableTree] BLACK OR AFRICAN AMERICAN [cont: 1 x 3]
# [TableTree] STRATA1
# [TableTree] A [cont: 1 x 3]
# [ElementaryTable] AGE (2 x 3)
# [TableTree] B [cont: 1 x 3]
# [ElementaryTable] AGE (2 x 3)
# [TableTree] C [cont: 1 x 3]
# [ElementaryTable] AGE (2 x 3)
We see that to order all of the AGE
nodes we need to get there with something like this: RACE, ASIAN, STRATA1, A, AGE
and no more as the next level is what we need to sort. But we see now that this path would sort only the first group. We need wildcards: RACE, *, STRATA1, *, AGE
.
Now, we have found a way to select relevant paths that we want to sort. We want to construct a scoring function that works on the median and mean and sort them. To do so, we may want to enter our scoring function with browser()
to see what is fed to it and try to retrieve the single value that is to be returned to do the sorting. We allow the user to experiment with this, while here we show a possible solution that considers summing all the column values that are retrieved with row_values(tt)
from the subtable that is fed to the function itself. Note that any score function should be defined as having a subtable tt
as a unique input parameter and a single numeric value as output.
scorefun <- function(tt) {
# Here we could use browser()
sum(unlist(row_values(tt)))
}
sort_at_path(tbl, c("RACE", "*", "STRATA1", "*", "AGE"), scorefun)
# A: Drug X B: Placebo C: Combination
# ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# ASIAN 79 (65.3%) 68 (64.2%) 84 (65.1%)
# A 27 (22.3%) 20 (18.9%) 31 (24.0%)
# mean 32.2 33.9 36.8
# median 30.0 33.0 36.0
# B 24 (19.8%) 29 (27.4%) 22 (17.1%)
# mean 34.1 31.6 34.7
# median 32.5 32.0 34.0
# C 28 (23.1%) 19 (17.9%) 31 (24.0%)
# median 36.5 34.0 33.0
# mean 36.2 33.0 32.4
# BLACK OR AFRICAN AMERICAN 28 (23.1%) 24 (22.6%) 27 (20.9%)
# A 6 (5.0%) 7 (6.6%) 8 (6.2%)
# mean 31.5 28.6 33.6
# median 32.0 29.0 32.5
# B 10 (8.3%) 6 (5.7%) 12 (9.3%)
# mean 35.6 30.8 33.7
# median 33.0 30.0 33.5
# C 12 (9.9%) 11 (10.4%) 7 (5.4%)
# mean 35.5 34.2 35.0
# median 33.0 36.0 32.0
To help the user visualize what is happening in the score function we show here an example of its exploration from the debugging:
> sort_at_path(tbl, c("RACE", "*", "STRATA1", "*", "AGE"), scorefun)
Called from: scorefun(x)
Browse[1]> tt ### THIS IS THE LEAF LEVEL -> DataRow ###
[DataRow indent_mod 0]: median 30.0 33.0 36.0
Browse[1]> row_values(tt) ### Extraction of values -> It will be a named list! ###
$`A: Drug X`
[1] 30
$`B: Placebo`
[1] 33
$`C: Combination`
[1] 36
Browse[1]> sum(unlist(row_values(tt))) ### Final value we want to give back to sort_at_path ###
[1] 99
We can see how powerful and pragmatic it might be to change the sorting principles from within the custom scoring function. We show this by selecting a specific column to sort. Looking at the pre-defined function cont_n_onecol()
gives us an insight into how to proceed.
cont_n_onecol
# function (j)
# {
# function(tt) {
# ctab <- content_table(tt)
# if (NROW(ctab) == 0) {
# stop("cont_n_allcols score function used at subtable [",
# obj_name(tt), "] that has no content table.")
# }
# row_values(tree_children(ctab)[[1]])[[j]][1]
# }
# }
# <bytecode: 0x561500c87850>
# <environment: namespace:rtables>
We see that a similar function to cont_n_allcols()
is wrapped by one that allows a parameter j
to be used to select a specific column. We will do the same here for selecting which column we want to sort.
scorefun_onecol <- function(colpath) {
function(tt) {
# Here we could use browser()
unlist(cell_values(tt, colpath = colpath), use.names = FALSE)[1] # Modified to lose the list names
}
}
sort_at_path(tbl, c("RACE", "*", "STRATA1", "*", "AGE"), scorefun_onecol(colpath = c("ARM", "A: Drug X")))
# A: Drug X B: Placebo C: Combination
# ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# ASIAN 79 (65.3%) 68 (64.2%) 84 (65.1%)
# A 27 (22.3%) 20 (18.9%) 31 (24.0%)
# mean 32.2 33.9 36.8
# median 30.0 33.0 36.0
# B 24 (19.8%) 29 (27.4%) 22 (17.1%)
# mean 34.1 31.6 34.7
# median 32.5 32.0 34.0
# C 28 (23.1%) 19 (17.9%) 31 (24.0%)
# median 36.5 34.0 33.0
# mean 36.2 33.0 32.4
# BLACK OR AFRICAN AMERICAN 28 (23.1%) 24 (22.6%) 27 (20.9%)
# A 6 (5.0%) 7 (6.6%) 8 (6.2%)
# median 32.0 29.0 32.5
# mean 31.5 28.6 33.6
# B 10 (8.3%) 6 (5.7%) 12 (9.3%)
# mean 35.6 30.8 33.7
# median 33.0 30.0 33.5
# C 12 (9.9%) 11 (10.4%) 7 (5.4%)
# mean 35.5 34.2 35.0
# median 33.0 36.0 32.0
In the above table we see that the mean and median rows are reordered by their values in the first column, compared to the raw table, as desired.
With this function we can also do the same for columns that are nested within larger splits:
# Simpler table
tbl <- basic_table() %>%
split_cols_by("ARM") %>%
split_cols_by("SEX",
split_fun = drop_and_remove_levels(c("U", "UNDIFFERENTIATED"))
) %>%
analyze("AGE", afun = more_analysis_fnc) %>%
build_table(DM) %>%
prune_table() %>%
print()
# A: Drug X B: Placebo C: Combination
# F M F M F M
# âââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# median 32.0 35.0 33.0 31.0 35.0 32.0
# mean 33.7 36.5 33.8 32.1 34.9 34.3
sort_at_path(tbl, c("AGE"), scorefun_onecol(colpath = c("ARM", "B: Placebo", "SEX", "F")))
# A: Drug X B: Placebo C: Combination
# F M F M F M
# âââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
# mean 33.7 36.5 33.8 32.1 34.9 34.3
# median 32.0 35.0 33.0 31.0 35.0 32.0
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4