I'm wondering if add_count()
and add_tally()
should throw an error in case that the output column name (default "n") is already used in a pre-existing column. Otherwise it ends up doing some odd behaviour. For instance:
library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union df <- tibble(a = rep(LETTERS, 2), n = 1:52) df #> # A tibble: 52 x 2 #> a n #> <chr> <int> #> 1 A 1 #> 2 B 2 #> 3 C 3 #> 4 D 4 #> 5 E 5 #> 6 F 6 #> 7 G 7 #> 8 H 8 #> 9 I 9 #> 10 J 10 #> # ... with 42 more rows add_tally(df) #> Using `n` as weighting variable #> # A tibble: 52 x 2 #> a n #> <chr> <int> #> 1 A 1378 #> 2 B 1378 #> 3 C 1378 #> 4 D 1378 #> 5 E 1378 #> 6 F 1378 #> 7 G 1378 #> 8 H 1378 #> 9 I 1378 #> 10 J 1378 #> # ... with 42 more rows
Do we want n to be used as weighting variable? This behaviour seems unexpected to me.
add_count(df, a) #> # A tibble: 52 x 2 #> a n #> <chr> <int> #> 1 A 2 #> 2 B 2 #> 3 C 2 #> 4 D 2 #> 5 E 2 #> 6 F 2 #> 7 G 2 #> 8 H 2 #> 9 I 2 #> 10 J 2 #> # ... with 42 more rows
In this case, add_count()
silently replaces the pre-existing column "n" with its output, which is likely not the user's intent.
EDIT: I see that the behaviour with add_count
is already discussed here. Fair enough on that front.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4