pick()
provides a way to easily select a subset of columns from your data
using select()
semantics while inside a
"data-masking" function like mutate()
or
summarise()
. pick()
returns a data frame containing the selected columns
for the current group.
pick()
is complementary to across()
:
With
pick()
, you typically apply a function to the full data frame.With
across()
, you typically apply a function to each column.
Arguments
- ...
-
Columns to pick.
You can't pick grouping columns because they are already automatically handled by the verb (i.e.
summarise()
ormutate()
).
Details
Theoretically, pick()
is intended to be replaceable with an equivalent call
to tibble()
. For example, pick(a, c)
could be replaced with
tibble(a = a, c = c)
, and pick(everything())
on a data frame with cols
a
, b
, and c
could be replaced with tibble(a = a, b = b, c = c)
.
pick()
specially handles the case of an empty selection by returning a 1
row, 0 column tibble, so an exact replacement is more like:
Examples
df <- tibble(
x = c(3, 2, 2, 2, 1),
y = c(0, 2, 1, 1, 4),
z1 = c("a", "a", "a", "b", "a"),
z2 = c("c", "d", "d", "a", "c")
)
df
#> # A tibble: 5 × 4
#> x y z1 z2
#> <dbl> <dbl> <chr> <chr>
#> 1 3 0 a c
#> 2 2 2 a d
#> 3 2 1 a d
#> 4 2 1 b a
#> 5 1 4 a c
# `pick()` provides a way to select a subset of your columns using
# tidyselect. It returns a data frame.
df %>% mutate(cols = pick(x, y))
#> # A tibble: 5 × 5
#> x y z1 z2 cols$x $y
#> <dbl> <dbl> <chr> <chr> <dbl> <dbl>
#> 1 3 0 a c 3 0
#> 2 2 2 a d 2 2
#> 3 2 1 a d 2 1
#> 4 2 1 b a 2 1
#> 5 1 4 a c 1 4
# This is useful for functions that take data frames as inputs.
# For example, you can compute a joint rank between `x` and `y`.
df %>% mutate(rank = dense_rank(pick(x, y)))
#> # A tibble: 5 × 5
#> x y z1 z2 rank
#> <dbl> <dbl> <chr> <chr> <int>
#> 1 3 0 a c 4
#> 2 2 2 a d 3
#> 3 2 1 a d 2
#> 4 2 1 b a 2
#> 5 1 4 a c 1
# `pick()` is also useful as a bridge between data-masking functions (like
# `mutate()` or `group_by()`) and functions with tidy-select behavior (like
# `select()`). For example, you can use `pick()` to create a wrapper around
# `group_by()` that takes a tidy-selection of columns to group on. For more
# bridge patterns, see
# https://rlang.r-lib.org/reference/topic-data-mask-programming.html#bridge-patterns.
my_group_by <- function(data, cols) {
group_by(data, pick({{ cols }}))
}
df %>% my_group_by(c(x, starts_with("z")))
#> # A tibble: 5 × 4
#> # Groups: x, z1, z2 [4]
#> x y z1 z2
#> <dbl> <dbl> <chr> <chr>
#> 1 3 0 a c
#> 2 2 2 a d
#> 3 2 1 a d
#> 4 2 1 b a
#> 5 1 4 a c
# Or you can use it to dynamically select columns to `count()` by
df %>% count(pick(starts_with("z")))
#> # A tibble: 3 × 3
#> z1 z2 n
#> <chr> <chr> <int>
#> 1 a c 2
#> 2 a d 2
#> 3 b a 1