This page describes the <tidy-select>
argument modifier which indicates
the argument supports tidy selections. Tidy selection provides a concise
dialect of R for selecting variables based on their names or properties.
Tidy selection is a variant of tidy evaluation. This means that inside
functions, tidy-select arguments require special attention, as described in
the Indirection section below. If you've never heard of tidy evaluation
before, start with vignette("programming")
.
Overview of selection features
Tidyverse selections implement a dialect of R where operators make it easy to select variables:
:
for selecting a range of consecutive variables.!
for taking the complement of a set of variables.&
and|
for selecting the intersection or the union of two sets of variables.c()
for combining selections.
In addition, you can use selection helpers. Some helpers select specific columns:
everything()
: Matches all variables.last_col()
: Select last variable, possibly with an offset.group_cols()
: Select all grouping columns.
Other helpers select variables by matching patterns in their names:
starts_with()
: Starts with a prefix.ends_with()
: Ends with a suffix.contains()
: Contains a literal string.matches()
: Matches a regular expression.num_range()
: Matches a numerical range like x01, x02, x03.
Or from variables stored in a character vector:
all_of()
: Matches variable names in a character vector. All names must be present, otherwise an out-of-bounds error is thrown.any_of()
: Same asall_of()
, except that no error is thrown for names that don't exist.
Or using a predicate function:
where()
: Applies a function to all variables and selects those for which the function returnsTRUE
.
Indirection
There are two main cases:
If you have a character vector of column names, use
all_of()
orany_of()
, depending on whether or not you want unknown variable names to cause an error, e.g.select(df, all_of(vars))
,select(df, !any_of(vars))
.If you want the user to be able to supply a tidyselect specification in a function argument, embrace the function argument, e.g.
select(df, {{ vars }})
.