Scoped verbs (_if
, _at
, _all
) have been superseded by the use of
pick()
or across()
in an existing verb. See vignette("colwise")
for
details.
The scoped variants of mutate()
and transmute()
make it easy to apply
the same transformation to multiple variables. There are three variants:
_all affects every variable
_at affects variables selected with a character vector or vars()
_if affects variables selected with a predicate function:
Usage
mutate_all(.tbl, .funs, ...)
mutate_if(.tbl, .predicate, .funs, ...)
mutate_at(.tbl, .vars, .funs, ..., .cols = NULL)
transmute_all(.tbl, .funs, ...)
transmute_if(.tbl, .predicate, .funs, ...)
transmute_at(.tbl, .vars, .funs, ..., .cols = NULL)
Arguments
- .tbl
A
tbl
object.- .funs
A function
fun
, a quosure style lambda~ fun(.)
or a list of either form.- ...
Additional arguments for the function calls in
.funs
. These are evaluated only once, with tidy dots support.- .predicate
A predicate function to be applied to the columns or a logical vector. The variables for which
.predicate
is or returnsTRUE
are selected. This argument is passed torlang::as_function()
and thus supports quosure-style lambda functions and strings representing function names.- .vars
A list of columns generated by
vars()
, a character vector of column names, a numeric vector of column positions, orNULL
.- .cols
This argument has been renamed to
.vars
to fit dplyr's terminology and is deprecated.
Value
A data frame. By default, the newly created columns have the shortest names needed to uniquely identify the output. To force inclusion of a name, even when not needed, name the input (see examples for details).
Grouping variables
If applied on a grouped tibble, these operations are not applied
to the grouping variables. The behaviour depends on whether the
selection is implicit (all
and if
selections) or
explicit (at
selections).
Grouping variables covered by explicit selections in
mutate_at()
andtransmute_at()
are always an error. Add-group_cols()
to thevars()
selection to avoid this:%>% mutate_at(vars(-group_cols(), ...), myoperation) data
Or remove
group_vars()
from the character vector of column names:<- setdiff(nms, group_vars(data)) nms %>% mutate_at(vars, myoperation) data
Grouping variables covered by implicit selections are ignored by
mutate_all()
,transmute_all()
,mutate_if()
, andtransmute_if()
.
Naming
The names of the new columns are derived from the names of the input variables and the names of the functions.
if there is only one unnamed function (i.e. if
.funs
is an unnamed list of length one), the names of the input variables are used to name the new columns;for
_at
functions, if there is only one unnamed variable (i.e., if.vars
is of the formvars(a_single_column)
) and.funs
has length greater than one, the names of the functions are used to name the new columns;otherwise, the new names are created by concatenating the names of the input variables and the names of the functions, separated with an underscore
"_"
.
The .funs
argument can be a named or unnamed list.
If a function is unnamed and the name cannot be derived automatically,
a name of the form "fn#" is used.
Similarly, vars()
accepts named and unnamed arguments.
If a variable in .vars
is named, a new column by that name will be created.
Name collisions in the new columns are disambiguated using a unique suffix.
Examples
iris <- as_tibble(iris)
# All variants can be passed functions and additional arguments,
# purrr-style. The _at() variants directly support strings. Here
# we'll scale the variables `height` and `mass`:
scale2 <- function(x, na.rm = FALSE) (x - mean(x, na.rm = na.rm)) / sd(x, na.rm)
starwars %>% mutate_at(c("height", "mass"), scale2)
#> # A tibble: 87 × 14
#> name height mass hair_color skin_color eye_color birth_year sex
#> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
#> 1 Luke Sky… NA NA blond fair blue 19 male
#> 2 C-3PO NA NA NA gold yellow 112 none
#> 3 R2-D2 NA NA NA white, bl… red 33 none
#> 4 Darth Va… NA NA none white yellow 41.9 male
#> 5 Leia Org… NA NA brown light brown 19 fema…
#> 6 Owen Lars NA NA brown, gr… light blue 52 male
#> 7 Beru Whi… NA NA brown light blue 47 fema…
#> 8 R5-D4 NA NA NA white, red red NA none
#> 9 Biggs Da… NA NA black light brown 24 male
#> 10 Obi-Wan … NA NA auburn, w… fair blue-gray 57 male
#> # ℹ 77 more rows
#> # ℹ 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
#> # films <list>, vehicles <list>, starships <list>
# ->
starwars %>% mutate(across(c("height", "mass"), scale2))
#> # A tibble: 87 × 14
#> name height mass hair_color skin_color eye_color birth_year sex
#> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
#> 1 Luke Sky… NA NA blond fair blue 19 male
#> 2 C-3PO NA NA NA gold yellow 112 none
#> 3 R2-D2 NA NA NA white, bl… red 33 none
#> 4 Darth Va… NA NA none white yellow 41.9 male
#> 5 Leia Org… NA NA brown light brown 19 fema…
#> 6 Owen Lars NA NA brown, gr… light blue 52 male
#> 7 Beru Whi… NA NA brown light blue 47 fema…
#> 8 R5-D4 NA NA NA white, red red NA none
#> 9 Biggs Da… NA NA black light brown 24 male
#> 10 Obi-Wan … NA NA auburn, w… fair blue-gray 57 male
#> # ℹ 77 more rows
#> # ℹ 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
#> # films <list>, vehicles <list>, starships <list>
# You can pass additional arguments to the function:
starwars %>% mutate_at(c("height", "mass"), scale2, na.rm = TRUE)
#> # A tibble: 87 × 14
#> name height mass hair_color skin_color eye_color birth_year sex
#> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
#> 1 Luke … -0.0749 -0.120 blond fair blue 19 male
#> 2 C-3PO -0.219 -0.132 NA gold yellow 112 none
#> 3 R2-D2 -2.26 -0.385 NA white, bl… red 33 none
#> 4 Darth… 0.788 0.228 none white yellow 41.9 male
#> 5 Leia … -0.708 -0.285 brown light brown 19 fema…
#> 6 Owen … 0.0976 0.134 brown, gr… light blue 52 male
#> 7 Beru … -0.276 -0.132 brown light blue 47 fema…
#> 8 R5-D4 -2.23 -0.385 NA white, red red NA none
#> 9 Biggs… 0.241 -0.0786 black light brown 24 male
#> 10 Obi-W… 0.213 -0.120 auburn, w… fair blue-gray 57 male
#> # ℹ 77 more rows
#> # ℹ 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
#> # films <list>, vehicles <list>, starships <list>
starwars %>% mutate_at(c("height", "mass"), ~scale2(., na.rm = TRUE))
#> # A tibble: 87 × 14
#> name height mass hair_color skin_color eye_color birth_year sex
#> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
#> 1 Luke … -0.0749 -0.120 blond fair blue 19 male
#> 2 C-3PO -0.219 -0.132 NA gold yellow 112 none
#> 3 R2-D2 -2.26 -0.385 NA white, bl… red 33 none
#> 4 Darth… 0.788 0.228 none white yellow 41.9 male
#> 5 Leia … -0.708 -0.285 brown light brown 19 fema…
#> 6 Owen … 0.0976 0.134 brown, gr… light blue 52 male
#> 7 Beru … -0.276 -0.132 brown light blue 47 fema…
#> 8 R5-D4 -2.23 -0.385 NA white, red red NA none
#> 9 Biggs… 0.241 -0.0786 black light brown 24 male
#> 10 Obi-W… 0.213 -0.120 auburn, w… fair blue-gray 57 male
#> # ℹ 77 more rows
#> # ℹ 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
#> # films <list>, vehicles <list>, starships <list>
# ->
starwars %>% mutate(across(c("height", "mass"), ~ scale2(.x, na.rm = TRUE)))
#> # A tibble: 87 × 14
#> name height mass hair_color skin_color eye_color birth_year sex
#> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
#> 1 Luke … -0.0749 -0.120 blond fair blue 19 male
#> 2 C-3PO -0.219 -0.132 NA gold yellow 112 none
#> 3 R2-D2 -2.26 -0.385 NA white, bl… red 33 none
#> 4 Darth… 0.788 0.228 none white yellow 41.9 male
#> 5 Leia … -0.708 -0.285 brown light brown 19 fema…
#> 6 Owen … 0.0976 0.134 brown, gr… light blue 52 male
#> 7 Beru … -0.276 -0.132 brown light blue 47 fema…
#> 8 R5-D4 -2.23 -0.385 NA white, red red NA none
#> 9 Biggs… 0.241 -0.0786 black light brown 24 male
#> 10 Obi-W… 0.213 -0.120 auburn, w… fair blue-gray 57 male
#> # ℹ 77 more rows
#> # ℹ 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
#> # films <list>, vehicles <list>, starships <list>
# You can also supply selection helpers to _at() functions but you have
# to quote them with vars():
iris %>% mutate_at(vars(matches("Sepal")), log)
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 1.63 1.25 1.4 0.2 setosa
#> 2 1.59 1.10 1.4 0.2 setosa
#> 3 1.55 1.16 1.3 0.2 setosa
#> 4 1.53 1.13 1.5 0.2 setosa
#> 5 1.61 1.28 1.4 0.2 setosa
#> 6 1.69 1.36 1.7 0.4 setosa
#> 7 1.53 1.22 1.4 0.3 setosa
#> 8 1.61 1.22 1.5 0.2 setosa
#> 9 1.48 1.06 1.4 0.2 setosa
#> 10 1.59 1.13 1.5 0.1 setosa
#> # ℹ 140 more rows
iris %>% mutate(across(matches("Sepal"), log))
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 1.63 1.25 1.4 0.2 setosa
#> 2 1.59 1.10 1.4 0.2 setosa
#> 3 1.55 1.16 1.3 0.2 setosa
#> 4 1.53 1.13 1.5 0.2 setosa
#> 5 1.61 1.28 1.4 0.2 setosa
#> 6 1.69 1.36 1.7 0.4 setosa
#> 7 1.53 1.22 1.4 0.3 setosa
#> 8 1.61 1.22 1.5 0.2 setosa
#> 9 1.48 1.06 1.4 0.2 setosa
#> 10 1.59 1.13 1.5 0.1 setosa
#> # ℹ 140 more rows
# The _if() variants apply a predicate function (a function that
# returns TRUE or FALSE) to determine the relevant subset of
# columns. Here we divide all the numeric columns by 100:
starwars %>% mutate_if(is.numeric, scale2, na.rm = TRUE)
#> # A tibble: 87 × 14
#> name height mass hair_color skin_color eye_color birth_year sex
#> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
#> 1 Luke … -0.0749 -0.120 blond fair blue -0.443 male
#> 2 C-3PO -0.219 -0.132 NA gold yellow 0.158 none
#> 3 R2-D2 -2.26 -0.385 NA white, bl… red -0.353 none
#> 4 Darth… 0.788 0.228 none white yellow -0.295 male
#> 5 Leia … -0.708 -0.285 brown light brown -0.443 fema…
#> 6 Owen … 0.0976 0.134 brown, gr… light blue -0.230 male
#> 7 Beru … -0.276 -0.132 brown light blue -0.262 fema…
#> 8 R5-D4 -2.23 -0.385 NA white, red red NA none
#> 9 Biggs… 0.241 -0.0786 black light brown -0.411 male
#> 10 Obi-W… 0.213 -0.120 auburn, w… fair blue-gray -0.198 male
#> # ℹ 77 more rows
#> # ℹ 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
#> # films <list>, vehicles <list>, starships <list>
starwars %>% mutate(across(where(is.numeric), ~ scale2(.x, na.rm = TRUE)))
#> # A tibble: 87 × 14
#> name height mass hair_color skin_color eye_color birth_year sex
#> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr>
#> 1 Luke … -0.0749 -0.120 blond fair blue -0.443 male
#> 2 C-3PO -0.219 -0.132 NA gold yellow 0.158 none
#> 3 R2-D2 -2.26 -0.385 NA white, bl… red -0.353 none
#> 4 Darth… 0.788 0.228 none white yellow -0.295 male
#> 5 Leia … -0.708 -0.285 brown light brown -0.443 fema…
#> 6 Owen … 0.0976 0.134 brown, gr… light blue -0.230 male
#> 7 Beru … -0.276 -0.132 brown light blue -0.262 fema…
#> 8 R5-D4 -2.23 -0.385 NA white, red red NA none
#> 9 Biggs… 0.241 -0.0786 black light brown -0.411 male
#> 10 Obi-W… 0.213 -0.120 auburn, w… fair blue-gray -0.198 male
#> # ℹ 77 more rows
#> # ℹ 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
#> # films <list>, vehicles <list>, starships <list>
# mutate_if() is particularly useful for transforming variables from
# one type to another
iris %>% mutate_if(is.factor, as.character)
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # ℹ 140 more rows
iris %>% mutate_if(is.double, as.integer)
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <int> <int> <int> <int> <fct>
#> 1 5 3 1 0 setosa
#> 2 4 3 1 0 setosa
#> 3 4 3 1 0 setosa
#> 4 4 3 1 0 setosa
#> 5 5 3 1 0 setosa
#> 6 5 3 1 0 setosa
#> 7 4 3 1 0 setosa
#> 8 5 3 1 0 setosa
#> 9 4 2 1 0 setosa
#> 10 4 3 1 0 setosa
#> # ℹ 140 more rows
# ->
iris %>% mutate(across(where(is.factor), as.character))
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # ℹ 140 more rows
iris %>% mutate(across(where(is.double), as.integer))
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <int> <int> <int> <int> <fct>
#> 1 5 3 1 0 setosa
#> 2 4 3 1 0 setosa
#> 3 4 3 1 0 setosa
#> 4 4 3 1 0 setosa
#> 5 5 3 1 0 setosa
#> 6 5 3 1 0 setosa
#> 7 4 3 1 0 setosa
#> 8 5 3 1 0 setosa
#> 9 4 2 1 0 setosa
#> 10 4 3 1 0 setosa
#> # ℹ 140 more rows
# Multiple transformations ----------------------------------------
# If you want to apply multiple transformations, pass a list of
# functions. When there are multiple functions, they create new
# variables instead of modifying the variables in place:
iris %>% mutate_if(is.numeric, list(scale2, log))
#> # A tibble: 150 × 13
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # ℹ 140 more rows
#> # ℹ 8 more variables: Sepal.Length_fn1 <dbl>, Sepal.Width_fn1 <dbl>,
#> # Petal.Length_fn1 <dbl>, Petal.Width_fn1 <dbl>,
#> # Sepal.Length_fn2 <dbl>, Sepal.Width_fn2 <dbl>,
#> # Petal.Length_fn2 <dbl>, Petal.Width_fn2 <dbl>
iris %>% mutate_if(is.numeric, list(~scale2(.), ~log(.)))
#> # A tibble: 150 × 13
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # ℹ 140 more rows
#> # ℹ 8 more variables: Sepal.Length_scale2 <dbl>,
#> # Sepal.Width_scale2 <dbl>, Petal.Length_scale2 <dbl>,
#> # Petal.Width_scale2 <dbl>, Sepal.Length_log <dbl>,
#> # Sepal.Width_log <dbl>, Petal.Length_log <dbl>, Petal.Width_log <dbl>
iris %>% mutate_if(is.numeric, list(scale = scale2, log = log))
#> # A tibble: 150 × 13
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # ℹ 140 more rows
#> # ℹ 8 more variables: Sepal.Length_scale <dbl>, Sepal.Width_scale <dbl>,
#> # Petal.Length_scale <dbl>, Petal.Width_scale <dbl>,
#> # Sepal.Length_log <dbl>, Sepal.Width_log <dbl>,
#> # Petal.Length_log <dbl>, Petal.Width_log <dbl>
# ->
iris %>%
as_tibble() %>%
mutate(across(where(is.numeric), list(scale = scale2, log = log)))
#> # A tibble: 150 × 13
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # ℹ 140 more rows
#> # ℹ 8 more variables: Sepal.Length_scale <dbl>, Sepal.Length_log <dbl>,
#> # Sepal.Width_scale <dbl>, Sepal.Width_log <dbl>,
#> # Petal.Length_scale <dbl>, Petal.Length_log <dbl>,
#> # Petal.Width_scale <dbl>, Petal.Width_log <dbl>
# When there's only one function in the list, it modifies existing
# variables in place. Give it a name to instead create new variables:
iris %>% mutate_if(is.numeric, list(scale2))
#> # A tibble: 150 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 -0.898 1.02 -1.34 -1.31 setosa
#> 2 -1.14 -0.132 -1.34 -1.31 setosa
#> 3 -1.38 0.327 -1.39 -1.31 setosa
#> 4 -1.50 0.0979 -1.28 -1.31 setosa
#> 5 -1.02 1.25 -1.34 -1.31 setosa
#> 6 -0.535 1.93 -1.17 -1.05 setosa
#> 7 -1.50 0.786 -1.34 -1.18 setosa
#> 8 -1.02 0.786 -1.28 -1.31 setosa
#> 9 -1.74 -0.361 -1.34 -1.31 setosa
#> 10 -1.14 0.0979 -1.28 -1.44 setosa
#> # ℹ 140 more rows
iris %>% mutate_if(is.numeric, list(scale = scale2))
#> # A tibble: 150 × 9
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # ℹ 140 more rows
#> # ℹ 4 more variables: Sepal.Length_scale <dbl>, Sepal.Width_scale <dbl>,
#> # Petal.Length_scale <dbl>, Petal.Width_scale <dbl>