These verbs are scoped variants of summarise(), mutate() and transmute(). They apply operations on a selection of variables.

  • summarise_all(), mutate_all() and transmute_all() apply the functions to all (non-grouping) columns.

  • summarise_at(), mutate_at() and transmute_at() allow you to select columns using the same name-based select_helpers just like with select().

  • summarise_if(), mutate_if() and transmute_if() operate on columns for which a predicate returns TRUE.

summarise_all(.tbl, .funs, ...)

summarise_if(.tbl, .predicate, .funs, ...)

summarise_at(.tbl, .vars, .funs, ..., .cols = NULL)

summarize_all(.tbl, .funs, ...)

summarize_if(.tbl, .predicate, .funs, ...)

summarize_at(.tbl, .vars, .funs, ..., .cols = NULL)

mutate_all(.tbl, .funs, ...)

mutate_if(.tbl, .predicate, .funs, ...)

mutate_at(.tbl, .vars, .funs, ..., .cols = NULL)

transmute_all(.tbl, .funs, ...)

transmute_if(.tbl, .predicate, .funs, ...)

transmute_at(.tbl, .vars, .funs, ..., .cols = NULL)

Arguments

.tbl

A tbl object.

.funs

List of function calls generated by funs(), or a character vector of function names, or simply a function.

Bare formulas are passed to rlang::as_function() to create purrr-style lambda functions. Note that these lambda prevent hybrid evaluation from happening and it is thus more efficient to supply functions like mean() directly rather than in a lambda-formula.

...

Additional arguments for the function calls in .funs. These are evaluated only once, with tidy dots support.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Value

A data frame. By default, the newly created columns have the shortest names needed to uniquely identify the output. To force inclusion of a name, even when not needed, name the input (see examples for details).

See also

Examples

# The scoped variants of summarise() and mutate() make it easy to # apply the same transformation to multiple variables: iris %>% group_by(Species) %>% summarise_all(mean)
#> # A tibble: 3 x 5 #> Species Sepal.Length Sepal.Width Petal.Length Petal.Width #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.01 3.43 1.46 0.246 #> 2 versicolor 5.94 2.77 4.26 1.33 #> 3 virginica 6.59 2.97 5.55 2.03
# There are three variants. # * _all affects every variable # * _at affects variables selected with a character vector or vars() # * _if affects variables selected with a predicate function: # The _at() variants directly support strings: starwars %>% summarise_at(c("height", "mass"), mean, na.rm = TRUE)
#> # A tibble: 1 x 2 #> height mass #> <dbl> <dbl> #> 1 174. 97.3
# You can also supply selection helpers to _at() functions but you have # to quote them with vars(): iris %>% mutate_at(vars(matches("Sepal")), log)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 1.629241 1.2527630 1.4 0.2 setosa #> 2 1.589235 1.0986123 1.4 0.2 setosa #> 3 1.547563 1.1631508 1.3 0.2 setosa #> 4 1.526056 1.1314021 1.5 0.2 setosa #> 5 1.609438 1.2809338 1.4 0.2 setosa #> 6 1.686399 1.3609766 1.7 0.4 setosa #> 7 1.526056 1.2237754 1.4 0.3 setosa #> 8 1.609438 1.2237754 1.5 0.2 setosa #> 9 1.481605 1.0647107 1.4 0.2 setosa #> 10 1.589235 1.1314021 1.5 0.1 setosa #> 11 1.686399 1.3083328 1.5 0.2 setosa #> 12 1.568616 1.2237754 1.6 0.2 setosa #> 13 1.568616 1.0986123 1.4 0.1 setosa #> 14 1.458615 1.0986123 1.1 0.1 setosa #> 15 1.757858 1.3862944 1.2 0.2 setosa #> 16 1.740466 1.4816045 1.5 0.4 setosa #> 17 1.686399 1.3609766 1.3 0.4 setosa #> 18 1.629241 1.2527630 1.4 0.3 setosa #> 19 1.740466 1.3350011 1.7 0.3 setosa #> 20 1.629241 1.3350011 1.5 0.3 setosa #> [ reached getOption("max.print") -- omitted 130 rows ]
starwars %>% summarise_at(vars(height:mass), mean, na.rm = TRUE)
#> # A tibble: 1 x 2 #> height mass #> <dbl> <dbl> #> 1 174. 97.3
# The _if() variants apply a predicate function (a function that # returns TRUE or FALSE) to determine the relevant subset of # columns. Here we apply mean() to the numeric columns: starwars %>% summarise_if(is.numeric, mean, na.rm = TRUE)
#> # A tibble: 1 x 3 #> height mass birth_year #> <dbl> <dbl> <dbl> #> 1 174. 97.3 87.6
# mutate_if() is particularly useful for transforming variables from # one type to another iris %>% as_tibble() %>% mutate_if(is.factor, as.character)
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <chr> #> 1 5.1 3.5 1.4 0.2 setosa #> 2 4.9 3 1.4 0.2 setosa #> 3 4.7 3.2 1.3 0.2 setosa #> 4 4.6 3.1 1.5 0.2 setosa #> 5 5 3.6 1.4 0.2 setosa #> 6 5.4 3.9 1.7 0.4 setosa #> 7 4.6 3.4 1.4 0.3 setosa #> 8 5 3.4 1.5 0.2 setosa #> 9 4.4 2.9 1.4 0.2 setosa #> 10 4.9 3.1 1.5 0.1 setosa #> # ... with 140 more rows
iris %>% as_tibble() %>% mutate_if(is.double, as.integer)
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <int> <int> <int> <int> <fct> #> 1 5 3 1 0 setosa #> 2 4 3 1 0 setosa #> 3 4 3 1 0 setosa #> 4 4 3 1 0 setosa #> 5 5 3 1 0 setosa #> 6 5 3 1 0 setosa #> 7 4 3 1 0 setosa #> 8 5 3 1 0 setosa #> 9 4 2 1 0 setosa #> 10 4 3 1 0 setosa #> # ... with 140 more rows
# --------------------------------------------------------------------------- # If you want apply multiple transformations, use funs() by_species <- iris %>% group_by(Species) by_species %>% summarise_all(funs(min, max))
#> # A tibble: 3 x 9 #> Species Sepal.Length_min Sepal.Width_min Petal.Length_min Petal.Width_min #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 4.3 2.3 1 0.1 #> 2 versicolor 4.9 2 3 1 #> 3 virginica 4.9 2.2 4.5 1.4 #> # ... with 4 more variables: Sepal.Length_max <dbl>, Sepal.Width_max <dbl>, #> # Petal.Length_max <dbl>, Petal.Width_max <dbl>
# Note that output variable name now includes the function name, in order to # keep things distinct. # You can express more complex inline transformations using . by_species %>% mutate_all(funs(. / 2.54))
#> # A tibble: 150 x 5 #> # Groups: Species [3] #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fct> #> 1 2.01 1.38 0.551 0.0787 setosa #> 2 1.93 1.18 0.551 0.0787 setosa #> 3 1.85 1.26 0.512 0.0787 setosa #> 4 1.81 1.22 0.591 0.0787 setosa #> 5 1.97 1.42 0.551 0.0787 setosa #> 6 2.13 1.54 0.669 0.157 setosa #> 7 1.81 1.34 0.551 0.118 setosa #> 8 1.97 1.34 0.591 0.0787 setosa #> 9 1.73 1.14 0.551 0.0787 setosa #> 10 1.93 1.22 0.591 0.0394 setosa #> # ... with 140 more rows
# Function names will be included if .funs has names or multiple inputs by_species %>% mutate_all(funs(inches = . / 2.54))
#> # A tibble: 150 x 9 #> # Groups: Species [3] #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_inch… #> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> #> 1 5.1 3.5 1.4 0.2 setosa 2.01 #> 2 4.9 3 1.4 0.2 setosa 1.93 #> 3 4.7 3.2 1.3 0.2 setosa 1.85 #> 4 4.6 3.1 1.5 0.2 setosa 1.81 #> 5 5 3.6 1.4 0.2 setosa 1.97 #> 6 5.4 3.9 1.7 0.4 setosa 2.13 #> 7 4.6 3.4 1.4 0.3 setosa 1.81 #> 8 5 3.4 1.5 0.2 setosa 1.97 #> 9 4.4 2.9 1.4 0.2 setosa 1.73 #> 10 4.9 3.1 1.5 0.1 setosa 1.93 #> # ... with 140 more rows, and 3 more variables: Sepal.Width_inches <dbl>, #> # Petal.Length_inches <dbl>, Petal.Width_inches <dbl>
by_species %>% summarise_all(funs(med = median))
#> # A tibble: 3 x 5 #> Species Sepal.Length_med Sepal.Width_med Petal.Length_med Petal.Width_med #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5 3.4 1.5 0.2 #> 2 versicolor 5.9 2.8 4.35 1.3 #> 3 virginica 6.5 3 5.55 2
by_species %>% summarise_all(funs(Q3 = quantile), probs = 0.75)
#> # A tibble: 3 x 5 #> Species Sepal.Length_Q3 Sepal.Width_Q3 Petal.Length_Q3 Petal.Width_Q3 #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.2 3.68 1.58 0.3 #> 2 versicolor 6.3 3 4.6 1.5 #> 3 virginica 6.9 3.18 5.88 2.3
by_species %>% summarise_all(c("min", "max"))
#> # A tibble: 3 x 9 #> Species Sepal.Length_min Sepal.Width_min Petal.Length_min Petal.Width_min #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 4.3 2.3 1 0.1 #> 2 versicolor 4.9 2 3 1 #> 3 virginica 4.9 2.2 4.5 1.4 #> # ... with 4 more variables: Sepal.Length_max <dbl>, Sepal.Width_max <dbl>, #> # Petal.Length_max <dbl>, Petal.Width_max <dbl>