select() keeps only the variables you mention; rename() keeps all variables.

select(.data, ...)

rename(.data, ...)

Arguments

.data

A tbl. All main verbs are S3 generics and provide methods for tbl_df(), dtplyr::tbl_dt() and dbplyr::tbl_dbi().

...

One or more unquoted expressions separated by commas. You can treat variable names like they are positions.

Positive values select variables; negative values to drop variables. If the first expression is negative, select() will automatically start with all variables.

Use named arguments to rename selected variables.

These arguments are automatically quoted and evaluated in a context where column names represent column positions. They support unquoting and splicing. See vignette("programming") for an introduction to these concepts.

Value

An object of the same class as .data.

Useful functions

As well as using existing functions like : and c(), there are a number of special functions that only work inside select

To drop variables, use -.

Note that except for :, - and c(), all complex expressions are evaluated outside the data frame context. This is to prevent accidental matching of data frame variables when you refer to variables from the calling context.

Scoped selection and renaming

The three scoped variants of select() (select_all(), select_if() and select_at()) and the three variants of rename() (rename_all(), rename_if(), rename_at()) make it easy to apply a renaming function to a selection of variables.

Tidy data

When applied to a data frame, row names are silently dropped. To preserve, convert to an explicit variable with tibble::rownames_to_column().

See also

Other single table verbs: arrange, filter, mutate, slice, summarise

Examples

iris <- as_tibble(iris) # so it prints a little nicer select(iris, starts_with("Petal"))
#> # A tibble: 150 x 2 #> Petal.Length Petal.Width #> <dbl> <dbl> #> 1 1.4 0.2 #> 2 1.4 0.2 #> 3 1.3 0.2 #> 4 1.5 0.2 #> 5 1.4 0.2 #> 6 1.7 0.4 #> 7 1.4 0.3 #> 8 1.5 0.2 #> 9 1.4 0.2 #> 10 1.5 0.1 #> # ... with 140 more rows
select(iris, ends_with("Width"))
#> # A tibble: 150 x 2 #> Sepal.Width Petal.Width #> <dbl> <dbl> #> 1 3.5 0.2 #> 2 3.0 0.2 #> 3 3.2 0.2 #> 4 3.1 0.2 #> 5 3.6 0.2 #> 6 3.9 0.4 #> 7 3.4 0.3 #> 8 3.4 0.2 #> 9 2.9 0.2 #> 10 3.1 0.1 #> # ... with 140 more rows
# Move Species variable to the front select(iris, Species, everything())
#> # A tibble: 150 x 5 #> Species Sepal.Length Sepal.Width Petal.Length Petal.Width #> <fctr> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.1 3.5 1.4 0.2 #> 2 setosa 4.9 3.0 1.4 0.2 #> 3 setosa 4.7 3.2 1.3 0.2 #> 4 setosa 4.6 3.1 1.5 0.2 #> 5 setosa 5.0 3.6 1.4 0.2 #> 6 setosa 5.4 3.9 1.7 0.4 #> 7 setosa 4.6 3.4 1.4 0.3 #> 8 setosa 5.0 3.4 1.5 0.2 #> 9 setosa 4.4 2.9 1.4 0.2 #> 10 setosa 4.9 3.1 1.5 0.1 #> # ... with 140 more rows
df <- as.data.frame(matrix(runif(100), nrow = 10)) df <- tbl_df(df[c(3, 4, 7, 1, 9, 8, 5, 2, 6, 10)]) select(df, V4:V6)
#> # A tibble: 10 x 8 #> V4 V7 V1 V9 V8 V5 V2 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 0.87959719 0.00250323 0.6477019 0.6523589 0.789947051 0.26714756 0.79909120 #> 2 0.37491998 0.84770109 0.0647610 0.6675703 0.004801874 0.60840493 0.13193857 #> 3 0.43416063 0.66004509 0.3374612 0.1980866 0.252908190 0.77196657 0.54759937 #> 4 0.80684989 0.29450061 0.5698746 0.8681269 0.563169963 0.51975041 0.09701772 #> 5 0.63560582 0.57010194 0.3383233 0.1732694 0.952278358 0.86047988 0.98637755 #> 6 0.06871643 0.83903090 0.4992534 0.2049636 0.685113066 0.79302051 0.83954213 #> 7 0.14484133 0.63062556 0.3084174 0.2171490 0.470041074 0.78391104 0.39959635 #> 8 0.29457026 0.49672970 0.8404281 0.1508981 0.289851467 0.42372128 0.95222495 #> 9 0.04144781 0.38952535 0.0651790 0.7745374 0.510124689 0.01551256 0.05949630 #> 10 0.11654699 0.73144890 0.4399482 0.5330259 0.750690085 0.41620118 0.27691599 #> # ... with 1 more variables: V6 <dbl>
select(df, num_range("V", 4:6))
#> # A tibble: 10 x 3 #> V4 V5 V6 #> <dbl> <dbl> <dbl> #> 1 0.87959719 0.26714756 0.006950771 #> 2 0.37491998 0.60840493 0.390183412 #> 3 0.43416063 0.77196657 0.238652540 #> 4 0.80684989 0.51975041 0.384401472 #> 5 0.63560582 0.86047988 0.073094374 #> 6 0.06871643 0.79302051 0.766595044 #> 7 0.14484133 0.78391104 0.687511492 #> 8 0.29457026 0.42372128 0.841570623 #> 9 0.04144781 0.01551256 0.105449455 #> 10 0.11654699 0.41620118 0.340091229
# Drop variables with - select(iris, -starts_with("Petal"))
#> # A tibble: 150 x 3 #> Sepal.Length Sepal.Width Species #> <dbl> <dbl> <fctr> #> 1 5.1 3.5 setosa #> 2 4.9 3.0 setosa #> 3 4.7 3.2 setosa #> 4 4.6 3.1 setosa #> 5 5.0 3.6 setosa #> 6 5.4 3.9 setosa #> 7 4.6 3.4 setosa #> 8 5.0 3.4 setosa #> 9 4.4 2.9 setosa #> 10 4.9 3.1 setosa #> # ... with 140 more rows
# The .data pronoun is available: select(mtcars, .data$cyl)
#> cyl #> Mazda RX4 6 #> Mazda RX4 Wag 6 #> Datsun 710 4 #> Hornet 4 Drive 6 #> Hornet Sportabout 8 #> Valiant 6 #> Duster 360 8 #> Merc 240D 4 #> Merc 230 4 #> Merc 280 6 #> Merc 280C 6 #> Merc 450SE 8 #> Merc 450SL 8 #> Merc 450SLC 8 #> Cadillac Fleetwood 8 #> Lincoln Continental 8 #> Chrysler Imperial 8 #> Fiat 128 4 #> Honda Civic 4 #> Toyota Corolla 4 #> Toyota Corona 4 #> Dodge Challenger 8 #> AMC Javelin 8 #> Camaro Z28 8 #> Pontiac Firebird 8 #> Fiat X1-9 4 #> Porsche 914-2 4 #> Lotus Europa 4 #> Ford Pantera L 8 #> Ferrari Dino 6 #> Maserati Bora 8 #> Volvo 142E 4
select(mtcars, .data$mpg : .data$disp)
#> mpg cyl disp #> Mazda RX4 21.0 6 160.0 #> Mazda RX4 Wag 21.0 6 160.0 #> Datsun 710 22.8 4 108.0 #> Hornet 4 Drive 21.4 6 258.0 #> Hornet Sportabout 18.7 8 360.0 #> Valiant 18.1 6 225.0 #> Duster 360 14.3 8 360.0 #> Merc 240D 24.4 4 146.7 #> Merc 230 22.8 4 140.8 #> Merc 280 19.2 6 167.6 #> Merc 280C 17.8 6 167.6 #> Merc 450SE 16.4 8 275.8 #> Merc 450SL 17.3 8 275.8 #> Merc 450SLC 15.2 8 275.8 #> Cadillac Fleetwood 10.4 8 472.0 #> Lincoln Continental 10.4 8 460.0 #> Chrysler Imperial 14.7 8 440.0 #> Fiat 128 32.4 4 78.7 #> Honda Civic 30.4 4 75.7 #> Toyota Corolla 33.9 4 71.1 #> Toyota Corona 21.5 4 120.1 #> Dodge Challenger 15.5 8 318.0 #> AMC Javelin 15.2 8 304.0 #> Camaro Z28 13.3 8 350.0 #> Pontiac Firebird 19.2 8 400.0 #> Fiat X1-9 27.3 4 79.0 #> Porsche 914-2 26.0 4 120.3 #> Lotus Europa 30.4 4 95.1 #> Ford Pantera L 15.8 8 351.0 #> Ferrari Dino 19.7 6 145.0 #> Maserati Bora 15.0 8 301.0 #> Volvo 142E 21.4 4 121.0
# However it isn't available within calls since those are evaluated # outside of the data context. This would fail if run: # select(mtcars, identical(.data$cyl)) # Renaming ----------------------------------------- # * select() keeps only the variables you specify select(iris, petal_length = Petal.Length)
#> # A tibble: 150 x 1 #> petal_length #> <dbl> #> 1 1.4 #> 2 1.4 #> 3 1.3 #> 4 1.5 #> 5 1.4 #> 6 1.7 #> 7 1.4 #> 8 1.5 #> 9 1.4 #> 10 1.5 #> # ... with 140 more rows
# * rename() keeps all variables rename(iris, petal_length = Petal.Length)
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width petal_length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fctr> #> 1 5.1 3.5 1.4 0.2 setosa #> 2 4.9 3.0 1.4 0.2 setosa #> 3 4.7 3.2 1.3 0.2 setosa #> 4 4.6 3.1 1.5 0.2 setosa #> 5 5.0 3.6 1.4 0.2 setosa #> 6 5.4 3.9 1.7 0.4 setosa #> 7 4.6 3.4 1.4 0.3 setosa #> 8 5.0 3.4 1.5 0.2 setosa #> 9 4.4 2.9 1.4 0.2 setosa #> 10 4.9 3.1 1.5 0.1 setosa #> # ... with 140 more rows