slice()
lets you index rows by their (integer) locations. It allows you
to select, remove, and duplicate rows. It is accompanied by a number of
helpers for common use cases:
slice_head()
andslice_tail()
select the first or last rows.slice_sample()
randomly selects rows.slice_min()
andslice_max()
select rows with the smallest or largest values of a variable.
If .data
is a grouped_df, the operation will be performed on each group,
so that (e.g.) slice_head(df, n = 5)
will select the first five rows in
each group.
Usage
slice(.data, ..., .by = NULL, .preserve = FALSE)
slice_head(.data, ..., n, prop, by = NULL)
slice_tail(.data, ..., n, prop, by = NULL)
slice_min(
.data,
order_by,
...,
n,
prop,
by = NULL,
with_ties = TRUE,
na_rm = FALSE
)
slice_max(
.data,
order_by,
...,
n,
prop,
by = NULL,
with_ties = TRUE,
na_rm = FALSE
)
slice_sample(.data, ..., n, prop, by = NULL, weight_by = NULL, replace = FALSE)
Arguments
- .data
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
- ...
For
slice()
: <data-masking
> Integer row values.Provide either positive values to keep, or negative values to drop. The values provided must be either all positive or all negative. Indices beyond the number of rows in the input are silently ignored.
For
slice_*()
, these arguments are passed on to methods.- .by, by
-
<
tidy-select
> Optionally, a selection of columns to group by for just this operation, functioning as an alternative togroup_by()
. For details and examples, see ?dplyr_by. - .preserve
Relevant when the
.data
input is grouped. If.preserve = FALSE
(the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.- n, prop
Provide either
n
, the number of rows, orprop
, the proportion of rows to select. If neither are supplied,n = 1
will be used. Ifn
is greater than the number of rows in the group (orprop > 1
), the result will be silently truncated to the group size.prop
will be rounded towards zero to generate an integer number of rows.A negative value of
n
orprop
will be subtracted from the group size. For example,n = -2
with a group of 5 rows will select 5 - 2 = 3 rows;prop = -0.25
with 8 rows will select 8 * (1 - 0.25) = 6 rows.- order_by
<
data-masking
> Variable or function of variables to order by. To order by multiple variables, wrap them in a data frame or tibble.- with_ties
Should ties be kept together? The default,
TRUE
, may return more rows than you request. UseFALSE
to ignore ties, and return the firstn
rows.- na_rm
Should missing values in
order_by
be removed from the result? IfFALSE
,NA
values are sorted to the end (like inarrange()
), so they will only be included if there are insufficient non-missing values to reachn
/prop
.- weight_by
<
data-masking
> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. See theDetails
section for more technical details regarding these weights.- replace
Should sampling be performed with (
TRUE
) or without (FALSE
, the default) replacement.
Value
An object of the same type as .data
. The output has the following
properties:
Each row may appear 0, 1, or many times in the output.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
Details
Slice does not work with relational databases because they have no
intrinsic notion of row order. If you want to perform the equivalent
operation, use filter()
and row_number()
.
For slice_sample()
, note that the weights provided in weight_by
are
passed through to the prob
argument of base::sample.int()
. This means
they cannot be used to reconstruct summary statistics from the underlying
population. See this discussion
for more details.
Methods
These function are generics, which means that packages can provide implementations (methods) for other classes. See the documentation of individual methods for extra arguments and differences in behaviour.
Methods available in currently loaded packages:
slice()
: dbplyr (tbl_lazy
), dplyr (data.frame
) .slice_head()
: dbplyr (tbl_lazy
), dplyr (data.frame
) .slice_tail()
: dbplyr (tbl_lazy
), dplyr (data.frame
) .slice_min()
: dbplyr (tbl_lazy
), dplyr (data.frame
) .slice_max()
: dbplyr (tbl_lazy
), dplyr (data.frame
) .slice_sample()
: dbplyr (tbl_lazy
), dplyr (data.frame
) .
Examples
# Similar to head(mtcars, 1):
mtcars %>% slice(1L)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
# Similar to tail(mtcars, 1):
mtcars %>% slice(n())
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Volvo 142E 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
mtcars %>% slice(5:n())
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
#> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
# Rows can be dropped with negative indices:
slice(mtcars, -(1:4))
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
#> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
# First and last rows based on existing order
mtcars %>% slice_head(n = 5)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
mtcars %>% slice_tail(n = 5)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.5 0 1 5 4
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6
#> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.6 0 1 5 8
#> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.6 1 1 4 2
# Rows with minimum and maximum values of a variable
mtcars %>% slice_min(mpg, n = 5)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Cadillac Fleetwood 10.4 8 472 205 2.93 5.250 17.98 0 0 3 4
#> Lincoln Continental 10.4 8 460 215 3.00 5.424 17.82 0 0 3 4
#> Camaro Z28 13.3 8 350 245 3.73 3.840 15.41 0 0 3 4
#> Duster 360 14.3 8 360 245 3.21 3.570 15.84 0 0 3 4
#> Chrysler Imperial 14.7 8 440 230 3.23 5.345 17.42 0 0 3 4
mtcars %>% slice_max(mpg, n = 5)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
# slice_min() and slice_max() may return more rows than requested
# in the presence of ties.
mtcars %>% slice_min(cyl, n = 1)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
# Use with_ties = FALSE to return exactly n matches
mtcars %>% slice_min(cyl, n = 1, with_ties = FALSE)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Datsun 710 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
# Or use additional variables to break the tie:
mtcars %>% slice_min(tibble(cyl, mpg), n = 1)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Volvo 142E 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
# slice_sample() allows you to random select with or without replacement
mtcars %>% slice_sample(n = 5)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Maserati Bora 15.0 8 301 335 3.54 3.570 14.60 0 1 5 8
#> Duster 360 14.3 8 360 245 3.21 3.570 15.84 0 0 3 4
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
mtcars %>% slice_sample(n = 5, replace = TRUE)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
# You can optionally weight by a variable - this code weights by the
# physical weight of the cars, so heavy cars are more likely to get
# selected.
mtcars %>% slice_sample(weight_by = wt, n = 5)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Lincoln Continental 10.4 8 460 215 3.00 5.424 17.82 0 0 3 4
#> Dodge Challenger 15.5 8 318 150 2.76 3.520 16.87 0 0 3 2
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Chrysler Imperial 14.7 8 440 230 3.23 5.345 17.42 0 0 3 4
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Group wise operation ----------------------------------------
df <- tibble(
group = rep(c("a", "b", "c"), c(1, 2, 4)),
x = runif(7)
)
# All slice helpers operate per group, silently truncating to the group
# size, so the following code works without error
df %>% group_by(group) %>% slice_head(n = 2)
#> # A tibble: 5 × 2
#> # Groups: group [3]
#> group x
#> <chr> <dbl>
#> 1 a 0.634
#> 2 b 0.771
#> 3 b 0.502
#> 4 c 0.711
#> 5 c 0.0919
# When specifying the proportion of rows to include non-integer sizes
# are rounded down, so group a gets 0 rows
df %>% group_by(group) %>% slice_head(prop = 0.5)
#> # A tibble: 3 × 2
#> # Groups: group [2]
#> group x
#> <chr> <dbl>
#> 1 b 0.771
#> 2 c 0.711
#> 3 c 0.0919
# Filter equivalents --------------------------------------------
# slice() expressions can often be written to use `filter()` and
# `row_number()`, which can also be translated to SQL. For many databases,
# you'll need to supply an explicit variable to use to compute the row number.
filter(mtcars, row_number() == 1L)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
filter(mtcars, row_number() == n())
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Volvo 142E 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
filter(mtcars, between(row_number(), 5, n()))
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
#> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
#> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2