Select distinct rows by a selection of variables

Scoped verbs (_if, _at, _all) have been superseded by the use of pick() or across() in an existing verb. See vignette("colwise") for details.

These scoped variants of distinct() extract distinct rows by a selection of variables. Like distinct(), you can modify the variables before ordering with the .funs argument.

Usage

distinct_all(.tbl, .funs = list(), ..., .keep_all = FALSE)

distinct_at(.tbl, .vars, .funs = list(), ..., .keep_all = FALSE)

distinct_if(.tbl, .predicate, .funs = list(), ..., .keep_all = FALSE)

Arguments

.tbl: A tbl object.
.funs: A function fun, a quosure style lambda ~ fun(.) or a list of either form.
...: Additional arguments for the function calls in .funs. These are evaluated only once, with tidy dots support.
.keep_all: If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.
.vars: A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.
.predicate: A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

Grouping variables

The grouping variables that are part of the selection are taken into account to determine distinct rows.

Examples

df <- tibble(x = rep(2:5, each = 2) / 2, y = rep(2:3, each = 4) / 2)

distinct_all(df)
#> # A tibble: 4 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1   1     1  
#> 2   1.5   1  
#> 3   2     1.5
#> 4   2.5   1.5
# ->
distinct(df, pick(everything()))
#> # A tibble: 4 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1   1     1  
#> 2   1.5   1  
#> 3   2     1.5
#> 4   2.5   1.5

distinct_at(df, vars(x,y))
#> # A tibble: 4 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1   1     1  
#> 2   1.5   1  
#> 3   2     1.5
#> 4   2.5   1.5
# ->
distinct(df, pick(x, y))
#> # A tibble: 4 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1   1     1  
#> 2   1.5   1  
#> 3   2     1.5
#> 4   2.5   1.5

distinct_if(df, is.numeric)
#> # A tibble: 4 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1   1     1  
#> 2   1.5   1  
#> 3   2     1.5
#> 4   2.5   1.5
# ->
distinct(df, pick(where(is.numeric)))
#> # A tibble: 4 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1   1     1  
#> 2   1.5   1  
#> 3   2     1.5
#> 4   2.5   1.5

# You can supply a function that will be applied before extracting the distinct values
# The variables of the sorted tibble keep their original values.
distinct_all(df, round)
#> # A tibble: 3 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2     2     1
#> 3     2     2
# ->
distinct(df, across(everything(), round))
#> # A tibble: 3 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2     2     1
#> 3     2     2