Experimental lifecycle

These functions provide a framework for modifying rows in a table using a second table of data. The two tables are matched by a set of key variables whose values must uniquely identify each row. The functions are inspired by SQL's INSERT, UPDATE, and DELETE, and can optionally modify in_place for selected backends.

  • rows_insert() adds new rows (like INSERT); key values in y must not occur in x.

  • rows_update() modifies existing rows (like UPDATE); key values in y must occur in x.

  • rows_patch() works like rows_update() but only overwrites NA values.

  • rows_upsert() inserts or updates depending on whether or not the key value in y already exists in x.

  • rows_delete() deletes rows (like DELETE); key values in y must exist in x.

rows_insert(x, y, by = NULL, ..., copy = FALSE, in_place = FALSE)

rows_update(x, y, by = NULL, ..., copy = FALSE, in_place = FALSE)

rows_patch(x, y, by = NULL, ..., copy = FALSE, in_place = FALSE)

rows_upsert(x, y, by = NULL, ..., copy = FALSE, in_place = FALSE)

rows_delete(x, y, by = NULL, ..., copy = FALSE, in_place = FALSE)

Arguments

x, y

A pair of data frames or data frame extensions (e.g. a tibble). y must have the same columns of x or a subset.

by

An unnamed character vector giving the key columns. The key values must uniquely identify each row (i.e. each combination of key values occurs at most once), and the key columns must exist in both x and y.

By default, we use the first column in y, since the first column is a reasonable place to put an identifier variable.

...

Other parameters passed onto methods.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

in_place

Should x be modified in place? This argument is only relevant for mutable backends (e.g. databases, data.tables).

When TRUE, a modified version of x is returned invisibly; when FALSE, a new object representing the resulting changes is returned.

Value

An object of the same type as x. The order of the rows and columns of x is preserved as much as possible. The output has the following properties:

  • rows_update() preserves rows as is; rows_insert() and rows_upsert() return all existing rows and potentially new rows; rows_delete() returns a subset of the rows.

  • Columns are not added, removed, or relocated, though the data may be updated.

  • Groups are taken from x.

  • Data frame attributes are taken from x.

If in_place = TRUE, the result will be returned invisibly.

Examples

data <- tibble(a = 1:3, b = letters[c(1:2, NA)], c = 0.5 + 0:2) data
#> # A tibble: 3 x 3 #> a b c #> <int> <chr> <dbl> #> 1 1 a 0.5 #> 2 2 b 1.5 #> 3 3 NA 2.5
# Insert rows_insert(data, tibble(a = 4, b = "z"))
#> Matching, by = "a"
#> # A tibble: 4 x 3 #> a b c #> <dbl> <chr> <dbl> #> 1 1 a 0.5 #> 2 2 b 1.5 #> 3 3 NA 2.5 #> 4 4 z NA
try(rows_insert(data, tibble(a = 3, b = "z")))
#> Matching, by = "a"
#> Error : Attempting to insert duplicate rows.
# Update rows_update(data, tibble(a = 2:3, b = "z"))
#> Matching, by = "a"
#> # A tibble: 3 x 3 #> a b c #> <int> <chr> <dbl> #> 1 1 a 0.5 #> 2 2 z 1.5 #> 3 3 z 2.5
rows_update(data, tibble(b = "z", a = 2:3), by = "a")
#> # A tibble: 3 x 3 #> a b c #> <int> <chr> <dbl> #> 1 1 a 0.5 #> 2 2 z 1.5 #> 3 3 z 2.5
# Variants: patch and upsert rows_patch(data, tibble(a = 2:3, b = "z"))
#> Matching, by = "a"
#> # A tibble: 3 x 3 #> a b c #> <int> <chr> <dbl> #> 1 1 a 0.5 #> 2 2 b 1.5 #> 3 3 z 2.5
rows_upsert(data, tibble(a = 2:4, b = "z"))
#> Matching, by = "a"
#> # A tibble: 4 x 3 #> a b c #> <int> <chr> <dbl> #> 1 1 a 0.5 #> 2 2 z 1.5 #> 3 3 z 2.5 #> 4 4 z NA
# Delete and truncate rows_delete(data, tibble(a = 2:3))
#> Matching, by = "a"
#> # A tibble: 1 x 3 #> a b c #> <int> <chr> <dbl> #> 1 1 a 0.5
rows_delete(data, tibble(a = 2:3, b = "b"))
#> Matching, by = "a"
#> Ignoring extra columns: b
#> # A tibble: 1 x 3 #> a b c #> <int> <chr> <dbl> #> 1 1 a 0.5
try(rows_delete(data, tibble(a = 2:3, b = "b"), by = c("a", "b")))
#> Error : Attempting to delete missing rows.