Cross joins match each row in x
to every row in y
, resulting in a data
frame with nrow(x) * nrow(y)
rows.
Since cross joins result in all possible matches between x
and y
, they
technically serve as the basis for all mutating joins, which
can generally be thought of as cross joins followed by a filter. In practice,
a more specialized procedure is used for better performance.
Usage
cross_join(x, y, ..., copy = FALSE, suffix = c(".x", ".y"))
Arguments
- x, y
A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
- ...
Other parameters passed onto methods.
- copy
If
x
andy
are not from the same data source, andcopy
isTRUE
, theny
will be copied into the same src asx
. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.- suffix
If there are non-joined duplicate variables in
x
andy
, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.
Value
An object of the same type as x
(including the same groups). The output has
the following properties:
There are
nrow(x) * nrow(y)
rows returned.Output columns include all columns from both
x
andy
. Column name collisions are resolved usingsuffix
.The order of the rows and columns of
x
is preserved as much as possible.
Methods
This function is a generic, which means that packages can provide implementations (methods) for other classes. See the documentation of individual methods for extra arguments and differences in behaviour.
The following methods are currently available in loaded packages:
dbplyr (tbl_lazy
), dplyr (data.frame
)
.
See also
Other joins:
filter-joins
,
mutate-joins
,
nest_join()
Examples
# Cross joins match each row in `x` to every row in `y`.
# Data within the columns is not used in the matching process.
cross_join(band_instruments, band_members)
#> # A tibble: 9 × 4
#> name.x plays name.y band
#> <chr> <chr> <chr> <chr>
#> 1 John guitar Mick Stones
#> 2 John guitar John Beatles
#> 3 John guitar Paul Beatles
#> 4 Paul bass Mick Stones
#> 5 Paul bass John Beatles
#> 6 Paul bass Paul Beatles
#> 7 Keith guitar Mick Stones
#> 8 Keith guitar John Beatles
#> 9 Keith guitar Paul Beatles
# Control the suffix added to variables duplicated in
# `x` and `y` with `suffix`.
cross_join(band_instruments, band_members, suffix = c("", "_y"))
#> # A tibble: 9 × 4
#> name plays name_y band
#> <chr> <chr> <chr> <chr>
#> 1 John guitar Mick Stones
#> 2 John guitar John Beatles
#> 3 John guitar Paul Beatles
#> 4 Paul bass Mick Stones
#> 5 Paul bass John Beatles
#> 6 Paul bass Paul Beatles
#> 7 Keith guitar Mick Stones
#> 8 Keith guitar John Beatles
#> 9 Keith guitar Paul Beatles