dplyr provides a flexible grammar of data manipulation. It's the next iteration of plyr, focused on tools for working with data frames (hence the d in the name).
It has three main goals:
Identify the most important data manipulation verbs and make them easy to use from R.
Provide blazing fast performance for in-memory data by writing key pieces in C++ (using Rcpp)
Use the same interface to work with data no matter where it's stored, whether in a data frame, a data table or database.
To learn more about dplyr, start with the vignettes:
browseVignettes(package = "dplyr")
Should lengthy operations such as
show a progress bar? Default:
These can be set on a package-by-package basis, or for the global environment.
pkgconfig::set_config() for usage.
NA values be matched in data frame joins
by default? Default:
"na" (for compatibility with dplyr v0.5.0 and earlier,
subject to change), alternative value:
"never" (the default
for database backends, see