Package 'tidylog'

Title: Logging for 'dplyr' and 'tidyr' Functions
Description: Provides feedback about 'dplyr' and 'tidyr' operations.
Authors: Benjamin Elbers [aut, cre] (ORCID: <https://orcid.org/0000-0001-5392-3448>), Damiano Oldoni [ctb] (ORCID: <https://orcid.org/0000-0003-3445-7562>)
Maintainer: Benjamin Elbers <[email protected]>
License: MIT + file LICENSE
Version: 1.1.0.9000
Built: 2026-06-23 16:42:02 UTC
Source: https://github.com/elbersb/tidylog

Help Index


Wrapper around dplyr::add_count that prints information about the operation

Description

Wrapper around dplyr::add_count() that prints information about the operation.

Usage

add_count(x, ...)

Arguments

x

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

...

Arguments passed on to dplyr::add_count

wt

<data-masking> Frequency weights. Can be NULL or a variable:

  • If NULL (the default), counts the number of rows in each group.

  • If a variable, computes sum(wt) for each group.

sort

If TRUE, will show the largest groups at the top.

name

The name of the new column in the output.

If omitted, it will default to n. If there's already a column called n, it will use nn. If there's a column called n and nn, it'll use nnn, and so on, adding ns until it gets a new name.

.drop

Handling of factor levels that don't appear in the data, passed on to group_by().

For count(): if FALSE will include counts for empty groups (i.e. for levels of factors that don't exist in the data).

[Defunct] For add_count(): defunct since it can't actually affect the output.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::add_count()

See Also

dplyr::add_count()


Wrapper around dplyr::add_tally that prints information about the operation

Description

Wrapper around dplyr::add_tally() that prints information about the operation.

Usage

add_tally(x, ...)

Arguments

x

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

...

Arguments passed on to dplyr::add_tally

wt

<data-masking> Frequency weights. Can be NULL or a variable:

  • If NULL (the default), counts the number of rows in each group.

  • If a variable, computes sum(wt) for each group.

sort

If TRUE, will show the largest groups at the top.

name

The name of the new column in the output.

If omitted, it will default to n. If there's already a column called n, it will use nn. If there's a column called n and nn, it'll use nnn, and so on, adding ns until it gets a new name.

.drop

Handling of factor levels that don't appear in the data, passed on to group_by().

For count(): if FALSE will include counts for empty groups (i.e. for levels of factors that don't exist in the data).

[Defunct] For add_count(): defunct since it can't actually affect the output.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::add_tally()

See Also

dplyr::add_tally()


Wrapper around dplyr::anti_join that prints information about the operation

Description

Wrapper around dplyr::anti_join() that prints information about the operation.

Usage

anti_join(x, y, by = NULL, ...)

Arguments

x, y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

by

A join specification created with join_by(), or a character vector of variables to join by.

If NULL, the default, ⁠*_join()⁠ will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join on different variables between x and y, use a join_by() specification. For example, join_by(a == b) will match x$a to y$b.

To join by multiple variables, use a join_by() specification with multiple expressions. For example, join_by(a == b, c == d) will match x$a to y$b and x$c to y$d. If the column names are the same between x and y, you can shorten this by listing only the variable names, like join_by(a, c).

join_by() can also be used to perform inequality, rolling, and overlap joins. See the documentation at ?join_by for details on these types of joins.

For simple equality joins, you can alternatively specify a character vector of variable names to join by. For example, by = c("a", "b") joins x$a to y$a and x$b to y$b. If variable names differ between x and y, use a named character vector like by = c("x_a" = "y_a", "x_b" = "y_b").

To perform a cross-join, generating all combinations of x and y, see cross_join().

...

Arguments passed on to dplyr::anti_join

x,y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

na_matches

Should two NA or two NaN values match?

  • "na", the default, treats two NA or two NaN values as equal, like %in%, match(), and merge().

  • "never" treats two NA or two NaN values as different, and will never match them together or to any other values. This is similar to joins for database sources and to base::merge(incomparables = NA).

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::anti_join()

See Also

dplyr::anti_join()


Wrapper around dplyr::count that prints information about the operation

Description

Wrapper around dplyr::count() that prints information about the operation.

Usage

count(x, ...)

Arguments

x

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

...

Arguments passed on to dplyr::count

wt

<data-masking> Frequency weights. Can be NULL or a variable:

  • If NULL (the default), counts the number of rows in each group.

  • If a variable, computes sum(wt) for each group.

sort

If TRUE, will show the largest groups at the top.

name

The name of the new column in the output.

If omitted, it will default to n. If there's already a column called n, it will use nn. If there's a column called n and nn, it'll use nnn, and so on, adding ns until it gets a new name.

.drop

Handling of factor levels that don't appear in the data, passed on to group_by().

For count(): if FALSE will include counts for empty groups (i.e. for levels of factors that don't exist in the data).

[Defunct] For add_count(): defunct since it can't actually affect the output.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::count()

See Also

dplyr::count()


Displays messages related to changing row number.

Description

Unlike log_...() functions, this assumes the data manipulation is performed elsewhere and only displays messages.

Usage

display_changed_rows(.olddata, .newdata, .funname)

Arguments

.olddata

Data frame before transformation.

.newdata

Data frame after transformation.

.funname

String: name of function that should be used in messages.


Wrapper around dplyr::distinct that prints information about the operation

Description

Wrapper around dplyr::distinct() that prints information about the operation.

Usage

distinct(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::distinct

.keep_all

If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::distinct()

See Also

dplyr::distinct()


Wrapper around dplyr::distinct_all that prints information about the operation

Description

Wrapper around dplyr::distinct_all() that prints information about the operation.

Usage

distinct_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::distinct_all

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.keep_all

If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::distinct_all()

See Also

dplyr::distinct_all()


Wrapper around dplyr::distinct_at that prints information about the operation

Description

Wrapper around dplyr::distinct_at() that prints information about the operation.

Usage

distinct_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::distinct_at

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.keep_all

If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::distinct_at()

See Also

dplyr::distinct_at()


Wrapper around dplyr::distinct_if that prints information about the operation

Description

Wrapper around dplyr::distinct_if() that prints information about the operation.

Usage

distinct_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::distinct_if

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.keep_all

If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::distinct_if()

See Also

dplyr::distinct_if()


Wrapper around tidyr::drop_na that prints information about the operation

Description

Wrapper around tidyr::drop_na() that prints information about the operation.

Usage

drop_na(data, ...)

Arguments

data

A data frame.

...

<tidy-select> Columns to inspect for missing values. If empty, all columns are used.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::drop_na()

See Also

tidyr::drop_na()


Wrapper around tidyr::fill that prints information about the operation

Description

Wrapper around tidyr::fill() that prints information about the operation.

Usage

fill(data, ...)

Arguments

data

A data frame.

...

Arguments passed on to tidyr::fill

.by

[Experimental]

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.direction

Direction in which to fill missing values. Currently either "down" (the default), "up", "downup" (i.e. first down and then up) or "updown" (first up and then down).

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::fill()

See Also

tidyr::fill()


Wrapper around dplyr::filter that prints information about the operation

Description

Wrapper around dplyr::filter() that prints information about the operation.

Usage

filter(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::filter

.by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.preserve

Relevant when the .data input is grouped. If .preserve = FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::filter()

See Also

dplyr::filter()


Wrapper around dplyr::filter_all that prints information about the operation

Description

Wrapper around dplyr::filter_all() that prints information about the operation.

Usage

filter_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::filter_all

.vars_predicate

A quoted predicate expression as returned by all_vars() or any_vars().

Can also be a function or purrr-like formula. In this case, the intersection of the results is taken by default and there's currently no way to request the union.

.preserve

when FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise it is kept as is.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::filter_all()

See Also

dplyr::filter_all()


Wrapper around dplyr::filter_at that prints information about the operation

Description

Wrapper around dplyr::filter_at() that prints information about the operation.

Usage

filter_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::filter_at

.vars_predicate

A quoted predicate expression as returned by all_vars() or any_vars().

Can also be a function or purrr-like formula. In this case, the intersection of the results is taken by default and there's currently no way to request the union.

.preserve

when FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise it is kept as is.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::filter_at()

See Also

dplyr::filter_at()


Wrapper around dplyr::filter_if that prints information about the operation

Description

Wrapper around dplyr::filter_if() that prints information about the operation.

Usage

filter_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::filter_if

.vars_predicate

A quoted predicate expression as returned by all_vars() or any_vars().

Can also be a function or purrr-like formula. In this case, the intersection of the results is taken by default and there's currently no way to request the union.

.preserve

when FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise it is kept as is.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::filter_if()

See Also

dplyr::filter_if()


Wrapper around dplyr::filter_out that prints information about the operation

Description

Wrapper around dplyr::filter_out() that prints information about the operation.

Usage

filter_out(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::filter_out

.by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.preserve

Relevant when the .data input is grouped. If .preserve = FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::filter_out()

See Also

dplyr::filter_out()


Wrapper around dplyr::full_join that prints information about the operation

Description

Wrapper around dplyr::full_join() that prints information about the operation.

Usage

full_join(x, y, by = NULL, ...)

Arguments

x, y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

by

A join specification created with join_by(), or a character vector of variables to join by.

If NULL, the default, ⁠*_join()⁠ will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join on different variables between x and y, use a join_by() specification. For example, join_by(a == b) will match x$a to y$b.

To join by multiple variables, use a join_by() specification with multiple expressions. For example, join_by(a == b, c == d) will match x$a to y$b and x$c to y$d. If the column names are the same between x and y, you can shorten this by listing only the variable names, like join_by(a, c).

join_by() can also be used to perform inequality, rolling, and overlap joins. See the documentation at ?join_by for details on these types of joins.

For simple equality joins, you can alternatively specify a character vector of variable names to join by. For example, by = c("a", "b") joins x$a to y$a and x$b to y$b. If variable names differ between x and y, use a named character vector like by = c("x_a" = "y_a", "x_b" = "y_b").

To perform a cross-join, generating all combinations of x and y, see cross_join().

...

Arguments passed on to dplyr::full_join

x,y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

keep

Should the join keys from both x and y be preserved in the output?

  • If NULL, the default, joins on equality retain only the keys from x, while joins on inequality retain the keys from both inputs.

  • If TRUE, all keys from both inputs are retained.

  • If FALSE, only keys from x are retained. For right and full joins, the data in key columns corresponding to rows that only exist in y are merged into the key columns from x. Can't be used when joining on inequality conditions.

na_matches

Should two NA or two NaN values match?

  • "na", the default, treats two NA or two NaN values as equal, like %in%, match(), and merge().

  • "never" treats two NA or two NaN values as different, and will never match them together or to any other values. This is similar to joins for database sources and to base::merge(incomparables = NA).

multiple

Handling of rows in x with multiple matches in y. For each row of x:

  • "all", the default, returns every match detected in y. This is the same behavior as SQL.

  • "any" returns one match detected in y, with no guarantees on which match will be returned. It is often faster than "first" and "last" if you just need to detect if there is at least one match.

  • "first" returns the first match detected in y.

  • "last" returns the last match detected in y.

unmatched

How should unmatched keys that would result in dropped rows be handled?

  • "drop" drops unmatched keys from the result.

  • "error" throws an error if unmatched keys are detected.

unmatched is intended to protect you from accidentally dropping rows during a join. It only checks for unmatched keys in the input that could potentially drop rows.

  • For left joins, it checks y.

  • For right joins, it checks x.

  • For inner joins, it checks both x and y. In this case, unmatched is also allowed to be a character vector of length 2 to specify the behavior for x and y independently.

relationship

Handling of the expected relationship between the keys of x and y. If the expectations chosen from the list below are invalidated, an error is thrown.

  • NULL, the default, doesn't expect there to be any relationship between x and y. However, for equality joins it will check for a many-to-many relationship (which is typically unexpected) and will warn if one occurs, encouraging you to either take a closer look at your inputs or make this relationship explicit by specifying "many-to-many".

    See the Many-to-many relationships section for more details.

  • "one-to-one" expects:

    • Each row in x matches at most 1 row in y.

    • Each row in y matches at most 1 row in x.

  • "one-to-many" expects:

    • Each row in y matches at most 1 row in x.

  • "many-to-one" expects:

    • Each row in x matches at most 1 row in y.

  • "many-to-many" doesn't perform any relationship checks, but is provided to allow you to be explicit about this relationship if you know it exists.

relationship doesn't handle cases where there are zero matches. For that, see unmatched.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::full_join()

See Also

dplyr::full_join()


Wrapper around tidyr::gather that prints information about the operation

Description

Wrapper around tidyr::gather() that prints information about the operation.

Usage

gather(data, ...)

Arguments

data

A data frame.

...

Arguments passed on to tidyr::gather

key,value

Names of new key and value columns, as strings or symbols.

This argument is passed by expression and supports quasiquotation (you can unquote strings and symbols). The name is captured from the expression with rlang::ensym() (note that this kind of interface where symbols do not represent actual objects is now discouraged in the tidyverse; we support it here for backward compatibility).

na.rm

If TRUE, will remove rows from output where the value column is NA.

convert

If TRUE will automatically run type.convert() on the key column. This is useful if the column types are actually numeric, integer, or logical.

factor_key

If FALSE, the default, the key values will be stored as a character vector. If TRUE, will be stored as a factor, which preserves the original ordering of the columns.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::gather()

See Also

tidyr::gather()


Wrapper around dplyr::group_by that prints information about the operation

Description

Wrapper around dplyr::group_by() that prints information about the operation.

Usage

group_by(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::group_by

.add

When FALSE, the default, group_by() will override existing groups. To add to the existing groups, use .add = TRUE.

.drop

Drop groups formed by factor levels that don't appear in the data? The default is TRUE except when .data has been previously grouped with .drop = FALSE. See group_by_drop_default() for details.

x

A tbl()

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::group_by()

See Also

dplyr::group_by()


Wrapper around dplyr::group_by_all that prints information about the operation

Description

Wrapper around dplyr::group_by_all() that prints information about the operation.

Usage

group_by_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::group_by_all

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.add

See group_by()

.drop

Drop groups formed by factor levels that don't appear in the data? The default is TRUE except when .data has been previously grouped with .drop = FALSE. See group_by_drop_default() for details.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::group_by_all()

See Also

dplyr::group_by_all()


Wrapper around dplyr::group_by_at that prints information about the operation

Description

Wrapper around dplyr::group_by_at() that prints information about the operation.

Usage

group_by_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::group_by_at

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.add

See group_by()

.drop

Drop groups formed by factor levels that don't appear in the data? The default is TRUE except when .data has been previously grouped with .drop = FALSE. See group_by_drop_default() for details.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::group_by_at()

See Also

dplyr::group_by_at()


Wrapper around dplyr::group_by_if that prints information about the operation

Description

Wrapper around dplyr::group_by_if() that prints information about the operation.

Usage

group_by_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::group_by_if

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.add

See group_by()

.drop

Drop groups formed by factor levels that don't appear in the data? The default is TRUE except when .data has been previously grouped with .drop = FALSE. See group_by_drop_default() for details.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::group_by_if()

See Also

dplyr::group_by_if()


Wrapper around dplyr::inner_join that prints information about the operation

Description

Wrapper around dplyr::inner_join() that prints information about the operation.

Usage

inner_join(x, y, by = NULL, ...)

Arguments

x, y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

by

A join specification created with join_by(), or a character vector of variables to join by.

If NULL, the default, ⁠*_join()⁠ will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join on different variables between x and y, use a join_by() specification. For example, join_by(a == b) will match x$a to y$b.

To join by multiple variables, use a join_by() specification with multiple expressions. For example, join_by(a == b, c == d) will match x$a to y$b and x$c to y$d. If the column names are the same between x and y, you can shorten this by listing only the variable names, like join_by(a, c).

join_by() can also be used to perform inequality, rolling, and overlap joins. See the documentation at ?join_by for details on these types of joins.

For simple equality joins, you can alternatively specify a character vector of variable names to join by. For example, by = c("a", "b") joins x$a to y$a and x$b to y$b. If variable names differ between x and y, use a named character vector like by = c("x_a" = "y_a", "x_b" = "y_b").

To perform a cross-join, generating all combinations of x and y, see cross_join().

...

Arguments passed on to dplyr::inner_join

x,y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

keep

Should the join keys from both x and y be preserved in the output?

  • If NULL, the default, joins on equality retain only the keys from x, while joins on inequality retain the keys from both inputs.

  • If TRUE, all keys from both inputs are retained.

  • If FALSE, only keys from x are retained. For right and full joins, the data in key columns corresponding to rows that only exist in y are merged into the key columns from x. Can't be used when joining on inequality conditions.

na_matches

Should two NA or two NaN values match?

  • "na", the default, treats two NA or two NaN values as equal, like %in%, match(), and merge().

  • "never" treats two NA or two NaN values as different, and will never match them together or to any other values. This is similar to joins for database sources and to base::merge(incomparables = NA).

multiple

Handling of rows in x with multiple matches in y. For each row of x:

  • "all", the default, returns every match detected in y. This is the same behavior as SQL.

  • "any" returns one match detected in y, with no guarantees on which match will be returned. It is often faster than "first" and "last" if you just need to detect if there is at least one match.

  • "first" returns the first match detected in y.

  • "last" returns the last match detected in y.

unmatched

How should unmatched keys that would result in dropped rows be handled?

  • "drop" drops unmatched keys from the result.

  • "error" throws an error if unmatched keys are detected.

unmatched is intended to protect you from accidentally dropping rows during a join. It only checks for unmatched keys in the input that could potentially drop rows.

  • For left joins, it checks y.

  • For right joins, it checks x.

  • For inner joins, it checks both x and y. In this case, unmatched is also allowed to be a character vector of length 2 to specify the behavior for x and y independently.

relationship

Handling of the expected relationship between the keys of x and y. If the expectations chosen from the list below are invalidated, an error is thrown.

  • NULL, the default, doesn't expect there to be any relationship between x and y. However, for equality joins it will check for a many-to-many relationship (which is typically unexpected) and will warn if one occurs, encouraging you to either take a closer look at your inputs or make this relationship explicit by specifying "many-to-many".

    See the Many-to-many relationships section for more details.

  • "one-to-one" expects:

    • Each row in x matches at most 1 row in y.

    • Each row in y matches at most 1 row in x.

  • "one-to-many" expects:

    • Each row in y matches at most 1 row in x.

  • "many-to-one" expects:

    • Each row in x matches at most 1 row in y.

  • "many-to-many" doesn't perform any relationship checks, but is provided to allow you to be explicit about this relationship if you know it exists.

relationship doesn't handle cases where there are zero matches. For that, see unmatched.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::inner_join()

See Also

dplyr::inner_join()


Wrapper around dplyr::left_join that prints information about the operation

Description

Wrapper around dplyr::left_join() that prints information about the operation.

Usage

left_join(x, y, by = NULL, ...)

Arguments

x, y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

by

A join specification created with join_by(), or a character vector of variables to join by.

If NULL, the default, ⁠*_join()⁠ will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join on different variables between x and y, use a join_by() specification. For example, join_by(a == b) will match x$a to y$b.

To join by multiple variables, use a join_by() specification with multiple expressions. For example, join_by(a == b, c == d) will match x$a to y$b and x$c to y$d. If the column names are the same between x and y, you can shorten this by listing only the variable names, like join_by(a, c).

join_by() can also be used to perform inequality, rolling, and overlap joins. See the documentation at ?join_by for details on these types of joins.

For simple equality joins, you can alternatively specify a character vector of variable names to join by. For example, by = c("a", "b") joins x$a to y$a and x$b to y$b. If variable names differ between x and y, use a named character vector like by = c("x_a" = "y_a", "x_b" = "y_b").

To perform a cross-join, generating all combinations of x and y, see cross_join().

...

Arguments passed on to dplyr::left_join

x,y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

keep

Should the join keys from both x and y be preserved in the output?

  • If NULL, the default, joins on equality retain only the keys from x, while joins on inequality retain the keys from both inputs.

  • If TRUE, all keys from both inputs are retained.

  • If FALSE, only keys from x are retained. For right and full joins, the data in key columns corresponding to rows that only exist in y are merged into the key columns from x. Can't be used when joining on inequality conditions.

na_matches

Should two NA or two NaN values match?

  • "na", the default, treats two NA or two NaN values as equal, like %in%, match(), and merge().

  • "never" treats two NA or two NaN values as different, and will never match them together or to any other values. This is similar to joins for database sources and to base::merge(incomparables = NA).

multiple

Handling of rows in x with multiple matches in y. For each row of x:

  • "all", the default, returns every match detected in y. This is the same behavior as SQL.

  • "any" returns one match detected in y, with no guarantees on which match will be returned. It is often faster than "first" and "last" if you just need to detect if there is at least one match.

  • "first" returns the first match detected in y.

  • "last" returns the last match detected in y.

unmatched

How should unmatched keys that would result in dropped rows be handled?

  • "drop" drops unmatched keys from the result.

  • "error" throws an error if unmatched keys are detected.

unmatched is intended to protect you from accidentally dropping rows during a join. It only checks for unmatched keys in the input that could potentially drop rows.

  • For left joins, it checks y.

  • For right joins, it checks x.

  • For inner joins, it checks both x and y. In this case, unmatched is also allowed to be a character vector of length 2 to specify the behavior for x and y independently.

relationship

Handling of the expected relationship between the keys of x and y. If the expectations chosen from the list below are invalidated, an error is thrown.

  • NULL, the default, doesn't expect there to be any relationship between x and y. However, for equality joins it will check for a many-to-many relationship (which is typically unexpected) and will warn if one occurs, encouraging you to either take a closer look at your inputs or make this relationship explicit by specifying "many-to-many".

    See the Many-to-many relationships section for more details.

  • "one-to-one" expects:

    • Each row in x matches at most 1 row in y.

    • Each row in y matches at most 1 row in x.

  • "one-to-many" expects:

    • Each row in y matches at most 1 row in x.

  • "many-to-one" expects:

    • Each row in x matches at most 1 row in y.

  • "many-to-many" doesn't perform any relationship checks, but is provided to allow you to be explicit about this relationship if you know it exists.

relationship doesn't handle cases where there are zero matches. For that, see unmatched.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::left_join()

See Also

dplyr::left_join()


Wrapper around dplyr::mutate that prints information about the operation

Description

Wrapper around dplyr::mutate() that prints information about the operation.

Usage

mutate(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::mutate

.by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.keep

Control which columns from .data are retained in the output. Grouping columns and columns created by ... are always kept.

  • "all" retains all columns from .data. This is the default.

  • "used" retains only the columns used in ... to create new columns. This is useful for checking your work, as it displays inputs and outputs side-by-side.

  • "unused" retains only the columns not used in ... to create new columns. This is useful if you generate new columns, but no longer need the columns used to generate them.

  • "none" doesn't retain any extra columns from .data. Only the grouping variables and columns created by ... are kept.

.before,.after

<tidy-select> Optionally, control where new columns should appear (the default is to add to the right hand side). See relocate() for more details.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::mutate()

See Also

dplyr::mutate()


Wrapper around dplyr::mutate_all that prints information about the operation

Description

Wrapper around dplyr::mutate_all() that prints information about the operation.

Usage

mutate_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::mutate_all

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::mutate_all()

See Also

dplyr::mutate_all()


Wrapper around dplyr::mutate_at that prints information about the operation

Description

Wrapper around dplyr::mutate_at() that prints information about the operation.

Usage

mutate_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::mutate_at

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::mutate_at()

See Also

dplyr::mutate_at()


Wrapper around dplyr::mutate_if that prints information about the operation

Description

Wrapper around dplyr::mutate_if() that prints information about the operation.

Usage

mutate_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::mutate_if

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::mutate_if()

See Also

dplyr::mutate_if()


Wrapper around tidyr::pivot_longer that prints information about the operation

Description

Wrapper around tidyr::pivot_longer() that prints information about the operation.

Usage

pivot_longer(data, ...)

Arguments

data

A data frame to pivot.

...

Arguments passed on to tidyr::pivot_longer

cols

<tidy-select> Columns to pivot into longer format.

cols_vary

When pivoting cols into longer format, how should the output rows be arranged relative to their original row number?

  • "fastest", the default, keeps individual rows from cols close together in the output. This often produces intuitively ordered output when you have at least one key column from data that is not involved in the pivoting process.

  • "slowest" keeps individual columns from cols close together in the output. This often produces intuitively ordered output when you utilize all of the columns from data in the pivoting process.

names_to

A character vector specifying the new column or columns to create from the information stored in the column names of data specified by cols.

  • If length 0, or if NULL is supplied, no columns will be created.

  • If length 1, a single column will be created which will contain the column names specified by cols.

  • If length >1, multiple columns will be created. In this case, one of names_sep or names_pattern must be supplied to specify how the column names should be split. There are also two additional character values you can take advantage of:

    • NA will discard the corresponding component of the column name.

    • ".value" indicates that the corresponding component of the column name defines the name of the output column containing the cell values, overriding values_to entirely.

names_prefix

A regular expression used to remove matching text from the start of each variable name.

names_sep,names_pattern

If names_to contains multiple values, these arguments control how the column name is broken up.

names_sep takes the same specification as separate(), and can either be a numeric vector (specifying positions to break on), or a single string (specifying a regular expression to split on).

names_pattern takes the same specification as extract(), a regular expression containing matching groups (⁠()⁠).

If these arguments do not give you enough control, use pivot_longer_spec() to create a spec object and process manually as needed.

names_ptypes,values_ptypes

Optionally, a list of column name-prototype pairs. Alternatively, a single empty prototype can be supplied, which will be applied to all columns. A prototype (or ptype for short) is a zero-length vector (like integer() or numeric()) that defines the type, class, and attributes of a vector. Use these arguments if you want to confirm that the created columns are the types that you expect. Note that if you want to change (instead of confirm) the types of specific columns, you should use names_transform or values_transform instead.

names_transform,values_transform

Optionally, a list of column name-function pairs. Alternatively, a single function can be supplied, which will be applied to all columns. Use these arguments if you need to change the types of specific columns. For example, names_transform = list(week = as.integer) would convert a character variable called week to an integer.

If not specified, the type of the columns generated from names_to will be character, and the type of the variables generated from values_to will be the common type of the input columns used to generate them.

names_repair

What happens if the output has invalid column names? The default, "check_unique" is to error if the columns are duplicated. Use "minimal" to allow duplicates in the output, or "unique" to de-duplicated by adding numeric suffixes. See vctrs::vec_as_names() for more options.

values_to

A string specifying the name of the column to create from the data stored in cell values. If names_to is a character containing the special .value sentinel, this value will be ignored, and the name of the value column will be derived from part of the existing column names.

values_drop_na

If TRUE, will drop rows that contain only NAs in the values_to column. This effectively converts explicit missing values to implicit missing values, and should generally be used only when missing values in data were created by its structure.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::pivot_longer()

See Also

tidyr::pivot_longer()


Wrapper around tidyr::pivot_wider that prints information about the operation

Description

Wrapper around tidyr::pivot_wider() that prints information about the operation.

Usage

pivot_wider(data, ...)

Arguments

data

A data frame to pivot.

...

Arguments passed on to tidyr::pivot_wider

id_cols

<tidy-select> A set of columns that uniquely identify each observation. Typically used when you have redundant variables, i.e. variables whose values are perfectly correlated with existing variables.

Defaults to all columns in data except for the columns specified through names_from and values_from. If a tidyselect expression is supplied, it will be evaluated on data after removing the columns specified through names_from and values_from.

id_expand

Should the values in the id_cols columns be expanded by expand() before pivoting? This results in more rows, the output will contain a complete expansion of all possible values in id_cols. Implicit factor levels that aren't represented in the data will become explicit. Additionally, the row values corresponding to the expanded id_cols will be sorted.

names_from,values_from

<tidy-select> A pair of arguments describing which column (or columns) to get the name of the output column (names_from), and which column (or columns) to get the cell values from (values_from).

If values_from contains multiple values, the value will be added to the front of the output column.

names_prefix

String added to the start of every variable name. This is particularly useful if names_from is a numeric vector and you want to create syntactic variable names.

names_sep

If names_from or values_from contains multiple variables, this will be used to join their values together into a single string to use as a column name.

names_glue

Instead of names_sep and names_prefix, you can supply a glue specification that uses the names_from columns (and special .value) to create custom column names.

names_sort

Should the column names be sorted? If FALSE, the default, column names are ordered by first appearance.

names_vary

When names_from identifies a column (or columns) with multiple unique values, and multiple values_from columns are provided, in what order should the resulting column names be combined?

  • "fastest" varies names_from values fastest, resulting in a column naming scheme of the form: ⁠value1_name1, value1_name2, value2_name1, value2_name2⁠. This is the default.

  • "slowest" varies names_from values slowest, resulting in a column naming scheme of the form: ⁠value1_name1, value2_name1, value1_name2, value2_name2⁠.

names_expand

Should the values in the names_from columns be expanded by expand() before pivoting? This results in more columns, the output will contain column names corresponding to a complete expansion of all possible values in names_from. Implicit factor levels that aren't represented in the data will become explicit. Additionally, the column names will be sorted, identical to what names_sort would produce.

names_repair

What happens if the output has invalid column names? The default, "check_unique" is to error if the columns are duplicated. Use "minimal" to allow duplicates in the output, or "unique" to de-duplicated by adding numeric suffixes. See vctrs::vec_as_names() for more options.

values_fill

Optionally, a (scalar) value that specifies what each value should be filled in with when missing.

This can be a named list if you want to apply different fill values to different value columns.

values_fn

Optionally, a function applied to the value in each cell in the output. You will typically use this when the combination of id_cols and names_from columns does not uniquely identify an observation.

This can be a named list if you want to apply different aggregations to different values_from columns.

unused_fn

Optionally, a function applied to summarize the values from the unused columns (i.e. columns not identified by id_cols, names_from, or values_from).

The default drops all unused columns from the result.

This can be a named list if you want to apply different aggregations to different unused columns.

id_cols must be supplied for unused_fn to be useful, since otherwise all unspecified columns will be considered id_cols.

This is similar to grouping by the id_cols then summarizing the unused columns using unused_fn.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::pivot_wider()

See Also

tidyr::pivot_wider()


Wrapper around dplyr::relocate that prints information about the operation

Description

Wrapper around dplyr::relocate() that prints information about the operation.

Usage

relocate(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::relocate

.before,.after

<tidy-select> Destination of columns selected by .... Supplying neither will move columns to the left-hand side; specifying both is an error.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::relocate()

See Also

dplyr::relocate()


Wrapper around dplyr::rename that prints information about the operation

Description

Wrapper around dplyr::rename() that prints information about the operation.

Usage

rename(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::rename

.fn

A function used to transform the selected .cols. Should return a character vector the same length as the input.

.cols

<tidy-select> Columns to rename; defaults to all columns.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::rename()

See Also

dplyr::rename()


Wrapper around dplyr::rename_all that prints information about the operation

Description

Wrapper around dplyr::rename_all() that prints information about the operation.

Usage

rename_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::rename_all

.funs

A function fun, a purrr style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::rename_all()

See Also

dplyr::rename_all()


Wrapper around dplyr::rename_at that prints information about the operation

Description

Wrapper around dplyr::rename_at() that prints information about the operation.

Usage

rename_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::rename_at

.funs

A function fun, a purrr style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::rename_at()

See Also

dplyr::rename_at()


Wrapper around dplyr::rename_if that prints information about the operation

Description

Wrapper around dplyr::rename_if() that prints information about the operation.

Usage

rename_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::rename_if

.funs

A function fun, a purrr style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::rename_if()

See Also

dplyr::rename_if()


Wrapper around dplyr::rename_with that prints information about the operation

Description

Wrapper around dplyr::rename_with() that prints information about the operation.

Usage

rename_with(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::rename_with

.fn

A function used to transform the selected .cols. Should return a character vector the same length as the input.

.cols

<tidy-select> Columns to rename; defaults to all columns.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::rename_with()

See Also

dplyr::rename_with()


Wrapper around tidyr::replace_na that prints information about the operation

Description

Wrapper around tidyr::replace_na() that prints information about the operation.

Usage

replace_na(data, ...)

Arguments

data

A data frame or vector.

...

Arguments passed on to tidyr::replace_na

replace

If data is a data frame, replace takes a named list of values, with one value for each column that has missing values to be replaced. Each value in replace will be cast to the type of the column in data that it being used as a replacement in.

If data is a vector, replace takes a single value. This single value replaces all of the missing values in the vector. replace will be cast to the type of data.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::replace_na()

See Also

tidyr::replace_na()


Wrapper around dplyr::right_join that prints information about the operation

Description

Wrapper around dplyr::right_join() that prints information about the operation.

Usage

right_join(x, y, by = NULL, ...)

Arguments

x, y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

by

A join specification created with join_by(), or a character vector of variables to join by.

If NULL, the default, ⁠*_join()⁠ will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join on different variables between x and y, use a join_by() specification. For example, join_by(a == b) will match x$a to y$b.

To join by multiple variables, use a join_by() specification with multiple expressions. For example, join_by(a == b, c == d) will match x$a to y$b and x$c to y$d. If the column names are the same between x and y, you can shorten this by listing only the variable names, like join_by(a, c).

join_by() can also be used to perform inequality, rolling, and overlap joins. See the documentation at ?join_by for details on these types of joins.

For simple equality joins, you can alternatively specify a character vector of variable names to join by. For example, by = c("a", "b") joins x$a to y$a and x$b to y$b. If variable names differ between x and y, use a named character vector like by = c("x_a" = "y_a", "x_b" = "y_b").

To perform a cross-join, generating all combinations of x and y, see cross_join().

...

Arguments passed on to dplyr::right_join

x,y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

keep

Should the join keys from both x and y be preserved in the output?

  • If NULL, the default, joins on equality retain only the keys from x, while joins on inequality retain the keys from both inputs.

  • If TRUE, all keys from both inputs are retained.

  • If FALSE, only keys from x are retained. For right and full joins, the data in key columns corresponding to rows that only exist in y are merged into the key columns from x. Can't be used when joining on inequality conditions.

na_matches

Should two NA or two NaN values match?

  • "na", the default, treats two NA or two NaN values as equal, like %in%, match(), and merge().

  • "never" treats two NA or two NaN values as different, and will never match them together or to any other values. This is similar to joins for database sources and to base::merge(incomparables = NA).

multiple

Handling of rows in x with multiple matches in y. For each row of x:

  • "all", the default, returns every match detected in y. This is the same behavior as SQL.

  • "any" returns one match detected in y, with no guarantees on which match will be returned. It is often faster than "first" and "last" if you just need to detect if there is at least one match.

  • "first" returns the first match detected in y.

  • "last" returns the last match detected in y.

unmatched

How should unmatched keys that would result in dropped rows be handled?

  • "drop" drops unmatched keys from the result.

  • "error" throws an error if unmatched keys are detected.

unmatched is intended to protect you from accidentally dropping rows during a join. It only checks for unmatched keys in the input that could potentially drop rows.

  • For left joins, it checks y.

  • For right joins, it checks x.

  • For inner joins, it checks both x and y. In this case, unmatched is also allowed to be a character vector of length 2 to specify the behavior for x and y independently.

relationship

Handling of the expected relationship between the keys of x and y. If the expectations chosen from the list below are invalidated, an error is thrown.

  • NULL, the default, doesn't expect there to be any relationship between x and y. However, for equality joins it will check for a many-to-many relationship (which is typically unexpected) and will warn if one occurs, encouraging you to either take a closer look at your inputs or make this relationship explicit by specifying "many-to-many".

    See the Many-to-many relationships section for more details.

  • "one-to-one" expects:

    • Each row in x matches at most 1 row in y.

    • Each row in y matches at most 1 row in x.

  • "one-to-many" expects:

    • Each row in y matches at most 1 row in x.

  • "many-to-one" expects:

    • Each row in x matches at most 1 row in y.

  • "many-to-many" doesn't perform any relationship checks, but is provided to allow you to be explicit about this relationship if you know it exists.

relationship doesn't handle cases where there are zero matches. For that, see unmatched.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::right_join()

See Also

dplyr::right_join()


Wrapper around dplyr::sample_frac that prints information about the operation

Description

Wrapper around dplyr::sample_frac() that prints information about the operation.

Usage

sample_frac(tbl, ...)

Arguments

tbl

A data.frame.

...

Arguments passed on to dplyr::sample_frac

size

<tidy-select> For sample_n(), the number of rows to select. For sample_frac(), the fraction of rows to select. If tbl is grouped, size applies to each group.

replace

Sample with or without replacement?

weight

<tidy-select> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1.

.env

DEPRECATED.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::sample_frac()

See Also

dplyr::sample_frac()


Wrapper around dplyr::sample_n that prints information about the operation

Description

Wrapper around dplyr::sample_n() that prints information about the operation.

Usage

sample_n(tbl, ...)

Arguments

tbl

A data.frame.

...

Arguments passed on to dplyr::sample_n

size

<tidy-select> For sample_n(), the number of rows to select. For sample_frac(), the fraction of rows to select. If tbl is grouped, size applies to each group.

replace

Sample with or without replacement?

weight

<tidy-select> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1.

.env

DEPRECATED.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::sample_n()

See Also

dplyr::sample_n()


Wrapper around dplyr::select that prints information about the operation

Description

Wrapper around dplyr::select() that prints information about the operation.

Usage

select(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

<tidy-select> One or more unquoted expressions separated by commas. Variable names can be used as if they were positions in the data frame, so expressions like x:y can be used to select a range of variables.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::select()

See Also

dplyr::select()


Wrapper around dplyr::select_all that prints information about the operation

Description

Wrapper around dplyr::select_all() that prints information about the operation.

Usage

select_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::select_all

.funs

A function fun, a purrr style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::select_all()

See Also

dplyr::select_all()


Wrapper around dplyr::select_at that prints information about the operation

Description

Wrapper around dplyr::select_at() that prints information about the operation.

Usage

select_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::select_at

.funs

A function fun, a purrr style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::select_at()

See Also

dplyr::select_at()


Wrapper around dplyr::select_if that prints information about the operation

Description

Wrapper around dplyr::select_if() that prints information about the operation.

Usage

select_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::select_if

.funs

A function fun, a purrr style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::select_if()

See Also

dplyr::select_if()


Wrapper around dplyr::semi_join that prints information about the operation

Description

Wrapper around dplyr::semi_join() that prints information about the operation.

Usage

semi_join(x, y, by = NULL, ...)

Arguments

x, y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

by

A join specification created with join_by(), or a character vector of variables to join by.

If NULL, the default, ⁠*_join()⁠ will perform a natural join, using all variables in common across x and y. A message lists the variables so that you can check they're correct; suppress the message by supplying by explicitly.

To join on different variables between x and y, use a join_by() specification. For example, join_by(a == b) will match x$a to y$b.

To join by multiple variables, use a join_by() specification with multiple expressions. For example, join_by(a == b, c == d) will match x$a to y$b and x$c to y$d. If the column names are the same between x and y, you can shorten this by listing only the variable names, like join_by(a, c).

join_by() can also be used to perform inequality, rolling, and overlap joins. See the documentation at ?join_by for details on these types of joins.

For simple equality joins, you can alternatively specify a character vector of variable names to join by. For example, by = c("a", "b") joins x$a to y$a and x$b to y$b. If variable names differ between x and y, use a named character vector like by = c("x_a" = "y_a", "x_b" = "y_b").

To perform a cross-join, generating all combinations of x and y, see cross_join().

...

Arguments passed on to dplyr::semi_join

x,y

A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

na_matches

Should two NA or two NaN values match?

  • "na", the default, treats two NA or two NaN values as equal, like %in%, match(), and merge().

  • "never" treats two NA or two NaN values as different, and will never match them together or to any other values. This is similar to joins for database sources and to base::merge(incomparables = NA).

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::semi_join()

See Also

dplyr::semi_join()


Wrapper around tidyr::separate_wider_delim that prints information about the operation

Description

Wrapper around tidyr::separate_wider_delim() that prints information about the operation.

Usage

separate_wider_delim(data, ...)

Arguments

data

A data frame.

...

Arguments passed on to tidyr::separate_wider_delim

cols

<tidy-select> Columns to separate.

delim

For separate_wider_delim(), a string giving the delimiter between values. By default, it is interpreted as a fixed string; use stringr::regex() and friends to split in other ways.

names

For separate_wider_delim(), a character vector of output column names. Use NA if there are components that you don't want to appear in the output; the number of non-NA elements determines the number of new columns in the result.

names_sep

If supplied, output names will be composed of the input column name followed by the separator followed by the new column name. Required when cols selects multiple columns.

For separate_wider_delim() you can specify instead of names, in which case the names will be generated from the source column name, names_sep, and a numeric suffix.

names_repair

Used to check that output data frame has valid names. Must be one of the following options:

  • ⁠"minimal⁠": no name repair or checks, beyond basic existence,

  • ⁠"unique⁠": make sure names are unique and not empty,

  • ⁠"check_unique⁠": (the default), no name repair, but check they are unique,

  • ⁠"universal⁠": make the names unique and syntactic

  • a function: apply custom name repair.

  • tidyr_legacy: use the name repair from tidyr 0.8.

  • a formula: a purrr-style anonymous function (see rlang::as_function())

See vctrs::vec_as_names() for more details on these terms and the strategies used to enforce them.

too_few

What should happen if a value separates into too few pieces?

  • "error", the default, will throw an error.

  • "debug" adds additional columns to the output to help you locate and resolve the underlying problem. This option is intended to help you debug the issue and address and should not generally remain in your final code.

  • "align_start" aligns starts of short matches, adding NA on the end to pad to the correct length.

  • "align_end" (separate_wider_delim() only) aligns the ends of short matches, adding NA at the start to pad to the correct length.

too_many

What should happen if a value separates into too many pieces?

  • "error", the default, will throw an error.

  • "debug" will add additional columns to the output to help you locate and resolve the underlying problem.

  • "drop" will silently drop any extra pieces.

  • "merge" (separate_wider_delim() only) will merge together any additional pieces.

cols_remove

Should the input cols be removed from the output? Always FALSE if too_few or too_many are set to "debug".

widths

A named numeric vector where the names become column names, and the values specify the column width. Unnamed components will match, but not be included in the output.

patterns

A named character vector where the names become column names and the values are regular expressions that match the contents of the vector. Unnamed components will match, but not be included in the output.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::separate_wider_delim()

See Also

tidyr::separate_wider_delim()


Wrapper around tidyr::separate_wider_position that prints information about the operation

Description

Wrapper around tidyr::separate_wider_position() that prints information about the operation.

Usage

separate_wider_position(data, ...)

Arguments

data

A data frame.

...

Arguments passed on to tidyr::separate_wider_position

cols

<tidy-select> Columns to separate.

delim

For separate_wider_delim(), a string giving the delimiter between values. By default, it is interpreted as a fixed string; use stringr::regex() and friends to split in other ways.

names

For separate_wider_delim(), a character vector of output column names. Use NA if there are components that you don't want to appear in the output; the number of non-NA elements determines the number of new columns in the result.

names_sep

If supplied, output names will be composed of the input column name followed by the separator followed by the new column name. Required when cols selects multiple columns.

For separate_wider_delim() you can specify instead of names, in which case the names will be generated from the source column name, names_sep, and a numeric suffix.

names_repair

Used to check that output data frame has valid names. Must be one of the following options:

  • ⁠"minimal⁠": no name repair or checks, beyond basic existence,

  • ⁠"unique⁠": make sure names are unique and not empty,

  • ⁠"check_unique⁠": (the default), no name repair, but check they are unique,

  • ⁠"universal⁠": make the names unique and syntactic

  • a function: apply custom name repair.

  • tidyr_legacy: use the name repair from tidyr 0.8.

  • a formula: a purrr-style anonymous function (see rlang::as_function())

See vctrs::vec_as_names() for more details on these terms and the strategies used to enforce them.

too_few

What should happen if a value separates into too few pieces?

  • "error", the default, will throw an error.

  • "debug" adds additional columns to the output to help you locate and resolve the underlying problem. This option is intended to help you debug the issue and address and should not generally remain in your final code.

  • "align_start" aligns starts of short matches, adding NA on the end to pad to the correct length.

  • "align_end" (separate_wider_delim() only) aligns the ends of short matches, adding NA at the start to pad to the correct length.

too_many

What should happen if a value separates into too many pieces?

  • "error", the default, will throw an error.

  • "debug" will add additional columns to the output to help you locate and resolve the underlying problem.

  • "drop" will silently drop any extra pieces.

  • "merge" (separate_wider_delim() only) will merge together any additional pieces.

cols_remove

Should the input cols be removed from the output? Always FALSE if too_few or too_many are set to "debug".

widths

A named numeric vector where the names become column names, and the values specify the column width. Unnamed components will match, but not be included in the output.

patterns

A named character vector where the names become column names and the values are regular expressions that match the contents of the vector. Unnamed components will match, but not be included in the output.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::separate_wider_position()

See Also

tidyr::separate_wider_position()


Wrapper around tidyr::separate_wider_regex that prints information about the operation

Description

Wrapper around tidyr::separate_wider_regex() that prints information about the operation.

Usage

separate_wider_regex(data, ...)

Arguments

data

A data frame.

...

Arguments passed on to tidyr::separate_wider_regex

cols

<tidy-select> Columns to separate.

delim

For separate_wider_delim(), a string giving the delimiter between values. By default, it is interpreted as a fixed string; use stringr::regex() and friends to split in other ways.

names

For separate_wider_delim(), a character vector of output column names. Use NA if there are components that you don't want to appear in the output; the number of non-NA elements determines the number of new columns in the result.

names_sep

If supplied, output names will be composed of the input column name followed by the separator followed by the new column name. Required when cols selects multiple columns.

For separate_wider_delim() you can specify instead of names, in which case the names will be generated from the source column name, names_sep, and a numeric suffix.

names_repair

Used to check that output data frame has valid names. Must be one of the following options:

  • ⁠"minimal⁠": no name repair or checks, beyond basic existence,

  • ⁠"unique⁠": make sure names are unique and not empty,

  • ⁠"check_unique⁠": (the default), no name repair, but check they are unique,

  • ⁠"universal⁠": make the names unique and syntactic

  • a function: apply custom name repair.

  • tidyr_legacy: use the name repair from tidyr 0.8.

  • a formula: a purrr-style anonymous function (see rlang::as_function())

See vctrs::vec_as_names() for more details on these terms and the strategies used to enforce them.

too_few

What should happen if a value separates into too few pieces?

  • "error", the default, will throw an error.

  • "debug" adds additional columns to the output to help you locate and resolve the underlying problem. This option is intended to help you debug the issue and address and should not generally remain in your final code.

  • "align_start" aligns starts of short matches, adding NA on the end to pad to the correct length.

  • "align_end" (separate_wider_delim() only) aligns the ends of short matches, adding NA at the start to pad to the correct length.

too_many

What should happen if a value separates into too many pieces?

  • "error", the default, will throw an error.

  • "debug" will add additional columns to the output to help you locate and resolve the underlying problem.

  • "drop" will silently drop any extra pieces.

  • "merge" (separate_wider_delim() only) will merge together any additional pieces.

cols_remove

Should the input cols be removed from the output? Always FALSE if too_few or too_many are set to "debug".

widths

A named numeric vector where the names become column names, and the values specify the column width. Unnamed components will match, but not be included in the output.

patterns

A named character vector where the names become column names and the values are regular expressions that match the contents of the vector. Unnamed components will match, but not be included in the output.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::separate_wider_regex()

See Also

tidyr::separate_wider_regex()


Wrapper around dplyr::slice that prints information about the operation

Description

Wrapper around dplyr::slice() that prints information about the operation.

Usage

slice(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::slice

.by,by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.preserve

Relevant when the .data input is grouped. If .preserve = FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

n,prop

Provide either n, the number of rows, or prop, the proportion of rows to select. If neither are supplied, n = 1 will be used. If n is greater than the number of rows in the group (or prop > 1), the result will be silently truncated to the group size. prop will be rounded towards zero to generate an integer number of rows.

A negative value of n or prop will be subtracted from the group size. For example, n = -2 with a group of 5 rows will select 5 - 2 = 3 rows; prop = -0.25 with 8 rows will select 8 * (1 - 0.25) = 6 rows.

order_by

<data-masking> Variable or function of variables to order by. To order by multiple variables, wrap them in a data frame or tibble.

with_ties

Should ties be kept together? The default, TRUE, may return more rows than you request. Use FALSE to ignore ties, and return the first n rows.

na_rm

Should missing values in order_by be removed from the result? If FALSE, NA values are sorted to the end (like in arrange()), so they will only be included if there are insufficient non-missing values to reach n/prop.

weight_by

<data-masking> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. See the Details section for more technical details regarding these weights.

replace

Should sampling be performed with (TRUE) or without (FALSE, the default) replacement.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::slice()

See Also

dplyr::slice()


Wrapper around dplyr::slice_head that prints information about the operation

Description

Wrapper around dplyr::slice_head() that prints information about the operation.

Usage

slice_head(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::slice_head

.by,by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.preserve

Relevant when the .data input is grouped. If .preserve = FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

n,prop

Provide either n, the number of rows, or prop, the proportion of rows to select. If neither are supplied, n = 1 will be used. If n is greater than the number of rows in the group (or prop > 1), the result will be silently truncated to the group size. prop will be rounded towards zero to generate an integer number of rows.

A negative value of n or prop will be subtracted from the group size. For example, n = -2 with a group of 5 rows will select 5 - 2 = 3 rows; prop = -0.25 with 8 rows will select 8 * (1 - 0.25) = 6 rows.

order_by

<data-masking> Variable or function of variables to order by. To order by multiple variables, wrap them in a data frame or tibble.

with_ties

Should ties be kept together? The default, TRUE, may return more rows than you request. Use FALSE to ignore ties, and return the first n rows.

na_rm

Should missing values in order_by be removed from the result? If FALSE, NA values are sorted to the end (like in arrange()), so they will only be included if there are insufficient non-missing values to reach n/prop.

weight_by

<data-masking> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. See the Details section for more technical details regarding these weights.

replace

Should sampling be performed with (TRUE) or without (FALSE, the default) replacement.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::slice_head()

See Also

dplyr::slice_head()


Wrapper around dplyr::slice_max that prints information about the operation

Description

Wrapper around dplyr::slice_max() that prints information about the operation.

Usage

slice_max(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::slice_max

.by,by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.preserve

Relevant when the .data input is grouped. If .preserve = FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

n,prop

Provide either n, the number of rows, or prop, the proportion of rows to select. If neither are supplied, n = 1 will be used. If n is greater than the number of rows in the group (or prop > 1), the result will be silently truncated to the group size. prop will be rounded towards zero to generate an integer number of rows.

A negative value of n or prop will be subtracted from the group size. For example, n = -2 with a group of 5 rows will select 5 - 2 = 3 rows; prop = -0.25 with 8 rows will select 8 * (1 - 0.25) = 6 rows.

order_by

<data-masking> Variable or function of variables to order by. To order by multiple variables, wrap them in a data frame or tibble.

with_ties

Should ties be kept together? The default, TRUE, may return more rows than you request. Use FALSE to ignore ties, and return the first n rows.

na_rm

Should missing values in order_by be removed from the result? If FALSE, NA values are sorted to the end (like in arrange()), so they will only be included if there are insufficient non-missing values to reach n/prop.

weight_by

<data-masking> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. See the Details section for more technical details regarding these weights.

replace

Should sampling be performed with (TRUE) or without (FALSE, the default) replacement.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::slice_max()

See Also

dplyr::slice_max()


Wrapper around dplyr::slice_min that prints information about the operation

Description

Wrapper around dplyr::slice_min() that prints information about the operation.

Usage

slice_min(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::slice_min

.by,by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.preserve

Relevant when the .data input is grouped. If .preserve = FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

n,prop

Provide either n, the number of rows, or prop, the proportion of rows to select. If neither are supplied, n = 1 will be used. If n is greater than the number of rows in the group (or prop > 1), the result will be silently truncated to the group size. prop will be rounded towards zero to generate an integer number of rows.

A negative value of n or prop will be subtracted from the group size. For example, n = -2 with a group of 5 rows will select 5 - 2 = 3 rows; prop = -0.25 with 8 rows will select 8 * (1 - 0.25) = 6 rows.

order_by

<data-masking> Variable or function of variables to order by. To order by multiple variables, wrap them in a data frame or tibble.

with_ties

Should ties be kept together? The default, TRUE, may return more rows than you request. Use FALSE to ignore ties, and return the first n rows.

na_rm

Should missing values in order_by be removed from the result? If FALSE, NA values are sorted to the end (like in arrange()), so they will only be included if there are insufficient non-missing values to reach n/prop.

weight_by

<data-masking> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. See the Details section for more technical details regarding these weights.

replace

Should sampling be performed with (TRUE) or without (FALSE, the default) replacement.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::slice_min()

See Also

dplyr::slice_min()


Wrapper around dplyr::slice_sample that prints information about the operation

Description

Wrapper around dplyr::slice_sample() that prints information about the operation.

Usage

slice_sample(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::slice_sample

.by,by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.preserve

Relevant when the .data input is grouped. If .preserve = FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

n,prop

Provide either n, the number of rows, or prop, the proportion of rows to select. If neither are supplied, n = 1 will be used. If n is greater than the number of rows in the group (or prop > 1), the result will be silently truncated to the group size. prop will be rounded towards zero to generate an integer number of rows.

A negative value of n or prop will be subtracted from the group size. For example, n = -2 with a group of 5 rows will select 5 - 2 = 3 rows; prop = -0.25 with 8 rows will select 8 * (1 - 0.25) = 6 rows.

order_by

<data-masking> Variable or function of variables to order by. To order by multiple variables, wrap them in a data frame or tibble.

with_ties

Should ties be kept together? The default, TRUE, may return more rows than you request. Use FALSE to ignore ties, and return the first n rows.

na_rm

Should missing values in order_by be removed from the result? If FALSE, NA values are sorted to the end (like in arrange()), so they will only be included if there are insufficient non-missing values to reach n/prop.

weight_by

<data-masking> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. See the Details section for more technical details regarding these weights.

replace

Should sampling be performed with (TRUE) or without (FALSE, the default) replacement.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::slice_sample()

See Also

dplyr::slice_sample()


Wrapper around dplyr::slice_tail that prints information about the operation

Description

Wrapper around dplyr::slice_tail() that prints information about the operation.

Usage

slice_tail(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::slice_tail

.by,by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.preserve

Relevant when the .data input is grouped. If .preserve = FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.

n,prop

Provide either n, the number of rows, or prop, the proportion of rows to select. If neither are supplied, n = 1 will be used. If n is greater than the number of rows in the group (or prop > 1), the result will be silently truncated to the group size. prop will be rounded towards zero to generate an integer number of rows.

A negative value of n or prop will be subtracted from the group size. For example, n = -2 with a group of 5 rows will select 5 - 2 = 3 rows; prop = -0.25 with 8 rows will select 8 * (1 - 0.25) = 6 rows.

order_by

<data-masking> Variable or function of variables to order by. To order by multiple variables, wrap them in a data frame or tibble.

with_ties

Should ties be kept together? The default, TRUE, may return more rows than you request. Use FALSE to ignore ties, and return the first n rows.

na_rm

Should missing values in order_by be removed from the result? If FALSE, NA values are sorted to the end (like in arrange()), so they will only be included if there are insufficient non-missing values to reach n/prop.

weight_by

<data-masking> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. See the Details section for more technical details regarding these weights.

replace

Should sampling be performed with (TRUE) or without (FALSE, the default) replacement.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::slice_tail()

See Also

dplyr::slice_tail()


Wrapper around tidyr::spread that prints information about the operation

Description

Wrapper around tidyr::spread() that prints information about the operation.

Usage

spread(data, ...)

Arguments

data

A data frame.

...

Arguments passed on to tidyr::spread

key,value

<tidy-select> Columns to use for key and value.

fill

If set, missing values will be replaced with this value. Note that there are two types of missingness in the input: explicit missing values (i.e. NA), and implicit missings, rows that simply aren't present. Both types of missing value will be replaced by fill.

convert

If TRUE, type.convert() with asis = TRUE will be run on each of the new columns. This is useful if the value column was a mix of variables that was coerced to a string. If the class of the value column was factor or date, note that will not be true of the new columns that are produced, which are coerced to character before type conversion.

drop

If FALSE, will keep factor levels that don't appear in the data, filling in missing combinations with fill.

sep

If NULL, the column names will be taken from the values of key variable. If non-NULL, the column names will be given by "<key_name><sep><key_value>".

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::spread()

See Also

tidyr::spread()


Wrapper around dplyr::summarise that prints information about the operation

Description

Wrapper around dplyr::summarise() that prints information about the operation.

Usage

summarise(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::summarise

.by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.groups

[Experimental] Grouping structure of the result.

  • "drop_last": drops the last level of grouping. This was the only supported option before version 1.0.0.

  • "drop": All levels of grouping are dropped.

  • "keep": Same grouping structure as .data.

  • "rowwise": Each row is its own group.

When .groups is not specified, it is set to "drop_last" for a grouped data frame, and "keep" for a rowwise data frame. In addition, a message informs you of how the result will be grouped unless the result is ungrouped, the option "dplyr.summarise.inform" is set to FALSE, or when summarise() is called from a function in a package.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::summarise()

See Also

dplyr::summarise()


Wrapper around dplyr::summarise_all that prints information about the operation

Description

Wrapper around dplyr::summarise_all() that prints information about the operation.

Usage

summarise_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::summarise_all

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::summarise_all()

See Also

dplyr::summarise_all()


Wrapper around dplyr::summarise_at that prints information about the operation

Description

Wrapper around dplyr::summarise_at() that prints information about the operation.

Usage

summarise_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::summarise_at

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::summarise_at()

See Also

dplyr::summarise_at()


Wrapper around dplyr::summarise_if that prints information about the operation

Description

Wrapper around dplyr::summarise_if() that prints information about the operation.

Usage

summarise_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::summarise_if

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::summarise_if()

See Also

dplyr::summarise_if()


Wrapper around dplyr::summarize that prints information about the operation

Description

Wrapper around dplyr::summarize() that prints information about the operation.

Usage

summarize(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

Arguments passed on to dplyr::summarize

.by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.groups

[Experimental] Grouping structure of the result.

  • "drop_last": drops the last level of grouping. This was the only supported option before version 1.0.0.

  • "drop": All levels of grouping are dropped.

  • "keep": Same grouping structure as .data.

  • "rowwise": Each row is its own group.

When .groups is not specified, it is set to "drop_last" for a grouped data frame, and "keep" for a rowwise data frame. In addition, a message informs you of how the result will be grouped unless the result is ungrouped, the option "dplyr.summarise.inform" is set to FALSE, or when summarise() is called from a function in a package.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::summarize()

See Also

dplyr::summarize()


Wrapper around dplyr::summarize_all that prints information about the operation

Description

Wrapper around dplyr::summarize_all() that prints information about the operation.

Usage

summarize_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::summarize_all

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::summarize_all()

See Also

dplyr::summarize_all()


Wrapper around dplyr::summarize_at that prints information about the operation

Description

Wrapper around dplyr::summarize_at() that prints information about the operation.

Usage

summarize_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::summarize_at

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::summarize_at()

See Also

dplyr::summarize_at()


Wrapper around dplyr::summarize_if that prints information about the operation

Description

Wrapper around dplyr::summarize_if() that prints information about the operation.

Usage

summarize_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::summarize_if

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::summarize_if()

See Also

dplyr::summarize_if()


Wrapper around dplyr::tally that prints information about the operation

Description

Wrapper around dplyr::tally() that prints information about the operation.

Usage

tally(x, ...)

Arguments

x

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

...

Arguments passed on to dplyr::tally

wt

<data-masking> Frequency weights. Can be NULL or a variable:

  • If NULL (the default), counts the number of rows in each group.

  • If a variable, computes sum(wt) for each group.

sort

If TRUE, will show the largest groups at the top.

name

The name of the new column in the output.

If omitted, it will default to n. If there's already a column called n, it will use nn. If there's a column called n and nn, it'll use nnn, and so on, adding ns until it gets a new name.

.drop

Handling of factor levels that don't appear in the data, passed on to group_by().

For count(): if FALSE will include counts for empty groups (i.e. for levels of factors that don't exist in the data).

[Defunct] For add_count(): defunct since it can't actually affect the output.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::tally()

See Also

dplyr::tally()


outputs some information about the data frame/tbl

Description

outputs some information about the data frame/tbl

Usage

tidylog(.data)

Arguments

.data

a tbl/data frame

Value

same as .data

Examples

tidylog(mtcars)
#> tidylog: data.frame with 32 rows and 11 columns

Wrapper around dplyr::top_frac that prints information about the operation

Description

Wrapper around dplyr::top_frac() that prints information about the operation.

Usage

top_frac(x, ...)

Arguments

x

A data frame.

...

Arguments passed on to dplyr::top_frac

n

Number of rows to return for top_n(), fraction of rows to return for top_frac(). If n is positive, selects the top rows. If negative, selects the bottom rows. If x is grouped, this is the number (or fraction) of rows per group. Will include more rows if there are ties.

wt

(Optional). The variable to use for ordering. If not specified, defaults to the last variable in the tbl.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::top_frac()

See Also

dplyr::top_frac()


Wrapper around dplyr::top_n that prints information about the operation

Description

Wrapper around dplyr::top_n() that prints information about the operation.

Usage

top_n(x, ...)

Arguments

x

A data frame.

...

Arguments passed on to dplyr::top_n

n

Number of rows to return for top_n(), fraction of rows to return for top_frac(). If n is positive, selects the top rows. If negative, selects the bottom rows. If x is grouped, this is the number (or fraction) of rows per group. Will include more rows if there are ties.

wt

(Optional). The variable to use for ordering. If not specified, defaults to the last variable in the tbl.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::top_n()

See Also

dplyr::top_n()


Wrapper around dplyr::transmute that prints information about the operation

Description

Wrapper around dplyr::transmute() that prints information about the operation.

Usage

transmute(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

...

<data-masking> Name-value pairs. The name gives the name of the column in the output.

The value can be:

  • A vector of length 1, which will be recycled to the correct length.

  • A vector the same length as the current group (or the whole data frame if ungrouped).

  • NULL, to remove the column.

  • A data frame or tibble, to create multiple columns in the output.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::transmute()

See Also

dplyr::transmute()


Wrapper around dplyr::transmute_all that prints information about the operation

Description

Wrapper around dplyr::transmute_all() that prints information about the operation.

Usage

transmute_all(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::transmute_all

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::transmute_all()

See Also

dplyr::transmute_all()


Wrapper around dplyr::transmute_at that prints information about the operation

Description

Wrapper around dplyr::transmute_at() that prints information about the operation.

Usage

transmute_at(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::transmute_at

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::transmute_at()

See Also

dplyr::transmute_at()


Wrapper around dplyr::transmute_if that prints information about the operation

Description

Wrapper around dplyr::transmute_if() that prints information about the operation.

Usage

transmute_if(.tbl, ...)

Arguments

.tbl

A tbl object.

...

Arguments passed on to dplyr::transmute_if

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::transmute_if()

See Also

dplyr::transmute_if()


Wrapper around tidyr::uncount that prints information about the operation

Description

Wrapper around tidyr::uncount() that prints information about the operation.

Usage

uncount(data, ...)

Arguments

data

A data frame, tibble, or grouped tibble.

...

Arguments passed on to tidyr::uncount

weights

A vector of weights. Evaluated in the context of data; supports quasiquotation.

.remove

If TRUE, and weights is the name of a column in data, then this column is removed.

.id

Supply a string to create a new variable which gives a unique identifier for each created row.

Details

Documentation generated from tidyr version 1.3.2.

Value

See tidyr::uncount()

See Also

tidyr::uncount()


Wrapper around dplyr::ungroup that prints information about the operation

Description

Wrapper around dplyr::ungroup() that prints information about the operation.

Usage

ungroup(x, ...)

Arguments

x

A tbl()

...

Arguments passed on to dplyr::ungroup

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.

.add

When FALSE, the default, group_by() will override existing groups. To add to the existing groups, use .add = TRUE.

.drop

Drop groups formed by factor levels that don't appear in the data? The default is TRUE except when .data has been previously grouped with .drop = FALSE. See group_by_drop_default() for details.

Details

Documentation generated from dplyr version 1.2.1.

Value

See dplyr::ungroup()

See Also

dplyr::ungroup()