Benchmarks

Using tidylog adds a small overhead to each function call. For instance, because tidylog needs to figure out how many rows were dropped when you use tidylog::filter, this call will be a bit slower than using dplyr::filter directly. The overhead is usually not noticeable, but can be for larger datasets, especially when using joins. The benchmarks below give some impression of how large the overhead is.

library("dplyr")
library("tidylog", warn.conflicts = FALSE)
library("bench")
library("knitr")

filter

On a small dataset:

bench::mark(
    dplyr::filter(mtcars, cyl == 4),
    tidylog::filter(mtcars, cyl == 4), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::filter(mtcars, cyl == 4) 617.17µs 638.58µs 98
tidylog::filter(mtcars, cyl == 4) 1.54ms 1.58ms 95

On a larger dataset:

df <- tibble(x = rnorm(100000))

bench::mark(
    dplyr::filter(df, x > 0),
    tidylog::filter(df, x > 0), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::filter(df, x > 0) 1.09ms 1.14ms 96
tidylog::filter(df, x > 0) 2.21ms 2.27ms 95

mutate

On a small dataset:

bench::mark(
    dplyr::mutate(mtcars, cyl = as.factor(cyl)),
    tidylog::mutate(mtcars, cyl = as.factor(cyl)), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::mutate(mtcars, cyl = as.factor(cyl)) 726.12µs 760.75µs 98
tidylog::mutate(mtcars, cyl = as.factor(cyl)) 1.83ms 1.88ms 94

On a larger dataset:

df <- tibble(x = round(runif(10000) * 10))

bench::mark(
    dplyr::mutate(df, x = as.factor(x)),
    tidylog::mutate(df, x = as.factor(x)), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::mutate(df, x = as.factor(x)) 5.07ms 5.14ms 98
tidylog::mutate(df, x = as.factor(x)) 6.11ms 6.19ms 95

joins

Joins are the most expensive operation, as tidylog has to do two additional joins behind the scenes.

On a small dataset:

bench::mark(
    dplyr::inner_join(band_members, band_instruments, by = "name"),
    tidylog::inner_join(band_members, band_instruments, by = "name"), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::inner_join(band_members, band_instruments, by = “name”) 935.66µs 967.36µs 97
tidylog::inner_join(band_members, band_instruments, by = “name”) 6.64ms 6.79ms 82

On a larger dataset (with many row duplications):

N <- 1000
df1 <- tibble(x1 = rnorm(N), key = round(runif(N) * 10))
df2 <- tibble(x2 = rnorm(N), key = round(runif(N) * 10))

bench::mark(
    dplyr::inner_join(df1, df2, by = "key"),
    tidylog::inner_join(df1, df2, by = "key"), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
#> Warning in dplyr::inner_join(df1, df2, by = "key"): Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> ℹ Row 1 of `x` matches multiple rows in `y`.
#> ℹ Row 3 of `y` matches multiple rows in `x`.
#> ℹ If a many-to-many relationship is expected, set `relationship =
#>   "many-to-many"` to silence this warning.
expression min median n_itr
dplyr::inner_join(df1, df2, by = “key”) 11.77ms 12.12ms 83
tidylog::inner_join(df1, df2, by = “key”) 7.59ms 7.74ms 87