Benchmarks

Using tidylog adds a small overhead to each function call. For instance, because tidylog needs to figure out how many rows were dropped when you use tidylog::filter, this call will be a bit slower than using dplyr::filter directly. The overhead is usually not noticeable, but can be for larger datasets, especially when using joins. The benchmarks below give some impression of how large the overhead is.

library("dplyr")
library("tidylog", warn.conflicts = FALSE)
library("bench")
library("knitr")

filter

On a small dataset:

bench::mark(
    dplyr::filter(mtcars, cyl == 4),
    tidylog::filter(mtcars, cyl == 4), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::filter(mtcars, cyl == 4) 564.29µs 613.08µs 98
tidylog::filter(mtcars, cyl == 4) 1.58ms 1.66ms 96

On a larger dataset:

df <- tibble(x = rnorm(100000))

bench::mark(
    dplyr::filter(df, x > 0),
    tidylog::filter(df, x > 0), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::filter(df, x > 0) 1.12ms 2.12ms 96
tidylog::filter(df, x > 0) 2.33ms 3ms 95

mutate

On a small dataset:

bench::mark(
    dplyr::mutate(mtcars, cyl = as.factor(cyl)),
    tidylog::mutate(mtcars, cyl = as.factor(cyl)), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::mutate(mtcars, cyl = as.factor(cyl)) 678.09µs 762.64µs 98
tidylog::mutate(mtcars, cyl = as.factor(cyl)) 1.99ms 2.08ms 94

On a larger dataset:

df <- tibble(x = round(runif(10000) * 10))

bench::mark(
    dplyr::mutate(df, x = as.factor(x)),
    tidylog::mutate(df, x = as.factor(x)), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::mutate(df, x = as.factor(x)) 5.28ms 5.54ms 98
tidylog::mutate(df, x = as.factor(x)) 6.56ms 6.89ms 95

joins

Joins are the most expensive operation, as tidylog has to do two additional joins behind the scenes.

On a small dataset:

bench::mark(
    dplyr::inner_join(band_members, band_instruments, by = "name"),
    tidylog::inner_join(band_members, band_instruments, by = "name"), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::inner_join(band_members, band_instruments, by = “name”) 975.68µs 1.06ms 96
tidylog::inner_join(band_members, band_instruments, by = “name”) 7.54ms 8.27ms 77

On a larger dataset (with many row duplications):

N <- 1000
df1 <- tibble(x1 = rnorm(N), key = round(runif(N) * 10))
df2 <- tibble(x2 = rnorm(N), key = round(runif(N) * 10))

bench::mark(
    dplyr::inner_join(df1, df2, by = "key"),
    tidylog::inner_join(df1, df2, by = "key"), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
#> Warning in dplyr::inner_join(df1, df2, by = "key"): Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> ℹ Row 1 of `x` matches multiple rows in `y`.
#> ℹ Row 8 of `y` matches multiple rows in `x`.
#> ℹ If a many-to-many relationship is expected, set `relationship =
#>   "many-to-many"` to silence this warning.
expression min median n_itr
dplyr::inner_join(df1, df2, by = “key”) 13.71ms 15.24ms 74
tidylog::inner_join(df1, df2, by = “key”) 8.18ms 8.72ms 86