Benchmarks

Using tidylog adds a small overhead to each function call. For instance, because tidylog needs to figure out how many rows were dropped when you use tidylog::filter, this call will be a bit slower than using dplyr::filter directly. The overhead is usually not noticeable, but can be for larger datasets, especially when using joins. The benchmarks below give some impression of how large the overhead is.

library("dplyr")
library("tidylog", warn.conflicts = FALSE)
library("bench")
library("knitr")

filter

On a small dataset:

bench::mark(
    dplyr::filter(mtcars, cyl == 4),
    tidylog::filter(mtcars, cyl == 4), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::filter(mtcars, cyl == 4) 524.98µs 569.89µs 98
tidylog::filter(mtcars, cyl == 4) 1.52ms 1.57ms 96

On a larger dataset:

df <- tibble(x = rnorm(100000))

bench::mark(
    dplyr::filter(df, x > 0),
    tidylog::filter(df, x > 0), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::filter(df, x > 0) 1.12ms 1.17ms 96
tidylog::filter(df, x > 0) 2.29ms 2.35ms 96

mutate

On a small dataset:

bench::mark(
    dplyr::mutate(mtcars, cyl = as.factor(cyl)),
    tidylog::mutate(mtcars, cyl = as.factor(cyl)), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::mutate(mtcars, cyl = as.factor(cyl)) 666.85µs 725µs 98
tidylog::mutate(mtcars, cyl = as.factor(cyl)) 1.91ms 2ms 94

On a larger dataset:

df <- tibble(x = round(runif(10000) * 10))

bench::mark(
    dplyr::mutate(df, x = as.factor(x)),
    tidylog::mutate(df, x = as.factor(x)), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::mutate(df, x = as.factor(x)) 5.25ms 5.3ms 98
tidylog::mutate(df, x = as.factor(x)) 6.39ms 6.49ms 95

joins

Joins are the most expensive operation, as tidylog has to do two additional joins behind the scenes.

On a small dataset:

bench::mark(
    dplyr::inner_join(band_members, band_instruments, by = "name"),
    tidylog::inner_join(band_members, band_instruments, by = "name"), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
expression min median n_itr
dplyr::inner_join(band_members, band_instruments, by = “name”) 927.92µs 1ms 96
tidylog::inner_join(band_members, band_instruments, by = “name”) 7.04ms 7.42ms 78

On a larger dataset (with many row duplications):

N <- 1000
df1 <- tibble(x1 = rnorm(N), key = round(runif(N) * 10))
df2 <- tibble(x2 = rnorm(N), key = round(runif(N) * 10))

bench::mark(
    dplyr::inner_join(df1, df2, by = "key"),
    tidylog::inner_join(df1, df2, by = "key"), iterations = 100
) %>%
    dplyr::select(expression, min, median, n_itr) %>%
    kable()
#> Warning in dplyr::inner_join(df1, df2, by = "key"): Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> ℹ Row 1 of `x` matches multiple rows in `y`.
#> ℹ Row 72 of `y` matches multiple rows in `x`.
#> ℹ If a many-to-many relationship is expected, set `relationship =
#>   "many-to-many"` to silence this warning.
expression min median n_itr
dplyr::inner_join(df1, df2, by = “key”) 12.63ms 13.01ms 76
tidylog::inner_join(df1, df2, by = “key”) 7.94ms 8.22ms 90