Using tidylog
adds a
small overhead to each function call. For instance, because tidylog
needs to figure out how many rows were dropped when you use
tidylog::filter
, this call will be a bit slower than using
dplyr::filter
directly. The overhead is usually not
noticeable, but can be for larger datasets, especially when using joins.
The benchmarks below give some impression of how large the overhead
is.
On a small dataset:
bench::mark(
dplyr::filter(mtcars, cyl == 4),
tidylog::filter(mtcars, cyl == 4), iterations = 100
) %>%
dplyr::select(expression, min, median, n_itr) %>%
kable()
expression | min | median | n_itr |
---|---|---|---|
dplyr::filter(mtcars, cyl == 4) | 617.17µs | 638.58µs | 98 |
tidylog::filter(mtcars, cyl == 4) | 1.54ms | 1.58ms | 95 |
On a larger dataset:
df <- tibble(x = rnorm(100000))
bench::mark(
dplyr::filter(df, x > 0),
tidylog::filter(df, x > 0), iterations = 100
) %>%
dplyr::select(expression, min, median, n_itr) %>%
kable()
expression | min | median | n_itr |
---|---|---|---|
dplyr::filter(df, x > 0) | 1.09ms | 1.14ms | 96 |
tidylog::filter(df, x > 0) | 2.21ms | 2.27ms | 95 |
On a small dataset:
bench::mark(
dplyr::mutate(mtcars, cyl = as.factor(cyl)),
tidylog::mutate(mtcars, cyl = as.factor(cyl)), iterations = 100
) %>%
dplyr::select(expression, min, median, n_itr) %>%
kable()
expression | min | median | n_itr |
---|---|---|---|
dplyr::mutate(mtcars, cyl = as.factor(cyl)) | 726.12µs | 760.75µs | 98 |
tidylog::mutate(mtcars, cyl = as.factor(cyl)) | 1.83ms | 1.88ms | 94 |
On a larger dataset:
df <- tibble(x = round(runif(10000) * 10))
bench::mark(
dplyr::mutate(df, x = as.factor(x)),
tidylog::mutate(df, x = as.factor(x)), iterations = 100
) %>%
dplyr::select(expression, min, median, n_itr) %>%
kable()
expression | min | median | n_itr |
---|---|---|---|
dplyr::mutate(df, x = as.factor(x)) | 5.07ms | 5.14ms | 98 |
tidylog::mutate(df, x = as.factor(x)) | 6.11ms | 6.19ms | 95 |
Joins are the most expensive operation, as tidylog has to do two additional joins behind the scenes.
On a small dataset:
bench::mark(
dplyr::inner_join(band_members, band_instruments, by = "name"),
tidylog::inner_join(band_members, band_instruments, by = "name"), iterations = 100
) %>%
dplyr::select(expression, min, median, n_itr) %>%
kable()
expression | min | median | n_itr |
---|---|---|---|
dplyr::inner_join(band_members, band_instruments, by = “name”) | 935.66µs | 967.36µs | 97 |
tidylog::inner_join(band_members, band_instruments, by = “name”) | 6.64ms | 6.79ms | 82 |
On a larger dataset (with many row duplications):
N <- 1000
df1 <- tibble(x1 = rnorm(N), key = round(runif(N) * 10))
df2 <- tibble(x2 = rnorm(N), key = round(runif(N) * 10))
bench::mark(
dplyr::inner_join(df1, df2, by = "key"),
tidylog::inner_join(df1, df2, by = "key"), iterations = 100
) %>%
dplyr::select(expression, min, median, n_itr) %>%
kable()
#> Warning in dplyr::inner_join(df1, df2, by = "key"): Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> Detected an unexpected many-to-many relationship between `x` and `y`.
#> ℹ Row 1 of `x` matches multiple rows in `y`.
#> ℹ Row 3 of `y` matches multiple rows in `x`.
#> ℹ If a many-to-many relationship is expected, set `relationship =
#> "many-to-many"` to silence this warning.
expression | min | median | n_itr |
---|---|---|---|
dplyr::inner_join(df1, df2, by = “key”) | 11.77ms | 12.12ms | 83 |
tidylog::inner_join(df1, df2, by = “key”) | 7.59ms | 7.74ms | 87 |