Fix NestedLoopJoin performance regression #12531

alihan-synnada · 2024-09-19T08:01:35Z

Which issue does this PR close?

Rationale for this change

Iterating over the right rows instead of the left while calling build_join_indices increased the number of calls to apply_join_filter_to_indices in cases where the right table has more rows (which applies to most common use cases). Building the indices inside build_join_indices helps reduce the number of calls to apply_join_filter_to_indices

The same indices are created inside build_join_indices every time with an expensive operation so it makes sense to cache them.

What changes are included in this PR?

This PR builds the indices in one go inside build_join_indices, removing the outer iteration, and reduces the number of calls to apply_join_filter_to_indices to 1 per join_left_and_right_batch call.

Also adds caching for join indices.

Are these changes tested?

Tested with the following query using TPCH data (SF=1)

EXPLAIN ANALYZE SELECT count(1) FROM nation n JOIN lineitem li ON n.n_nationkey < li.l_orderkey;

x	join time	percent change
before regression	9.421251914s	0%
after regression	51.073303629s	+442.11%
fix without cache	10.450086252s	+10.92%
fix with cache	7.390388154s	-21.56%

Are there any user-facing changes?

Only performance changes.

Dandandan · 2024-09-19T13:16:43Z

datafusion/physical-plan/src/joins/nested_loop_join.rs

+    let capacity = left_row_count * right_row_count;
+
+    // Left indices are 0..left_row_count repeated right_row_count times
+    let mut left_indices_builder = UInt64Array::builder(capacity);


Using Vec to build the indices is slightly faster and has some nicer syntax.

Dandandan · 2024-09-19T13:18:07Z

datafusion/physical-plan/src/joins/nested_loop_join.rs

+    // Right indices are each right row index repeated left_row_count times
+    let mut right_indices_builder = UInt32Array::builder(capacity);
+    for right_index in 0..right_row_count {
+        right_indices_builder.extend(vec![Some(right_index as u32); left_row_count])


We should avoid this intermediate Vec

Thanks for the heads up

ozankabak · 2024-09-19T15:07:33Z

/benchmark

ozankabak

LGTM but let's get some more eyes on this.

The accidental regression made us look into index calculations more closely, enabling us to optimize the code relative to how it was before the regression-inducing PR.

datafusion/physical-plan/src/joins/nested_loop_join.rs

berkaysynnada · 2024-09-19T15:13:00Z

cc @korowa, @comphead

korowa

LGTM overall -- I have some doubts regarding if caching really required here -- we can discuss it.

Also I've noticed that this PR increases the size of intermediate batches, which, though, seems to be acceptable until NLJ is allowed to emit massive (significantly more than configured batch size) batches as its output (it works as before and there is an issue for fixing this).

korowa · 2024-09-19T18:07:26Z

datafusion/physical-plan/src/joins/nested_loop_join.rs

+
+    // We always use the same indices before applying the filter, so we can cache them
+    let (left_indices_cache, right_indices_cache) = indices_cache;
+    let cached_output_row_count = left_indices_cache.len();


In case of 25 rows build-side there are 200k arrays, for 500 rows -- 4kk and so on (I suppose we don't need that much data on the build side to reach GBs size for these arrays).

I understand that we still will have to create interemediate batches to apply filter, and produce output batches, but I suppose, that starting from some point the size of these caches will become meaningful.

I guess we can do away with the cache or make it optional. In case we remove the cache, we could create the indices and apply the filter in chunks similar to before. If we pass in a range that we then use to calculate the indices for instead of creating right_batch.num_rows() chunks, we can control the size of the intermediate batches too. Something like (0..output_row_count).chunks(CHUNK_SIZE) should do the trick, now that we create the indices by mapping the current row index.

I believe it can bring the performance without cache down to a similar level to before the regression, maybe even better. I'll run a few benchmarks with this setup without a cache and update the benchmarks table.

The chunks approach didn't change the performance, but it helped reduce the sizes of the intermediate batches. The 10% performance hit without a cache comes from the way the arrays are constructed and I couldn't find a faster approach for now. I suggest we go with the cached approach for now. When the issue that enables NLJ to emit massive batches is implemented, we can choose between the cached and chunked approaches depending on NLJ's output size. I'll open an issue about it

Sounds good. I will merge this soon to avoid performance issues in any upcoming release unless there is more feedback. We seem to gain 20% performance relative to how it was before with caches, and we can migrate to a cached-vs-chunked-depending-on-output-batch-size approach in the future.

The chunks approach didn't change the performance, but it helped reduce the sizes of the intermediate batches.

Thank you for checking this option

Optimize apply_join_filter_to_indices calls

e0f9ca7

github-actions bot added the physical-expr Physical Expressions label Sep 19, 2024

alihan-synnada changed the title ~~Optimize apply_join_filter_to_indices calls~~ Fix NestedLoopJoin performance regression Sep 19, 2024

Dandandan reviewed Sep 19, 2024

View reviewed changes

alihan-synnada added 2 commits September 19, 2024 16:19

Optimize join indices calculation

c91d60b

Cache join indices

253e207

ozankabak marked this pull request as ready for review September 19, 2024 15:07

ozankabak approved these changes Sep 19, 2024

View reviewed changes

datafusion/physical-plan/src/joins/nested_loop_join.rs Outdated Show resolved Hide resolved

korowa reviewed Sep 19, 2024

View reviewed changes

alihan-synnada requested a review from ozankabak September 20, 2024 09:02

ozankabak and others added 3 commits September 20, 2024 13:09

Update datafusion/physical-plan/src/joins/nested_loop_join.rs

5670aba

Fix missing flag for adjust_indices_by_join_type

bd2def0

Fix SQL logic test

6a26fa6

github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Sep 20, 2024

ozankabak merged commit 8397855 into apache:main Sep 20, 2024
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix NestedLoopJoin performance regression #12531

Fix NestedLoopJoin performance regression #12531

alihan-synnada commented Sep 19, 2024 •

edited

Loading

Dandandan Sep 19, 2024

Dandandan Sep 19, 2024

alihan-synnada Sep 19, 2024

ozankabak commented Sep 19, 2024

ozankabak left a comment

berkaysynnada commented Sep 19, 2024

korowa left a comment •

edited

Loading

korowa Sep 19, 2024

alihan-synnada Sep 20, 2024 •

edited

Loading

alihan-synnada Sep 20, 2024

ozankabak Sep 20, 2024

korowa Sep 20, 2024

Fix NestedLoopJoin performance regression #12531

Fix NestedLoopJoin performance regression #12531

Conversation

alihan-synnada commented Sep 19, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Dandandan Sep 19, 2024

Choose a reason for hiding this comment

Dandandan Sep 19, 2024

Choose a reason for hiding this comment

alihan-synnada Sep 19, 2024

Choose a reason for hiding this comment

ozankabak commented Sep 19, 2024

ozankabak left a comment

Choose a reason for hiding this comment

berkaysynnada commented Sep 19, 2024

korowa left a comment • edited Loading

Choose a reason for hiding this comment

korowa Sep 19, 2024

Choose a reason for hiding this comment

alihan-synnada Sep 20, 2024 • edited Loading

Choose a reason for hiding this comment

alihan-synnada Sep 20, 2024

Choose a reason for hiding this comment

ozankabak Sep 20, 2024

Choose a reason for hiding this comment

korowa Sep 20, 2024

Choose a reason for hiding this comment

alihan-synnada commented Sep 19, 2024 •

edited

Loading

korowa left a comment •

edited

Loading

alihan-synnada Sep 20, 2024 •

edited

Loading