Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize select #234

Open
Totktonada opened this issue Nov 11, 2021 · 1 comment
Open

Optimize select #234

Totktonada opened this issue Nov 11, 2021 · 1 comment

Comments

@Totktonada
Copy link
Member

Totktonada commented Nov 11, 2021

I was equipped by the test case from #220 and saw significant difference between pure vshard call and crud's select even after PR #222 and PR #226. It was around 2 times, but I have no precise numbers under hands now. The bottleneck is on router.

Brief jit.p profiling does not show low hanging fruits anymore. Selective enabling / disabling of crud's code shows that, say, considerable amount of time is spent on creating select plan. Surprisingly, pcall() also leads to a visible slowdown. Checks was in the jit.p profile (around 5%, but it is by the profile, not by RPS).

Anyway, my observations were quite brief: let's look into this more systematically. Carefully review involved code and evaluate time spent in different parts of the code. Next, look what we can do to speed up our case.

The goal is to shrink crud / vshard RPS ratio to ~1.5.

Let's start from #225. (Update: it was implemented in PR #251.)


Observation: crud calculates a sharding key even when bucket_id is known (see plan.new()).(Update: this particular problem was resolved in PR #252.)


After we'll done here, we can look next on optimization of automatic sharding key determination case and look, whether there is a difference with user provided bucket id case (that's not in the scope of this issue).

@Totktonada Totktonada changed the title Optimize crud in the single replicaset select case Optimize select with known bucket id Nov 11, 2021
DifferentialOrange added a commit that referenced this issue Dec 17, 2021
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on tests runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 7%.

Closes #234
DifferentialOrange added a commit that referenced this issue Dec 17, 2021
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on tests runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 7%.

Closes #234
DifferentialOrange added a commit that referenced this issue Dec 20, 2021
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Closes #234
DifferentialOrange added a commit that referenced this issue Dec 20, 2021
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Closes #234
DifferentialOrange added a commit that referenced this issue Dec 20, 2021
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Closes #234
@Totktonada Totktonada added the 5sp label Feb 9, 2022
DifferentialOrange added a commit that referenced this issue Feb 25, 2022
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Part of #234
DifferentialOrange added a commit that referenced this issue Feb 25, 2022
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Part of #234
DifferentialOrange added a commit that referenced this issue Feb 25, 2022
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Part of #234
DifferentialOrange added a commit that referenced this issue Mar 4, 2022
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Part of #234
DifferentialOrange added a commit that referenced this issue Mar 4, 2022
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Part of #234
DifferentialOrange added a commit that referenced this issue Mar 4, 2022
After this patch, select and pairs requests will no longer fetch
sharding key info and extract sharding key info if bucket_id specified.
Since calls with specified bucket_id already ignore sharding key
values, behavior will not change. Other crud operations already have
this optimization.

Based on test runs on HP ProBook 440 G7 i7/16Gb, performance had
increased by 6-7%.

Part of #234
@DifferentialOrange DifferentialOrange removed their assignment Jun 13, 2023
@DifferentialOrange
Copy link
Member

This ticket is became something like "Select optimization" epic due to stuff like "The goal is to shrink crud / vshard RPS ratio to ~1.5". We need to reestimate it if we decide to continue working on it.

@DifferentialOrange DifferentialOrange changed the title Optimize select with known bucket id Optimize select Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants