Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized count queries by removing all joins if there are no where #2210

Merged
merged 1 commit into from
May 18, 2023

Conversation

colinmollenhour
Copy link
Member

…auses using joined tables and all joins are left joins.

This has been used for a few years now, first mentioned to OpenMage at #1335 (comment).

Joins are absolute performance killers for large count queries in some cases so this can make loading some grids much faster.

@github-actions github-actions bot added Component: lib/Varien Relates to lib/Varien Component: lib/* Relates to lib/* labels Jun 9, 2022
@kiatng kiatng added the performance Performance related label Jun 10, 2022
@kiatng
Copy link
Contributor

kiatng commented Jun 12, 2022

How to test this?

@ADDISON74
Copy link
Contributor

Here @colinmollenhour said the issue is not confirmed #1335 (comment). If we can have some clues about testing we can check it.

@colinmollenhour
Copy link
Member Author

The code can be confirmed to work by just visiting the various grids and viewing the counts to ensure they are correct. The originally reported issue is what I could not reproduce before this patch or after.

However, it makes a huge performance difference on large tables when joins are used. The proper way to add columns in the grid is to add them statically to the _grid table but many times people don't or can't and this helps a lot. May also help in other areas like APIs, reporting, any third party modules that use joins in grids, etc.

@fballiano
Copy link
Contributor

I was thinking about the splitting the query with AND/OR, what if that query has a and/or within quotes like in part of a text? isn't it a bit dangerous to split that way? I don't know if it could be done in another way.

@tmotyl
Copy link
Contributor

tmotyl commented Jun 21, 2022

what about group by and having ? should it also be taken into account?

@colinmollenhour
Copy link
Member Author

colinmollenhour commented Jun 22, 2022

I was thinking about the splitting the query with AND/OR, what if that query has a and/or within quotes like in part of a text? isn't it a bit dangerous to split that way? I don't know if it could be done in another way.

The parentheses are stripped out so it is really just looking for a match of the table name anywhere in the where clause so I think nested conditions should be fine.

what about group by and having ? should it also be taken into account?

Group by is already considered on line 245. I haven't taken "having" into account.. I don't think I've ever seen that used in a grid collection but it could be added easily to line 245.

EDIT: Having shouldn't be used without group by anyway so I think it is implicitly handled in line 245, right?
This may not be bulletproof, there could be odd cases that break it.. Personally I think it is worth it since it brought some of our grids from 20s+ load times to sub-second and there was really no alternative.

@sreichel
Copy link
Contributor

I haven't taken "having" into account.. I don't think I've ever seen that used in a grid collection but it could be added easily to line 245.

Not sure if "having" was required ... adding product SKUs to Sales Order Grid?

@colinmollenhour
Copy link
Member Author

Not sure if "having" was required ... adding product SKUs to Sales Order Grid?

Yes but I believe "having" can only be used in conjunction with "group by" which is already considered so "having" is not an issue.

…auses using joined tables and all joins are left joins.
@colinmollenhour colinmollenhour changed the base branch from 1.9.4.x to main April 4, 2023 18:46
@colinmollenhour colinmollenhour added the Rebased: RFC-0002 This PR has been rebased onto either 'main' or 'next' according to RFC-0002 label Apr 4, 2023
@luigifab
Copy link
Contributor

I tested with customers grid with our million customers.

Line 266 of the PR, I added echo (string)$countSelect; to debug the SQL query.
The SQL displayed without and with the PR are here:

MariaDB [om]> SELECT COUNT(*) FROM `customer_entity` AS `e` LEFT JOIN `customer_entity_varchar` AS `at_prefix` ON (`at_prefix`.`entity_id` = `e`.`entity_id`) AND (`at_prefix`.`attribute_id` = 4) LEFT JOIN `customer_entity_varchar` AS `at_firstname` ON (`at_firstname`.`entity_id` = `e`.`entity_id`) AND (`at_firstname`.`attribute_id` = 5) LEFT JOIN `customer_entity_varchar` AS `at_middlename` ON (`at_middlename`.`entity_id` = `e`.`entity_id`) AND (`at_middlename`.`attribute_id` = 6) LEFT JOIN `customer_entity_varchar` AS `at_lastname` ON (`at_lastname`.`entity_id` = `e`.`entity_id`) AND (`at_lastname`.`attribute_id` = 7) LEFT JOIN `customer_entity_varchar` AS `at_suffix` ON (`at_suffix`.`entity_id` = `e`.`entity_id`) AND (`at_suffix`.`attribute_id` = 8) LEFT JOIN `customer_entity_int` AS `at_default_billing` ON (`at_default_billing`.`entity_id` = `e`.`entity_id`) AND (`at_default_billing`.`attribute_id` = 13) LEFT JOIN `customer_address_entity_varchar` AS `at_billing_postcode` ON (`at_billing_postcode`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_postcode`.`attribute_id` = 30) LEFT JOIN `customer_address_entity_varchar` AS `at_billing_city` ON (`at_billing_city`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_city`.`attribute_id` = 26) LEFT JOIN `customer_address_entity_varchar` AS `at_billing_telephone` ON (`at_billing_telephone`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_telephone`.`attribute_id` = 31) LEFT JOIN `customer_address_entity_varchar` AS `at_billing_region` ON (`at_billing_region`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_region`.`attribute_id` = 28) LEFT JOIN `customer_address_entity_varchar` AS `at_billing_country_id` ON (`at_billing_country_id`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_country_id`.`attribute_id` = 27);
+----------+
| COUNT(*) |
+----------+
|  1299000 |
+----------+
1 row in set (6.196 sec)

MariaDB [om]> SELECT COUNT(*) FROM `customer_entity` AS `e`;
+----------+
| COUNT(*) |
+----------+
|  1299000 |
+----------+
1 row in set (0.286 sec)

It looks good... but without the PR the webpage doesn't take 6s to load.

I profiled the customers grid with Blackfire without the PR, and the SQL used for counting is:
select count(*) from customer_entity as e

So I think there is an optimisation elsewhere.

@fballiano
Copy link
Contributor

so this PR adds 6 seconds to your query? auch... well..

@colinmollenhour
Copy link
Member Author

so this PR adds 6 seconds to your query? auch... well..

I think he's saying that when he does a real page load without the PR the page is loading in less than 6 seconds so he thinks there must be an optimization somewhere else already making this one have no effect in this case. The unoptimized join query is definitely slower than the optimized query.

@kyrena
Copy link
Contributor

kyrena commented May 16, 2023

Yes, he said that. It seems that for customer grid the PR is useless, because the SQL request is optimized by someone else.

@colinmollenhour
Copy link
Member Author

So I think there is an optimisation elsewhere.

@luigifab Can you provide some more info here? I'm struggling to see how that is possible.. Also is it the case with other grids?

However, I'm not sure it matters, this is still a good optimization that could benefit other areas besides just grids.

Copy link
Contributor

@kiatng kiatng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a way to test this with a custom table:

$result = Mage::getResourceModel('finance/ledger_collection')
    ->joinBillingData() // 3 left joins
    ->addFieldToFilter('main_table.name', 'prepaid')
    ->getSize();

It's about 60% faster with the PR.

@sreichel
Copy link
Contributor

It's about 60% faster with the PR.

👍

Can we replace array_filter with foreach?

@colinmollenhour
Copy link
Member Author

Can we replace array_filter with foreach?

For what reason? As we say in the South, "If it ain't broke, don't fix it." 😄

@sreichel
Copy link
Contributor

foreach seems to be faster ... https://stackoverflow.com/a/62691264/5703627

@colinmollenhour
Copy link
Member Author

foreach seems to be faster ... https://stackoverflow.com/a/62691264/5703627

I can't imagine it making any difference in this case, there will never be more than a few elements in the arrays being filtered (number of tables and where clauses is "fixed" and "small") and this code runs once per collection so in most cases just once per page. So since there is no performance gain to be had beyond a few nanoseconds, the argument as to which is better highly subjective.. Given the code as written is already highly tested I don't think it it worthwhile to rewrite, but I do appreciate your enthusiasm for optimizing performance. 👍

@fballiano fballiano changed the title Optimize count queries by removing all joins if there are no where cl… Optimize count queries by removing all joins if there are no where May 18, 2023
@fballiano fballiano changed the title Optimize count queries by removing all joins if there are no where Optimized count queries by removing all joins if there are no where May 18, 2023
@fballiano fballiano merged commit d6927f9 into OpenMage:main May 18, 2023
@sreichel
Copy link
Contributor

@fballiano wtf?

Is this you playground now?

At other PRs we discuss about "performance", but here ... 👎

@fballiano
Copy link
Contributor

it was approved, and I though @colinmollenhour explanation was enough, I genuinely didn't think you have this much problem with a foreach.

@sreichel
Copy link
Contributor

sreichel commented May 18, 2023

"it was approved", but it had open questions ....

it was approved

Maybe you dont see it, but with faster progress a lot of errors were introduced. Simple PRs got approved/merged without being tested by contributor and/or reviewers ....

Is this they way you want to work?

@fballiano
Copy link
Contributor

this PR is like 10 months old, the moment it's decided to merge you complain about a minor thing, I wait for a long explanation by the author but still is not good enough for you.
also, are you seriously complaining about performance with the code of somebody who is probably the biggest improver of the whole M1 performance since 10+ years?

anyway, I've said it since the first day: openmage is not magento1, we don't have the manpower to do things in that same way, we'll do our best but I'll not babysit a dieing project which never want to change anything because "maybe a foreach is better than array_filter".

the benefits we had in the last couple of years are huge, also if we had a few problems, wth, we're held to a standard which is impossible for projects with 100 times the manpower, it does not make any sense.

we're in a "-rc" phase since how many months now? doesn't seem to me we're recklessly releasing whatever...

last thing: please rebase your PRs, if I do it I can do it only squashing commits and you will for sure find a reason to complain about that too.

@justinbeaty
Copy link
Contributor

While I have not had too much time lately to contribute, I have to step in here and defend @fballiano.

As he said, it is not right for someone to have no participation in this PR for nearly a year, and only after two approvals are left, expect to be able to block a PR over such a minor thing such as foreach.

First, there is no speed benefit to changing it to a foreach in this case. Are we forbidden from ever using array_filter in openmage? If that's someone's opinion, they they should make a code style RFC that says so.

Second, since this is Colin's PR, he is free to keep it as he wrote it. That is exactly what PR authors get to do. Person makes PR, two people approve it, and it gets merged. Or people don't approve it and it doesn't get merged...

@sreichel
Copy link
Contributor

are you seriously complaining about performance with the code of somebody who is probably the biggest improver

You are mixing things up!

Of course i know Collins work, but he isn't perfect too.

Asking for foreach replacement was to make it as fast as possible ... nothing else. We changed array_walk and stristr to foreach and stripos in #3238 for micro-optimization, but here ... ?

I recently changed Varien_Object::getData() (#3245) to save nanoseconds ...

WHY do introduce new code, knowing we can do better?

@tmotyl
Copy link
Contributor

tmotyl commented May 19, 2023

Lets do some math:

20 - number of array elements in this case.
php 8.2 (time for 1000000 elements)
array_filter: 0.069692134857178
foreach: 0.018320083618164

0.069692134857178 - 0.018320083618164 = 0.05137205123 (delta per 1000000 elements)

0.05137205123 / 1000000 * 20 = 0.00000102744s

So the performance gain, in this case, is 0.00000102744 seconds.

In contrast, this patch is often improving the page load by a few seconds.

My take:
Such a small gain is not worth rewriting the code and going through the review and test process again.
Changing the code would also mean we merge the change weeks or months later. So postponing the patch, just to make it perfect and win 0.00000102744s when the patch as is giving us 6s is a high cost of delay.

The time we would spend reviewing and testing the patch after changes would be much better used by spending on other patches.
Of course there is still an open path to submit a PR to rewrite the code to foreach, if you think its worth doing @sreichel .

@fballiano
Copy link
Contributor

WHY do introduce new code, knowing we can do better?

how many times did you answer "will do it in another PR" just to get something merged and then improve it later? I'll answer: many, many times. so why the heck you want to ruin it for everybody now?

some things takes us ages to get reviewed and approved, so:

  • to change a minor thing we restart everything from scratch and never get the PR merge wasting everybody's time
  • we merge something that's 90% good, which has a good value, although not perfect (perfect? what is perfect in our codebase anyway? what is perfect in anything actually?) and we add a bit of value anyway

when we merge something that's not perfect (although obviously we all try not to be reckless) we are improving, showing that we are alive, creating a discussion about a new feature. maybe it generates a bug? ok, whatever, we will fix it together (what the heck is a community project anyway? and lately we're extremely responsive with bug reports). since we're not a corporation we can't do otherwise anyway.

fballiano added a commit that referenced this pull request May 19, 2023
@ADDISON74
Copy link
Contributor

In the history of OpenMage, many times it happened to merge a PR and make subsequent changes or even revert it. We should implement a rule to avoid such contradictory discussions.

In the present case, the merging process has been completed. I agree optimization of the code to save execution time was requested before but it got an answer from the author. If someone wants it please open a PR to be discussed, tested and approved.

In my opinion, nothing can be perfect in the material world and I guess that many times it doesn't work out for God either.

@elidrissidev
Copy link
Member

elidrissidev commented May 25, 2023

I found a small issue, if there are any bound parameters used in the removed JOIN clauses they are not being cleared, therefore causing an error: SQLSTATE[HY093]: Invalid parameter number: number of bound variables does not match number of tokens

Found this when doing a clean install just now, a data upgrade script in Mage_Directory was failing because the of the param :region_locale:

$this->addBindParam(':region_locale', $locale);
$this->getSelect()->joinLeft(
['rname' => $this->_regionNameTable],
'main_table.region_id = rname.region_id AND rname.locale = :region_locale',
['name']
);

@kiatng
Copy link
Contributor

kiatng commented May 26, 2023

To replicate error reported by @elidrissidev

        $collection = Mage::getResourceModel('directory/region_collection');
        $collection->addBindParam('id', 'MY');
        $collection->getSelect()->where('main_table.country_id = :id');
        $collection->getSize(); // throws SQLSTATE[HY093]: Invalid parameter number: number of bound variables does not match number of tokens

No error when inserting return $countSelect; before the code introduced in this PR.

@kiatng
Copy link
Contributor

kiatng commented May 26, 2023

An ugly patch:

            // insert before the PR Code
            if (count($this->_bindParams)) {
                return $countSelect;
            }
            // ... PR code ...

May be @colinmollenhour has a better fix.

@elidrissidev
Copy link
Member

To replicate error reported by @elidrissidev

        $collection = Mage::getResourceModel('directory/region_collection');
        $collection->addBindParam('id', 'MY');
        $collection->getSelect()->where('main_table.country_id = :id');
        $collection->getSize(); // throws SQLSTATE[HY093]: Invalid parameter number: number of bound variables does not match number of tokens

No error when inserting return $countSelect; before the code introduced in this PR.

Actually, that seems to be a second issue. The bound params are also not cleared when a removed WHERE was using them.

For my original issue, the code below seems is enough to reproduce it, but only in PHP 8.x, with PHP 7.4 it doesn't throw any error for me 🧐

Mage::getResourceModel('directory/region_collection')->getSize();

Screen Shot 2023-05-26 at 10 09 18 AM

colinmollenhour added a commit to colinmollenhour/magento-lts that referenced this pull request May 26, 2023
@colinmollenhour
Copy link
Member Author

I pushed #3279 which I believe is all that is needed.

Actually, that seems to be a second issue. The bound params are also not cleared when a removed WHERE was using them.

There are no where conditions removed by this optimization.. The reason that Mage::getResourceModel('directory/region_collection')->getSize(); is affected is that collection is always joined with a bind param in _initSelect.

@luigifab
Copy link
Contributor

@luigifab Can you provide some more info here? I'm struggling to see how that is possible.. Also is it the case with other grids?
However, I'm not sure it matters, this is still a good optimization that could benefit other areas besides just grids.

@colinmollenhour I didn't search, and I didn't have more info except the test I did, but yes this is a good optimisation, just not sure how to show it, but now @kiatng found a way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: lib/Varien Relates to lib/Varien Component: lib/* Relates to lib/* performance Performance related Rebased: RFC-0002 This PR has been rebased onto either 'main' or 'next' according to RFC-0002
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants