Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discover] [Unified Data Table] Improve Discover / Unified Data Table performance in ES|QL mode #191249

Merged
merged 15 commits into from
Sep 4, 2024

Conversation

davismcphee
Copy link
Contributor

@davismcphee davismcphee commented Aug 26, 2024

Summary

This PR includes several performance improvements for ES|QL mode in Discover:

  • Memoize UnifiedDataTableRenderCellValue to reduce re-renders.
  • Show the loading spinner immediately on state changes that should trigger a refetch so that changed state isn't applied to the grid before the fetch completes. This also helps to reduce re-renders on state changes.
  • Replace EuiDataGrid in-memory sorting with a custom implementation using the @kbn/sort-predicates package (the same one Lens uses for its data grid). EUI in-memory sorting has a drastic impact on performance because it renders the entire grid off screen in order to parse cell values, detect their schema, and then sort the rows. This can cause rendering delays of several seconds for larger datasets on each render of the data grid. The new approach will sort in memory based on actual field values and pass the sorted rows to the data grid, which is both much more performant as well as improving the sorting behaviour across field types.
  • Use the overscanRowCount option from react-window to render some additional rows outside of view in the data grid, which minimizes pop-in of rows while scrolling quickly.

There are also a couple of bug fixes included in the PR as a result of the new sorting behaviour:

  • Numeric columns are now sorted correctly where previously they used alphabetic sorting.
  • The "Copy column" action in data grid columns now correctly copies the column values in the current sort order instead of copying them unsorted as it used to.

Before - 66,904 UnifiedDataTableRenderCellValue render calls:

render_old.mp4

After - 424 UnifiedDataTableRenderCellValue render calls:

render_new.mp4

Before profile:
profile_old

After profile:
profile_new

Before scrolling:

pop_in_old.mp4

After scrolling:

pop_in_new.mp4

Checklist

For maintainers

@davismcphee davismcphee added release_note:skip Skip the PR/issue when compiling release notes backport:skip This commit does not require backporting Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph) Team:ESQL ES|QL related features in Kibana labels Aug 26, 2024
@davismcphee davismcphee self-assigned this Aug 26, 2024
@davismcphee
Copy link
Contributor Author

/ci

1 similar comment
@davismcphee
Copy link
Contributor Author

/ci

@davismcphee davismcphee force-pushed the unified-data-table-performance branch from 7275c69 to 0942114 Compare August 27, 2024 03:20
@davismcphee
Copy link
Contributor Author

/ci

1 similar comment
@davismcphee
Copy link
Contributor Author

/ci

@davismcphee davismcphee force-pushed the unified-data-table-performance branch from 30c105f to eb58d4d Compare August 28, 2024 03:13
@davismcphee
Copy link
Contributor Author

/ci

@davismcphee davismcphee marked this pull request as ready for review August 28, 2024 04:05
@davismcphee davismcphee requested review from a team as code owners August 28, 2024 04:05
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-esql (Team:ESQL)

@davismcphee
Copy link
Contributor Author

/ci

@kertal kertal requested a review from a team August 28, 2024 07:56
@kertal
Copy link
Member

kertal commented Aug 28, 2024

EUI in-memory sorting has a drastic impact on performance because it renders the entire grid off screen in order to parse cell values, detect their schema, and then sort the rows.

This looks impressive! one note a bit unrelated to this PR, but this perfomance impact is also what we need to consider if we implement in-table search based on rendered values and not on raw values

@davismcphee
Copy link
Contributor Author

one note a bit unrelated to this PR, but this perfomance impact is also what we need to consider if we implement in-table search based on rendered values and not on raw values

True, we'll definitely have to consider performance impact for this, but I'm hopeful we'll be able to do it more efficiently than the data grid in-memory sorting. We may have to render certain cells in memory like React based ones, but hopefully we can deal with string renderers for most, and use caching, etc. to help speed things up.

@davismcphee davismcphee force-pushed the unified-data-table-performance branch from eb58d4d to cae0eb6 Compare August 28, 2024 17:57
// SKIP: This test runs forever and never completes
// after https://github.com/elastic/kibana/pull/191249
// but the actual functionality works fine
it.skip(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elastic/security-threat-hunting-investigations (maybe @logeekal or @michaelolo24 specifically?) For some reason after these changes, this test consistently fails and hangs until CI times out (locally it just ran forever until I killed the process). It seems to have something to do with the fireEvent.change call below, but I couldn't pinpoint what was happening. As far as I can tell the actual functionality isn't impacted and Timeline works as expected, plus it's the only Jest test failing, so maybe you could confirm things work as expected and this could be investigated as a followup if all seems good?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davismcphee , I will pull down the branch and check what is the issue. If something needed to be fixed in the test, I will push that..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add-on, I think it might EUI data grid test mode creating issue with memoized components. I did try to memoize render cell value before when we were working on improving the performance of unified data table but I had to remove it becuase Euidatagrid was not playing nice with it in jest as you can see here.

But I will check again to confirm that it is the same issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this more and found the warnings were related to incorrect prop type validation generated by EUI for Jest testing components, specifically from this file. I addressed it in this commit by skipping memoization for Jest environments, which gets rid of the warnings: b6ceeb1.

But unfortunately it looks like this wasn't the source of the original issue and this test still runs forever, so I'm really not sure what could be going on here. I even resolved all of the warnings in our Unified Data Table tests to be sure I wasn't missing something, but they're warning free now with no similar issues occurring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davismcphee for looking into it and adding JEST clause. I should have done that initially but it did not occur to me.

I have fixed the test here: d93e90d.

Issue was huge amount of data in jsdom. Since changing rowHeight re-renders the complete table, it was causing jsdom to take huge amount of time to re-render and thus firevent.change never reports that the value has been changed.

I changed the passed data to just one row and test passed.

Regarding NaN warning, I will check it separately, it comes everytime the data grid renders, probably will need to mock some EuiDataGrid height calculation function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into that and fixing the test @logeekal!

@stratoula
Copy link
Contributor

stratoula commented Aug 29, 2024

I wonder what change raises the async bundle 100+KB 🤔

Update: I played with it and indeed feels lighter 🎉 I just wonder if we can do something for the async bundle size bump

Copy link
Member

@kertal kertal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're a smooth (UI) operator, Davis, great to focus on performance!
Sorting in my testing was snappier
Discover_-Elastic_und_Discover-_Elastic_und_localhost
Loading a result with many fields in the table, I wouldn't call this snappier, it was like riding a VW beetle (no offense!) vs a Koenigsegg Jesko Absolut (didn't know, just googled fastest car in the world
Discover_-Elastic_und_Discover-_Elastic_und_localhost
), a few seconds vs a minute

Nice bugfixes under the hood also, I agree with @stratoula that reducing the bundle size would be the icing on the cake, but given this improvements I'd take the cake, since the cake wasn't added / eaten when loading Kibana.

When in doubt that memoization causes problems with tests that can't be fixed, I'd suggest to extract this to another PR, so it would stay a beetle for a little longer. better beetle than a bug.

@stratoula
Copy link
Contributor

stratoula commented Aug 30, 2024

Nice bugfixes under the hood also, I agree with @stratoula that reducing the bundle size would be the icing on the cake, but given this improvements I'd take the cake, since the cake wasn't added / eaten when loading Kibana.

Let's understand what causes the bump first though, is a significant bump. I would love to understand what change caused it

@davismcphee
Copy link
Contributor Author

/ci

@davismcphee
Copy link
Contributor Author

@stratoula The increase is 100kb overall because packages are imported separately to each plugin, but the individual increase to Unified Data Table was about ~20kb. I looked into it and a majority of the increase is from node_modules due to imports in @kbn/sort-predicates for semver and IP address sorting. I tried optimizing the imports but it didn't have an impact, I guess it just takes a lot of code to sort these things 😅

Main:
stats_main

This PR:
stats_pr

@stratoula
Copy link
Contributor

Thanx for looking into this Davis! This works for me, just wanted to be sure that this increase is reasonable

Copy link
Contributor

@logeekal logeekal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making these changes. 👌

Comment on lines +335 to +341
const state = kbnRison.encode({
dataSource: { type: 'esql' },
query: { esql: 'from kibana_sample_data_flights' },
});
await PageObjects.common.navigateToActualUrl('discover', `?_a=${state}`, {
ensureCurrentUrl: false,
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukasolson I'm not sure exactly why this test started failing, but as far as I can tell it shouldn't have passed originally since from kibana_sample_data_flights doesn't show the chart and should only trigger 1 request for Table. I suspect we have some race condition (which we should investigate further separately) and the performance improvements cause this scenario to behave as expected, but WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you must be right. Here's a recording showing that it was broken before:

Screen.Recording.2024-09-03.at.3.03.57.PM.mov

With this fix, it looks like things are working properly.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
cloudSecurityPosture 636 640 +4
discover 957 970 +13
esqlDataGrid 345 358 +13
lens 1438 1439 +1
logsExplorer 546 559 +13
securitySolution 5654 5658 +4
slo 827 840 +13
total +61

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
cloudSecurityPosture 480.7KB 493.6KB +12.9KB
discover 769.8KB 792.4KB +22.6KB
esqlDataGrid 129.5KB 152.1KB +22.6KB
securitySolution 17.9MB 18.0MB +61.6KB
slo 829.0KB 851.7KB +22.6KB
total +142.4KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
esqlDataGrid 9.3KB 9.3KB +50.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @davismcphee

Comment on lines +335 to +341
const state = kbnRison.encode({
dataSource: { type: 'esql' },
query: { esql: 'from kibana_sample_data_flights' },
});
await PageObjects.common.navigateToActualUrl('discover', `?_a=${state}`, {
ensureCurrentUrl: false,
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you must be right. Here's a recording showing that it was broken before:

Screen.Recording.2024-09-03.at.3.03.57.PM.mov

With this fix, it looks like things are working properly.

@davismcphee davismcphee merged commit 2629fa8 into elastic:main Sep 4, 2024
41 checks passed
@davismcphee davismcphee deleted the unified-data-table-performance branch September 4, 2024 03:26
davismcphee added a commit that referenced this pull request Sep 9, 2024
…when Document column isn't visible (#192332)

## Summary

Part of #191249 included using the `overscanRowCount` property of EUI
data grid to render some additional data grid rows off screen, which
greatly reduces pop in when scrolling through the grid. While the
performance cost of this low in most cases, it has an impact in certain
situations such as when documents have many fields and the Document
column is visible (e.g. Discover's `many_fields` performance journey).
This is because the Document column can render _many_ DOM elements in
each of its cells, which EUI data grid struggles to handle.

This PR updates Unified Data Table to only use `overscanRowCount` when
the Document column isn't visible, so we'll still benefit from the
scrolling improvements when columns are selected, but won't take the
performance hit for the Document column.

Perf journey run:
https://buildkite.com/elastic/kibana-single-user-performance/builds/14355.

<img width="842" alt="Screenshot 2024-09-09 at 14 07 46"
src="https://github.com/user-attachments/assets/b074fd7a-ef8a-4917-ac5e-0675b0e60b7d">


### Checklist

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)

### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
gergoabraham pushed a commit to gergoabraham/kibana that referenced this pull request Sep 13, 2024
…when Document column isn't visible (elastic#192332)

## Summary

Part of elastic#191249 included using the `overscanRowCount` property of EUI
data grid to render some additional data grid rows off screen, which
greatly reduces pop in when scrolling through the grid. While the
performance cost of this low in most cases, it has an impact in certain
situations such as when documents have many fields and the Document
column is visible (e.g. Discover's `many_fields` performance journey).
This is because the Document column can render _many_ DOM elements in
each of its cells, which EUI data grid struggles to handle.

This PR updates Unified Data Table to only use `overscanRowCount` when
the Document column isn't visible, so we'll still benefit from the
scrolling improvements when columns are selected, but won't take the
performance hit for the Document column.

Perf journey run:
https://buildkite.com/elastic/kibana-single-user-performance/builds/14355.

<img width="842" alt="Screenshot 2024-09-09 at 14 07 46"
src="https://github.com/user-attachments/assets/b074fd7a-ef8a-4917-ac5e-0675b0e60b7d">


### Checklist

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)

### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting ci:build-webpack-bundle-analyzer release_note:skip Skip the PR/issue when compiling release notes Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph) Team:ESQL ES|QL related features in Kibana v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants