Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Data Frame Analytics: Scatterplot Matrix #84420

Open
22 of 41 tasks
walterra opened this issue Nov 26, 2020 · 3 comments
Open
22 of 41 tasks

[ML] Data Frame Analytics: Scatterplot Matrix #84420

walterra opened this issue Nov 26, 2020 · 3 comments
Assignees

Comments

@walterra
Copy link
Contributor

walterra commented Nov 26, 2020

Meta issue to track progress on scatterplot matrices for outlier detection

7.11 (initial PR)

  • outlier detection results view
  • creation wizard
  • fix tooltips (not showing up)
  • translations
  • don't hard code outlier_score field
  • dropdown to select fields
  • store collapse state in URL
  • Kibana/EUI colour palettes
  • filter non-numeric fields from axis selection
  • extend tooltips to show more information

7.11 (bugfixing)

7.12

8.6

8.7

Follow up / Bugfixing / Nice to have

  • Concept to add scatterplot matrix to Data Visualizer
  • Concept to add scatterplot matrix to Dashboards as part of a workflow to create ingest pipelines
  • For regression, show only one row of small multiples against the dependent variable
  • threshold slider styling
  • migrate to alternative Vega v5 version that doesn't show axis for each chart of the matrix
  • dynamically get top x influential fields
  • display (or sort by) feature importance value in fields combobox
  • histograms for diagonal charts with same attribute on x/y axis
  • switch VegaLite to TypeScript (see [Vega] Update vega version #78390 (comment))
  • Vega adds 1+ MBs to our bundle size (luckily not the page load bundle). Would be great to explore if the Vega library itself could be shared with the Kibana Vega Plugin to avoid that.
  • Fix dark theme (Vega Lite related issue, doesn't pass on view settings to repeated charts)
  • store selected fields in the URL state.
  • sync Vega state of outlier threshold with react state so we can retain the setting on updates.
  • improve color legend to be more contrasty
  • improve the messaging or the warning callout about docs containing arrays, e.g. mention the affected fields.
  • scatterplot makes the UI slow for large datasets, investigate the root cause since we're just using a plain search to get the data, it's unlikely that the search itself makes the page slow.
  • Dots in field names end up with backslashes for the axis labels
  • When charts don’t fit, there is not wrapper or at least a horizontal scrollbar.
    Hence the carts to the right are simply hidden from the user
    [ML] Data Frame analytics outlier detection scatterplot matrix does not adhere to bounding box #144709
  • Investigate if we can make the chart responsive, so for example with less columns increase the size of individual cells.

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@Winterflower
Copy link
Contributor

Hi @walterra !
I have been testing the scatterplot on Cloud Staging as part of my work to help @alvarezmelissa87 with the DFA validation part and the scatterplot introduces some significant lag when the dataset in question is big (several GB as is for example the case with Ember). Would it be possible to put a button in the UI that would generate the UI scatterplot on demand if the user wishes to see it? For some datasets like Ember that have too many fields to plot simultaneously, the scatterplot may not be super useful.

@walterra
Copy link
Contributor Author

Thanks for the feedback! We get the data for the scatterplot with a regular search, no ES aggregations used, so I wonder what makes it slow actually, I'll investigate. With more tools to assess results we're also thinking of improvements to the overall layout of the page and if everything needs to be on display by default. I added a related item to the issue description.

@walterra walterra removed the v7.14.0 label Jun 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants