Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] make xpack.ml.max_ml_node_size and xpack.ml.use_auto_machine_memory_percent dynamically settable #66132

Conversation

benwtrent
Copy link
Member

@benwtrent benwtrent commented Dec 9, 2020

With this commit the following settings are all dynamic:

  • xpack.ml.max_ml_node_size
  • xpack.ml.use_auto_machine_memory_percent
  • xpack.ml.max_lazy_ml_nodes

Since all these settings could be easily interrelated, the ability to update a Cluster with a single settings call is useful.

Additionally, setting some of these values at the node level (in a new node in a mixed cluster) it could cause issues with the master attempting to read the newer setting/value.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@droberts195
Copy link
Contributor

When setting the xpack.ml.max_ml_node_size, it is possible that it is different PER NODE. This is because different nodes could have different roles which could support different maximum values.

We have developed autoscaling on the assumption that it will only be used in clusters where the ML nodes are dedicated ML nodes. Thus every ML node in the cluster should just have the ml role. Maybe a better way to resolve this problem is to document more clearly that we expect autoscaling to only be used in clusters where all ML nodes are dedicated ML nodes.

But I can also see that the old code was problematic, as it was assuming that the value of this setting on the current master node applied to the ML nodes in the cluster, when in fact the appropriate value may not have been known at the time when the current master node was started. Another way to get around that would be to make the setting a dynamic cluster-wide setting. Then the cluster operator would set it appropriately via a call to the cluster settings endpoint given their knowledge of the infrastructure.

@benwtrent benwtrent changed the title [ML] handle largest possible node size settings at the node level [ML] make xpack.ml.max_ml_node_size and xpack.ml.use_auto_machine_memory_percent dynamically settable Dec 10, 2020
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent benwtrent merged commit 3406422 into elastic:master Dec 14, 2020
@benwtrent benwtrent deleted the feature/ml-autoscale-handle-max-node-size-for-individual-nodes branch December 14, 2020 16:06
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Dec 14, 2020
…memory_percent` dynamically settable (elastic#66132)

With this commit the following settings are all dynamic:

- `xpack.ml.max_ml_node_size`
- `xpack.ml.use_auto_machine_memory_percent`
- `xpack.ml.max_lazy_ml_nodes`

Since all these settings could be easily interrelated, the ability to update a Cluster with a single settings call is useful.

Additionally, setting some of these values at the node level (in a new node in a mixed cluster) it could cause issues with the master attempting to read the newer setting/value.
benwtrent added a commit that referenced this pull request Dec 14, 2020
…memory_percent` dynamically settable (#66132) (#66270)

With this commit the following settings are all dynamic:

- `xpack.ml.max_ml_node_size`
- `xpack.ml.use_auto_machine_memory_percent`
- `xpack.ml.max_lazy_ml_nodes`

Since all these settings could be easily interrelated, the ability to update a Cluster with a single settings call is useful.

Additionally, setting some of these values at the node level (in a new node in a mixed cluster) it could cause issues with the master attempting to read the newer setting/value.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Dec 14, 2020
* elastic/master: (33 commits)
  Add searchable snapshot cache folder to NodeEnvironment (elastic#66297)
  [DOCS] Add dynamic runtime fields to docs (elastic#66194)
  Add HDFS searchable snapshot integration (elastic#66185)
  Support canceling cross-clusters search requests (elastic#66206)
  Mute testCacheSurviveRestart (elastic#66289)
  Fix cat tasks api params in spec and handler (elastic#66272)
  Snapshot of a searchable snapshot should be empty (elastic#66162)
  [ML] DFA _explain API should not fail when none field is included (elastic#66281)
  Add action to decommission legacy monitoring cluster alerts (elastic#64373)
  move rollup_index param out of RollupActionConfig (elastic#66139)
  Improve FieldFetcher retrieval of fields (elastic#66160)
  Remove unsed fields in `RestAnalyzeAction` (elastic#66215)
  Simplify searchable snapshot CacheKey (elastic#66263)
  Autoscaling remove feature flags (elastic#65973)
  Improve searchable snapshot mount time (elastic#66198)
  [ML] Report cause when datafeed extraction encounters error (elastic#66167)
  Remove suggest reference in some API specs (elastic#66180)
  Fix warning when installing a plugin for different ESversion (elastic#66146)
  [ML] make `xpack.ml.max_ml_node_size` and `xpack.ml.use_auto_machine_memory_percent` dynamically settable (elastic#66132)
  [DOCS] Add `require_alias` to Bulk API (elastic#66259)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants