[SIEM] [ML] Deployments without ML Nodes do not show an error when enabling jobs #63155
Labels
bug
Fixes for quality problems that affect the customer experience
Feature:ML Rule
Security Solution ML Rule feature
impact:medium
Addressing this issue will have a medium level of impact on the quality/strength of our product.
Team:Detection Rule Management
Security Detection Rule Management Team
Team:Detections and Resp
Security Detection Response Team
Team: SecuritySolution
Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc.
Team:SIEM
Similar to #54382, this is an issue around messaging/error states when an ML Node is unavailable.
On a fresh 7.7-BC5 cloud deployment without an ML Node, users can still enable ML Rules/Jobs as if ML is available. Jobs will fail to start, and there is no additional messaging to the user as to why.
Proposed fix would be to add additional health checks (as added in #50766) to/around our
MlCapabilitiesContext
such that we have insight into:and then use this information to provide the necessary details to the user.
For details, here are the response payloads when clicking/enabling a job. As you'll see, jobs will install without issue, however the responses don't include any information pointing to the fact that there isn't an ML Node available or that the job will never start.
Request:
/api/ml/modules/setup/siem_auditbeat
Response
Request:
/api/ml/jobs/force_start_datafeeds
Response
Request:
/api/ml/jobs/jobs_summary
Response
Request:
/api/ml/modules/recognize/apm-*-transaction*,auditbeat-*,endgame-*,filebeat-*,packetbeat-*,winlogbeat-*
Response
ML Job Settings after enabling a job (loading spinner is perpetual):
ML App:
The text was updated successfully, but these errors were encountered: