Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIEM] [ML] Deployments without ML Nodes do not show an error when enabling jobs #63155

Closed
spong opened this issue Apr 9, 2020 · 3 comments · Fixed by #150166
Closed

[SIEM] [ML] Deployments without ML Nodes do not show an error when enabling jobs #63155

spong opened this issue Apr 9, 2020 · 3 comments · Fixed by #150166
Labels
bug Fixes for quality problems that affect the customer experience Feature:ML Rule Security Solution ML Rule feature impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:SIEM

Comments

@spong
Copy link
Member

spong commented Apr 9, 2020

Similar to #54382, this is an issue around messaging/error states when an ML Node is unavailable.

On a fresh 7.7-BC5 cloud deployment without an ML Node, users can still enable ML Rules/Jobs as if ML is available. Jobs will fail to start, and there is no additional messaging to the user as to why.

Proposed fix would be to add additional health checks (as added in #50766) to/around our MlCapabilitiesContext such that we have insight into:

  • # of ML Nodes
  • Health status of ML Nodes (migration in process, low on memory, etc)

and then use this information to provide the necessary details to the user.


For details, here are the response payloads when clicking/enabling a job. As you'll see, jobs will install without issue, however the responses don't include any information pointing to the fact that there isn't an ML Node available or that the job will never start.

Request: /api/ml/modules/setup/siem_auditbeat

Response

{
  "jobs": [
    { "id": "rare_process_by_host_linux_ecs", "success": true },
    { "id": "linux_anomalous_network_activity_ecs", "success": true },
    { "id": "linux_anomalous_network_port_activity_ecs", "success": true },
    { "id": "linux_anomalous_network_service", "success": true },
    { "id": "linux_anomalous_network_url_activity_ecs", "success": true },
    { "id": "linux_anomalous_user_name_ecs", "success": true },
    { "id": "linux_anomalous_process_all_hosts_ecs", "success": true }
  ],
  "datafeeds": [
    {
      "id": "datafeed-linux_anomalous_network_activity_ecs",
      "success": true,
      "started": false
    },
    {
      "id": "datafeed-linux_anomalous_network_port_activity_ecs",
      "success": true,
      "started": false
    },
    {
      "id": "datafeed-linux_anomalous_network_service",
      "success": true,
      "started": false
    },
    {
      "id": "datafeed-rare_process_by_host_linux_ecs",
      "success": true,
      "started": false
    },
    {
      "id": "datafeed-linux_anomalous_process_all_hosts_ecs",
      "success": true,
      "started": false
    },
    {
      "id": "datafeed-linux_anomalous_user_name_ecs",
      "success": true,
      "started": false
    },
    {
      "id": "datafeed-linux_anomalous_network_url_activity_ecs",
      "success": true,
      "started": false
    }
  ],
  "kibana": {}
}

Request: /api/ml/jobs/force_start_datafeeds

Response

{ "datafeed-linux_anomalous_network_activity_ecs": { "started": true } }

Request: /api/ml/jobs/jobs_summary

Response

[
  {
    "id": "linux_anomalous_network_activity_ecs",
    "description": "SIEM Auditbeat: Looks for unusual processes using the network which could indicate command-and-control, lateral movement, persistence, or data exfiltration activity (beta)",
    "groups": ["auditbeat", "process", "siem"],
    "processed_record_count": 0,
    "memory_status": "ok",
    "jobState": "opening",
    "hasDatafeed": true,
    "datafeedId": "datafeed-linux_anomalous_network_activity_ecs",
    "datafeedIndices": ["auditbeat-*"],
    "datafeedState": "starting",
    "isSingleMetricViewerJob": true,
    "auditMessage": {
      "level": "warning",
      "text": "No node found to start datafeed [datafeed-linux_anomalous_network_activity_ecs]. Reasons [datafeed awaiting job assignment.]"
    }
  },
  {
    "id": "linux_anomalous_network_port_activity_ecs",
    "description": "SIEM Auditbeat: Looks for unusual destination port activity that could indicate command-and-control, persistence mechanism, or data exfiltration activity (beta)",
    "groups": ["auditbeat", "process", "siem"],
    "processed_record_count": 0,
    "memory_status": "ok",
    "jobState": "closed",
    "hasDatafeed": true,
    "datafeedId": "datafeed-linux_anomalous_network_port_activity_ecs",
    "datafeedIndices": ["auditbeat-*"],
    "datafeedState": "stopped",
    "isSingleMetricViewerJob": true
  },
  {
    "id": "linux_anomalous_network_service",
    "description": "SIEM Auditbeat: Looks for unusual listening ports that could indicate execution of unauthorized services, backdoors, or persistence mechanisms (beta)",
    "groups": ["auditbeat", "process", "siem"],
    "processed_record_count": 0,
    "memory_status": "ok",
    "jobState": "closed",
    "hasDatafeed": true,
    "datafeedId": "datafeed-linux_anomalous_network_service",
    "datafeedIndices": ["auditbeat-*"],
    "datafeedState": "stopped",
    "isSingleMetricViewerJob": true
  },
  {
    "id": "linux_anomalous_network_url_activity_ecs",
    "description": "SIEM Auditbeat: Looks for an unusual web URL request from a Linux instance. Curl and wget web request activity is very common but unusual web requests from a Linux server can sometimes be malware delivery or execution (beta)",
    "groups": ["auditbeat", "process", "siem"],
    "processed_record_count": 0,
    "memory_status": "ok",
    "jobState": "closed",
    "hasDatafeed": true,
    "datafeedId": "datafeed-linux_anomalous_network_url_activity_ecs",
    "datafeedIndices": ["auditbeat-*"],
    "datafeedState": "stopped",
    "isSingleMetricViewerJob": true
  },
  {
    "id": "linux_anomalous_process_all_hosts_ecs",
    "description": "SIEM Auditbeat: Looks for processes that are unusual to all Linux hosts. Such unusual processes may indicate unauthorized services, malware, or persistence mechanisms (beta)",
    "groups": ["auditbeat", "process", "siem"],
    "processed_record_count": 0,
    "memory_status": "ok",
    "jobState": "closed",
    "hasDatafeed": true,
    "datafeedId": "datafeed-linux_anomalous_process_all_hosts_ecs",
    "datafeedIndices": ["auditbeat-*"],
    "datafeedState": "stopped",
    "isSingleMetricViewerJob": true
  },
  {
    "id": "linux_anomalous_user_name_ecs",
    "description": "SIEM Auditbeat: Rare and unusual users that are not normally active may indicate unauthorized changes or activity by an unauthorized user which may be credentialed access or lateral movement (beta)",
    "groups": ["auditbeat", "process", "siem"],
    "processed_record_count": 0,
    "memory_status": "ok",
    "jobState": "closed",
    "hasDatafeed": true,
    "datafeedId": "datafeed-linux_anomalous_user_name_ecs",
    "datafeedIndices": ["auditbeat-*"],
    "datafeedState": "stopped",
    "isSingleMetricViewerJob": true
  },
  {
    "id": "rare_process_by_host_linux_ecs",
    "description": "SIEM Auditbeat: Detect unusually rare processes on Linux (beta)",
    "groups": ["auditbeat", "process", "siem"],
    "processed_record_count": 0,
    "memory_status": "ok",
    "jobState": "closed",
    "hasDatafeed": true,
    "datafeedId": "datafeed-rare_process_by_host_linux_ecs",
    "datafeedIndices": ["auditbeat-*"],
    "datafeedState": "stopped",
    "isSingleMetricViewerJob": true
  }
]

Request: /api/ml/modules/recognize/apm-*-transaction*,auditbeat-*,endgame-*,filebeat-*,packetbeat-*,winlogbeat-*

Response

[
  {
    "id": "siem_auditbeat",
    "title": "SIEM Auditbeat",
    "query": {
      "bool": { "filter": [{ "term": { "agent.type": "auditbeat" } }] }
    },
    "description": "Detect suspicious network activity and unusual processes in Auditbeat data (beta)",
    "logo": { "icon": "securityAnalyticsApp" }
  }
]


ML Job Settings after enabling a job (loading spinner is perpetual):

ML App:

@spong spong added bug Fixes for quality problems that affect the customer experience Team:SIEM v7.7.0 labels Apr 9, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/siem (Team:SIEM)

@spalger spalger added v7.7.1 and removed v7.7.0 labels May 14, 2020
@spong spong removed the v7.7.1 label Jun 25, 2020
@spong
Copy link
Member Author

spong commented Jun 25, 2020

@MadameSheema this is still relevant as of the most recent 7.9.0-snapshot and should be prioritized with the addition of new ML Jobs to improve UX.

@MindyRS MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Oct 27, 2020
@MadameSheema MadameSheema added Team:Detections and Resp Security Detection Response Team Feature:ML Rule Security Solution ML Rule feature impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. labels Jan 21, 2021
@MadameSheema MadameSheema added impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. and removed impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. labels Mar 25, 2021
@dontcallmesherryli dontcallmesherryli added impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. and removed impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. labels Apr 21, 2021
@MadameSheema MadameSheema added impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. and removed impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. labels Jun 9, 2021
@peluja1012 peluja1012 added the Team:Detection Rule Management Security Detection Rule Management Team label Sep 15, 2021
@spong spong linked a pull request Feb 22, 2023 that will close this issue
@spong
Copy link
Member Author

spong commented Feb 22, 2023

To be resolved by #150166! 🎉 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:ML Rule Security Solution ML Rule feature impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:SIEM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants