[8.0 only] Should the alerting services plugin always be enabled? #90934

mikecote · 2021-02-10T13:19:04Z

There don't seem to be use cases where users of Kibana would want to disable any of our plugins. If we can't find a valid reason, it feels like we should prevent our plugins from being disabled by deprecating the setting (ex: xpack.actions.enabled) in 7.x and removing the capability in 8.0. This will make things much more straightforward and allow other plugins to make the alerting-related plugins a required dependency.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-02-10T13:19:06Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

mikecote · 2021-02-10T13:19:57Z

cc @kobelb

pmuellr · 2021-02-10T13:48:37Z

If there's still some requirement that customers be able to run Kibana without alerts "enabled", we could implement a "soft-disable" config key, that does not run alerts at all - perhaps disable CRUD operations or something as well, maybe still let you read though.

Seems preferable to making other plugins treat alerting as an optional plugin, with all that entails.

You can already "soft-disable" actions via the enabledActionTypes config key - you can just set it to an empty array to disable all the action types.

Crazybus · 2021-05-28T07:04:44Z

In the mean time would it make sense to update the wording in the documentation to make it clearer what this setting does?

https://www.elastic.co/guide/en/kibana/current/alert-action-settings-kb.html#general-alert-action-settings

xpack.actions.enabled | Feature toggle that enables Actions in Kibana. Defaults to true.

The current wording makes it sound like this setting is a feature toggle that will disable Kibana alerting. When the reality is that this setting will disable the plugin and any other Kibana applications that depend on it.

mikecote · 2021-08-10T15:07:12Z

Based on #89584, I could not find explicit requirements to have our plugins disable-able today so I'm moving this discussion in the direction of removing .enabled in all alerting plugins (event log, task manager, alerting, actions).

Having dedicated Kibanas for alerting should be researched to ensure it addresses the requirements while thinking about concerns like having 0 Kibanas for alerting (by accident), situations where we'd be unable to schedule tasks / create rules in certain Kibanas, etc.

pmuellr · 2021-08-10T17:08:09Z

Having dedicated Kibanas for alerting should be researched to ensure it addresses the requirements while thinking about concerns like having 0 Kibanas for alerting (by accident), situations where we'd be unable to schedule tasks / create rules in certain Kibanas, etc.

I think "Kibanas that only run alerts" and "Kibanas that don't run alerts" kind of story is probably independent of whether the base plugins should be enabled. You'd probably still want the alerting UIs to work in "Kibanas that don't run alerts", you just don't run the tasks on those Kibanas. But of course, we're nowhere near figuring out how this would really work, so ... just my 2¢.

mikecote · 2021-08-10T18:20:37Z

I think "Kibanas that only run alerts" and "Kibanas that don't run alerts" kind of story is probably independent of whether the base plugins should be enabled. You'd probably still want the alerting UIs to work in "Kibanas that don't run alerts", you just don't run the tasks on those Kibanas. But of course, we're nowhere near figuring out how this would really work, so ... just my 2¢.

++ I agree with everything mentioned here.

chrisronline · 2021-08-12T15:17:07Z

To clarify here, it sounds like the plan is to:

Deprecate the ability to disable these plugins in 7.x
Remove the .enable config for these plugins in 8.0

Are we still discussing this, or are we happy with the above approach?

mikecote · 2021-08-12T15:46:57Z

Are we still discussing this, or are we happy with the above approach?

I'm 👍 to say we're happy to remove the above approach. The reason we originally added support for .enabled was to comply with the standards of being able to disable any plugin in Kibana. Now that the narrative has changed, I'm thinking we follow along as there wasn't prior requirements for this.

To clarify here, it sounds like the plan is to:

Deprecate the ability to disable these plugins in 7.x

Remove the .enable config for these plugins in 8.0

Yup that's the correct plan 👍

mikecote · 2021-09-08T13:45:31Z

@kobelb @stacey-gammon @lukeelmers since deprecating the .enabled flags in our plugins (#108281), we've started hearing use-cases that our customers are using these flags for.

The most recent one is about preventing certain Kibana instances from running alerting or action plugin(s) because they don't want them running actions or alerting tasks. We've never supported dedicated alerting instances at this time because it comes with side effects where some plugins (like Observability) become completely disabled and also the non-alerting instances cannot enqueue rule or action tasks for other Kibanas to pick up.

It seems removing the .enabled flag will cause some friction to some users who relied on such a flag to set up their deployment but it's also something we don't recommend / support. We are transitioning towards an internal xpack.task_manager.internal.exclude_task_types flag (#111036) that allows us to temporarily disable certain task types to debug Kibana, but we feel it will also become the new norm and prevent us from keeping it internal / removing such configuration at a future time.

Question: With the context above, we will be making this a breaking change and cause some friction, is your recommendation to still move forward with the removal of .enabled? And create an ER on ourselves to officially support dedicated alerting Kibanas? (which some users may be waiting for to upgrade)

kobelb · 2021-09-08T17:47:55Z

Question: With the context above, we will be making this a breaking change and cause some friction, is your recommendation to still move forward with the removal of .enabled? And create an ER on ourselves to officially support dedicated alerting Kibanas? (which some users may be waiting for to upgrade)

Based on all of the information in this thread, it's my understanding that we do see benefit from preventing a specific Kibana node from executing alerts; however, we don't want to make all consumers of the alerting framework have to deal with the complexities of the plugin being entirely disabled. If this is this case, I think it'd make sense for us to remove xpack.alerting.enabled and xpack.actions.enabled settings that disable the plugins entirely and instead add a xpack.alerting.rule_execution.enabled flag that only prevents the task manager from executing the alerting rules themselves.

Is this feasible?

chrisronline · 2021-09-08T17:51:27Z

@mikecote Does it feel possible that users might want to keep rule execution but disable action execution? A config like xpack.task_manager.internal.exclude_task_types gives us this flexibility out of the box - if we went down the route of xpack.alerting.rule_execution.enabled, we'd have to potentially duplicate this config across, at least, the actions plugin too (which is doable too)

mikecote · 2021-09-09T13:34:50Z

Thanks @kobelb for input! After chatting with @chrisronline offline (chrisroffline?) we're still not sure how a customer could manage to make Kibana alerting work on dedicated instances. So we've decided to pursue the path of having an internal xpack.task_manager.internal.exclude_task_types configuration for now (for ourselves, debugging purposes, etc) and take space/time to properly develop xpack.alerting.rule_execution.enabled and xpack.actions.action_execution.enabled in a way we would be comfortable to support when it becomes a priority.

chrisronline · 2021-09-09T15:13:46Z

@YulNaumenko @ymao1 @pmuellr Does anyone else have a strong opinion about the above direction?

pmuellr · 2021-09-09T15:53:53Z

General direction seems fine to me.

w/r/t xpack.task_manager.internal.exclude_task_types - we already have xpack.actions.enabledActionTypes:

A list of action types that are enabled. It defaults to [*], enabling all types. The names for built-in Kibana action types are prefixed with a . and include: .server-log, .slack, .email, .index, .pagerduty, and .webhook. An empty list [] will disable all action types.

Disabled action types will not appear as an option when creating new connectors, but existing connectors and actions of that type will remain in Kibana and will not function.

Would it make sense to have a similar setting for rule types? Customers could then essentially disable running rules by setting the value to [].

Then we'd be left with non-alerting tasks, and whether we'd want to disable those as well. Presumably task manager could have a similar setting (or does it already?).

I'm also wondering about our alerting tasks that aren't rule / connector execution - api key invalidation, etc - I'm assuming these won't have issues if "alerting is disabled", but not sure.

Also wondering what happens to rules already scheduled, when "alerting is disabled". Presumably the task docs still exist, the rules are still executed from a task manager POV, but then the rule executor would basically just return immediately. Could log a warning, maybe, since presumably the customer should go ahead and disable these.

gmmorris · 2021-09-09T17:11:31Z

There's a potential unintended consequence to the ability of disabling specific task types on a Kibana instance - it can break features like "Run Now".

For example:
Say a customer has two Kibana:

Kibana A is configured to run all tasks
Kibana B is configured to disable all alerting task types

User uses Stack Management to update an existing rule from interval:1h to interval:5m.
When they click "save" the API call hits Kibana B which updates the rule SO and then calls runNow on the rule in order to force the task to pull a fresh rule configuration from the SO. The runNow fails because that task type is disabled in Kibana B, but I think † this failure is silent (limited to Server Log) as runNow is async.
The user is given the feedback that the updated was successful, and the UX reflects back that the interval is now 5m because the rule SO was updated. Sadly, the task was not, and the rule doesn't run until the original schedule is reached which might be a whole hour away.

In general, I worry this config could be naively used to create "alerting only kibana" instacnes and this will likely have unintended consequences.
If we add this config it has to be marked experimental and unsafe somehow.

†: it might actually fail the update, but that still sucks, as the user randomly fails to update the rule 50% of the time.

pmuellr · 2021-09-09T17:24:08Z

Ya, it's complicated. RFC?

Another thing I happened to think of, is that whatever we come up with here to "disable alerting", could also potentially be used to prevent the "rando Kibana with different encryption key screws up all the ESOs". Like, you should really opt-in to alerting (or task manager) via some pre-arranged "id", so that rando Kibanas wouldn't actually upset alerting / task manager the way they do today. Not saying it HAS to, but it would be nice to solve that issue as well, and whatever we do in general regarding this issue might help :-)

chrisronline · 2021-09-09T18:48:47Z

I'd like to revisit the problem statement as I understand it.

Problem

A user is experiencing a performance issue in Kibana and isn't sure where it's coming from. They'd like to disable various features of Kibana to isolate the issue. Because we deprecated the ability to disable the various plugins alerting owns (with the goal to remove this functionality in 8.x), users are unable to determine if the performance issue stems from background work or something else.

To ensure users still have a way to isolate performance problems, we need to expose a config (that is not a feature, not supported, and inherently unsafe to use for more than a short period of time while debugging) that allows them to stop all/some background tasks while they are debugging the performance issue. This config should exist at the task manager level so they have better control over what they want to disable (everything, just actions, just rules, some other background task, etc)

Concerns

A concern is if users start using this config to enable different use cases (such as a dedicated Kibana to execute rules/actions). This is not the intended use case and therefore the config will explicitly state that it is "not safe" to use outside of debugging/troubleshooting. I know there are varying opinions about how much control we should give users to "shoot themselves in the foot" but I feel we can counteract that in two ways:

Be very explicit that this config is unsafe/unstable
Ensure this config is "supportable", meaning it will not take long to solve a support case where the root cause is this config (logging, additional field in TM. health api, etc)

I'm fairly sure our current solution solves this well and I don't think we need to worry about edge cases, as we're not building an supported feature.

Thoughts?

kobelb · 2021-09-09T19:38:52Z

I had a quick call with @chrisronline and @mikecote on Zoom, and I'll summarize my conclusions. I think that all of the proposed options are tolerable:

Add a xpack.task_manager.unsafe.exclude_task_types setting
Don't add new settings and tell users to configure xpack.task_manager.poll_interval: 12345678901234567890 xpack.task_manager.max_workers: 1
Add xpack.alerting.rule_execution.enable and xpack.actions.action_execution.enabled.

I will support whatever decision the team makes.

mikecote · 2021-09-09T20:02:36Z

We will be going with Option 1 as @kobelb mentioned above. The reason we are working on such a flag is what @chrisronline described here: #90934 (comment). The other discussion items provide use cases if this was an official feature but we don't have time or scope to develop an official feature for 7.16 as we remove the ability to make our plugins disableable. To mitigate the concerns, we've documented in the Task Manager's README that This configuration is experimental, unsupported and can only be used for temporary debugging purposes as it will cause Kibana to behave in unexpected ways. and seems to capture the concerns mentioned above. A warning will also appear when starting Kibana.

we already have xpack.actions.enabledActionTypes

This configuration would allow Task Manager to claim tasks and fail them during executions (due to being disabled). It would be preferable not to have any part of the task run.

I'm also wondering about our alerting tasks that aren't rule / connector execution - api key invalidation, etc - I'm assuming these won't have issues if "alerting is disabled", but not sure.

The xpack.task_manager.unsafe.exclude_task_types route can handle this by setting the relative task types as values.

Also wondering what happens to rules already scheduled, when "alerting is disabled"

In the xpack.task_manager.unsafe.exclude_task_types use case, Kibana will not be claiming as many or any tasks, so no claiming mutations will happen to those task documents, and they would remain untouched from a running perspective (Kibana can still schedule tasks, etc).

it can break features like "Run Now".

We've aimed our README documentation to mention that it will make Kibana behave in unexpected ways (This configuration is experimental, unsupported and can only be used for temporary debugging purposes as it will cause Kibana to behave in unexpected ways.). So we are covered under that use case. We're also refraining from officially documenting this setting so we can remove it at a future time when we have a better idea of how to officially support dedicated Kibanas to run alerting and address these types of problems.

leandrojmp · 2024-07-17T02:56:42Z

Hello, did this went anywhere?

So we are covered under that use case. We're also refraining from officially documenting this setting so we can remove it at a future time when we have a better idea of how to officially support dedicated Kibanas to run alerting and address these types of problems.

We are having some performance issues regarding to alerts and would like to have dedicated Kibana instances to run the alert tasks because it is impacting the overall usage of Kibana.

I couldn't find anything in the documentation about this, so I'm assuming that this not moved forward.

Is this still in the roadmap?

shanisagiv1 · 2024-07-23T08:34:12Z

After discussing this internally recently, unfortunately, there is no plan to support that in the near future.

pmuellr · 2024-07-23T12:58:59Z

We are having some performance issues regarding to alerts and would like to have dedicated Kibana instances to run the alert tasks because it is impacting the overall usage of Kibana.

I couldn't find anything in the documentation about this, so I'm assuming that this not moved forward.

Sounds like the node.roles configuration may work for you: https://www.elastic.co/guide/en/kibana/current/settings.html

The default setting is * which means Kibana operates in both ui and background_tasks roles. But you can also run a set of Kibanas where they only have have one role. Anything scheduled by task manager (like alerting rules) will run in the Kibanas with a background_tasks role, and HTTP requests are only served by Kibanas with a ui role.

leandrojmp · 2024-07-23T18:07:15Z

Hello @pmuellr,

I'm not sure this would work, I couldn't find description for what tasks kibana executes when using the role ui and when using the role background_tasks, not sure this is documented.

But according to support it is not possible to have Kibana instances dedicates to Alerts only, even if I add more instances and do not put them behind our current LB of Kibana they would still run all tasks.

Support also said that there are already some enhancement requests about this and they opened another one with the number #22253.

We had a call with some Elastic Engineers today about other stuff and we also mentioned this, being able to uncouple the alerting functions from the other kibana functions.

pmuellr · 2024-07-24T16:49:15Z

But according to support it is not possible to have Kibana instances dedicates to Alerts only

True, but only because there are other tasks (clean up, telemetry, etc) that run in the background_tasks nodes, besides alerting rules (and connector executions for alert notifications). Some of those can be expensive as well, so actually good to have all background tasks separate from "ui" processing.

I'm not sure this would work ...

But according to support it is not possible to have Kibana instances dedicates to Alerts only, even if I add more instances and do not put them behind our current LB of Kibana they would still run all tasks.

With the default config, for on-prem, true. For ESS, you'll start getting background_node and ui nodes generated once you go past 8GB in stateful. For serverless, there are always separate background_task and ui nodes, which are autoscaled separately.

pmuellr · 2024-07-24T17:38:55Z

And ya, sorry, we don't have much in the way of docs on this. I've opened issue add doc for node.roles #189116 to track ...

leandrojmp · 2024-07-24T18:01:00Z

As noted in ER Dedicated instances for Kibana rules #22253, this is already in use in stateful and serverless in ESS.

This issue is private, but is this would also be the case for on-prem deployments? Because the answer I got from support is that this is not possible.

Not sure now if I should reopen the case linking this thread to get more information.

pmuellr · 2024-07-24T18:47:49Z

Yes, this works for on-prem as well.

If you already had a support case open on this topic, I'd suggest re-opening or open a new one.

Sorry about the inaccessible link. It doesn't really say much more than my comment on ESS / serverless above ... (figured I'd just duplicate the info here).

mikecote added discuss Feature:Alerting Feature:Task Manager Feature:Actions Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Feature:EventLog labels Feb 10, 2021

mikecote changed the title ~~Should the alerting services plugin be allowed to be disabled?~~ Should the alerting services plugin always be enabled? Feb 10, 2021

mikecote mentioned this issue Mar 2, 2021

[Alerting][Docs] Changed alerting documentation to point to a single source of explaining the configurations. #92942

Merged

1 task

mikecote changed the title ~~Should the alerting services plugin always be enabled?~~ [8.0 only] Should the alerting services plugin always be enabled? Jun 15, 2021

mikecote added the Breaking Change label Jun 15, 2021

This was referenced Jul 5, 2021

Triggers action ui x-pack configuration error #64920

Closed

[RAC][Epic] Retiring deprecated alerting framework features #104320

Closed

gmmorris added the loe:large Large Level of Effort label Jul 6, 2021

chrisronline mentioned this issue Aug 12, 2021

Deprecate ability to disable alerting, actions, task manager, stack alerts, and event log plugins #108281

Merged

chrisronline mentioned this issue Aug 12, 2021

[7.x] Deprecate ability to disable alerting, actions, task manager, stack alerts, and event log plugins #108396

Closed

gmmorris added the core services Issues related to enabling features across Kibana to leverage core services across domains label Aug 16, 2021

chrisronline self-assigned this Aug 17, 2021

gmmorris added the estimate:medium Medium Estimated Level of Effort label Aug 18, 2021

gmmorris removed the loe:large Large Level of Effort label Sep 2, 2021

chrisronline mentioned this issue Sep 29, 2021

[Alerting] Prevent our plugins from being disabled #113461

Merged

chrisronline closed this as completed in #113461 Oct 5, 2021

stefnestor mentioned this issue Jan 10, 2022

Bulk enable/disable Kibana Rules #116017

Open

kobelb added the needs-team Issues missing a team label label Jan 31, 2022

botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[8.0 only] Should the alerting services plugin always be enabled? #90934

[8.0 only] Should the alerting services plugin always be enabled? #90934

mikecote commented Feb 10, 2021

elasticmachine commented Feb 10, 2021

mikecote commented Feb 10, 2021

pmuellr commented Feb 10, 2021

Crazybus commented May 28, 2021

mikecote commented Aug 10, 2021

pmuellr commented Aug 10, 2021

mikecote commented Aug 10, 2021

chrisronline commented Aug 12, 2021

mikecote commented Aug 12, 2021

mikecote commented Sep 8, 2021

kobelb commented Sep 8, 2021 •

edited

Loading

chrisronline commented Sep 8, 2021

mikecote commented Sep 9, 2021 •

edited

Loading

chrisronline commented Sep 9, 2021

pmuellr commented Sep 9, 2021

gmmorris commented Sep 9, 2021

pmuellr commented Sep 9, 2021

chrisronline commented Sep 9, 2021

kobelb commented Sep 9, 2021 •

edited

Loading

mikecote commented Sep 9, 2021

leandrojmp commented Jul 17, 2024

shanisagiv1 commented Jul 23, 2024

pmuellr commented Jul 23, 2024

leandrojmp commented Jul 23, 2024

pmuellr commented Jul 24, 2024 •

edited

Loading

pmuellr commented Jul 24, 2024

leandrojmp commented Jul 24, 2024

pmuellr commented Jul 24, 2024

[8.0 only] Should the alerting services plugin always be enabled? #90934

[8.0 only] Should the alerting services plugin always be enabled? #90934

Comments

mikecote commented Feb 10, 2021

elasticmachine commented Feb 10, 2021

mikecote commented Feb 10, 2021

pmuellr commented Feb 10, 2021

Crazybus commented May 28, 2021

mikecote commented Aug 10, 2021

pmuellr commented Aug 10, 2021

mikecote commented Aug 10, 2021

chrisronline commented Aug 12, 2021

mikecote commented Aug 12, 2021

mikecote commented Sep 8, 2021

kobelb commented Sep 8, 2021 • edited Loading

chrisronline commented Sep 8, 2021

mikecote commented Sep 9, 2021 • edited Loading

chrisronline commented Sep 9, 2021

pmuellr commented Sep 9, 2021

gmmorris commented Sep 9, 2021

pmuellr commented Sep 9, 2021

chrisronline commented Sep 9, 2021

Problem

Concerns

kobelb commented Sep 9, 2021 • edited Loading

mikecote commented Sep 9, 2021

leandrojmp commented Jul 17, 2024

shanisagiv1 commented Jul 23, 2024

pmuellr commented Jul 23, 2024

leandrojmp commented Jul 23, 2024

pmuellr commented Jul 24, 2024 • edited Loading

pmuellr commented Jul 24, 2024

leandrojmp commented Jul 24, 2024

pmuellr commented Jul 24, 2024

kobelb commented Sep 8, 2021 •

edited

Loading

mikecote commented Sep 9, 2021 •

edited

Loading

kobelb commented Sep 9, 2021 •

edited

Loading

pmuellr commented Jul 24, 2024 •

edited

Loading