Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerts][Actions] The Email connector stopped working after the upgrade Kibana from 7.10.1 to 7.11.1. #94021

Closed
YulNaumenko opened this issue Mar 8, 2021 · 20 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Actions Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@YulNaumenko
Copy link
Contributor

Kibana version:
7.11.1.

Describe the bug:

We have two customers reported about the problem with Email connector after upgrading from 7.10.1 to 7.11.1:

@maggieghamry

We recently upgraded from 7.10.1 to 7.11.1. The alert email connector we had was working until after the upgrade. We did not change anything on the config. We checked with the mail team and they do not see any issues on their side.
This is our config -
host - appmail.federated.fds, port:25
The error we are getting is -
Action failed to run
The following error was found:
error sending email
Details:
Hostname/IP does not match certificate's altnames: Host: appmail.fds.com. is not in the cert's altnames: DNS:securemail.fds.com, DNS:www.securemail.fds.com

and

@predogma

This customer was using k8 on openshift with version 7.10.1 and was hitting known issue (https://github.com/elastic/kibana/issues/88733) with the slack connector.  We had him update to v7.11.1, which corrected the slack connector issue, but in exchange it broke the email connector which was functioning. 
Error "message": "action execution failure: .email:f38886b0-7797-11eb-aa11-01cc07c9a284: email: error sending email: Greeting never received"
@YulNaumenko YulNaumenko added bug Fixes for quality problems that affect the customer experience Feature:Alerting Feature:Actions Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Mar 8, 2021
@predogma
Copy link

predogma commented Mar 9, 2021

FYI the update from 7.10 > 7.11.1 has these set already. The issue did not occur in 7.10, only when updating to 7.11.1 (same configs)

  xpack.actions.proxyRejectUnauthorizedCertificates: false
  xpack.actions.rejectUnauthorized: false

@ymao1
Copy link
Contributor

ymao1 commented Mar 11, 2021

@predogma Can you provide the customer's email configuration that is not working? Thanks!

@ymao1 ymao1 self-assigned this Mar 11, 2021
@predogma
Copy link

@ymao1 customer is K8 openshift, the following is drom the kinana-configmap.yml

...
    xpack:
      security:
        encryptionKey: ${KIBANA_ENCRYPTION_KEY}
        session:
          idleTimeout: 3600000
          lifespan: '30d'
        enabled: 'true'
        loginAssistanceMessage: ''
        loginHelp: ''
      encryptedSavedObjects:
        encryptionKey: ${KIBANA_ENCRYPTION_KEY}
      reporting:
        encryptionKey: ${KIBANA_ENCRYPTION_KEY}
      actions:
        proxyUrl: 'https://*********:8083'
        allowedHosts: ['*']
        proxyRejectUnauthorizedCertificates: false
        rejectUnauthorized: false
        enabledActionTypes: ['*']
...

@ymao1
Copy link
Contributor

ymao1 commented Mar 11, 2021

@predogma Thank you for providing the kibana config! Can you provide the email connector configuration? You can access this in the .kibana index using this query

GET .kibana/_search
{
  "query": {
    "bool": {
      "filter": [{
        "terms": {
          "type": [
            "action"
          ]
        }
      },{
        "terms": {
          "action.actionTypeId": [
            ".email"
          ]
        }
      }]
    }
  }
}

@pmuellr
Copy link
Member

pmuellr commented Mar 25, 2021

This seems like the bug that was resolved in 7.11.2 - issue #91363 (comment)

7.11.2 should be avoided because of some migration errors with some alerts, but the fix should be in 7.12.0 which is now out, as well.

@ymao1
Copy link
Contributor

ymao1 commented Mar 25, 2021

Thanks @pmuellr! I thought the workaround until upgrading to 7.11.2/7.12 was to set rejectUnauthorized to false but it sounds it is already set to false and still not working?

@predogma
Copy link

@ymao1 @pmuellr Unfortunately they don't seem to have configuration any more, but it was basic.

I am a bit confused on this. Is it resolved or not in 7.12, if so what was 'fixed/changed', because yes in 7.11.1, customer already had the settings set to false and it was not working, when it worked in 7.10. What changed in 7.12 that would make it fixed before I ask customer to update again.

@ymao1
Copy link
Contributor

ymao1 commented Mar 25, 2021

In 7.11 (but not 7.10), email connectors for simple mail servers that don't support TLS were broken unless the global config xpack.actions.rejectUnauthorized was set to false. This PR available in 7.12 fixed that issue so you wouldn't need to set that global config to get simple mail servers working.

I guess it is hard to say if this would resolve the issue without knowing what type of email connector they had set up. If the email server did not support TLS, then this bug fix would likely resolve the issue. If it does support TLS, then it might be a different issue. Is that a fair assessment @pmuellr ?

@predogma
Copy link

predogma commented Mar 25, 2021

I cross referenced back to other case, same customer, here are example of the connectors being used.
I will see again if can get current one and confirm again if still having this issue.

      {
        "_index" : ".kibana_1",
        "_type" : "_doc",
        "_id" : "***:action:caa53c1f-2a03-4e39-935a-be1db517dcee",
        "_score" : 1.0,
        "_source" : {
          "action" : {
            "actionTypeId" : ".email",
            "name" : "dummy",
            "config" : {
              "hasAuth" : true,
              "from" : "*****@****",
              "host" : "*******",
              "port" : 465,
              "service" : null,
              "secure" : null
            },
            "secrets" : "***************"
         },
          "type" : "action",
          "references" : [ ],
          "namespace" : "***",
          "migrationVersion" : {
            "action" : "7.10.0"
          },
          "updated_at" : "2021-01-29T22:09:44.939Z"
        }
      },
....
     {
        "_index" : ".kibana_1",
        "_type" : "_doc",
        "_id" : "***:action:095f345d-44d6-4a2b-820b-07e39d9d73ae",
        "_score" : 1.0,
        "_source" : {
          "action" : {
            "actionTypeId" : ".email",
            "name" : "test",
            "config" : {
              "hasAuth" : false,
              "from" : "******@*****",
              "host" : "********",
              "port" : 25,
              "service" : null,
              "secure" : null
            },
            "secrets" : "***************"
        },
          "type" : "action",
          "references" : [ ],
          "namespace" : "***",
          "migrationVersion" : {
            "action" : "7.10.0"
          },
          "updated_at" : "2021-01-29T22:35:58.987Z"
        }
      },

@predogma
Copy link

Have call today a 2PM EST, will see about getting current config he has now for the connectors in 7.11.1.

@predogma
Copy link

predogma commented Mar 25, 2021

@ymao1 yes its not really any diff

    {
      "took" : 2,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 0.0,
        "hits" : [
          {
            "_index" : ".kibana_1",
            "_type" : "_doc",
            "_id" : "action:3948ca40-8d94-11eb-bbc3-99d2f782c597",
            "_score" : 0.0,
            "_source" : {
              "action" : {
                "actionTypeId" : ".email",
                "name" : "test-email",
                "config" : {
                  "hasAuth" : false,
                  "from" : "******@***",
                  "host" : "*******",
                  "port" : 25,
                  "service" : null,
                  "secure" : null
                },
                "secrets" : "************"
              },
              "type" : "action",
              "references" : [ ],
              "migrationVersion" : {
                "action" : "7.11.0"
              },
              "updated_at" : "2021-03-25T18:02:12.083Z"
            }
          }
        ]
      }
    }

server does not require secure config he says

@pmuellr
Copy link
Member

pmuellr commented Mar 25, 2021

Here's the release-based summary, AFAIK, looking through issues:

So I'd expect that email connector bug to exist in 7.11.0 and 7.11.1, but fixed in 7.11.2.

However, I think that was all concerning using the mail directly, and not through a proxy, and it looks like a proxy is being used here, for one of the customers. Is the other customer also using a proxy?

In issue #91686 (comment) , I note that setting xpack.actions.rejectUnauthorized to false is a work-around. Looking at the code again, that still seems like the case.

With a proxy in place, we follow a slightly different path, but should still be setting the ultimate rejectUnauthorized based on xpack.actions.proxyRejectUnauthorizedCertificates set to false.

So it's concerning that using those flags doesn't work for the customer in 7.11.1 - it seems like there could be a different bug involving using TLS AND a proxy for the email.

Realized also we did a node version jump from 7.10 -> 7.11 from node 10 -> 14. It's possible there was some change to the underlying TLS support in node that somehow broke this as well. Seems unlikely, but something to consider if we have to debug the email connector further.

One piece of info that would be good to have is if they have a user/pass set for the email or not. Some of the logic in the email connector is sensitive to that.

@pmuellr
Copy link
Member

pmuellr commented Mar 25, 2021

Unfortunately they don't seem to have configuration any more, but it was basic.

I'm not sure what this is in reference to - but guessing the email connectors? Like they deleted the connectors that were working, and then broke. And then created some new ones, that have never worked? Or does configuration relate to the Kibana config?

I assume a proxy is still in use here? We had a different proxy problem that we also fixed in the same time frame, there's an odd chance that by fixing that, we somehow broke the email connector, but seems unlikely to me.

@predogma
Copy link

predogma commented Mar 25, 2021

Sorry, they had email connector working in 7.10.0, but slack connector not working, they updated to 7.11.1, then slack worked and email broke. So the configs were set in this environment (7.11.1) rejectUnauthorized, proxyRejectUnauthorizedCertificates and still email broken. The 7.11.0 showing in that email connector from API call, is misleading as I just saw him restart the pod and that is indicating 7.11.1 for Kibana.

@predogma
Copy link

@pmuellr Customer is going to update the deployment 7.12.0. I will reschedule with them if issue still exists. Bare with it will take them a couple days.

@pmuellr
Copy link
Member

pmuellr commented Mar 26, 2021

I assume a proxy is still in use here?

Given proxyRejectUnauthorizedCertificates is mentioned, I assume a proxy IS in use here, but wanted to double-check.

Can we find out if their proxy supports connecting to SMTP servers? Not clear that they all do, but that's the route the code flow is going to take now, because we don't have a way of bypassing the proxy for certain servers, yet. See issue #92949 for more details and a link to a work-in-progress PR for that.

Mentioning this as it's possible in the older code, somehow the email DID (inadvertently) not use the proxy, allowing things to work.

@predogma
Copy link

Why would that change between 7.10.0 and 7.11.1 though? it worked to the same SMTP servers (2 different ones/connectors) prior to updating to 7.11.1.

@gmmorris
Copy link
Contributor

I'm a little confused by this issue:
Have we identified a concrete problem yet that isn't addressed by #91760?

@predogma
Copy link

@ymao1 Checking back to original case. It was closed with no confirmation if 7.12.x resolved issue.

@ymao1
Copy link
Contributor

ymao1 commented May 24, 2021

@predogma Thank you for checking! I will close this issue for now and reopen if it comes up again.

@ymao1 ymao1 closed this as completed May 24, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Actions Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

No branches or pull requests

6 participants