Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TSDB] k8 package upgrade failure from 8.6.2 to 8.8.1 #6554

Closed
constanca-m opened this issue Jun 13, 2023 · 19 comments
Closed

[TSDB] k8 package upgrade failure from 8.6.2 to 8.8.1 #6554

constanca-m opened this issue Jun 13, 2023 · 19 comments
Assignees

Comments

@constanca-m
Copy link
Contributor

constanca-m commented Jun 13, 2023

This issue is for the problem exposed by elastic/elasticsearch#96254 (comment).

Summary:

  • ES 8.6.2, K8s 1.32.2
  • Upgrade ES to 8.7.1
  • Upgrade ECS to 8.8.1 and update K8s to 1.41.0 caused error:
"statusCode": 500,
"error": "Internal Server Error",
"message": "illegal_argument_exception\n\tCaused by:\n\t\tillegal_argument_exception: invalid composite mappings for [metrics-kubernetes.state_daemonset]\n\tRoot causes:\n\t\tillegal_argument_exception: composable template [metrics-kubernetes.state_daemonset] template after composition with component templates [metrics-kubernetes.state_daemonset@package, metrics-kubernetes.state_daemonset@custom, .fleet_globals-1, .fleet_agent_id_verification-1] is invalid"

Deleting every data from Kubernetes on Index Management does not seem to fix the problem.


@constanca-m @lalit-satapathy

Good to know I shouldn't be encountering this error. I am using an Elastic cluster via Elastic Cloud.

How did you install the k8 package. Did you upgrade the stack? Did you upgrade the package version?
Originally I had it installed under ES 8.6.2 and the k8s package was 1.32.2. On 06/05 I upgraded the cluster first to 8.7.1 and then to 8.8.0 and tried to update the k8s package to 1.41.0 and the error I observed from the call via browser dev tools was this -
{"statusCode":500,"error":"Internal Server Error","message":"illegal_argument_exception\n\tCaused by:\n\t\tillegal_argument_exception: invalid composite mappings for [metrics-kubernetes.pod]\n\tRoot causes:\n\t\tillegal_argument_exception: composable template [metrics-kubernetes.pod] template after composition with component templates [metrics-kubernetes.pod@package, metrics-kubernetes.pod@custom, .fleet_globals-1, .fleet_agent_id_verification-1] is invalid"}

I opened a support ticket and was advised to remove mappings and templates, I uninstalled the k8s package and tried to install it as a fresh 1.41.0 and it continues to run into this error. The mappings it fails on changes sometimes, but it always has the same general error.

I've gone through and verified there are no components, datastreams, indicies or index templates with kubernetes in it. In the logs when I try to install it, I can see it add some of the templates and then remove them once it fails.
I've also updated to 8.8.1 but same error.

Which version of the package you are getting the error?
1.41.0 on ES 8.8.1

Please provide the specific error you are getting.
I get this error if I click either to add Kubernetes and then choose an agent policy, or if I click to install the assets for the integration.
{
"statusCode": 500,
"error": "Internal Server Error",
"message": "illegal_argument_exception\n\tCaused by:\n\t\tillegal_argument_exception: invalid composite mappings for [metrics-kubernetes.state_daemonset]\n\tRoot causes:\n\t\tillegal_argument_exception: composable template [metrics-kubernetes.state_daemonset] template after composition with component templates [metrics-kubernetes.state_daemonset@package, metrics-kubernetes.state_daemonset@custom, .fleet_globals-1, .fleet_agent_id_verification-1] is invalid"
}

You'll see this one fails on metrics-kubernetes.state_daemonset but if I click the same button to install the assets it fails on metrics-kubernetes.state_deployment so it's seemingly random which one it fails on.
{
"statusCode": 500,
"error": "Internal Server Error",
"message": "illegal_argument_exception\n\tCaused by:\n\t\tillegal_argument_exception: invalid composite mappings for [metrics-kubernetes.state_deployment]\n\tRoot causes:\n\t\tillegal_argument_exception: composable template [metrics-kubernetes.state_deployment] template after composition with component templates [metrics-kubernetes.state_deployment@package, metrics-kubernetes.state_deployment@custom, .fleet_globals-1, .fleet_agent_id_verification-1] is invalid"
}

I also attached the ES log from when I tried to add it but not much interesting to me.
k8saddassets.txt

@lincolncoe-vca That should not happen, since TSDB is not enabled on the events data stream. Are you enabling it yourself?

I did not know what TSDB was until I came across this thread so I believe the answer is no but I am happy to check if you let me know how I can do so.

@lalit-satapathy lalit-satapathy changed the title [Kubernetes] Illegal argument exception [TSDB] k8 package upgrade failure from 8.6.2 to 8.8.1 Jun 13, 2023
@lincolncoe-vca
Copy link

Thanks @constanca-m for separating. Is there anything else I can provide to help narrow this down?

@lalit-satapathy
Copy link
Collaborator

lalit-satapathy commented Jun 13, 2023

Unfortunately doing a whole new cluster won't work, we have a lot of other data on this one.
Unfortunately I also have nothing under indicies, data streams, index templates or component templates related to kubernetes but it still fails to add the integration.

@lincolncoe-vca,

We need to see that the issue has occurred only because of series of steps that has been used here to upgrade ES from: ES 8.6.2 to 8.7.1 to 8.8.0. So, can you help re-confirm that on a fresh installation of 8.8.0 with the k8 package, no such issue occurs at your side?

@lincolncoe-vca
Copy link

@lalit-satapathy I can try later today, but if it worked for you all I would think that is probably fine.

I think the bigger question for me is if it's ES 8.6.2 to 8.7.1 to 8.8.0, or if it's related to the k8 package update from 1.32.2 to 1.41.0.

@constanca-m
Copy link
Contributor Author

I did the following:

  • Installed K8s 1.32.2 on 8.6.2
  • Upgraded to 8.7.1
  • Upgraded to 8.8.0
  • Upgrade K8s to 1.42.0.

Still could not reproduce the error.

Then I uninstalled Kubernetes 1.42.0 and installed 1.41.0. Still nothing.

@lincolncoe-vca
Copy link

@constanca-m Hmm, is there anything I can run to try to get more details on which component field it's failing on?

@constanca-m
Copy link
Contributor Author

I am out of ideas for now @lincolncoe-vca. Did you try to install Kubernetes 1.42.0 (latest version) after you deleted everything?

@lincolncoe-vca
Copy link

@constanca-m Just tried 1.42.0, same thing
{
"statusCode": 500,
"error": "Internal Server Error",
"message": "illegal_argument_exception\n\tCaused by:\n\t\tillegal_argument_exception: invalid composite mappings for [metrics-kubernetes.state_resourcequota]\n\tRoot causes:\n\t\tillegal_argument_exception: composable template [metrics-kubernetes.state_resourcequota] template after composition with component templates [metrics-kubernetes.state_resourcequota@package, metrics-kubernetes.state_resourcequota@custom, .fleet_globals-1, .fleet_agent_id_verification-1] is invalid"
}

@tetianakravchenko
Copy link
Contributor

hey @lincolncoe-vca could you please provide some details on your setup? like what is your k8s environment (GKE,EKS/etc)? Do you have any custom integration configuration (like processors)? or you use the default configuration? in case you have an index mapping of any of the data_stream that are failing now, before the package version was upgraded to 1.41.0, could you provide it as well?

I opened a support ticket and was advised to remove mappings and templates, I uninstalled the k8s package and tried to install it as a fresh 1.41.0 and it continues to run into this error. The mappings it fails on changes sometimes, but it always has the same general error.

correctly I understand that you see this error at the package installation time? or it fails after some time of running?

@lincolncoe-vca
Copy link

@tetianakravchenko It's when trying to install the integration in Fleet, not even down to the k8s integration part.

None of the datastreams persist, it removes them all after it fails. These are the 2 persisting component templates it's failing on -

.fleet_agent_id_verification-1
{ "_routing": { "required": false }, "numeric_detection": false, "dynamic_date_formats": [ "strict_date_optional_time", "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z" ], "dynamic": true, "_source": { "excludes": [], "includes": [], "enabled": true }, "date_detection": true, "properties": { "event": { "type": "object", "properties": { "agent_id_status": { "ignore_above": 1024, "type": "keyword" }, "ingested": { "format": "strict_date_time_no_millis||strict_date_optional_time||epoch_millis", "type": "date" } } } } }

.fleet_globals-1
{ "_routing": { "required": false }, "numeric_detection": false, "_meta": { "managed_by": "fleet", "managed": true }, "dynamic": true, "_source": { "excludes": [], "includes": [], "enabled": true }, "dynamic_templates": [ { "strings_as_keyword": { "mapping": { "ignore_above": 1024, "type": "keyword" }, "match_mapping_type": "string" } } ], "date_detection": false }

@lalit-satapathy
Copy link
Collaborator

@lincolncoe-vca,

On 06/05 I upgraded the cluster first to 8.7.1 and then to 8.8.0 and tried to update the k8s package to 1.41.0

The first time issue was seen during an upgrade only? Which date was 8.8.0 upgrade was run first time?

@lincolncoe-vca
Copy link

@lalit-satapathy It was after I updated to 8.8.0 that I tried to update k8s package from 1.32.2 to 1.41.0, I do not know if updating the k8s package from 1.32.2 to 1.41.0 while on ES 8.6.2 would have caused the issue unfortunately.

06/05 is when the upgrade was run the first time.

@tetianakravchenko
Copy link
Contributor

tetianakravchenko commented Jun 14, 2023

For me the .fleet_agent_id_verification-1 and .fleet_globals-1 shared in #6554 (comment) looks correct, @juliaElastic could you please double check the comment above #6554 (comment) ?

@juliaElastic also maybe you have any idea of any changes from the fleet side to cause such an issue after upgrading a stack from 8.6.2 -> 8.7.1 and then to 8.8.0? As mentioned in the comment It's when trying to install the integration in Fleet, not even down to the k8s integration part. - so shouldn't be related to the environment.

@lincolncoe-vca could you please try to install version 1.39.1 and check if it works? it is the latest version before the changes, that I think could have caused this error

@juliaElastic
Copy link
Contributor

juliaElastic commented Jun 14, 2023

I could reproduce this by installing the latest kubernetes package (1.42.0) and then trying to update the .fleet-globals-1 or the .fleet_agent_id_verification-1 component template with the content provided above.
The root cause is visible if we add the ?error_trace=true parameter to the ES update.

PUT _component_template/.fleet_globals-1?error_trace=true
{
  "template": {
    "mappings":  { "_routing": { "required": false }, "numeric_detection": false, "_meta": { "managed_by": "fleet", "managed": true }, "dynamic": true, "_source": { "excludes": [], "includes": [], "enabled": true }, "dynamic_templates": [ { "strings_as_keyword": { "mapping": { "ignore_above": 1024, "type": "keyword" }, "match_mapping_type": "string" } } ], "date_detection": false }}
}

Caused by: org.elasticsearch.index.mapper.MapperParsingException: Failed to parse mapping: Cannot set both [mode] and [enabled] parameters
		at org.elasticsearch.server@8.9.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.parseMapping(MapperService.java:385)
		at org.elasticsearch.server@8.9.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:361)
		at org.elasticsearch.server@8.9.0-SNAPSHOT/org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.lambda$validateCompositeTemplate$28(MetadataIndexTemplateService.java:1593)
		... 22 more

So the issue is that the kubernetes package defines _source.mode: synthetic and the fleet component templates define _source.enabled: true, and both are not allowed to be set.
This could be fixed by removing the enabled: true from the fleet templates.
Do we know if the enabled config was added by the client? It doesn't seem to be there on a fresh cluster.

@tetianakravchenko
Copy link
Contributor

tetianakravchenko commented Jun 14, 2023

@lincolncoe-vca could you please try to update .fleet_agent_id_verification-1 and .fleet_globals-1 templates either via UI or using https://www.elastic.co/guide/en/elasticsearch/reference/8.8/indices-component-template.html with the same content as in the error, but without _source.enabled: true ?

have you updated .fleet_agent_id_verification-1 and .fleet_globals-1 manually before?
it seems that synthetic is enabled in your environment, how it was enabled?

@lincolncoe-vca
Copy link

@tetianakravchenko
Success! Removing _source.enabled: true fixed it.

@juliaElastic I don't know how enabled was added, I rarely touch the API's tho, the vast majority of things I manage are via Kibana. I went back and looked at the dev tools history in Kibana and don't see anything that would suggest I touched that previously.

Thanks for all your help on this.

@tetianakravchenko
Copy link
Contributor

tetianakravchenko commented Jun 14, 2023

@lincolncoe-vca did you maybe have Synthetic source enabled for the kubernetes package data_stream in stack version 8.6.2? or enabled it somehow else?

Screenshot 2023-06-14 at 19 51 32

@lincolncoe-vca
Copy link

@tetianakravchenko
I don't think so but also happy to look if it's on any others - where do I find that setting?

@tetianakravchenko
Copy link
Contributor

@lincolncoe-vca
If you upgraded your stack to 8.8.x - it is not available anymore, in 8.6.x it was available in the integration package under the Advanced options:
Screenshot 2023-06-14 at 20 15 36

@tetianakravchenko
Copy link
Contributor

As the reported issue was resolved, I am closing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants