-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Elasticsearch] Add dimensions fields for TSDB migration #6623
Conversation
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
🌐 Coverage report
|
/test |
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All data_streams are missing fields that were defined in this convention: #5193 (comment)
CCR:
- doesn't it need a elasticsearch.cluster.id as a dimension?
enrich:
- doesn't it need a elasticsearch.cluster.id as a dimension? node.id could be enough, but
elasticsearch.cluster.id
ensure uniqueness if we have multiple clusters
Index:
elasticsearch.cluster.id
I think is missing. there could be a case for the same name of the index in multiple clusters
the same for index_recovery, I think it applies to all data_streams
ingest_pipeline:
do you think it is needed to add node.id
or the pipeline name/id is the same on all nodes?
ml:
should node.id be added
"node": { | |
"id": "2eRkSFTXSLie_seiHf4Y1A", | |
"name": "efacd89a6e88" |
@@ -89,6 +89,7 @@ | |||
type: keyword | |||
- name: cluster.name | |||
type: keyword | |||
dimension: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if cluster.id will not be a better candidate for the dimension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not migrate this one, it is still pending (description).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not migrate this one, it is still pending (description).
but you are planning to add dimension fields for this data_streams that are blocked by mentioned in description issues, in this PR? or you plan to move those data_streams to another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They will be moved to another PR. I will remove this dimension, but I will leave the ecs
ones, just to not cause confusion then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the one for cluster stats (as I believe it is also not necessary). I am leaving the enrich
dimensions though, even if it is not migrated - I will validate it again when the issue is resolved.
processors: | ||
- fingerprint: | ||
fields: | ||
- elasticsearch.ingest_pipeline.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the type: wildcard
is the case as type: object
?
fields:
- name: name
type: wildcard
description: Name / id of the ingest pipeline
can you please share sample of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please share sample of it?
Sorry, I don't understand. A sample of the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't understand. A sample of the error?
sample of the document - part of the document that include this field, there is missing same_event for this data_stream, can't check it there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a sample document
{
"_index": ".ds-metrics-elasticsearch.ingest_pipeline-default-2023.06.23-000001",
"_id": "5p2D54gBH7q8D4JF6839",
"_version": 1,
"_score": 0,
"_source": {
"agent": {
"name": "kind-control-plane",
"id": "a781ce37-a210-49d3-8344-6518fb35d4ac",
"type": "metricbeat",
"ephemeral_id": "55936ecf-0dfb-474d-9031-284992efdf8a",
"version": "8.8.0"
},
"@timestamp": "2023-06-23T09:09:21.280Z",
"elasticsearch": {
"node": {
"roles": [
"data_content",
"data_hot",
"ingest",
"master",
"remote_cluster_client",
"transform"
],
"name": "instance-0000000000",
"id": "J_W-dXFXTxuXnGCwbCb6Iw"
},
"cluster": {
"name": "985f2ca8e1a74327aa2c698275330b90",
"id": "SyM7nU1DRmKd3soposFsXg"
},
"ingest_pipeline": {
"total": {
"count": 428,
"failed": 0,
"time": {
"total": {
"ms": 0
},
"self": {
"ms": 0
}
}
},
"name": "metrics-elasticsearch.stack_monitoring.cluster_stats-1.7.4",
"name_fingerprint": "LX8WOW8tc72gcK7v5HOrWtDf6v4="
}
},
"ecs": {
"version": "8.0.0"
},
"data_stream": {
"namespace": "default",
"type": "metrics",
"dataset": "elasticsearch.ingest_pipeline"
},
"service": {
"address": "https://test-es-3.es.us-central1.gcp.cloud.es.io:9243",
"type": "elasticsearch"
},
"elastic_agent": {
"id": "a781ce37-a210-49d3-8344-6518fb35d4ac",
"version": "8.8.0",
"snapshot": true
},
"host": {
"hostname": "kind-control-plane",
"os": {
"kernel": "5.15.49-linuxkit",
"codename": "focal",
"name": "Ubuntu",
"type": "linux",
"family": "debian",
"version": "20.04.6 LTS (Focal Fossa)",
"platform": "ubuntu"
},
"containerized": false,
"ip": [
"10.244.0.1",
"10.244.0.1",
"10.244.0.1",
"172.18.0.2",
"fc00:f853:ccd:e793::2",
"fe80::42:acff:fe12:2",
"172.25.0.4"
],
"name": "kind-control-plane",
"id": "e12fa0193ee24a5cae5f9665f6e7eb8c",
"mac": [
"02-42-AC-12-00-02",
"02-42-AC-19-00-04",
"22-DE-5A-26-82-AC",
"3A-AE-FC-E1-7E-8C",
"7E-91-38-58-97-2B"
],
"architecture": "x86_64"
},
"metricset": {
"period": 10000,
"name": "ingest_pipeline"
},
"event": {
"duration": 275991722,
"agent_id_status": "verified",
"ingested": "2023-06-23T09:09:22Z",
"module": "elasticsearch",
"dataset": "elasticsearch.ingest_pipeline"
}
},
"fields": {
"elastic_agent.version": [
"8.8.0"
],
"elasticsearch.ingest_pipeline.name_fingerprint": [
"LX8WOW8tc72gcK7v5HOrWtDf6v4="
],
"host.hostname": [
"kind-control-plane"
],
"host.mac": [
"02-42-AC-12-00-02",
"02-42-AC-19-00-04",
"22-DE-5A-26-82-AC",
"3A-AE-FC-E1-7E-8C",
"7E-91-38-58-97-2B"
],
"service.type": [
"elasticsearch"
],
"host.ip": [
"10.244.0.1",
"10.244.0.1",
"10.244.0.1",
"172.18.0.2",
"fc00:f853:ccd:e793::2",
"fe80::42:acff:fe12:2",
"172.25.0.4"
],
"agent.type": [
"metricbeat"
],
"event.module": [
"elasticsearch"
],
"host.os.version": [
"20.04.6 LTS (Focal Fossa)"
],
"elasticsearch.ingest_pipeline.total.time.total.ms": [
0
],
"host.os.kernel": [
"5.15.49-linuxkit"
],
"host.os.name": [
"Ubuntu"
],
"agent.name": [
"kind-control-plane"
],
"host.name": [
"kind-control-plane"
],
"elastic_agent.snapshot": [
true
],
"event.agent_id_status": [
"verified"
],
"host.id": [
"e12fa0193ee24a5cae5f9665f6e7eb8c"
],
"elasticsearch.node.roles": [
"data_content",
"data_hot",
"ingest",
"master",
"remote_cluster_client",
"transform"
],
"elasticsearch.node.id": [
"J_W-dXFXTxuXnGCwbCb6Iw"
],
"elasticsearch.cluster.name": [
"985f2ca8e1a74327aa2c698275330b90"
],
"elasticsearch.ingest_pipeline.total.failed": [
0
],
"host.os.type": [
"linux"
],
"elastic_agent.id": [
"a781ce37-a210-49d3-8344-6518fb35d4ac"
],
"data_stream.namespace": [
"default"
],
"elasticsearch.ingest_pipeline.total.time.self.ms": [
0
],
"metricset.period": [
10000
],
"host.os.codename": [
"focal"
],
"elasticsearch.ingest_pipeline.name": [
"metrics-elasticsearch.stack_monitoring.cluster_stats-1.7.4"
],
"data_stream.type": [
"metrics"
],
"event.duration": [
275991722
],
"elasticsearch.cluster.id": [
"SyM7nU1DRmKd3soposFsXg"
],
"host.architecture": [
"x86_64"
],
"metricset.name": [
"ingest_pipeline"
],
"event.ingested": [
"2023-06-23T09:09:22.000Z"
],
"@timestamp": [
"2023-06-23T09:09:21.280Z"
],
"elasticsearch.node.name": [
"instance-0000000000"
],
"agent.id": [
"a781ce37-a210-49d3-8344-6518fb35d4ac"
],
"elasticsearch.ingest_pipeline.total.count": [
428
],
"ecs.version": [
"8.0.0"
],
"host.os.platform": [
"ubuntu"
],
"host.containerized": [
false
],
"service.address": [
"https://test-es-3.es.us-central1.gcp.cloud.es.io:9243"
],
"data_stream.dataset": [
"elasticsearch.ingest_pipeline"
],
"agent.ephemeral_id": [
"55936ecf-0dfb-474d-9031-284992efdf8a"
],
"agent.version": [
"8.8.0"
],
"host.os.family": [
"debian"
],
"event.dataset": [
"elasticsearch.ingest_pipeline"
]
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at this sample, "name": "metrics-elasticsearch.stack_monitoring.cluster_stats-1.7.4",
seems to be a keyword.
And seems that wildcard
belong to the keyword family - https://www.elastic.co/guide/en/elasticsearch/reference/7.17/keyword.html#keyword
does adding a dimension on name
field fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the wildcart
is not valid to be a dimension.
@@ -37,6 +37,7 @@ | |||
Node ID | |||
- name: name | |||
type: keyword | |||
dimension: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't node.id be a better candidate? it could be not unique for multiple clusters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each cluster only has one service.address
, so the combination service.address
+ node.name
should be unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
service.address
is a host address defined in the configuration, so it could be for example localhost:9200
if the agent is running on the same instance with the elasticsearch - that is not unique enough
node.name
from my understanding it is a hostname, isnt it? so it can be the same for multiple clusters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
localhost:9200
is a default value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the integration to work you need to give the service.address
. This way, if you give to the same ES integration the same service.address
, you will be receiving metrics from the same clusters as before. I tested with with a local cluster and one on the cloud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node.name from my understanding it is a hostname, isnt it? so it can be the same for multiple clusters
The service.address
uniquely identifies a cluster for an ES integration, and since node.name
is unique per cluster, that combination is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the integration to work you need to give the service.address. This way, if you give to the same ES integration the same service.address, you will be receiving metrics from the same clusters as before.
why? If I set service.address
as localhost:9200
, install agent on different nodes and use the same policy for those node, I will get correct data
The service.address uniquely identifies a cluster for an ES integration, and since node.name is unique per cluster, that combination is enough.
but there can be the same node.name
for 2 different clusters. It is not unique
example: I have 2 different instance: es-test and es-test2 in the same gcp account (it is just for the test, more realistic: have instance with the same name in different accounts/in different cloud providers, just for the test I've changes the hostname of es-test2 to es-test):
service.address
the same for both nodes, node.name
as well. Since I did not change default value - cluster.name the same as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused.
To install the integration in some policy you need to set the service.address
:
This way, the service.address
is unique. You cannot connect to two different clusters using the same service.address
. So the service.address
uniquely identifies a cluster.
If I set service.address as localhost:9200, install agent on different nodes and use the same policy for those node, I will get correct data
So install two different agents? The agent.id
is a dimension, so there is no overlapping. If the service.address
for the ES is different, there is also no overlap. Otherwise, there is as it should be.
but there can be the same node.name for 2 different clusters. It is not unique
We always have value for service.address
. The node.name
is unique per cluster, so service.address + node.name
is unique.
I tested it it by adding to the policy:
- 1 local elastic agent
- 1 cluster with 3 nodes
- Another cluster with 3 nodes (this one so I could update the version)
I didn't get any overlap.
@@ -316,6 +316,7 @@ | |||
Node ID | |||
- name: name | |||
type: keyword | |||
dimension: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as for node
I did not migrate these data streams. There are issues still pending (more in the description) @tetianakravchenko |
The |
Both the id for a ML job and the name for ingest pipeline are unique per cluster, so setting them as a dimension is enough. |
Signed-off-by: constanca-m <constanca.manteigas@elastic.co>
Any specific reason why not all common dimensions (8 nos) are not included ? |
Are you talking about the ECS dimensions? it is in the description. |
I've added this comment as well - #6623 (review) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!. However, my recommendation is to have all ECS common dimensions included to avoid the risk
Package elasticsearch - 1.8.0 containing this change is available at https://epr.elastic.co/search?package=elasticsearch |
What does this PR do?
Set some fields as dimension so the data streams can be migrated to TSDB in the future.
This PR sets dimensions on all metrics data streams, except for CCR, Cluster Stats, Enrich, Shard and Pending Tasks. Read next section for more details.
Details
To enable TSDB we need some fields set as dimension. The combination of the dimension fields + the timestamp must be unique. Otherwise, documents will be overwritten. Check this for more information.
The same set of ECS fields were set as dimensions:
service.address
,host.name
,agent.id
. This decision was based on #5193 (comment). None of the cloud fields are considered needed, since the service address needs to be set for the integration to work. If two different integrations are used for the same service address, then they can be overwritten since they will have the exact same values.nested
: issue [Elasticsearch][CCR] Change type nested to object #6604.text
: issue [TSDB] TSDB enablement fails when there is a field of type text elasticsearch#96254.text
: issue [TSDB] TSDB enablement fails when there is a field of type text elasticsearch#96254.elasticsearch.ingest_pipeline.name_fingerprint
: the name of a pipeline is of type wildcard. Since that type does not qualify for a dimension, we need to create a fingerprint for it.elasticsearch.ingest_pipeline.processor.order_index
to distinguish between the processors and the pipeline itself. The processors should have that field set, and the pipeline should not.elasticsearch.index.name
is unique per cluster. Should be enough.service.address
only sends one document per timestamp. No more fields should be needed to set as dimension.elasticsearch.index.name
since it is unique per clusterelasticsearch.index.recovery.id
since an index can be distributed in more than one shardelasticsearch.node.name
is unique per clusterelasticsearch.node.name
is unique per clusterelasticsearch.ml.job.id
is uniqueChecklist
changelog.yml
file.Related issues
Relates to #6618.