cluster.routing.allocation.awareness.force does not work like promised #104777

toughcoding · 2024-01-25T20:59:13Z

Elasticsearch Version

8.12.0

Installed Plugins

No response

Java Version

bundled

OS Version

Linux 1d1f1835b648 6.4.16-linuxkit #1 SMP PREEMPT Thu Nov 16 10:49:20 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

Problem Description

According to docs if not ALL nodes are available that will form complete list of cluster.routing.allocation.awareness.attributes
then replicas should not be STARTED but should be in UNASSIGNED state.

In practice it does not work. Not only when node_left cluster which I simulate on my blog and that can kill cluster but even when not all nodes that are part of cluster are being started.

Steps to Reproduce

Starting network, volumes and first node

docker network create elknodes
docker volume create --opt type=tmpfs --opt device=tmpfs --opt o=size=2m europe01data
docker volume create --opt type=tmpfs --opt device=tmpfs --opt o=size=2m africa01data


docker run --rm \
--name europe01 \
--net elknodes \
-d \
-e ES_JAVA_OPTS="-Xms2g -Xmx2g" \
-e node.name="europe01" \
-p 9200:9200 \
-e node.attr.continent="europe" \
-v europe01data:/usr/share/elasticsearch/data \
-e cluster.routing.allocation.awareness.attributes=continent \
-e cluster.routing.allocation.awareness.force.continent.values="europe,arctica,africa" \
docker.elastic.co/elasticsearch/elasticsearch:8.12.0

reset password

docker exec -it europe01 bash -c "(mkfifo pipe1); ( (elasticsearch-reset-password -u elastic -i < pipe1) & ( echo $'y\n123456\n123456' > pipe1) );sleep 5;rm pipe1"

get enrollment token

token=`docker exec -it europe01 elasticsearch-create-enrollment-token -s node | tr -d '\r\n'

Start second node, so values europe,africa are used but arctica is missing

docker run --rm \
-e ENROLLMENT_TOKEN="$token" \
-e node.name="africa01" \
-e node.attr.continent="africa" \
-v africa01data:/usr/share/elasticsearch/data \
-e cluster.routing.allocation.awareness.attributes=continent \
-e cluster.routing.allocation.awareness.force.continent.values="europe,arctica,africa" \
-p 9201:9200 \
--name africa01 \
--net elknodes \
-d \
-m 2GB docker.elastic.co/elasticsearch/elasticsearch:8.12.0

Create index

curl -k -u elastic:123456 -XPUT "https://localhost:9200/customerdata" \
-H 'content-type: application/json' -d'
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}'

Check shard allocation

curl -k -u elastic:123456 -XGET "https://localhost:9200/_cat/shards?v&s=state:asc&index=customerdata"

will return

index        shard prirep state   docs store dataset ip         node
customerdata 0     r      STARTED    0  227b    227b 172.26.0.3 africa01
customerdata 0     p      STARTED    0  227b    227b 172.26.0.2 europe01

Planned node arctica was not started but Elasticsearch already assigned replicas.

Logs (if relevant)

No response

The text was updated successfully, but these errors were encountered:

DaveCTurner · 2024-01-26T11:06:52Z

Thanks very much for your interest in Elasticsearch.

This appears to be a user question, and we'd like to direct these kinds of things to the Elasticsearch forum. If you can stop by there, we'd appreciate it. This allows us to use GitHub for verified bug reports, feature requests, and pull requests. Specifically, the behaviour you describe is expected.

There's an active community in the forum that should be able to help get an answer to your question. As such, I hope you don't mind that I close this.

DaveCTurner · 2024-01-26T11:40:46Z

Ah wait I see, the behaviour is as expected but the docs say something else. ~~I'll fix that up in a sec.~~ See #104800

elasticsearchmachine · 2024-01-26T11:41:18Z

Pinging @elastic/es-docs (Team:Docs)

elasticsearchmachine · 2024-01-26T11:41:18Z

Pinging @elastic/es-distributed (Team:Distributed)

The docs for forced awareness indicate that no replicas will be assigned until all zones are available, which is definitely undesirable and also not the actual behaviour. This commit fixes the wording to match what really happens. Closes elastic#104777

toughcoding · 2024-01-26T11:52:26Z

Ah wait I see, the behaviour is as expected but the docs say something else. I'll fix that up in a sec.

thanks for understanding :)
I was raising ticket before to ELK support team and their answers are based on this documentation so it's really crucial to have it clarified.

The docs for forced awareness indicate that no replicas will be assigned until all zones are available, which is definitely undesirable and also not the actual behaviour. This commit fixes the wording to match what really happens. Closes #104777

toughcoding added >bug needs:triage Requires assignment of a team area label labels Jan 25, 2024

DaveCTurner closed this as not planned Won't fix, can't repro, duplicate, stale Jan 26, 2024

DaveCTurner reopened this Jan 26, 2024

DaveCTurner added >docs General docs changes :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) and removed needs:triage Requires assignment of a team area label labels Jan 26, 2024

elasticsearchmachine added Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team labels Jan 26, 2024

DaveCTurner mentioned this issue Jan 26, 2024

Allocation awareness allocates some replicas #104800

Merged

DaveCTurner closed this as completed in #104800 Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster.routing.allocation.awareness.force does not work like promised #104777

cluster.routing.allocation.awareness.force does not work like promised #104777

toughcoding commented Jan 25, 2024 •

edited

Loading

DaveCTurner commented Jan 26, 2024

DaveCTurner commented Jan 26, 2024 •

edited

Loading

elasticsearchmachine commented Jan 26, 2024

elasticsearchmachine commented Jan 26, 2024

toughcoding commented Jan 26, 2024

cluster.routing.allocation.awareness.force does not work like promised #104777

cluster.routing.allocation.awareness.force does not work like promised #104777

Comments

toughcoding commented Jan 25, 2024 • edited Loading

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Starting network, volumes and first node

reset password

get enrollment token

Start second node, so values europe,africa are used but arctica is missing

Create index

Check shard allocation

Logs (if relevant)

DaveCTurner commented Jan 26, 2024

DaveCTurner commented Jan 26, 2024 • edited Loading

elasticsearchmachine commented Jan 26, 2024

elasticsearchmachine commented Jan 26, 2024

toughcoding commented Jan 26, 2024

toughcoding commented Jan 25, 2024 •

edited

Loading

DaveCTurner commented Jan 26, 2024 •

edited

Loading