Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster.routing.allocation.awareness.force does not work like promised #104777

Closed
toughcoding opened this issue Jan 25, 2024 · 5 comments · Fixed by #104800
Closed

cluster.routing.allocation.awareness.force does not work like promised #104777

toughcoding opened this issue Jan 25, 2024 · 5 comments · Fixed by #104800
Labels
>bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >docs General docs changes Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team

Comments

@toughcoding
Copy link

toughcoding commented Jan 25, 2024

Elasticsearch Version

8.12.0

Installed Plugins

No response

Java Version

bundled

OS Version

Linux 1d1f1835b648 6.4.16-linuxkit #1 SMP PREEMPT Thu Nov 16 10:49:20 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

Problem Description

According to docs if not ALL nodes are available that will form complete list of cluster.routing.allocation.awareness.attributes
then replicas should not be STARTED but should be in UNASSIGNED state.

In practice it does not work. Not only when node_left cluster which I simulate on my blog and that can kill cluster but even when not all nodes that are part of cluster are being started.

Steps to Reproduce

Starting network, volumes and first node

docker network create elknodes
docker volume create --opt type=tmpfs --opt device=tmpfs --opt o=size=2m europe01data
docker volume create --opt type=tmpfs --opt device=tmpfs --opt o=size=2m africa01data

docker run --rm \
--name europe01 \
--net elknodes \
-d \
-e ES_JAVA_OPTS="-Xms2g -Xmx2g" \
-e node.name="europe01" \
-p 9200:9200 \
-e node.attr.continent="europe" \
-v europe01data:/usr/share/elasticsearch/data \
-e cluster.routing.allocation.awareness.attributes=continent \
-e cluster.routing.allocation.awareness.force.continent.values="europe,arctica,africa" \
docker.elastic.co/elasticsearch/elasticsearch:8.12.0

reset password

docker exec -it europe01 bash -c "(mkfifo pipe1); ( (elasticsearch-reset-password -u elastic -i < pipe1) & ( echo $'y\n123456\n123456' > pipe1) );sleep 5;rm pipe1"

get enrollment token

token=`docker exec -it europe01 elasticsearch-create-enrollment-token -s node | tr -d '\r\n'

Start second node, so values europe,africa are used but arctica is missing

docker run --rm \
-e ENROLLMENT_TOKEN="$token" \
-e node.name="africa01" \
-e node.attr.continent="africa" \
-v africa01data:/usr/share/elasticsearch/data \
-e cluster.routing.allocation.awareness.attributes=continent \
-e cluster.routing.allocation.awareness.force.continent.values="europe,arctica,africa" \
-p 9201:9200 \
--name africa01 \
--net elknodes \
-d \
-m 2GB docker.elastic.co/elasticsearch/elasticsearch:8.12.0

Create index

curl -k -u elastic:123456 -XPUT "https://localhost:9200/customerdata" \
-H 'content-type: application/json' -d'
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}'

Check shard allocation

curl -k -u elastic:123456 -XGET "https://localhost:9200/_cat/shards?v&s=state:asc&index=customerdata"

will return

index        shard prirep state   docs store dataset ip         node
customerdata 0     r      STARTED    0  227b    227b 172.26.0.3 africa01
customerdata 0     p      STARTED    0  227b    227b 172.26.0.2 europe01

Planned node arctica was not started but Elasticsearch already assigned replicas.

Logs (if relevant)

No response

@toughcoding toughcoding added >bug needs:triage Requires assignment of a team area label labels Jan 25, 2024
@DaveCTurner
Copy link
Contributor

Thanks very much for your interest in Elasticsearch.

This appears to be a user question, and we'd like to direct these kinds of things to the Elasticsearch forum. If you can stop by there, we'd appreciate it. This allows us to use GitHub for verified bug reports, feature requests, and pull requests. Specifically, the behaviour you describe is expected.

There's an active community in the forum that should be able to help get an answer to your question. As such, I hope you don't mind that I close this.

@DaveCTurner DaveCTurner closed this as not planned Won't fix, can't repro, duplicate, stale Jan 26, 2024
@DaveCTurner
Copy link
Contributor

DaveCTurner commented Jan 26, 2024

Ah wait I see, the behaviour is as expected but the docs say something else. I'll fix that up in a sec. See #104800

@DaveCTurner DaveCTurner reopened this Jan 26, 2024
@DaveCTurner DaveCTurner added >docs General docs changes :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) and removed needs:triage Requires assignment of a team area label labels Jan 26, 2024
@elasticsearchmachine elasticsearchmachine added Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team labels Jan 26, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Jan 26, 2024
The docs for forced awareness indicate that no replicas will be assigned
until all zones are available, which is definitely undesirable and also
not the actual behaviour. This commit fixes the wording to match what
really happens.

Closes elastic#104777
@toughcoding
Copy link
Author

Ah wait I see, the behaviour is as expected but the docs say something else. I'll fix that up in a sec.

thanks for understanding :)
I was raising ticket before to ELK support team and their answers are based on this documentation so it's really crucial to have it clarified.

DaveCTurner added a commit that referenced this issue Jan 29, 2024
The docs for forced awareness indicate that no replicas will be assigned
until all zones are available, which is definitely undesirable and also
not the actual behaviour. This commit fixes the wording to match what
really happens.

Closes #104777
DaveCTurner added a commit that referenced this issue Jan 29, 2024
The docs for forced awareness indicate that no replicas will be assigned
until all zones are available, which is definitely undesirable and also
not the actual behaviour. This commit fixes the wording to match what
really happens.

Closes #104777
DaveCTurner added a commit that referenced this issue Jan 29, 2024
The docs for forced awareness indicate that no replicas will be assigned
until all zones are available, which is definitely undesirable and also
not the actual behaviour. This commit fixes the wording to match what
really happens.

Closes #104777
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >docs General docs changes Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants