Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double-check local checkpoint for staleness #29276

Conversation

DaveCTurner
Copy link
Contributor

Today, when determining if an operation is stale, we compare the seqno against
the local checkpoint before looking in the version map. However, in between
these two checks the local checkpoint could advance, causing some tombstones to
become stale, and then the stale tombstones could be collected. In this
situation we might incorrectly decide that the operation is fresh and apply it.

To avoid this situation, check the local checkpoint again after calling
getVersionFromMap(). Since it only ever increases, this gives the right result
despite other concurrent activity.

Today, when determining if an operation is stale, we compare the seqno against
the local checkpoint before looking in the version map. However, in between
these two checks the local checkpoint could advance, causing some tombstones to
become stale, and then the stale tombstones could be collected. In this
situation we might incorrectly decide that the operation is fresh and apply it.

To avoid this situation, check the local checkpoint again after calling
getVersionFromMap(). Since it only ever increases, this gives the right result
despite other concurrent activity.
@DaveCTurner DaveCTurner added >bug v7.0.0 v6.3.0 :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. labels Mar 28, 2018
@DaveCTurner
Copy link
Contributor Author

Actually turning this into something that can reliably fail a test case seems tricky. Your thoughts welcome.

@bleskes
Copy link
Contributor

bleskes commented Mar 28, 2018

Thanks @DaveCTurner . The change is semantically correct, but I prefer we bundle things more. Right we first compare the seq# to the local checkpoint and then call
compareOpToLuceneDocBasedOnSeqNo. Instead I would prefer to move all of it into compareOpToLuceneDocBasedOnSeqNo so it's all in one place (or alternatively it move both checks out of method). This will allow us to make the flow "cleaner":

  1. Preflight against the local checpoint.
  2. Check the version map and lucene
  3. Check again against the local checkpoint (in case it advanced, making 2 not reliable).

This will bundle the version map with Lucene as a single unit with a requirement to remember everything above the local checkpoint. I think that's cleaner and align better with future direction. WDYT?

Actually turning this into something that can reliably fail a test case seems tricky. Your thoughts welcome.

I was thinking about this this morning. I was hoping that if we add refreshes to testConcurrentOutOfDocsOnReplica and sometimes run it with gc deletes equal to 0, it will reproduce.

DaveCTurner added a commit to elastic/elasticsearch-formal-models that referenced this pull request Mar 28, 2018
This models how indexing and deletion operations are handled on the replica,
including the optimisations for append-only operations and the interaction with
Lucene commits and the version map.

It incorporates

- elastic/elasticsearch#28787
- elastic/elasticsearch#28790
- elastic/elasticsearch#29276
- a proposal to always prune tombstones
@DaveCTurner
Copy link
Contributor Author

Instead I would prefer to move all of it into compareOpToLuceneDocBasedOnSeqNo

Sure, I like that idea too.

I was thinking about this this morning. I was hoping that if we add refreshes to testConcurrentOutOfDocsOnReplica and sometimes run it with gc deletes equal to 0, it will reproduce.

🤞

I was wondering about adding some kind of

assert randomDelay();

at strategic points in order to expose rare races more frequently.

@hub-cap hub-cap added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. labels Mar 28, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@DaveCTurner
Copy link
Contributor Author

@bleskes I looked at testConcurrentOutOfDocsOnReplica and it already performs refreshes (every 4 operations) but does not simulate GC deletes as per assertOpsOnPrimary. I thought it'd be useful to wrap up each operation along with the things to do after applying it (refresh/flush/GC) as per 5d46eee, and then make the tests more consistent about applying these things, but didn't want to go too far down that path without your thoughts. WDYT?

@bleskes
Copy link
Contributor

bleskes commented Apr 30, 2018

@DaveCTurner shall we close this in favour of #30121 ?

@DaveCTurner
Copy link
Contributor Author

Yes, lets.

@DaveCTurner DaveCTurner deleted the 2018-03-28-double-check-local-checkpoint branch July 23, 2022 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >non-issue v6.4.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants