-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce long polling for changes #33683
Introduce long polling for changes #33683
Conversation
Rather than scheduling pings to the leader index when we are caught up to the leader, this commit introduces long polling for changes. We will fire off a request to the leader which if we are already caught up will enter a poll on the leader side to listen for global checkpoint changes. These polls will timeout after a default of one minute, but can also be specified when creating the following task. We use these time outs as a way to keep statistics up to date, to not exaggerate time since last fetches, and to avoid pipes being broken.
Pinging @elastic/es-distributed |
@elasticmachine run gradle build tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! I left two questions about ShardChangesAction.TransportAction#asyncShardOperation(...)
.
x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/ShardChangesAction.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/ShardChangesAction.java
Show resolved
Hide resolved
…polling * elastic/master: SQL: Return correct catalog separator in JDBC (elastic#33670) [CCR] Add validation for max_retry_delay (elastic#33648) [CCR] Add monitoring mapping verification test (elastic#33662) CORE: Disable Setting Type Validation (elastic#33660) (elastic#33669) Revert "Use serializable exception in GCP listeners (elastic#33657)" Adding index refresh (elastic#33647) [DOCS] Moves securing-communications to docs (elastic#33640) [HLRC][ML] Add ML delete datafeed API to HLRC (elastic#33667) Mute testRecoveryWithConcurrentIndexing TEST: decrease logging level in the flush test DOC: Add SQL section on client applications Fix race in global checkpoint listeners test Use serializable exception in GCP listeners (elastic#33657)
@martijnvg I pushed some commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not know asyncShardOperation
before. This is really nice.
final SeqNoStats seqNoStats = indexShard.seqNoStats(); | ||
|
||
if (request.getFromSeqNo() > seqNoStats.getGlobalCheckpoint()) { | ||
assert request.getFromSeqNo() == 1 + seqNoStats.getGlobalCheckpoint(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this assertion always holds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elasticmachine run gradle build tests |
* master: (24 commits) Only notify ready global checkpoint listeners (elastic#33690) Don't count hits via the collector if the hit count can be computed from index stats. (elastic#33701) Expose retries for CCR fetch failures (elastic#33694) Test fix - Graph vertices could appear in different orders based on map insertion sequence (elastic#33709) Structured audit logging (elastic#31931) Core: Add DateFormatter interface for java time parsing (elastic#33467) [CCR] Check whether the rejected execution exception has the shutdown flag set (elastic#33703) Mute ClusterDisruptionIT#testSendingShardFailure Revert "Mute FullClusterRestartSettingsUpgradeIT" Adjust BWC version on settings upgrade test (elastic#33650) [ML] Allow overrides for some file structure detection decisions (elastic#33630) Adapt skip version for doc_values format deprecation [TEST] wait for no initializing shards [Docs] Minor fix in `has_child` javadoc comment (elastic#33674) Mute FullClusterRestartSettingsUpgradeIT [Kerberos] Add realm name & UPN to user metadata (elastic#33338) [TESTS] Disable specific locales for RestrictedTrustManagerTest (elastic#33299) SQL: Return functions in JDBC driver metadata (elastic#33672) SCRIPTING: Move terms_set Context to its Own Class (elastic#33602) AwaitsFix testRestoreMinmal ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This job triggered CI during a migration of the master. Kicking off an additional build for you manually... Jenkins, test this please. |
@elasticmachine run gradle build tests |
* master: Add script to cache dependencies (elastic#33726) [DOCS] Moves security reference to docs folder (elastic#33643) Cleanup assertions in global checkpoint listeners (elastic#33722) [CCR] Move ccr tests in core module back to ccr module (elastic#33711) HLRC: ML PUT Calendar (elastic#33362) [Tests] Fix randomization in StringTermsIT (elastic#33678) Pin TLS1.2 in SSLConfigurationReloaderTests
To fix the build failures here, I want to integrate #33731 and then I want to add a check to the pending tasks check in the REST test to skip the shard changes action tasks that can be outstanding from polls. 😇 |
* master: Move CCR REST tests to a sub-project of ccr Move CCR REST tests to ccr sub-project (elastic#33731) Move CCR monitoring tests to ccr sub-project (elastic#33730)
This pull request requires #33738 and then it can be merged. |
* master: Do not count shard changes tasks against REST tests (elastic#33738) [HLRC][ML] Add ML get datafeed API to HLRC (elastic#33715)
Rather than scheduling pings to the leader index when we are caught up to the leader, this commit introduces long polling for changes. We will fire off a request to the leader which if we are already caught up will enter a poll on the leader side to listen for global checkpoint changes. These polls will timeout after a default of one minute, but can also be specified when creating the following task. We use these time outs as a way to keep statistics up to date, to not exaggerate time since last fetches, and to avoid pipes being broken.
Rather than scheduling pings to the leader index when we are caught up to the leader, this commit introduces long polling for changes. We will fire off a request to the leader which if we are already caught up will enter a poll on the leader side to listen for global checkpoint changes. These polls will timeout after a default of one minute, but can also be specified when creating the following task. We use these time outs as a way to keep statistics up to date, to not exaggerate time since last fetches, and to avoid pipes being broken.
Relates #32651