Introduce long polling for changes #33683

jasontedor · 2018-09-13T18:04:42Z

Rather than scheduling pings to the leader index when we are caught up to the leader, this commit introduces long polling for changes. We will fire off a request to the leader which if we are already caught up will enter a poll on the leader side to listen for global checkpoint changes. These polls will timeout after a default of one minute, but can also be specified when creating the following task. We use these time outs as a way to keep statistics up to date, to not exaggerate time since last fetches, and to avoid pipes being broken.

Relates #32651

Rather than scheduling pings to the leader index when we are caught up to the leader, this commit introduces long polling for changes. We will fire off a request to the leader which if we are already caught up will enter a poll on the leader side to listen for global checkpoint changes. These polls will timeout after a default of one minute, but can also be specified when creating the following task. We use these time outs as a way to keep statistics up to date, to not exaggerate time since last fetches, and to avoid pipes being broken.

elasticmachine · 2018-09-13T18:04:43Z

Pinging @elastic/es-distributed

jasontedor · 2018-09-13T18:42:10Z

@elasticmachine run gradle build tests

martijnvg

This looks good! I left two questions about ShardChangesAction.TransportAction#asyncShardOperation(...).

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/ShardChangesAction.java

…polling * elastic/master: SQL: Return correct catalog separator in JDBC (elastic#33670) [CCR] Add validation for max_retry_delay (elastic#33648) [CCR] Add monitoring mapping verification test (elastic#33662) CORE: Disable Setting Type Validation (elastic#33660) (elastic#33669) Revert "Use serializable exception in GCP listeners (elastic#33657)" Adding index refresh (elastic#33647) [DOCS] Moves securing-communications to docs (elastic#33640) [HLRC][ML] Add ML delete datafeed API to HLRC (elastic#33667) Mute testRecoveryWithConcurrentIndexing TEST: decrease logging level in the flush test DOC: Add SQL section on client applications Fix race in global checkpoint listeners test Use serializable exception in GCP listeners (elastic#33657)

jasontedor · 2018-09-13T19:06:41Z

@martijnvg I pushed some commits.

martijnvg

LGTM

dnhatn

I did not know asyncShardOperation before. This is really nice.

dnhatn · 2018-09-13T19:54:41Z

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/ShardChangesAction.java

+            final SeqNoStats seqNoStats = indexShard.seqNoStats();
+
+            if (request.getFromSeqNo() > seqNoStats.getGlobalCheckpoint()) {
+                assert request.getFromSeqNo() == 1 + seqNoStats.getGlobalCheckpoint();


I am not sure if this assertion always holds.

@dnhatn Please see my latest pushes. I have integrated #33690 so that we only wake up if we advanced to the request from sequence number, or timeout.

dnhatn · 2018-09-13T19:57:46Z

@elasticmachine run gradle build tests

* master: (24 commits) Only notify ready global checkpoint listeners (elastic#33690) Don't count hits via the collector if the hit count can be computed from index stats. (elastic#33701) Expose retries for CCR fetch failures (elastic#33694) Test fix - Graph vertices could appear in different orders based on map insertion sequence (elastic#33709) Structured audit logging (elastic#31931) Core: Add DateFormatter interface for java time parsing (elastic#33467) [CCR] Check whether the rejected execution exception has the shutdown flag set (elastic#33703) Mute ClusterDisruptionIT#testSendingShardFailure Revert "Mute FullClusterRestartSettingsUpgradeIT" Adjust BWC version on settings upgrade test (elastic#33650) [ML] Allow overrides for some file structure detection decisions (elastic#33630) Adapt skip version for doc_values format deprecation [TEST] wait for no initializing shards [Docs] Minor fix in `has_child` javadoc comment (elastic#33674) Mute FullClusterRestartSettingsUpgradeIT [Kerberos] Add realm name & UPN to user metadata (elastic#33338) [TESTS] Disable specific locales for RestrictedTrustManagerTest (elastic#33299) SQL: Return functions in JDBC driver metadata (elastic#33672) SCRIPTING: Move terms_set Context to its Own Class (elastic#33602) AwaitsFix testRestoreMinmal ...

dnhatn

LGTM.

elasticdog · 2018-09-14T15:27:58Z

This job triggered CI during a migration of the master. Kicking off an additional build for you manually...

Jenkins, test this please.

jasontedor · 2018-09-14T20:00:01Z

@elasticmachine run gradle build tests

* master: Add script to cache dependencies (elastic#33726) [DOCS] Moves security reference to docs folder (elastic#33643) Cleanup assertions in global checkpoint listeners (elastic#33722) [CCR] Move ccr tests in core module back to ccr module (elastic#33711) HLRC: ML PUT Calendar (elastic#33362) [Tests] Fix randomization in StringTermsIT (elastic#33678) Pin TLS1.2 in SSLConfigurationReloaderTests

jasontedor · 2018-09-15T03:57:43Z

To fix the build failures here, I want to integrate #33731 and then I want to add a check to the pending tasks check in the REST test to skip the shard changes action tasks that can be outstanding from polls. 😇

* master: Move CCR REST tests to a sub-project of ccr Move CCR REST tests to ccr sub-project (elastic#33731) Move CCR monitoring tests to ccr sub-project (elastic#33730)

jasontedor · 2018-09-15T17:39:59Z

This pull request requires #33738 and then it can be merged.

* master: Do not count shard changes tasks against REST tests (elastic#33738) [HLRC][ML] Add ML get datafeed API to HLRC (elastic#33715)

Rather than scheduling pings to the leader index when we are caught up to the leader, this commit introduces long polling for changes. We will fire off a request to the leader which if we are already caught up will enter a poll on the leader side to listen for global checkpoint changes. These polls will timeout after a default of one minute, but can also be specified when creating the following task. We use these time outs as a way to keep statistics up to date, to not exaggerate time since last fetches, and to avoid pipes being broken.

jasontedor added review :Distributed/CCR Issues around the Cross Cluster State Replication features labels Sep 13, 2018

jasontedor requested review from martijnvg and dnhatn September 13, 2018 18:04

martijnvg reviewed Sep 13, 2018

View reviewed changes

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/ShardChangesAction.java Outdated Show resolved Hide resolved

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/ShardChangesAction.java Show resolved Hide resolved

jasontedor added 3 commits September 13, 2018 14:50

Fork it over

d4f8a98

Ensure we do not lose exceptions

2890693

martijnvg approved these changes Sep 13, 2018

View reviewed changes

dnhatn approved these changes Sep 13, 2018

View reviewed changes

jasontedor added 2 commits September 14, 2018 09:34

Iteration

15d1af9

dnhatn approved these changes Sep 14, 2018

View reviewed changes

jasontedor added 2 commits September 14, 2018 09:58

Add some trace logging

9e32f93

Add shard ID

7d1f975

Merge branch 'master' into global-checkpoint-polling

e072468

* master: Move CCR REST tests to a sub-project of ccr Move CCR REST tests to ccr sub-project (elastic#33731) Move CCR monitoring tests to ccr sub-project (elastic#33730)

jasontedor mentioned this pull request Sep 15, 2018

Do not count shard changes tasks against REST tests #33738

Merged

Merge branch 'master' into global-checkpoint-polling

086b701

* master: Do not count shard changes tasks against REST tests (elastic#33738) [HLRC][ML] Add ML get datafeed API to HLRC (elastic#33715)

jasontedor merged commit 770ad53 into elastic:master Sep 16, 2018

jasontedor deleted the global-checkpoint-polling branch September 16, 2018 15:01

jasontedor mentioned this pull request Sep 17, 2018

Add global checkpoint polling to cross-cluster replication #32651

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce long polling for changes #33683

Introduce long polling for changes #33683

jasontedor commented Sep 13, 2018 •

edited

Loading

elasticmachine commented Sep 13, 2018

jasontedor commented Sep 13, 2018

martijnvg left a comment

jasontedor commented Sep 13, 2018

martijnvg left a comment

dnhatn left a comment

dnhatn Sep 13, 2018

jasontedor Sep 14, 2018

dnhatn commented Sep 13, 2018

dnhatn left a comment •

edited

Loading

elasticdog commented Sep 14, 2018

jasontedor commented Sep 14, 2018 •

edited

Loading

jasontedor commented Sep 15, 2018

jasontedor commented Sep 15, 2018

Introduce long polling for changes #33683

Introduce long polling for changes #33683

Conversation

jasontedor commented Sep 13, 2018 • edited Loading

elasticmachine commented Sep 13, 2018

jasontedor commented Sep 13, 2018

martijnvg left a comment

Choose a reason for hiding this comment

jasontedor commented Sep 13, 2018

martijnvg left a comment

Choose a reason for hiding this comment

dnhatn left a comment

Choose a reason for hiding this comment

dnhatn Sep 13, 2018

Choose a reason for hiding this comment

jasontedor Sep 14, 2018

Choose a reason for hiding this comment

dnhatn commented Sep 13, 2018

dnhatn left a comment • edited Loading

Choose a reason for hiding this comment

elasticdog commented Sep 14, 2018

jasontedor commented Sep 14, 2018 • edited Loading

jasontedor commented Sep 15, 2018

jasontedor commented Sep 15, 2018

jasontedor commented Sep 13, 2018 •

edited

Loading

dnhatn left a comment •

edited

Loading

jasontedor commented Sep 14, 2018 •

edited

Loading