Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle scheduler exceptions #38014

Merged

Conversation

henningandersen
Copy link
Contributor

This is a continuation of #28667, #36137 and also fixes #37708. The goal is to log warnings on all exceptions thrown out of Runnables passed to Scheduler.schedule. There were no usages of the "Future.get()" on the ScheduledFuture returned from schedule and we have therefore redefined the semantics to be that scheduled tasks should never fail and if they do this should always be logged as a warning.

Parts of this will be backported to 6.x, notably we will avoid following breaking changes (breaking-java label only applies to 7.0):

  1. Keep Scheduler.schedule with old signature. This way 6.x remains backwards compatible while we can still use the new safe method everywhere. The signature change of schedule was primarily motivated in this (and that the new order is more in line with ScheduledExecutorService.schedule).
  2. Keep Processor.Parameters.scheduler as is in 6.x
  3. Keep ThreadWatchDog (in particular newInstance) as is in 6.x

Please pay special attention to the change in Processor.Parameters, I would like input on the todo in that file.

Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of elastic#28667, elastic#36317 and also fixes elastic#37708.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a first pass. Overall looking good.

Fixed review comments: removed todo, use FutureUtils.cancel and removed
scheduler task decoration since this adds more complexity than it
benefits.

This is a continuation of elastic#28667, elastic#36137 and also fixes elastic#37708.
@henningandersen
Copy link
Contributor Author

@elasticmachine please run elasticsearch-ci/2

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@henningandersen henningandersen merged commit 68ed72b into elastic:master Jan 31, 2019
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jan 31, 2019
…ersion

* elastic/master:
  Do not set up NodeAndClusterIdStateListener in test (elastic#38110)
  ML: better handle task state race condition (elastic#38040)
  Soft-deletes policy should always fetch latest leases (elastic#37940)
  Handle scheduler exceptions (elastic#38014)
  Minor logging improvements (elastic#38084)
  Fix Painless void return bug (elastic#38046)
  Update PutFollowAction serialization post-backport (elastic#37989)
  fix a few versionAdded values in ElasticsearchExceptions (elastic#37877)
  Reenable BWC tests after backport of elastic#37899 (elastic#38093)
  Mute failing test
  Mute failing test
  Fail start on obsolete indices documentation (elastic#37786)
  SQL: Implement FIRST/LAST aggregate functions (elastic#37936)
  Un-mute NoMasterNodeIT.testNoMasterActionsWriteMasterBlock
  remove unused parser fields in RemoteResponseParsers
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Feb 1, 2019
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

In this backport to 6.x, source backwards compatibility is maintained
and some of the changes has therefore not been carried out (notably
the signature change on Processor.Parameters.scheduler).

This is a continuation of elastic#28667, elastic#36137 and also fixes elastic#37708.
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Feb 4, 2019
Instead of logging warnings we now rethrow exceptions thrown inside
scheduled/submitted tasks. This will still log them as warnings in
production but has the added benefit that if they are thrown during
unit/integration test runs, the test will be flagged as an error.

This is a continuation of elastic#38014
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Feb 5, 2019
Instead of logging warnings we now rethrow exceptions thrown inside
scheduled/submitted tasks. This will still log them as warnings in
production but has the added benefit that if they are thrown during
unit/integration test runs, the test will be flagged as an error.

Fixed NPE in GlobalCheckPointListners that caused CCR tests
(IndexFollowingIT and likely others) to fail.

This is a continuation of elastic#38014
henningandersen added a commit that referenced this pull request Feb 5, 2019
Instead of logging warnings we now rethrow exceptions thrown inside
scheduled/submitted tasks. This will still log them as warnings in
production but has the added benefit that if they are thrown during
unit/integration test runs, the test will be flagged as an error.

Fixed NPE in GlobalCheckPointListeners that caused CCR tests
(IndexFollowingIT and likely others) to fail.

This is a continuation of #38014

Backports #38317
henningandersen added a commit that referenced this pull request Feb 5, 2019
Instead of logging warnings we now rethrow exceptions thrown inside
scheduled/submitted tasks. This will still log them as warnings in
production but has the added benefit that if they are thrown during
unit/integration test runs, the test will be flagged as an error.

This is a continuation of #38014

Fixed NPE that caused CCR tests (IndexFollowingIT and likely others)
to fail.

schedule could bubble rejected exception to uncaught exception
handler when not using SAME executor if thread pool is terminated.
Now ignore rejected exception silently if executor is shutdown.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: EvilThreadPoolTests.testExecutionExceptionOnSinglePrioritizingThreadPoolExecutor fails
4 participants