Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent experiment hanging when worker is killed by OS [Resolves #501] #506

Merged
merged 1 commit into from
Nov 15, 2018

Conversation

thcrock
Copy link
Contributor

@thcrock thcrock commented Nov 12, 2018

Converts the MulticoreExperiment to use Pebble instead of
multiprocessing.

This allows us to wrap the ProcessExpired exception in the case of
the operating system (or something else) killing the subprocess. This
ensures the process pool can keep going and provide useful information
in the logs.

@codecov-io
Copy link

codecov-io commented Nov 12, 2018

Codecov Report

Merging #506 into master will decrease coverage by 0.12%.
The diff coverage is 67.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #506      +/-   ##
==========================================
- Coverage   82.63%   82.51%   -0.13%     
==========================================
  Files          81       81              
  Lines        4629     4660      +31     
==========================================
+ Hits         3825     3845      +20     
- Misses        804      815      +11
Impacted Files Coverage Δ
src/triage/component/catwalk/utils.py 98.83% <100%> (+0.08%) ⬆️
src/triage/experiments/multicore.py 74.71% <45.45%> (-8.85%) ⬇️
src/triage/experiments/base.py 92.79% <90.9%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 24f81e9...49b5d60. Read the comment docs.

Copy link
Member

@jesteria jesteria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me.

But I'm curious, is the OOM-handling specific to Pebble, rather than to all the concurrency stuff that was added in Python 3? There's so much high-level concurrency stuff in Python 3, I was just a little surprised that another library would be necessary.

It looks like the big change is that we use a future, and can handle exceptions in child processes. But, abstractly, that's not unique to Pebble.

@thcrock
Copy link
Contributor Author

thcrock commented Nov 14, 2018

I did look into vanilla concurrent.futures. It does claim to provide the process dying functionality. But it does not allow a maxtasksperchild equivalent. Pebble does both; well, it uses concurrent.futures, as far as I know, but then adds on a bit more functionality.

Copy link
Member

@jesteria jesteria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh yeah – that's weird, but OK 👍

num_failures += 1
results.append(result)
logging.exception('Child failure')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really prefer the clarity of try/except/else but NBD:

try:
    result = next(iterator)
except StopIteration:
    break
except Exception:
    logging.exception(...)
    num_failures += 1
else:
    results.append(result)
    num_successes += 1

Copy link
Contributor

@tweddielin tweddielin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Converts the MulticoreExperiment to use Pebble instead of
multiprocessing.

This allows us to wrap the ProcessExpired exception in the case of
the operating system (or something else) killing the subprocess. This
ensures the process pool can keep going and provide useful information
in the logs.
@thcrock thcrock merged commit 19696ba into master Nov 15, 2018
@thcrock thcrock deleted the concurrent_futures branch November 15, 2018 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants