-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix WithContainerStepTest.stop and WithContainerStepTest.death #257
Conversation
src/test/java/org/jenkinsci/plugins/docker/workflow/WithContainerStepTest.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jenkinsci/plugins/docker/workflow/WithContainerStep.java
Outdated
Show resolved
Hide resolved
Let's rebuild and hope that |
It looks like the build might fail to record Linux test results, but |
" withDockerContainer('httpd:2.4.12') {\n" + | ||
" sh \"sleep 5; ps -e -o pid,command | egrep '${pwd tmp: true}/durable-.+/script.sh' | fgrep -v grep | sort -n | tr -s ' ' | cut -d ' ' -f2 | xargs kill -9\"\n" + | ||
" withDockerContainer('httpd:2.4.54-alpine') {\n" + | ||
" sh \"set -e; sleep 5; ps -e -o pid,args | egrep '${pwd tmp: true}/durable-.+/script.sh' | fgrep -v grep | sort -n | tr -s ' ' | cut -d ' ' -f2 | xargs kill -9\"\n" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doing this whole parsing looks so painful to me 😢
Would it be easier to test if the heartbeat file still existed/was being written to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test is trying to check whether sh
steps inside of withContainer
respond correctly if the remote script just suddenly dies, so it is using internal knowledge of durable-task
to kill all of the processes related to the sh
step from within, which is pretty awkward. It used to be simpler before jenkinsci/durable-task-plugin#49, but it had to be updated in #121 since the pid
file didn't exist anymore.
I think that if we deleted the heartbeat file here it wouldn't do anything, since the control script would just touch the file 3 seconds later, recreating it. Not 100% sure though. Maybe we could chmod
it to be unwritable or something, but IDK if that would be much easier to maintain.
I saw these flake in #256 and didn't think much of it, but then I saw them also fail in #253, and took a look. There seem to be a lot of problems with our use of
ps
.The real head scratcher for me though is that the tests pass locally with
httpd:2.4.12
. They only fail if I usehttpd:2.4.35
or newer (which I believe is due to the switch to a-slim
version of Debian in docker-library/httpd#113 which removesps
from the container entirely, but it is hard to track things down because that repo does not have tags/releases). If I do update to 2.4.35 or newer then I get the exact error messages seen in CI. Is something in the CI environment forcing newer images to be used? Is that even possible?