Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix WithContainerStepTest.stop and WithContainerStepTest.death #257

Merged
merged 2 commits into from
Jun 22, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,15 @@ private static class Decorator extends LauncherDecorator implements Serializable
@Override public void kill(Map<String,String> modelEnvVars) throws IOException, InterruptedException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
String executable = getExecutable();
// For this code to be able to successfully kill processes:
// 1: The image must contain `ps`, which notably excludes slim variants of Debian.
// 2. The version of `ps` being used must support `-o command`. Busybox does not (so no Alpine!).
// `-o args` seems to be a more portable equivalent, but even then, see JENKINS-52881.
// 3. The version of `ps` being used must support the `e` output modifier to include environment
// variables in the command's output. Otherwise, the output only includes `jsc=<cookie here>` and
// `JENKINS_SERVER_COOKIE=$jsc`, but the output must include `JENKINS_SERVER_COOKIE=<cookie here>
// for this code to work.
// In practice, this means that this code only works with images that include procps.
if (getInner().launch().cmds(executable, "exec", container, "ps", "-A", "-o", "pid,command", "e").stdout(baos).quiet(true).start().joinWithTimeout(DockerClient.CLIENT_TIMEOUT, TimeUnit.SECONDS, listener) != 0) {
throw new IOException("failed to run ps");
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ public class WithContainerStepTest {
WorkflowJob p = story.j.jenkins.createProject(WorkflowJob.class, "prj");
p.setDefinition(new CpsFlowDefinition(
"node {\n" +
" withDockerContainer('httpd:2.4.12') {\n" +
" withDockerContainer('ubuntu:kinetic-20220602') {\n" +
" sh 'trap \\'echo got SIGTERM\\' TERM; trap \\'echo exiting; exit 99\\' EXIT; echo sleeping now with JENKINS_SERVER_COOKIE=$JENKINS_SERVER_COOKIE; sleep 999'\n" +
" }\n" +
"}", true));
Expand All @@ -185,8 +185,8 @@ public class WithContainerStepTest {
WorkflowJob p = story.j.jenkins.createProject(WorkflowJob.class, "prj");
p.setDefinition(new CpsFlowDefinition(
"node {\n" +
" withDockerContainer('httpd:2.4.12') {\n" +
" sh \"sleep 5; ps -e -o pid,command | egrep '${pwd tmp: true}/durable-.+/script.sh' | fgrep -v grep | sort -n | tr -s ' ' | cut -d ' ' -f2 | xargs kill -9\"\n" +
" withDockerContainer('httpd:2.4.54-alpine') {\n" +
" sh \"set -e; sleep 5; ps -e -o pid,args | egrep '${pwd tmp: true}/durable-.+/script.sh' | fgrep -v grep | sort -n | tr -s ' ' | cut -d ' ' -f2 | xargs kill -9\"\n" +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing this whole parsing looks so painful to me 😢
Would it be easier to test if the heartbeat file still existed/was being written to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is trying to check whether sh steps inside of withContainer respond correctly if the remote script just suddenly dies, so it is using internal knowledge of durable-task to kill all of the processes related to the sh step from within, which is pretty awkward. It used to be simpler before jenkinsci/durable-task-plugin#49, but it had to be updated in #121 since the pid file didn't exist anymore.

I think that if we deleted the heartbeat file here it wouldn't do anything, since the control script would just touch the file 3 seconds later, recreating it. Not 100% sure though. Maybe we could chmod it to be unwritable or something, but IDK if that would be much easier to maintain.

" }\n" +
"}", true));
Field hci = BourneShellScript.class.getDeclaredField("HEARTBEAT_CHECK_INTERVAL");
Expand Down