Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform analysis on multistage Dockerfiles #612

Closed
nishakm opened this issue Mar 27, 2020 · 3 comments
Closed

Perform analysis on multistage Dockerfiles #612

nishakm opened this issue Mar 27, 2020 · 3 comments
Assignees
Labels
feature new feature proposal Propose a change to the project
Milestone

Comments

@nishakm
Copy link
Contributor

nishakm commented Mar 27, 2020

Describe the Feature
Tern can read a Dockerfile, build and image and analyze the image to provide a report and pin a Dockerfile. However, it cannot do these for multistage Dockerfiles accurately because the intermediate stages get thrown away after a build is complete. Why not enable analysis and pinning for multistage Dockerfiles?

Use Cases
Multistage Dockerfiles are the de-facto build mechanism for golang projects in particular. A Dockerfile will typically use the golang image to build and an alpine image to deploy the built golang binary.

Implementation Changes
A proposed implementation can look like this:

  1. Split the Dockerfile by stage, making a single Dockerfile for each stage
  2. Build and analyze each stage
  3. For reporting, perhaps organize each stage as a different section and indicate that each is a build stage of the next. Pinning a multistage Dockerfile is straightforward.
@nishakm nishakm added feature new feature proposal Propose a change to the project GSoC For Google Summer of Code labels Mar 27, 2020
@ForgetMe17
Copy link
Contributor

Maybe i can work on this issue!

@nishakm nishakm added this to the Near Future milestone Apr 21, 2020
@nishakm nishakm modified the milestones: Near Future, Release 2.2.0 Jul 9, 2020
@ForgetMe17
Copy link
Contributor

ForgetMe17 commented Jul 14, 2020

Hi nisha, here is my plan:
While performing analyze on a single-stage dockerfile, we first build it and analyze the image. For a mutilstage dockerfile it should work similarly.

1. Build
For a two-stage dockerfile, we need to get two images corresponding to two stages in the dockerfile. So in the build part, we can use the following dockerfiles to build the images.
Multistage Dockerfile

FROM golang:1.7.3
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html  
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest  
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/alexellis/href-counter/app . 
CMD ["./app"]  

into
Dockerfile1: Stage 1

FROM golang:1.7.3
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html  
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

Dockerfile2: Stage 1 + Stage 2

FROM golang:1.7.3
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html  
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest  
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/alexellis/href-counter/app . # Copy file from previous stage.
CMD ["./app"]  

I am not spliting into seperate stages since the second stage may need to copy files from the previous stage(Stage 2 has dependcy on Stage 1).
While building Dockerfile2, the engine first builds Stage 1 and then Stage 2, during building Stage 2, files in the Stage 1 are copied to Stage 2. Stage 1 is deleted once Stage 2 is finished.

2.Analyze
After we get the image, we can analyze and lock the dockfile. Here we can use analyze_docker_image(image_obj, redo=False, dfile_lock=False, dfobj=None), but this dfobj should be:
Stage 1

FROM golang:1.7.3
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html  
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

Stage 2

FROM alpine:latest  
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/alexellis/href-counter/app . # Copy file from previous stage.
CMD ["./app"]  

This is different from build part because each image is built according to each stage. So we can lock each stage by analyzing each image.

@nishakm
Copy link
Contributor Author

nishakm commented Jul 14, 2020

@ForgetMe17 That makes sense to me. All the information collected when analyzing an image built from Dockerfile1 are stored in the Image object. So you can instantiate stage1 = DockerImage(repotag1) for the first image and stage2 = DockerImage(repotag2) for the second image. But some of the reports do not take more than one image (most notably the SPDX tag-value) so, focusing on the default report and Dockerfile lock is enough for this part.

ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Jul 18, 2020
Implemented two functions:
1. check_mutistage_dockerfile(): Given a dockerfile object, return if
it is a multistage dockerfile.
2. split_multistage_dockerfile(): Given a multistage dockerfile object,
return the splited dockerfile object list.

Works towards tern-tools#767.
Super issue tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Jul 23, 2020
This commit adds two functions to implement building a multistage
dockerfile to get its images by the stages.

1. get_multistage_image_dockerfiles() at
tern\analyze\docker\dockerfile.py: This function splits a multistage
dockerfile into dockerfiles by its stage for build.

2. build_multistage() at tern\analyze\docker\run.py: This functions
builds a multistage dockerfile to get the images for analyze. So far we
can build the dockerfile and the further jobs like analyze and clean up
are implement by other commits.

Works towards tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Jul 23, 2020
This commit adds two functions to implement building a multistage
dockerfile to get its images by the stages.

1. get_multistage_image_dockerfiles() at
tern\analyze\docker\dockerfile.py: This function splits a multistage
dockerfile into dockerfiles by its stage for build.

2. build_multistage() at tern\analyze\docker\run.py: This functions
builds a multistage dockerfile to get the images for analyze. So far we
can build the dockerfile and the further jobs like analyze and clean up
are implement by other commits.

Works towards tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Jul 23, 2020
This commit adds two functions to implement building a multistage
dockerfile to get its images by the stages.

1. get_multistage_image_dockerfiles() at
tern\analyze\docker\dockerfile.py: This function splits a multistage
dockerfile into dockerfiles by its stage for build.

2. build_multistage() at tern\analyze\docker\run.py: This functions
builds a multistage dockerfile to get the images for analyze. So far we
can build the dockerfile and the further jobs like analyze and clean up
are implement by other commits.

Works towards tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Aug 4, 2020
This commit add 3 functions in tern\analyze\docker\dockerfile.py。

1. check_multistage_dockerfile() to check if the given dockerfile is a
multistage dockerfile, return True or Flase and the index of FROM line.

2. get_multistage_image_dockerfiles() to split multistage dockerfile
into seperate dockerfiles for build.

3. write_dockerfile_by_structure() to write a dockerfile by the dfobj
structure.

Works towards tern-tools#767.
Super issue tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Aug 12, 2020
This commit add 3 functions in tern\analyze\docker\dockerfile.py。

1. check_multistage_dockerfile() to check if the given dockerfile is a
multistage dockerfile, return True or Flase and the index of FROM line.

2. get_multistage_image_dockerfiles() to split multistage dockerfile
into seperate dockerfiles for build.

3. write_dockerfile_by_structure() to write a dockerfile by the dfobj
structure.

Works towards tern-tools#767.
Super issue tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Aug 13, 2020
1. In tern\analyze\docker\dockerfile.py, add function
split_multistage_dockerfile_by_stage() to split by stage.

2. In tern\analyze\docker\run.py, add function
build_multistage(), this is a draft version for building and analyzing
multistage dockerfile.

Works towards tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Aug 21, 2020
1. In tern\analyze\docker\dockerfile.py, add function
split_multistage_dockerfile_by_stage() to split by stage.

2. In tern\analyze\docker\run.py, add function
build_multistage(), this is a draft version for building and analyzing
multistage dockerfile.

Works towards tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
@nishakm nishakm removed the GSoC For Google Summer of Code label Aug 26, 2020
@nishakm nishakm modified the milestones: Release 2.2.0, Near Future Aug 26, 2020
@nishakm nishakm modified the milestones: Near Future, 3.0.0 Sep 3, 2020
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Sep 13, 2020
Add function execute_multistage_dockerfile(args) in
tern\analyze\docker\run.py. This function iterate the stages in the
multistage dockerfile and use execute_dockerfile() for each stage.

Works towards tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
ForgetMe17 added a commit to ForgetMe17/tern that referenced this issue Sep 13, 2020
Add function execute_multistage_dockerfile(args) in
tern\analyze\docker\run.py. This function iterate the stages in the
multistage dockerfile and use execute_dockerfile() for each stage.

Works towards tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
@nishakm nishakm modified the milestones: Release 3.0.0, Release 2.3.0 Nov 18, 2020
nishakm pushed a commit to nishakm/tern that referenced this issue Nov 18, 2020
Add function execute_multistage_dockerfile(args).
This function iterate the stages in the
multistage dockerfile and use execute_dockerfile() for each stage.

Works towards tern-tools#612.

Signed-off-by: WangJL <hazard15020@gmail.com>
Signed-off-by: Nisha K <nishak@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature new feature proposal Propose a change to the project
Projects
None yet
Development

No branches or pull requests

2 participants