Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker image: add vim and nettools (netstat) #10815

Closed
wants to merge 1 commit into from

Conversation

eolivelli
Copy link
Contributor

@eolivelli eolivelli commented Jun 3, 2021

Motivation

Now that we have rootless images the user (system administrator in this case) cannot install new tools easily, so it is better to add to the base image some common tools, like "vim" and "netstat"
e. What is the problem you're trying to solve.*

Modifications

Add "vim" and "netstat" and "unzip" to the base image.
The image will become a little bigger but it is worth

Verifying this change

This change is already covered by existing tests, the integration tests.

@eolivelli eolivelli requested review from tuteng and lhotari June 3, 2021 13:13
@eolivelli
Copy link
Contributor Author

@lhotari is it true that now we are building the official docker image on CI ?

IIRC you switched to some lightweight docker image the integration tests

can you please confirm ?

@lhotari
Copy link
Member

lhotari commented Jun 3, 2021

IIRC you switched to some lightweight docker image the integration tests

There's the java-test-image available added by #9625 . The docker file for that is https://github.com/apache/pulsar/blob/master/tests/docker-images/java-test-image/Dockerfile .

java-test-image isn't used for integration tests in CI atm. There's support for that, but it's not used in CI.

@eolivelli
Copy link
Contributor Author

@lhotari thanks

Copy link
Member

@sijie sijie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a production image. so please be thoughtful when adding an additional package.

@eolivelli
Copy link
Contributor Author

@sijie sure I know.
this image will be used by many users.

Now that we are rootless it is very difficult to operate inside the containers.
I believe that we should add some guide in the docs about to work in a rootless world
on the other hand we should provide the most common tools.
without "vim" or "nano" it is impossible to edit files easily on Linux
the same applies to "netstat" it is something that everyone is used to use

if you deploy Pulsar using the Open Source docker images and the helm chart you should be able to work

@Anonymitaet
Copy link
Member

@eolivelli thanks for your contribution. For this PR, do we need to update docs?

@eolivelli
Copy link
Contributor Author

@sijie @addisonj @michaeljmarshall do you have some suggestion about how to help users work on the pods now that we are rootless ?

@michaeljmarshall
Copy link
Member

do you have some suggestion about how to help users work on the pods now that we are rootless ?

I think the largest difference is that you cannot download dependencies via apt. I've found Bitnami blogs pretty helpful with the non-root container work. They have a blog here that mentions the main challenges that users face when using them: https://engineering.bitnami.com/articles/running-non-root-containers-on-openshift.html and https://docs.bitnami.com/tutorials/work-with-non-root-containers/. (Note that the blogs have some outdated references to limitations of kubernetes.)

Depending on how we move forward here, I can add docs for troubleshooting with the non-root docker images. (As the above links describe, one note is that you can still run the container as the root user. If you're starting the docker container yourself, you can specify --user root. If the container is already running, you can specify the user when using docker exec. In Kubernetes, you can do this by specifying the pod's securityContext.)

This PR does raise one question for me: what are our container security goals for Pulsar? I agree that development is easier with these tools (and possibly others) in the docker image. However, I'd posit the main use case for these docker images is not development, but is rather running in a production environment where there shouldn't be frequent debugging. These tools technically introduce potential attack vectors and increase the size of these docker images. One option is to create "debug" docker images that include more tools.

Given that we already include several pulsar clients in the basic pulsar docker image, it's probably reasonable to include these three packages as the PR proposes. If we change direction and have smaller docker images or if we need to add many more packages, we could discuss have "debug" images and limiting the packages shipped in the basic pulsar and pulsar-all images.

Full disclosure, my perspective is influenced by Google's "distroless" work: https://github.com/GoogleContainerTools/distroless. Their perspective is strong (and if you go to their README, you'll see that they make use of "debug" images):

Why should I use distroless images?
Restricting what's in your runtime container to precisely what's necessary for your app is a best practice employed by Google and other tech giants that have used containers in production for many years. It improves the signal to noise of scanners (e.g. CVE) and reduces the burden of establishing provenance to just what you need.

I'm not sure that Pulsar should (or can) become "distroless", but I do find the principles compelling.

@eolivelli
Copy link
Contributor Author

Given that we already include several pulsar clients in the basic pulsar docker image, it's probably reasonable to include these three packages as the PR proposes. If we change direction and have smaller docker images or if we need to add many more packages, we could discuss have "debug" images and limiting the packages shipped in the basic pulsar and pulsar-all images.

It is really a pain to upgrade to 2.8.0 docker images (a part from the fact that you have to fix the owner of the files on the bk and zk pods).

I wonder if anyone will be happy to upgrade to 2.8.0 docker images.
I would like that users won't be forced to build their own custom docker images or to be forced to use third-party docker images just because they do not have a way to edit files (vim) or to see open connections (netstat)

@michaeljmarshall
Copy link
Member

I wonder if anyone will be happy to upgrade to 2.8.0 docker images.

I think security conscious users/companies will be happy to see we're moving in this direction. Further, the Pulsar 2.8.0 image is the first pulsar image that is compliant with Open Shift's requirements to run a container as a random non root user.

We can mitigate much of the complexity in the helm chart. In the official Helm chart, we are going to override the default user for the bookkeeper and zookeeper containers so they run as root until a user opts in to running them as non root.

I would like that users won't be forced to build their own custom docker images or to be forced to use third-party docker images just because they do not have a way to edit files (vim) or to see open connections (netstat)

I agree that we want to provide a docker image that is useful to the majority of our users. A "debug" docker image could eventually make sense too, but for now, we can just add in these packages.

@sijie
Copy link
Member

sijie commented Jun 8, 2021

Give we have reverted the docker image from rootless to a root image, we should hold on to this change and push the discussion later when we switched back to rootless.

@eolivelli
Copy link
Contributor Author

@sijie

Give we have reverted the docker image from rootless to a root image, we should hold on to this change and push the discussion later when we switched back to rootless.

Makes sense to me.
We should think more about the impact of rootless container images.

I am keeping open this PR and add it to 2.9.0, so that we won't forget

@codelipenghui
Copy link
Contributor

The pr had no activity for 30 days, mark with Stale label.

@github-actions
Copy link

github-actions bot commented Mar 4, 2022

@eolivelli:Thanks for your contribution. For this PR, do we need to update docs?
(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants