-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bk, zk securityContext to support upgrade to non-root docker image #266
Add bk, zk securityContext to support upgrade to non-root docker image #266
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch makes sense to me, but I am kot a k8s expert, so please wait other reviews before merging (also CI failed...)
The error comes from the
Specifically, the logs show that Bookie init logs:
Zookeeper logs during the above timeout:
|
After talking with @lhotari, it looks like the issues are the known TLS/ZK 3.6.3 issues. I'll retry the tests a few times to see if they pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good in Kubernetes (tested on GKE), but encountering some problems in OpenShift (tested on OC 4.10).
The fsGroup: 0
will cause following error:
create Pod pulsar-zookeeper-0 in StatefulSet pulsar-zookeeper failed error: pods "pulsar-zookeeper-0" is forbidden: unable to validate against any security context constraint:
provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{0}: 0 is not an allowed group,
This is because OpenShift defines SecurityContextConstrains, so we might do more work on this for deploying on OpenShift.
But we can still merge this PR and support OpenShift on an individual PR.
@maxsxu - thank you for testing. Do you know if OpenShift works when you set
By setting |
@michaeljmarshall Unfortunately, still not work while setting The Broker and Proxy keep initializing due to below error:
Logs from the ZK Pod:
From my observation, OpenShift will generates a random non-zero We can observe the following inside the ZK Pod:
So, as for the "the container user is always a member of the root group...", yes indeed, but not for the PV group. |
@maxsxu - thanks for testing out my theory. I realize now that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @michaeljmarshall shouldn't the helm chart version change?
@frankjkelly good question. I know our recent release procedure has tied individual PRs with version bumps, but it seems more appropriate to me that we should separate releases and features into separate PRs. Regarding releases, I think the Pulsar Community needs to revisit Helm Chart release procedures. We're operating in a gray area by performing releases that are not first voted upon on the dev mailing list. By my understanding, all Apache project releases are supposed to have a vote. |
@michaeljmarshall Thanks for the information - all good points - esp. the Apache requirement. The challenge I think becomes however that two Pulsar deployments with the same Helm Chart version could potentially behave very differently due to different configurations despite the "immutability" of the images. |
@frankjkelly - are you saying that a release is overwritten when we merge a PR without incrementing the version? I assumed it only published the version when the version number changed. If it is overwriting existing helm binaries, that definitely needs to be fixed. |
@michaeljmarshall I'm not sure if a new helm chart release is made even if the version number has not changed. But either way - even if no release is done - it's a source for confusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Master Issue: apache/pulsar#11269
Motivation
Apache Pulsar's docker images for 2.10.0 and above are non-root by default. In order to ensure there is a safe upgrade path, we need to expose the
securityContext
for the Bookkeeper and Zookeeper StatefulSets. Here is the relevant k8s documentation on this k8s feature: https://kubernetes.io/docs/tasks/configure-pod-container/security-context.Once released, all deployments using the default
values.yaml
configuration for thesecurityContext
will pay a one time penalty on upgrade where the kubelet will recursively chown files to be root group writable. It's possible to temporarily avoid this penalty by settingsecurityContext: {}
.Modifications
bookkeeper.securityContext
andzookeeper.securityContext
.fsGroup: 0
. This is already the default group id in the docker image, and the docker image assumes the user has root group permission.fsGroupChangePolicy: "OnRootMismatch"
. This configuration will work for all deployments where the user id is stable. If the user id switches between restarts, like it does in OpenShift, please set toAlways
./pulsar/log/bookie-gc.log
?)Verifying this change
I first attempted verification of this change with minikube. It did not work because minikube uses hostPath volumes by default. I then tested on EKS v1.21.9-eks-0d102a7. I tested by deploying the current, latest version of the helm chart (2.9.3) and then upgrading to this PR's version of the helm chart along with using the 2.10.0 docker image. I also tested upgrading from a default version
Test 1 is a plain upgrade using the default 2.9.3 version of the chart, then upgrading to this PR's version of the chart with the modification to use the 2.10.0 docker images. It worked as expected.
Test 2 is a plain upgrade using the default 2.9.3 version of the chart, then an upgrade to this PR's version of the chart, then an upgrade to this PR's version of the chart using 2.10.0 docker images. There is a minor error described in the
README.md
. The solution is to chown the bookie's data directory.GC Logging
In my testing, I ran into the following errors when using
-Xlog:gc:/var/log/bookie-gc.log
:I resolved the error by removing the setting.
OpenShift Observations
I wanted to seamlessly support OpenShift, so I investigated using configuring the bookkeeper and zookeeper process with
umask 002
so that they would create files and directories that are group writable (OpenShift has a stable group id, but gives the process a random user id). That worked for most tools when switching the user id, but not for RocksDB, which creates a lock file at/pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK
with the permission0644
ignoring the umask. Here is the relevant error:As such, in order to support OpenShift, I exposed the
fsGroupChangePolicy
, which allows for OpenShift support, but not necessarily seamless support.