Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bk, zk securityContext to support upgrade to non-root docker image #266

Conversation

michaeljmarshall
Copy link
Member

Master Issue: apache/pulsar#11269

Motivation

Apache Pulsar's docker images for 2.10.0 and above are non-root by default. In order to ensure there is a safe upgrade path, we need to expose the securityContext for the Bookkeeper and Zookeeper StatefulSets. Here is the relevant k8s documentation on this k8s feature: https://kubernetes.io/docs/tasks/configure-pod-container/security-context.

Once released, all deployments using the default values.yaml configuration for the securityContext will pay a one time penalty on upgrade where the kubelet will recursively chown files to be root group writable. It's possible to temporarily avoid this penalty by setting securityContext: {}.

Modifications

  • Add config blocks for the bookkeeper.securityContext and zookeeper.securityContext.
  • Default to fsGroup: 0. This is already the default group id in the docker image, and the docker image assumes the user has root group permission.
  • Default to fsGroupChangePolicy: "OnRootMismatch". This configuration will work for all deployments where the user id is stable. If the user id switches between restarts, like it does in OpenShift, please set to Always.
  • Remove gc configuration writing to directory that the user lacks permission. (Perhaps we want to write to /pulsar/log/bookie-gc.log?)
  • Add documentation to the README.

Verifying this change

I first attempted verification of this change with minikube. It did not work because minikube uses hostPath volumes by default. I then tested on EKS v1.21.9-eks-0d102a7. I tested by deploying the current, latest version of the helm chart (2.9.3) and then upgrading to this PR's version of the helm chart along with using the 2.10.0 docker image. I also tested upgrading from a default version

Test 1 is a plain upgrade using the default 2.9.3 version of the chart, then upgrading to this PR's version of the chart with the modification to use the 2.10.0 docker images. It worked as expected.

$ helm install test apache/pulsar
$ # Wait for chart to deploy, then run the following, which uses Pulsar version 2.10.0:
$  helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/

Test 2 is a plain upgrade using the default 2.9.3 version of the chart, then an upgrade to this PR's version of the chart, then an upgrade to this PR's version of the chart using 2.10.0 docker images. There is a minor error described in the README.md. The solution is to chown the bookie's data directory.

$ helm install test apache/pulsar
$ # Wait for chart to deploy, then run the following, which uses Pulsar version 2.9.2:
$  helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/
$ # Upgrade using Pulsar version 2.10.0
$  helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/

GC Logging

In my testing, I ran into the following errors when using -Xlog:gc:/var/log/bookie-gc.log:

pulsar-bookkeeper-verify-clusterid [0.008s] Error opening log file '/var/log/bookie-gc.log': Permission denied
pulsar-bookkeeper-verify-clusterid [0.008s] Initialization of output 'file=/var/log/bookie-gc.log' using options '(null)' failed.
pulsar-bookkeeper-verify-clusterid [0.005s] Error opening log file '/var/log/bookie-gc.log': Permission denied
pulsar-bookkeeper-verify-clusterid [0.006s] Initialization of output 'file=/var/log/bookie-gc.log' using options '(null)' failed.
pulsar-bookkeeper-verify-clusterid Invalid -Xlog option '-Xlog:gc:/var/log/bookie-gc.log', see error log for details.
pulsar-bookkeeper-verify-clusterid Error: Could not create the Java Virtual Machine.
pulsar-bookkeeper-verify-clusterid Error: A fatal exception has occurred. Program will exit.
pulsar-bookkeeper-verify-clusterid Invalid -Xlog option '-Xlog:gc:/var/log/bookie-gc.log', see error log for details.
pulsar-bookkeeper-verify-clusterid Error: Could not create the Java Virtual Machine.
pulsar-bookkeeper-verify-clusterid Error: A fatal exception has occurred. Program will exit.

I resolved the error by removing the setting.

OpenShift Observations

I wanted to seamlessly support OpenShift, so I investigated using configuring the bookkeeper and zookeeper process with umask 002 so that they would create files and directories that are group writable (OpenShift has a stable group id, but gives the process a random user id). That worked for most tools when switching the user id, but not for RocksDB, which creates a lock file at /pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK with the permission 0644 ignoring the umask. Here is the relevant error:

2022-05-14T03:45:06,903+0000  ERROR org.apache.bookkeeper.server.Main - Failed to build bookie server
java.io.IOException: Error open RocksDB database
    at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:199) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:88) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.lambda$static$0(KeyValueStorageRocksDB.java:62) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.LedgerMetadataIndex.<init>(LedgerMetadataIndex.java:68) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.<init>(SingleDirectoryDbLedgerStorage.java:169) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.newSingleDirectoryDbLedgerStorage(DbLedgerStorage.java:150) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.initialize(DbLedgerStorage.java:129) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:818) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:152) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:52) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:304) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.server.Main.doMain(Main.java:226) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.server.Main.main(Main.java:208) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
Caused by: org.rocksdb.RocksDBException: while open a file for lock: /pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK: Permission denied
    at org.rocksdb.RocksDB.open(Native Method) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
    at org.rocksdb.RocksDB.open(RocksDB.java:239) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
    at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:196) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    ... 13 more

As such, in order to support OpenShift, I exposed the fsGroupChangePolicy, which allows for OpenShift support, but not necessarily seamless support.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch makes sense to me, but I am kot a k8s expert, so please wait other reviews before merging (also CI failed...)

@michaeljmarshall
Copy link
Member Author

michaeljmarshall commented May 17, 2022

The error comes from the bookie-init job in the Zookeeper TLS tests. We're seeing similar errors in #260. It fails running this code:

bin/apply-config-from-env.py conf/bookkeeper.conf; /pulsar/keytool/keytool.sh toolset ${HOSTNAME}.pulsar-ci-toolset.pulsar.svc.cluster.local true; if bin/bookkeeper shell whatisinstanceid; then
  echo "bookkeeper cluster already initialized";
else
  bin/bookkeeper shell initnewcluster;
fi

Specifically, the logs show that bin/bookkeeper shell whatisinstanceid connects to zookeeper, fails to find the instance id, and then bin/bookkeeper shell initnewcluster is run and times out while opening a connection to zookeeper. I looked at this a bit tonight, but couldn't find anything relevant. The one oddity in the bookie init logs is the last log line where it logs SSL handler added for channel. That time is somewhat close to the time that the zookeeper expires an old connection. It could just be a coincidence.

Bookie init logs:

[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:49:40,679+0000 [main] INFO  org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 1048575 Bytes
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:49:40,919+0000 [main] INFO  org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled=false
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:49:41,161+0000 [main-SendThread(pulsar-ci-zookeeper:2281)] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server pulsar-ci-zookeeper/10.244.1.17:2281.
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:49:41,161+0000 [main-SendThread(pulsar-ci-zookeeper:2281)] INFO  org.apache.zookeeper.ClientCnxn - SASL config status: Will not attempt to authenticate using SASL (unknown error)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:50:21,359+0000 [main-SendThread(pulsar-ci-zookeeper:2281)] WARN  org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 40358ms for session id 0x0
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:50:22,663+0000 [main-SendThread(pulsar-ci-zookeeper:2281)] WARN  org.apache.zookeeper.ClientCnxn - An exception was thrown while closing send thread for session 0x0.
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 40358ms for session id 0x0
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1258) [org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:50:27,458+0000 [main] INFO  org.apache.zookeeper.ClientCnxnSocketNetty - channel is told closing
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:50:27,677+0000 [main] INFO  org.apache.zookeeper.ZooKeeper - Session: 0x0 closed
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:50:27,776+0000 [main] ERROR org.apache.bookkeeper.meta.zk.ZKMetadataDriverBase - Failed to create zookeeper client to pulsar-ci-zookeeper:2281
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase.waitForConnection(ZooKeeperWatcherBase.java:159) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$Builder.build(ZooKeeperClient.java:260) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.zk.ZKMetadataDriverBase.initialize(ZKMetadataDriverBase.java:207) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.zk.ZKMetadataBookieDriver.initialize(ZKMetadataBookieDriver.java:60) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.MetadataDrivers.runFunctionWithMetadataBookieDriver(MetadataDrivers.java:345) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.MetadataDrivers.runFunctionWithRegistrationManager(MetadataDrivers.java:372) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.client.BookKeeperAdmin.initNewCluster(BookKeeperAdmin.java:1278) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.tools.cli.commands.bookies.InitCommand.apply(InitCommand.java:56) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.bookie.BookieShell$InitNewCluster.runCmd(BookieShell.java:334) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.bookie.BookieShell$MyCommand.runCmd(BookieShell.java:238) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.bookie.BookieShell.run(BookieShell.java:2278) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.bookie.BookieShell.main(BookieShell.java:2369) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:50:27,432+0000 [globalEventExecutor-1-1] WARN  org.apache.zookeeper.ClientCnxnSocketNetty - future isn't success.
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] io.netty.util.concurrent.DefaultPromise$LeanCancellationException: null
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at io.netty.util.concurrent.DefaultPromise.cancel(...)(Unknown Source) ~[io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:50:27,722+0000 [main-EventThread] INFO  org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x0
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] Exception in thread "main" com.google.common.util.concurrent.UncheckedExecutionException: Failed to create zookeeper client to pulsar-ci-zookeeper:2281
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.tools.cli.commands.bookies.InitCommand.apply(InitCommand.java:58)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.bookie.BookieShell$InitNewCluster.runCmd(BookieShell.java:334)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.bookie.BookieShell$MyCommand.runCmd(BookieShell.java:238)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.bookie.BookieShell.run(BookieShell.java:2278)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.bookie.BookieShell.main(BookieShell.java:2369)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] Caused by: org.apache.bookkeeper.meta.exceptions.MetadataException: Failed to create zookeeper client to pulsar-ci-zookeeper:2281
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.zk.ZKMetadataDriverBase.initialize(ZKMetadataDriverBase.java:227)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.zk.ZKMetadataBookieDriver.initialize(ZKMetadataBookieDriver.java:60)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.MetadataDrivers.runFunctionWithMetadataBookieDriver(MetadataDrivers.java:345)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.MetadataDrivers.runFunctionWithRegistrationManager(MetadataDrivers.java:372)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.client.BookKeeperAdmin.initNewCluster(BookKeeperAdmin.java:1278)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.tools.cli.commands.bookies.InitCommand.apply(InitCommand.java:56)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	... 4 more
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase.waitForConnection(ZooKeeperWatcherBase.java:159)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$Builder.build(ZooKeeperClient.java:260)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	at org.apache.bookkeeper.meta.zk.ZKMetadataDriverBase.initialize(ZKMetadataDriverBase.java:207)
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 	... 9 more
[pod/pulsar-ci-bookie-init-ffxw6/pulsar-ci-bookie-init] 2022-05-14T04:50:28,965+0000 [epollEventLoopGroup-2-1] INFO  org.apache.zookeeper.ClientCnxnSocketNetty - SSL handler added for channel: [id: 0xbcd9c9b9]

Zookeeper logs during the above timeout:

[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:40,253+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e001f, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:44,255+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e0020, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:49,717+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:49,900+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-broker,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:52,254+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e0022, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:52,575+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:52,643+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:54,252+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e0025, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:54,582+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-zookeeper,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:54,586+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.NettyServerCnxn - Processing ruok command from /127.0.0.1:43088
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:58,254+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e0026, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:49:59,714+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-broker,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:00,398+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-zookeeper,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:00,413+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.NettyServerCnxn - Processing ruok command from /127.0.0.1:43090
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:03,414+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:08,145+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:08,252+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e0028, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:08,331+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:10,202+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-broker,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:10,254+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e002b, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:15,393+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:19,342+0000 [epollEventLoopGroup-7-2] ERROR org.apache.zookeeper.server.NettyServerCnxnFactory - Unsuccessful handshake with session 0x0
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:20,254+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e002e, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:20,255+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e002c, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:22,179+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-broker,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:23,699+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:24,531+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:24,713+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-zookeeper,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:24,721+0000 [epollEventLoopGroup-7-2] INFO  org.apache.zookeeper.server.NettyServerCnxn - Processing ruok command from /127.0.0.1:43092
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:28,256+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e0031, timeout of 30000ms exceeded
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:28,630+0000 [epollEventLoopGroup-7-1] INFO  org.apache.zookeeper.server.auth.X509AuthenticationProvider - Authenticated Id 'CN=pulsar-ci-bookie,OU=IT Department,O=StreamNative,ST=San Francisco,C=US' for Scheme 'x509'
[pod/pulsar-ci-zookeeper-0/pulsar-ci-zookeeper] 2022-05-14T04:50:30,253+0000 [SessionTracker] INFO  org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x10000059c2e0032, timeout of 30000ms exceeded

@michaeljmarshall
Copy link
Member Author

After talking with @lhotari, it looks like the issues are the known TLS/ZK 3.6.3 issues. I'll retry the tests a few times to see if they pass.

Copy link

@maxsxu maxsxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good in Kubernetes (tested on GKE), but encountering some problems in OpenShift (tested on OC 4.10).

The fsGroup: 0 will cause following error:

create Pod pulsar-zookeeper-0 in StatefulSet pulsar-zookeeper failed error: pods "pulsar-zookeeper-0" is forbidden: unable to validate against any security context constraint: 
provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{0}: 0 is not an allowed group,

This is because OpenShift defines SecurityContextConstrains, so we might do more work on this for deploying on OpenShift.

But we can still merge this PR and support OpenShift on an individual PR.

@michaeljmarshall
Copy link
Member Author

@maxsxu - thank you for testing. Do you know if OpenShift works when you set securityContext: {} for zookeeper and bookkeeper? I had assumed (perhaps incorrectly) the PR's current security context would work because the OpenShift documentation for how to create a docker image to run on OpenShift explicitly says:

Because the container user is always a member of the root group, the container user can read and write these files.

By setting securityContext: {}, the user should be a member of the root group, but I'm not sure that OpenShift will recursively make the persistent volumes root group writable.

@maxsxu
Copy link

maxsxu commented May 21, 2022

@michaeljmarshall Unfortunately, still not work while setting securityContext: {}.

The Broker and Proxy keep initializing due to below error:

WATCHER::
WatchedEvent state:SyncConnected type:None path:null
Node does not exist: /admin/clusters/pulsar
2022-05-21T07:45:10,734+0000 [main] ERROR org.apache.zookeeper.util.ServiceUtils - Exiting JVM with code 1
pulsar cluster pulsar isn't initialized yet ... check in 3 seconds ...

Logs from the ZK Pod:

org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client, it probably closed the socket: address = /10.129.5.225:38320, session = 0x2002e4f5ac42ab8
at org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:163) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:326) [org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522) [org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154) [org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
2022-05-21T10:05:52,110+0000 [SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x2002e4f5ac42aa7, timeout of 30000ms exceeded
2022-05-21T10:05:52,110+0000 [SessionTracker] INFO org.apache.zookeeper.server.ZooKeeperServer - Expiring session 0x3002e4f34832aa7, timeout of 30000ms exceeded
2022-05-21T10:05:52,110+0000 [RequestThrottler] INFO org.apache.zookeeper.server.ZooKeeperServer - Submitting global closeSession request for session 0x2002e4f5ac42aa7
2022-05-21T10:05:52,110+0000 [RequestThrottler] INFO org.apache.zookeeper.server.ZooKeeperServer - Submitting global closeSession request for session 0x3002e4f34832aa7
2022-05-21T10:05:52,614+0000 [CommitProcessor:2] INFO org.apache.zookeeper.server.quorum.LeaderSessionTracker - Committing global session 0x2002e4f5ac42ab9
2022-05-21T10:05:52,956+0000 [NIOWorkerThread-1] WARN org.apache.zookeeper.server.NIOServerCnxn - Unexpected exception

From my observation, OpenShift will generates a random non-zero fsGroup for Pod when unspecified. So the group of PV (/pulsar/data directory) will be that random non-zero fsGroup, rather the root group.

We can observe the following inside the ZK Pod:

$ id
uid=1001060000(1001060000) gid=0(root) groups=0(root),1001060000
$ ls -al
total 84
drwxrwxr-x. 1 root       root          42 May 21 12:10 .
dr-xr-xr-x. 1 root       root          53 May 21 12:10 ..
-rw-r--r--. 1 root       root       32333 Jan 22  2020 LICENSE
-rw-r--r--. 1 root       root        6612 Jan 22  2020 NOTICE
-rw-r--r--. 1 root       root        1269 Jan 22  2020 README
drwxr-xr-x. 3 root       root        4096 Mar 26 04:02 bin
drwxrwxr-x. 1 root       root          28 Jan 22  2020 conf
drwxr-xr-x. 2 root       root        4096 Mar 26 04:05 connectors
drwxrwsr-x. 4 root       1001060000  4096 May 21 03:12 data
drwxr-xr-x. 3 root       root         132 Mar 26 04:02 examples
drwxr-xr-x. 4 root       root          66 Mar 26 04:02 instances
drwxr-xr-x. 3 root       root       20480 Mar 26 04:02 lib
drwxr-xr-x. 2 root       root        4096 Jan 22  2020 licenses
drwxr-xr-x. 2 1001060000 root          50 May 21 12:10 logs
drwxr-xr-x. 2 root       root          91 Mar 26 04:05 offloaders
drwxr-xr-x. 2 root       root          66 Mar 26 04:02 pulsar-client

So, as for the "the container user is always a member of the root group...", yes indeed, but not for the PV group.

@michaeljmarshall
Copy link
Member Author

michaeljmarshall commented May 25, 2022

@maxsxu - thanks for testing out my theory. I realize now that the fsGroup does not technically need to be 0 or root in this configuration. A user deploying to OpenShift can choose any GID that is acceptable. The docker image will work correctly, because the .conf files are writable by the root group, and the configured fsGroup will own the PVCs. Essentially, OpenShift users will just need to select an fsGroup that passes the security context requirements.

Copy link
Contributor

@frankjkelly frankjkelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @michaeljmarshall shouldn't the helm chart version change?

@michaeljmarshall
Copy link
Member Author

Thanks @michaeljmarshall shouldn't the helm chart version change?

@frankjkelly good question. I know our recent release procedure has tied individual PRs with version bumps, but it seems more appropriate to me that we should separate releases and features into separate PRs.

Regarding releases, I think the Pulsar Community needs to revisit Helm Chart release procedures. We're operating in a gray area by performing releases that are not first voted upon on the dev mailing list. By my understanding, all Apache project releases are supposed to have a vote.

@frankjkelly
Copy link
Contributor

@michaeljmarshall Thanks for the information - all good points - esp. the Apache requirement. The challenge I think becomes however that two Pulsar deployments with the same Helm Chart version could potentially behave very differently due to different configurations despite the "immutability" of the images.

@michaeljmarshall
Copy link
Member Author

@frankjkelly - are you saying that a release is overwritten when we merge a PR without incrementing the version? I assumed it only published the version when the version number changed. If it is overwriting existing helm binaries, that definitely needs to be fixed.

@frankjkelly
Copy link
Contributor

@michaeljmarshall I'm not sure if a new helm chart release is made even if the version number has not changed. But either way - even if no release is done - it's a source for confusion.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michaeljmarshall michaeljmarshall merged commit 428736c into apache:master Jun 14, 2022
@michaeljmarshall michaeljmarshall deleted the prepare-for-non-root-container branch June 14, 2022 03:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants