pulsar-helm-chart

Author	SHA1	Message	Date
Rajan Dhabalia	89f28bca9c	Support mechanism to provide external zookeeper-server list to build global/configuration zookeeper (#269 ) * Support mechanism to provide external zookeeper-server list to build global/configuration zookeeper * Add external zk example * add external zk list into values.yaml Fixes #268 ### Motivation Right now, [chart dynamically](https://github.com/apache/pulsar-helm-chart/blob/master/charts/pulsar/templates/zookeeper-statefulset.yaml#L140) creates zk cluster with zk pods initialized in the same namespace. However, for global/configuration zookeeper, user requires to build zk clusters with pods deployed in different namespaces. Therefore, user needs a mechanism to pass an external list of zk-servers to the chart and build zk-cluster with pods across different namespaces. ### Modification - Chart should be considering zk-value's configuration for external zookeeper and generate zk-configuration file with appropriate zk-server list and unique id of that zookeeper. This PR sets `ZOOKEEPER_SERVERS` value provided by user and also sets override-value flag which will be used by [generate-zookeeper-config.sh](https://github.com/apache/pulsar/blob/master/docker/pulsar/scripts/generate-zookeeper-config.sh) to override external zk list in config file and assign appropriate id to the host. https://github.com/apache/pulsar/pull/15987 fixes [generate-zookeeper-config.sh](https://github.com/apache/pulsar/blob/master/docker/pulsar/scripts/generate-zookeeper-config.sh) changes. ### Result - User can add `ZOOKEEPER_SERVERS` string into `zookeeper.configData` in [Values.yaml](https://github.com/apache/pulsar-helm-chart/blob/master/charts/pulsar/values.yaml#L385) file to override external zk-server list.	2022-10-18 17:41:43 -05:00
Lari Hotari	25f355e6e2	Use appVersion as default tag for Pulsar images (#200 ) Co-authored-by: Michael Marshall <mmarshall@apache.org> ### Motivation There was a suggestion [in a dev mailing list discussion](https://lists.apache.org/thread/bgkvcyt1qq6h67p2k8xwp89xlncbqn3d) that the Helm chart's appVersion should be used as the default image tag. ### Additional context There are some limitations in Helm. It is not possible to set "appVersion" from the command line. There's in an open feature request https://github.com/helm/helm/issues/8194 to add such a feature to Helm. ### Modifications - change default values.yaml and set the tags for the images that use the Pulsar image to an empty value - add "defaultPulsarImageTag" to values.yaml - add a helper template "pulsar.imageFullName" that contains the logic to fall back to .Values.defaultPulsarImageTag and if it's not set, falling back to .Chart.AppVersion - use the helper template in all other templates that require the logic	2022-10-17 15:42:58 -05:00
HuynhKevin	3c59b43f28	Add imagePullSecrets zookeeper (#244 ) * Add imagePullSecrets for zookeeper * Add imagePullSecrets for zookeeper Co-authored-by: Kevin Huynh <khuynh@littlebigcode.fr> All components have the imagePullSecrets to avoid quota limit to init correctly the pods except zookeeper	2022-06-26 00:01:48 -05:00
Michael Marshall	428736c788	Add bk, zk securityContext to support upgrade to non-root docker image (#266 ) Master Issue: https://github.com/apache/pulsar/issues/11269 ### Motivation Apache Pulsar's docker images for 2.10.0 and above are non-root by default. In order to ensure there is a safe upgrade path, we need to expose the `securityContext` for the Bookkeeper and Zookeeper StatefulSets. Here is the relevant k8s documentation on this k8s feature: https://kubernetes.io/docs/tasks/configure-pod-container/security-context. Once released, all deployments using the default `values.yaml` configuration for the `securityContext` will pay a one time penalty on upgrade where the kubelet will recursively chown files to be root group writable. It's possible to temporarily avoid this penalty by setting `securityContext: {}`. ### Modifications * Add config blocks for the `bookkeeper.securityContext` and `zookeeper.securityContext`. * Default to `fsGroup: 0`. This is already the default group id in the docker image, and the docker image assumes the user has root group permission. * Default to `fsGroupChangePolicy: "OnRootMismatch"`. This configuration will work for all deployments where the user id is stable. If the user id switches between restarts, like it does in OpenShift, please set to `Always`. * Remove gc configuration writing to directory that the user lacks permission. (Perhaps we want to write to `/pulsar/log/bookie-gc.log`?) * Add documentation to the README. ### Verifying this change I first attempted verification of this change with minikube. It did not work because minikube uses hostPath volumes by default. I then tested on EKS v1.21.9-eks-0d102a7. I tested by deploying the current, latest version of the helm chart (2.9.3) and then upgrading to this PR's version of the helm chart along with using the 2.10.0 docker image. I also tested upgrading from a default version Test 1 is a plain upgrade using the default 2.9.3 version of the chart, then upgrading to this PR's version of the chart with the modification to use the 2.10.0 docker images. It worked as expected. ```bash $ helm install test apache/pulsar $ # Wait for chart to deploy, then run the following, which uses Pulsar version 2.10.0: $ helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/ ``` Test 2 is a plain upgrade using the default 2.9.3 version of the chart, then an upgrade to this PR's version of the chart, then an upgrade to this PR's version of the chart using 2.10.0 docker images. There is a minor error described in the `README.md`. The solution is to chown the bookie's data directory. ```bash $ helm install test apache/pulsar $ # Wait for chart to deploy, then run the following, which uses Pulsar version 2.9.2: $ helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/ $ # Upgrade using Pulsar version 2.10.0 $ helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/ ``` ### GC Logging In my testing, I ran into the following errors when using `-Xlog:gc:/var/log/bookie-gc.log`: ``` pulsar-bookkeeper-verify-clusterid [0.008s] Error opening log file '/var/log/bookie-gc.log': Permission denied pulsar-bookkeeper-verify-clusterid [0.008s] Initialization of output 'file=/var/log/bookie-gc.log' using options '(null)' failed. pulsar-bookkeeper-verify-clusterid [0.005s] Error opening log file '/var/log/bookie-gc.log': Permission denied pulsar-bookkeeper-verify-clusterid [0.006s] Initialization of output 'file=/var/log/bookie-gc.log' using options '(null)' failed. pulsar-bookkeeper-verify-clusterid Invalid -Xlog option '-Xlog:gc:/var/log/bookie-gc.log', see error log for details. pulsar-bookkeeper-verify-clusterid Error: Could not create the Java Virtual Machine. pulsar-bookkeeper-verify-clusterid Error: A fatal exception has occurred. Program will exit. pulsar-bookkeeper-verify-clusterid Invalid -Xlog option '-Xlog:gc:/var/log/bookie-gc.log', see error log for details. pulsar-bookkeeper-verify-clusterid Error: Could not create the Java Virtual Machine. pulsar-bookkeeper-verify-clusterid Error: A fatal exception has occurred. Program will exit. ``` I resolved the error by removing the setting. ### OpenShift Observations I wanted to seamlessly support OpenShift, so I investigated using configuring the bookkeeper and zookeeper process with `umask 002` so that they would create files and directories that are group writable (OpenShift has a stable group id, but gives the process a random user id). That worked for most tools when switching the user id, but not for RocksDB, which creates a lock file at `/pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK` with the permission `0644` ignoring the umask. Here is the relevant error: ``` 2022-05-14T03:45:06,903+0000 ERROR org.apache.bookkeeper.server.Main - Failed to build bookie server java.io.IOException: Error open RocksDB database at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:199) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:88) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.lambda$static$0(KeyValueStorageRocksDB.java:62) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.bookie.storage.ldb.LedgerMetadataIndex.<init>(LedgerMetadataIndex.java:68) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.<init>(SingleDirectoryDbLedgerStorage.java:169) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.newSingleDirectoryDbLedgerStorage(DbLedgerStorage.java:150) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.initialize(DbLedgerStorage.java:129) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:818) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:152) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:52) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:304) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.server.Main.doMain(Main.java:226) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] at org.apache.bookkeeper.server.Main.main(Main.java:208) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] Caused by: org.rocksdb.RocksDBException: while open a file for lock: /pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK: Permission denied at org.rocksdb.RocksDB.open(Native Method) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?] at org.rocksdb.RocksDB.open(RocksDB.java:239) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?] at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:196) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4] ... 13 more ``` As such, in order to support OpenShift, I exposed the `fsGroupChangePolicy`, which allows for OpenShift support, but not necessarily _seamless_ support.	2022-06-13 22:11:13 -05:00
Lari Hotari	1c4f745941	Improve Zookeeper "ruok" probes: use TLS port when TLS is enabled, specify "-q 1" for nc (#223 ) - NOTICE: we are no more using "bin/pulsar-zookeeper-ruok.sh" from the apachepulsar/pulsar docker image. The probe script is part of the chart. * Pass "-q 1" to netcat (nc) to fix issue with Zookeeper ruok probe - see https://github.com/apache/pulsar/pull/14088 * Send ruok to TLS port when TLS is enabled * Bump chart version	2022-02-17 07:48:20 +02:00
Lari Hotari	22f4b9b3bd	Wrap Zookeeper probe script with timeout command (#214 ) so that the probe doesn't continue running indefinitely - resolves the issue with Kubernetes <1.20 "Before Kubernetes 1.20, the field timeoutSeconds was not respected for exec probes: probes continued running indefinitely, even past their configured deadline, until a result was returned." in https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes - #179 already fixed the issue for Kubernetes 1.20+	2022-01-26 15:17:15 +02:00
Aaron Johnson	cee3b5c5e6	added additionalCommand parameter (#150 ) Co-authored-by: Aaron Johnson <aaron.johnson@crowdstrike.com>	2022-01-05 10:26:55 -06:00
Lari Hotari	b4b2fa7b80	[Security] Workaround for CVE-2021-44228 Log4J RCE when Log4J >= 2.10.0 (#186 ) * [Security] Workaround for CVE-2021-44228 Log4J RCE when Log4J >= 2.10.0 - prevents the exploit by disabling message pattern lookups * Bump the chart version	2021-12-10 18:30:01 +02:00
Lari Hotari	a16c6bbf19	Make k8s probe timeoutSeconds configurable and set default to 5s for k8s 1.20+ compatibility (#179 ) - set to 5 seconds by default - address compatibility with Kubernetes 1.20+. This impacts "bin/pulsar-zookeeper-ruok.sh" exec probe used in ZK. "Before Kubernetes 1.20, the field timeoutSeconds was not respected for exec probes: probes continued running indefinitely, even past their configured deadline, until a result was returned." https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes	2021-11-25 08:46:42 +01:00
Peter Tinti	f307cc32af	updates pulsar ca name generation to use suffix making cert swappable (#141 ) Updates CA name generation to be configurable allowing the swapping in of a CA. ### Motivation We recently swapped out cert issuers and found that with the current helm chart we were unable to do a hot swap without downtime (via helm) because the CA cert name is not configurable. Being able to change the name of the CA allows us to create a new CA first -> Validate -> then swap over in follow up apply/release. ### Modifications Adds the ability to specify the suffix used to generate the CA name (not the whole name in order to preserve back compatibility regardless of the release name.)	2021-08-25 23:14:03 -07:00
Aaron Johnson	c45813ffe5	added extraVolumes and extraVolumeMounts (#149 ) Fixes #147 ### Motivation This gives the helm chart user the ability to specify a secret or other type of volume to be mounted into any of the statefulset pods ### Modifications * Added conditionals to `bookkeeper`, `broker`, `proxy`, `toolset`, and `zookeeper` statefulsets which allow the chart user to specify extraVolumes and extraVolumeMounts for deployed pods. * Added `extraVolumes` and `extraVolumeMounts` parameters to values.yaml	2021-08-25 23:13:27 -07:00
MMeent	11a1d578dd	Fix indentation issue on `checksum/config` (#117 ) Fixes #116 ### Motivation Theres indentation issues for the `checksum/config` annotation in these templates, which would either fail linting or not apply at all in some situations. ### Modifications I've added indentation at the specified places such that this isn't an issue anymore. ### Verifying this change - [ ] Make sure that the change passes the CI checks.	2021-06-23 21:11:38 -07:00
Miecio	23ba8ac948	Fix for missing PSP for bookie initialize and other (#101 ) ### Motivation When using standard bookkeeper installation on PSP cluster initialization fails because has to be started as root ### Modifications Add same ServiceAccount and SecurityContext for bookkeeper-cluster-initialize as in bookkeeper specyfication. UPDATE: Seems that when using in cluster TLS encryption other components also require RW access to root FS, I added PSP for proxy, zookeepe, broker and toolset ### Verifying this change - [x] Make sure that the change passes the CI checks.	2021-01-30 09:22:52 -08:00
Miloš Matijašević	c2f672881e	Updating pods on configmap change (#73 ) Fixes #71 ### Motivation Pods are not restarting when config maps are changed after changing values.yaml file, so they need to be restarted manually in order to pick up new values from config map. ### Modifications As I mentioned `restartPodsOnConfigMapChange` flag for each component is added in values.yaml file whether to restart pods on configmap change or not, default is `false`. In statefulset templates for each component is added part which is adding annotation that contains hash of corresponding configmap if `restartPodsOnConfigMapChange` is `true`, which will cause pods to restart if corresponding configmap has been changed (https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments). ### Verifying this change - [ ] Make sure that the change passes the CI checks.	2021-01-07 21:28:11 -08:00
Jean Helou	6c9856a1af	Use `.Release.Namespace` by default to handle namespaces (#80 ) It remains possible to override the current release namespace by setting the `namespace` value though this may lead to having the helm metadata and the pulsar components in different namespaces Fixes #66 ### Motivation Trying to deploy the chart in a namespace using the usual helm pattern fails for example ``` kubectl create ns pulsartest helm upgrade --install pulsar -n pulsartest apache/pulsar Error: namespaces "pulsar" not found ``` fixing that while keeping the helm metadata and the deployed objects in the same namespace requires declaring the namespace twice ``` kubectl create ns pulsartest helm upgrade --install pulsar -n pulsartest apache/pulsar --set namespace=pulsartest Error: namespaces "pulsar" not found ``` This is needlessly confusing for newcomers who follow the helm documentation and is contrary to helm best practices. ### Modifications I changed the chart to use the context namespace `.Release.Namespace` by default while preserving the ability to override that by explicitly providing a namespace on the commande line, with the this modification both examples behave as expected ### Verifying this change - [x] Make sure that the change passes the CI checks.	2020-12-03 19:32:05 -08:00
Lari Hotari	6c2edba8b1	Get OS signals passed to container process by using shell built-in "exec" (#59 ) ### Changes - using "exec" to run a command replaces the shell process with the executed process - this is required so that the process running in the container is able to receive OS signals - explained in https://docs.docker.com/develop/develop-images/dockerfile_best-practices/ and https://docs.docker.com/engine/reference/builder/#entrypoint - receiving SIGTERM signal is required for graceful shutdown. This is explained in https://pracucci.com/graceful-shutdown-of-kubernetes-pods.html This change might fix issues such as https://github.com/apache/pulsar/issues/6603 . One expectation of this fix is that graceful shutdown would allow Pulsar components such as a bookies to deregistered from Zookeeper properly before shutdown. ### Motivation Dockerfile best practices mention that "exec" should be used so that the process running in a container can receive OS signals. This is explained in https://docs.docker.com/develop/develop-images/dockerfile_best-practices/ and https://docs.docker.com/engine/reference/builder/#entrypoint . Kubernetes documention explains pod termination in https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination : "Typically, the container runtime sends a TERM signal to the main process in each container. Once the grace period has expired, the KILL signal is sent to any remaining processes, and the Pod is then deleted from the API Server ." Currently some issues while running Pulsar are caused by the lack of graceful shutdown. Graceful shutdown isn't happening at all since the Pulsar processes never receive the TERM signal that would allow graceful shutdown. This PR fixes that. This PR was inspired by https://github.com/kafkaesque-io/pulsar-helm-chart/pull/31	2020-08-30 23:05:49 -06:00
Thomas O'Neill	207d697bed	Fix zookeeper antiaffinity (#52 ) Fixes #39 ### Motivation The match expression for the "app" label was incorrect breaking the antiaffinity since they would never match. Fixing this makes the podAntiAffinity work, but now requires at least N nodes to be in the cluster where N = largest replica set with affinity. Added the option to set the affinity type to preferredDuringSchedulingIgnoredDuringExecution where it will try to follow the affinity, but will still deploy a pod if it needs to break it. ### Modifications - Fixed app matchExpression - Added option to set the affinity type - bumped chart version ### Verifying this change - [X] Make sure that the change passes the CI checks.	2020-08-13 10:19:01 -07:00
John Harris	6b92881149	Add zookeeper metrics port and PodMonitors (#44 ) * Add 'http' port specification to zookeeper statefulset This makes the zookeeper spec inline with the other statefulset specs in this chart and it provides a port target for custom podMonitors * Added PodMonitors for bookie, broker, proxy, and zookeeper New PodMonitors are needed for prometheus-operator to pickup scrape targets. Defaults to disabled so users need to opt in to deploy * Added Apache license info to podmonitor yamls	2020-07-23 10:34:43 +08:00
Sijie Guo	1c8a434ef6	Don't substitute environment variables (#28 ) Motivation environment variables are already taken by bash scripts. We don't need to substitute them.	2020-06-25 20:24:03 -07:00
Sijie Guo	0338d17b89	Publish chart index to gh-pages branch (#3 ) Motivation Release helm chart when new tags are created	2020-04-21 02:44:58 -07:00

20 Commits