* Allow to use selectors with volumeClaimTemplates
* Fixed naming inconsistency, added null value
Co-authored-by: Claudio Vellage <claudio.vellage@pm.me>
Co-authored-by: Michael Marshall <mmarshall@apache.org>
### Motivation
Currently it's not possible to use selectors with volumeClaimTemplates which makes it hard/impossible to bind statically provisioned PVs.
### Modifications
Added (optional) selectors to `volumeClaimTemplates` and documented in values file.
### Verifying this change
- [ ] Make sure that the change passes the CI checks.
* allow specifying the nodeSelector for the init jobs
* Use pulsar_metadata.nodeSelector
Co-authored-by: samuel <samuel.verstraete@aprimo.com>
### Motivation
When deploying pulsar to an AKS cluster with windows nodepools i was unable to specify that the Jobs of the initalize release had to run on linux nodes. With the change i can now specify a node selector for the init jobs.
### Modifications
add nodeSelector on pulsar_init and bookie_init
### Verifying this change
- [ ] Make sure that the change passes the CI checks.
### Motivation
In #269, we added a way to configure external zookeeper servers. However, it was added to the wrong section of the zookeeper config. The `zookeeper.configData` section is mapped directly into the zookeeper configmap.
### Modifications
Move `zookeeper.configData.ZOOKEEPER_SERVERS` to `zookeeper.externalZookeeperServerList`
### Verifying this change
This is a cosmetic change on an unreleased feature.
* Replace monitoring solution with kube-prometheus-stack dependency
* Enable pod monitors
* Download necessary chart dependencies for CI
* Actually run dependency update
* Enable missed podMonitor
* Disable alertmanager by default for feature parity
Related issues #294#65
Supersedes #296 and #297
### Motivation
Our helm chart is out of date. I propose we make a breaking change for the monitoring solution and start using the `kube-prometheus-stack` as a dependency. This should make upgrades easier and will let users leverage all of that chart's features.
This change will result in the removal of the StreamNative Grafana Dashboards. We'll need to figure out the right way to address that. The apache/pulsar project has grafana dashboards, but they have not been maintained. With this added dependency, we'll have the benefit of being able to use k8s `ConfigMap`s to configure grafana dashboards.
### Modifications
* Remove old prometheus and grafana configuration
* Add kube-prometheus-stack chart as a dependency
* Enable several components by default. I am not opinionated on these, but it is based on the other values in the chart.
### Verifying this change
This is a large change that will require manual validation, and may break deployments. I propose this triggers a helm chart 3.0.0 release.
Co-authored-by: Stepan Mazurov <smazurov@quantummetric.com>
### Motivation
In #204, api version of the cert resources was updated to v1. This was insufficient because `v1` has different spec from `v1alpha1`
This MR finishes the work that #204 and @lhotari started.
### Modifications
Changed the spec of certs to match v1 cert manager spec.
### Verifying this change
- [ ] Make sure that the change passes the CI checks.
Co-authored-by: Michael Marshall <mmarshall@apache.org>
### Motivation
There was a suggestion [in a dev mailing list discussion](https://lists.apache.org/thread/bgkvcyt1qq6h67p2k8xwp89xlncbqn3d) that the Helm chart's appVersion should be used as the default image tag.
### Additional context
There are some limitations in Helm. It is not possible to set "appVersion" from the command line. There's in an open feature request https://github.com/helm/helm/issues/8194 to add such a feature to Helm.
### Modifications
- change default values.yaml and set the tags for the images that use the Pulsar image to an empty value
- add "defaultPulsarImageTag" to values.yaml
- add a helper template "pulsar.imageFullName" that contains the logic to fall back to .Values.defaultPulsarImageTag and if it's not set, falling back to .Chart.AppVersion
- use the helper template in all other templates that require the logic
* Add nodeSelector to cluster initialize pod
* Add option to values file
* Update charts/pulsar/templates/pulsar-cluster-initialize.yaml
Co-authored-by: Michael Marshall <mikemarsh17@gmail.com>
* Fix typo in values
Co-authored-by: Michael Marshall <mikemarsh17@gmail.com>
### Motivation
Add an option to choose where to run pulsar-cluster-initialize pod. Sometimes there is a necessity to run only on certain nodes.
### Modifications
Added nodeSelector option to the pulsar-cluster-initialize job.
Fixes https://github.com/apache/pulsar-helm-chart/issues/250
### Motivation
`httpNumThreads` is hardcoded to 8 in `charts/pulsar/templates/proxy-configmap.yaml`
When trying to override in `values.yaml` by using `proxy.configData.httpNumThreads` we get an error because the keys get duplicated.
This happens because `{{ toYaml .Values.proxy.configData | indent 2 }}` doesn't deduplicate the keys and there is no other way to set `httpNumThreads`
### Modifications
Removing the key from charts/pulsar/templates/proxy-configmap.yaml and adding it to the values yaml solves the problem.
### Verifying this change
- [x] Make sure that the change passes the CI checks.
Master Issue: https://github.com/apache/pulsar/issues/11269
### Motivation
Apache Pulsar's docker images for 2.10.0 and above are non-root by default. In order to ensure there is a safe upgrade path, we need to expose the `securityContext` for the Bookkeeper and Zookeeper StatefulSets. Here is the relevant k8s documentation on this k8s feature: https://kubernetes.io/docs/tasks/configure-pod-container/security-context.
Once released, all deployments using the default `values.yaml` configuration for the `securityContext` will pay a one time penalty on upgrade where the kubelet will recursively chown files to be root group writable. It's possible to temporarily avoid this penalty by setting `securityContext: {}`.
### Modifications
* Add config blocks for the `bookkeeper.securityContext` and `zookeeper.securityContext`.
* Default to `fsGroup: 0`. This is already the default group id in the docker image, and the docker image assumes the user has root group permission.
* Default to `fsGroupChangePolicy: "OnRootMismatch"`. This configuration will work for all deployments where the user id is stable. If the user id switches between restarts, like it does in OpenShift, please set to `Always`.
* Remove gc configuration writing to directory that the user lacks permission. (Perhaps we want to write to `/pulsar/log/bookie-gc.log`?)
* Add documentation to the README.
### Verifying this change
I first attempted verification of this change with minikube. It did not work because minikube uses hostPath volumes by default. I then tested on EKS v1.21.9-eks-0d102a7. I tested by deploying the current, latest version of the helm chart (2.9.3) and then upgrading to this PR's version of the helm chart along with using the 2.10.0 docker image. I also tested upgrading from a default version
Test 1 is a plain upgrade using the default 2.9.3 version of the chart, then upgrading to this PR's version of the chart with the modification to use the 2.10.0 docker images. It worked as expected.
```bash
$ helm install test apache/pulsar
$ # Wait for chart to deploy, then run the following, which uses Pulsar version 2.10.0:
$ helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/
```
Test 2 is a plain upgrade using the default 2.9.3 version of the chart, then an upgrade to this PR's version of the chart, then an upgrade to this PR's version of the chart using 2.10.0 docker images. There is a minor error described in the `README.md`. The solution is to chown the bookie's data directory.
```bash
$ helm install test apache/pulsar
$ # Wait for chart to deploy, then run the following, which uses Pulsar version 2.9.2:
$ helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/
$ # Upgrade using Pulsar version 2.10.0
$ helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/
```
### GC Logging
In my testing, I ran into the following errors when using `-Xlog:gc:/var/log/bookie-gc.log`:
```
pulsar-bookkeeper-verify-clusterid [0.008s] Error opening log file '/var/log/bookie-gc.log': Permission denied
pulsar-bookkeeper-verify-clusterid [0.008s] Initialization of output 'file=/var/log/bookie-gc.log' using options '(null)' failed.
pulsar-bookkeeper-verify-clusterid [0.005s] Error opening log file '/var/log/bookie-gc.log': Permission denied
pulsar-bookkeeper-verify-clusterid [0.006s] Initialization of output 'file=/var/log/bookie-gc.log' using options '(null)' failed.
pulsar-bookkeeper-verify-clusterid Invalid -Xlog option '-Xlog:gc:/var/log/bookie-gc.log', see error log for details.
pulsar-bookkeeper-verify-clusterid Error: Could not create the Java Virtual Machine.
pulsar-bookkeeper-verify-clusterid Error: A fatal exception has occurred. Program will exit.
pulsar-bookkeeper-verify-clusterid Invalid -Xlog option '-Xlog:gc:/var/log/bookie-gc.log', see error log for details.
pulsar-bookkeeper-verify-clusterid Error: Could not create the Java Virtual Machine.
pulsar-bookkeeper-verify-clusterid Error: A fatal exception has occurred. Program will exit.
```
I resolved the error by removing the setting.
### OpenShift Observations
I wanted to seamlessly support OpenShift, so I investigated using configuring the bookkeeper and zookeeper process with `umask 002` so that they would create files and directories that are group writable (OpenShift has a stable group id, but gives the process a random user id). That worked for most tools when switching the user id, but not for RocksDB, which creates a lock file at `/pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK` with the permission `0644` ignoring the umask. Here is the relevant error:
```
2022-05-14T03:45:06,903+0000 ERROR org.apache.bookkeeper.server.Main - Failed to build bookie server
java.io.IOException: Error open RocksDB database
at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:199) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:88) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.lambda$static$0(KeyValueStorageRocksDB.java:62) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.bookie.storage.ldb.LedgerMetadataIndex.<init>(LedgerMetadataIndex.java:68) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.<init>(SingleDirectoryDbLedgerStorage.java:169) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.newSingleDirectoryDbLedgerStorage(DbLedgerStorage.java:150) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.initialize(DbLedgerStorage.java:129) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:818) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:152) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:52) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:304) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.server.Main.doMain(Main.java:226) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at org.apache.bookkeeper.server.Main.main(Main.java:208) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
Caused by: org.rocksdb.RocksDBException: while open a file for lock: /pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK: Permission denied
at org.rocksdb.RocksDB.open(Native Method) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
at org.rocksdb.RocksDB.open(RocksDB.java:239) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:196) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
... 13 more
```
As such, in order to support OpenShift, I exposed the `fsGroupChangePolicy`, which allows for OpenShift support, but not necessarily _seamless_ support.
* Bump version to `2.9.2`
* Because the latest Pulsar image is based on Java 11, some JVM param for printing GC information has been abandoned, change to use the new JVM param. refer to https://docs.oracle.com/en/java/javase/11/tools/java.html#GUID-BE93ABDC-999C-4CB5-A88B-1994AAAC74D5 and https://issues.redhat.com/browse/CLOUD-3040.
original param | new param
--|--
`-XX:+PrintGCDetails` | `-Xlog:gc*`
`-XX:+PrintGCApplicationStoppedTime` | `-Xlog:safepoint`
`-XX:+PrintHeapAtGC` | `-Xlog:gc+heap=trace`
`-XX:+PrintGCTimeStamps` | `-Xlog:gc::utctime`
* remove JVM param `-XX:G1LogLevel=finest`
* Add multi volume support in bookkeeper. (#112)
* Add multi volumes support in bookkeeper configmap.
Co-authored-by: druidliu <druidliu@tencent.com>
Fixes#112
### Motivation
*Add option for user to choose whether using multi volume in bookeeper, especially while using `local-storage`.*
### Modifications
Add `useMultiVolumes` option under `.Values.bookkeeper.volumes.journal` and `.Values.bookkeeper.volumes.ledgers`.
User can choose how many volumes could be used for bookkeeper jounal or ledgers.
### Verifying this change
- [x] Make sure that the change passes the CI checks.
Updates CA name generation to be configurable allowing the swapping in of a CA.
### Motivation
We recently swapped out cert issuers and found that with the current helm chart we were unable to do a hot swap without downtime (via helm) because the CA cert name is not configurable. Being able to change the name of the CA allows us to create a new CA first -> Validate -> then swap over in follow up apply/release.
### Modifications
Adds the ability to specify the suffix used to generate the CA name (not the whole name in order to preserve back compatibility regardless of the release name.)
Fixes#147
### Motivation
This gives the helm chart user the ability to specify a secret or other type of volume to be mounted into any of the statefulset pods
### Modifications
* Added conditionals to `bookkeeper`, `broker`, `proxy`, `toolset`, and `zookeeper` statefulsets which allow the chart user to specify extraVolumes and extraVolumeMounts for deployed pods.
* Added `extraVolumes` and `extraVolumeMounts` parameters to values.yaml
Fixes #<xyz>
### Motivation
would be nice to have this option here so people can run admin commands against the prometheus.
### Modifications
added a new value and modified the deployment, taken from the official prom helm.
### Verifying this change
- [ ] Make sure that the change passes the CI checks.
### Motivation
* While component certs can be configured with a custom duration the CA cert for self-signed configuration uses default values. It can be convenient to have this certificate expire more than a month out.
### Modifications
* Updates the internal issuer `{{ .Release.Name }}-ca-tls` certificate to make `duration` and `renewBefore` configurable. Does not use `common` so that the CA can be configured to last much longer than individual components certs if desired.
### Verifying this change
- [x] Make sure that the change passes the CI checks.
This commit let's users override the apiVersion referenced in this
chart so that the chart can be used with newer cert-manager releases.
(script/cert-manager/install-cert-manager.sh installs 0.13.0 when
current version is 1.2.0...)
Fixes#68
### Motivation
cert-manager apiVersion changed after cert-manager 1.0.0 was released, which prevents the chart from provisionning certificates with newer cert-manager installation because of an incompatible apiVersion.
I have a cluster with cert-manager >1.0.0 installed, making `apiVersion` overridable makes it easy for me to install pulsar on that cluster
### Modifications
I introduced the value `certs.internal_issuer.apiVersion`, which by default uses the apiVersion that was previously hardcoded (`cert-manager.io/v1alpha2`)
I replaced all occurrences of that apiVersion by a reference to the value so that users can override it to `cert-manager.io/v1` if they have a newer version of cert-manager installed.
### Verifying this change
- [x] Make sure that the change passes the CI checks.
### Motivation
In some case, my k8s node only have 1 large capacity ssd, for deploying 1 bookie, I need:
- Partition the ssd into 2 disks, and make 2 pv over it.
- Just make 1 pv over it, but journal & ledgers under same mount path (this PR did)
Both can't isolate IO for journal & ledgers, so I prefer the second one for reusability.
### Modifications
values.yaml
- add `useSingleCommonVolume` option, default false
bookkeeper-statefulset.yaml
- mount the only PV to path `/pulsar/data/bookkeeper`
- use configured common storageClassName
bookkeeper-storageclass.yaml
- use configured provisioner for the common storageClass
### Others
This may not be an issue for everyone, if it's not necessary to merge, I'll just use it locally
### Verifying this change
- [x] Make sure that the change passes the CI checks.
### Motivation
As I wanted to use [streamnative/apache-pulsar-grafana-dashboard](https://github.com/streamnative/apache-pulsar-grafana-dashboard) with this helm chart and own cluster wide Prometheus stack I decided that use of PodMonitor CRD is a good way. Unfortunately prometheus config has some metrics relabelings that are required by grafana dashboard. I decied to port them directly to PodMonitor definition
### Modifications
* Added missing PodMonitor for autorecovery
* Port relabelings from `prometheus-configmap.yaml` to each PodMonitor
### Verifying this change
- [x] Make sure that the change passes the CI checks.
Fixes#71
### Motivation
Pods are not restarting when config maps are changed after changing values.yaml file, so they need to be restarted manually in order to pick up new values from config map.
### Modifications
As I mentioned `restartPodsOnConfigMapChange` flag for each component is added in values.yaml file whether to restart pods on configmap change or not, default is `false`.
In statefulset templates for each component is added part which is adding annotation that contains hash of corresponding configmap if `restartPodsOnConfigMapChange` is `true`, which will cause pods to restart if corresponding configmap has been changed (https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments).
### Verifying this change
- [ ] Make sure that the change passes the CI checks.
Add PSP and add/modify RBAC. I'm open for all discussion.
### Motivation
On clusters which use PSP and restrictive default policy pulsar cannot be installed, because it uses root user and requires writable container root directory. Additionally default RBAC for broker are too permissive (usage of ClusterRoleBinding) in my opinion.
### Modifications
Add PSP and RBAC for bookkeeper and autorecovery to add
exception to allow startup even in secure environment
where containers cannot access RW on root by default.
Add option for limiting broker ClusterRoleBinding
to single namespace by replacing to RoleBinding
### Verifying this change
- [x] Make sure that the change passes the CI checks.
It remains possible to override the current release namespace by setting
the `namespace` value though this may lead to having the helm metadata
and the pulsar components in different namespaces
Fixes#66
### Motivation
Trying to deploy the chart in a namespace using the usual helm pattern fails for example
```
kubectl create ns pulsartest
helm upgrade --install pulsar -n pulsartest apache/pulsar
Error: namespaces "pulsar" not found
```
fixing that while keeping the helm metadata and the deployed objects in the same namespace requires declaring the namespace twice
```
kubectl create ns pulsartest
helm upgrade --install pulsar -n pulsartest apache/pulsar --set namespace=pulsartest
Error: namespaces "pulsar" not found
```
This is needlessly confusing for newcomers who follow the helm documentation and is contrary to helm best practices.
### Modifications
I changed the chart to use the context namespace `.Release.Namespace` by default while preserving the ability to override that by explicitly providing a namespace on the commande line, with the this modification both examples behave as expected
### Verifying this change
- [x] Make sure that the change passes the CI checks.
Signed-off-by: xiaolong.ran <rxl@apache.org>
### Motivation
Bump the image version to 2.6.2
### Verifying this change
- [x] Make sure that the change passes the CI checks.
### Motivation
* ```publishNotReadyAddresses``` is a service spec and not a service annotation. This is mentioned in the K8s API docs at https://v1-17.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.15/#servicespec-v1-core
### Modifications
* Modified ```publishNotReadyAddresses``` from annotation to service spec
### Verifying this change
- [x] Make sure that the change passes the CI checks.
### Motivation
* It's not recommended to run a production zookkeeper cluster with forceSync as "no". This is also mentioned in the forceSync section in https://pulsar.apache.org/docs/en/next/reference-configuration/#zookeeper
### Modifications
* Removed ```-Dzookeeper.forceSync=no``` from ```values.yaml``` as default ```forceSync``` is ```yes```.
Fixes#50
### Motivation
The host option is not required to setup an ingress, so I made it an optional value
### Modifications
*Describe the modifications you've done.*
Made setting the host optional.
Co-authored-by: Elad Dolev <elad@firebolt.io>
### Motivation
Give the ability to deploy multi-cluster instance on K8s clusters with non-default `clusterDomain`, and connect to external configuration-store
### Modifications
- give the ability to change cluster's name
- give the ability to change `clusterDomain`
- fix external configuration store functionality
- use broker ports variables
- use label templates, and add `component` label in several places
### Verifying this change
- [x] Make sure that the change passes the CI checks.