180 Commits

Author SHA1 Message Date
Emre Aydın
9542c7b226
Use container ports for proxy stateful set probes (#410)
Using service ports cause probes to fail.

Co-authored-by: emre <emre.aydin@zapata.ai>
2023-12-11 00:20:07 -08:00
Frank Kelly
8cb3c18377
Allow Proxy and Broker HPA to specify scaling policies on scaleUp or scaleDown. (#391) 2023-09-15 14:12:12 -05:00
Michał Koziorowski
ea5404c421
Fixed bookkeeper volume mounts indentation (#384) 2023-08-24 09:32:58 +08:00
Ethan-Merrill
73fe688a43
Add support for stateful functions using the bookie as state storage (#171)
### Motivation

Enables support for using the Pulsar bookies as persistent state storage for functions.

### Modifications

- Added an option to enable/disable using bookies as state storage
- Adds extra server components options to the bookkeeper to enable necessary features for bookies to be used as state storage
- Adds stateStorageServiceUrl to the broker configmap
2023-07-18 21:37:03 -05:00
Tomasz Jegorow
042fd5b6d4
Configure custom topologyKey for podAntiAffinity (#351) 2023-07-12 18:19:49 +03:00
Atkins
79ec5ba333
Fix pod annotations when restartPodsOnConfigMapChange is true (#353)
Signed-off-by: Atkins Chang <atkinschang@gmail.com>
2023-07-12 18:18:48 +03:00
Atkins
b30eb6fff8
Improve HPA (#354)
* Use `autoscaling/v2` if Kubernetes version >= 1.23

Signed-off-by: Atkins Chang <atkinschang@gmail.com>

* Disable replicas when autoscaling enabled

Signed-off-by: Atkins Chang <atkinschang@gmail.com>

---------

Signed-off-by: Atkins Chang <atkinschang@gmail.com>
2023-07-12 18:18:19 +03:00
Brad Shelton
f8ad65066e
To address the function role vs clusterrole issue (#236)
* To address the function role vs clusterrole issue

* making backwards compatable

* updated value.yaml to include limit functions to namespace

* Added documentation to clarify the new attribute

* moved limit_to_namespace under functions.rbac
2023-07-12 18:11:36 +03:00
Lari Hotari
49f4acdf5a
Refactor GitHub Actions CI to a single workflow (#371)
* Refactor GitHub Actions CI to a single workflow

* Handle case where "ct lint" fails because of no chart changes

* Re-order scenarios

* Remove excessive default GC logging

* Bump cert-manager version to v1.12.2

* Use compatible cert-manager version

* Install debugging tools (k9s) for ssh access

* Only apply for interactive shells

* Fix JWT symmetric test

* Fix part that was missing from #356

* Install k9s on the fly when k9s is used

- set KUBECONFIG on the fly for kubectl too
2023-07-11 15:55:35 +03:00
huis
2d646f4efe
Fix PVC selector scope error when bookkeeper uses multiple data volumes (#342)
Fix PVC selector scope error when bookkeeper uses multiple data volumes
2023-07-11 10:03:38 +03:00
Robert Moucha
71450334cf
Fix zookeeper annotations (#348)
Fixed the case when no ZK annotations are set and `zookeeper.restartPodsOnConfigMapChange: true`
helm can not render template
2023-07-11 10:01:54 +03:00
mfuxi
786e182de4
add ingressClassName (#360) 2023-07-11 10:00:45 +03:00
Chris Johnson
90a26b2dc8
fix: proxy should not use priviledged port numbers (#356)
* fix: proxy should not use priviledged port numbers

This fixes issue #335

* fix: making the change backward compatible
2023-07-11 10:00:17 +03:00
Lari Hotari
f43c6f6d9e
Fix GitHub Actions based CI checks which have been failing (#370)
* Upgrade upgrade kind, chart releaser and helm versions

* Disable podMonitory for values-broker-tls.yaml file

- was missing from #317

* Use k8s 1.18.20

* Use ubuntu-20.04 runtime

- k8s < 1.19 doesn't support cgroup v2

* Upgrade to k8s 1.19 as baseline

* Baseline to k8s 1.20

* Set ip family to ipv4

* Add more logging to kind cluster creation

* Simplify duplicate job deletion

* use verbosity flag

* Upgrade to k8s 1.24

* Replace removed tolerate-unready-endpoints annotation with publishNotReadyAddresses

(cherry picked from commit e90926053a2b01bb95529fbaddc8d2ce2cdeec63)

* Use k8s 1.21 as baseline

* Run on ubuntu-22.04

* Use Pulsar 2.10.4
2023-07-10 12:45:37 -07:00
Michael Marshall
687060aa27
Chart: Bump version to 3.0.0 2022-10-21 00:26:42 -05:00
edward.zeng
95c218b218
Fix PodMonitor name conflicts for multiple releases in same namespace (#258)
* Fix PodMonitor name conflicts for multiple releases in same namespace

Signed-off-by: Edward Zeng <jie.zeng@zilliz.com>

* Use pulsar.fullname for PodMonitor name prefix

Signed-off-by: Edward Zeng <jie.zeng@zilliz.com>
Co-authored-by: Michael Marshall <mmarshall@apache.org>

Signed-off-by: Edward Zeng <jie.zeng@zilliz.com>

Fixes #257

### Motivation

Fix PodMonitor name conflicts for multiple releases in same namespace

### Modifications

Use release name instead of hardcode `pulsar.name` for pod monitor name.

### Verifying this change

- [x] Make sure that the change passes the CI checks.
2022-10-20 21:15:16 -05:00
Michael Marshall
d9769a9519
Add missing license headers and .rat-excludes (#319)
* Add missing license headers and .rat-excludes

* Fix .rat-excludes files

### Motivation

As part of our updated release process, we need to make sure that all relevant files have license headers.

### Modifications

* Add license headers formatted appropriately for each file type

### Verifying this change

The follow script shows that the solution is complete:

```shell
$ java -jar ../apache-rat-0.15/apache-rat-0.15.jar . -E .rat-excludes 
Ignored 18 lines in your exclusion files as comments or empty lines.


*****************************************************
Summary
-------
Generated at: 2022-10-20T17:54:42-05:00

Notes: 4
Binaries: 1
Archives: 0
Standards: 92

Apache Licensed: 92
Generated Documents: 0

JavaDocs are generated, thus a license header is optional.
Generated files do not require license headers.

0 Unknown Licenses

*****************************************************
  Files with Apache License headers will be marked AL
  Binary files (which do not require any license headers) will be marked B
  Compressed archives will be marked A
  Notices, licenses etc. will be marked N
  AL    ./.asf.yaml
  AL    ./.rat-excludes
  N     ./LICENSE
  N     ./NOTICE
  AL    ./README.md
  AL    ./Vagrantfile
  AL    ./license_test.go
  AL    ./charts/pulsar/.helmignore
  AL    ./charts/pulsar/Chart.yaml
  N     ./charts/pulsar/LICENSE
  N     ./charts/pulsar/NOTICE
  AL    ./charts/pulsar/values.yaml
  B     ./charts/pulsar/charts/kube-prometheus-stack-41.5.1.tgz
  AL    ./charts/pulsar/templates/_autorecovery.tpl
  AL    ./charts/pulsar/templates/_bookkeeper.tpl
  AL    ./charts/pulsar/templates/_broker.tpl
  AL    ./charts/pulsar/templates/_configurationstore.tpl
  AL    ./charts/pulsar/templates/_helpers.tpl
  AL    ./charts/pulsar/templates/_toolset.tpl
  AL    ./charts/pulsar/templates/_zookeeper.tpl
  AL    ./charts/pulsar/templates/autorecovery-configmap.yaml
  AL    ./charts/pulsar/templates/autorecovery-podmonitor.yaml
  AL    ./charts/pulsar/templates/autorecovery-rbac.yaml
  AL    ./charts/pulsar/templates/autorecovery-service.yaml
  AL    ./charts/pulsar/templates/autorecovery-statefulset.yaml
  AL    ./charts/pulsar/templates/bookkeeper-cluster-initialize.yaml
  AL    ./charts/pulsar/templates/bookkeeper-configmap.yaml
  AL    ./charts/pulsar/templates/bookkeeper-pdb.yaml
  AL    ./charts/pulsar/templates/bookkeeper-podmonitor.yaml
  AL    ./charts/pulsar/templates/bookkeeper-rbac.yaml
  AL    ./charts/pulsar/templates/bookkeeper-service.yaml
  AL    ./charts/pulsar/templates/bookkeeper-statefulset.yaml
  AL    ./charts/pulsar/templates/bookkeeper-storageclass.yaml
  AL    ./charts/pulsar/templates/broker-cluster-role-binding.yaml
  AL    ./charts/pulsar/templates/broker-configmap.yaml
  AL    ./charts/pulsar/templates/broker-hpa.yaml
  AL    ./charts/pulsar/templates/broker-pdb.yaml
  AL    ./charts/pulsar/templates/broker-podmonitor.yaml
  AL    ./charts/pulsar/templates/broker-rbac.yaml
  AL    ./charts/pulsar/templates/broker-service-account.yaml
  AL    ./charts/pulsar/templates/broker-service.yaml
  AL    ./charts/pulsar/templates/broker-statefulset.yaml
  AL    ./charts/pulsar/templates/dashboard-deployment.yaml
  AL    ./charts/pulsar/templates/dashboard-ingress.yaml
  AL    ./charts/pulsar/templates/dashboard-service.yaml
  AL    ./charts/pulsar/templates/function-worker-configmap.yaml
  AL    ./charts/pulsar/templates/keytool.yaml
  AL    ./charts/pulsar/templates/namespace.yaml
  AL    ./charts/pulsar/templates/proxy-configmap.yaml
  AL    ./charts/pulsar/templates/proxy-hpa.yaml
  AL    ./charts/pulsar/templates/proxy-ingress.yaml
  AL    ./charts/pulsar/templates/proxy-pdb.yaml
  AL    ./charts/pulsar/templates/proxy-podmonitor.yaml
  AL    ./charts/pulsar/templates/proxy-rbac.yaml
  AL    ./charts/pulsar/templates/proxy-service.yaml
  AL    ./charts/pulsar/templates/proxy-statefulset.yaml
  AL    ./charts/pulsar/templates/pulsar-cluster-initialize.yaml
  AL    ./charts/pulsar/templates/pulsar-manager-admin-secret.yaml
  AL    ./charts/pulsar/templates/pulsar-manager-configmap.yaml
  AL    ./charts/pulsar/templates/pulsar-manager-deployment.yaml
  AL    ./charts/pulsar/templates/pulsar-manager-ingress.yaml
  AL    ./charts/pulsar/templates/pulsar-manager-service.yaml
  AL    ./charts/pulsar/templates/tls-cert-internal-issuer.yaml
  AL    ./charts/pulsar/templates/tls-certs-internal.yaml
  AL    ./charts/pulsar/templates/toolset-configmap.yaml
  AL    ./charts/pulsar/templates/toolset-rbac.yaml
  AL    ./charts/pulsar/templates/toolset-service.yaml
  AL    ./charts/pulsar/templates/toolset-statefulset.yaml
  AL    ./charts/pulsar/templates/zookeeper-configmap.yaml
  AL    ./charts/pulsar/templates/zookeeper-pdb.yaml
  AL    ./charts/pulsar/templates/zookeeper-podmonitor.yaml
  AL    ./charts/pulsar/templates/zookeeper-rbac.yaml
  AL    ./charts/pulsar/templates/zookeeper-service.yaml
  AL    ./charts/pulsar/templates/zookeeper-statefulset.yaml
  AL    ./charts/pulsar/templates/zookeeper-storageclass.yaml
  AL    ./examples/values-bookkeeper-aws.yaml
  AL    ./examples/values-cs.yaml
  AL    ./examples/values-jwt-asymmetric.yaml
  AL    ./examples/values-jwt-symmetric.yaml
  AL    ./examples/values-local-cluster.yaml
  AL    ./examples/values-local-pv.yaml
  AL    ./examples/values-minikube.yaml
  AL    ./examples/values-no-persistence.yaml
  AL    ./examples/values-one-node.yaml
  AL    ./examples/values-tls.yaml
  AL    ./examples/values-zookeeper-aws.yaml
  AL    ./hack/common.sh
  AL    ./hack/kind-cluster-build.sh
  AL    ./scripts/set-pulsar-version.sh
  AL    ./scripts/cert-manager/install-cert-manager.sh
  AL    ./scripts/pulsar/cleanup_helm_release.sh
  AL    ./scripts/pulsar/common.sh
  AL    ./scripts/pulsar/common_auth.sh
  AL    ./scripts/pulsar/generate_token.sh
  AL    ./scripts/pulsar/generate_token_secret_key.sh
  AL    ./scripts/pulsar/get_token.sh
  AL    ./scripts/pulsar/prepare_helm_release.sh
 
*****************************************************

```
2022-10-20 20:29:09 -05:00
Michael Marshall
35090ec822
Include LICENSE and NOTICE in distribution 2022-10-20 15:48:07 -05:00
Michael Marshall
9324a9a270
Fix bookkeeper metadata init when specifying metadataPrefix (#316)
Fixes #309

### Motivation

Fix the metadataPrefix initialization.

### Modifications

* Fix the script by adding `&& echo`

### Verifying this change

I manually verified that this change works and correctly puts the metadata in the prefixed location.
2022-10-20 15:24:20 -05:00
Claudio Vellage
343ce0527d
Allow to use selectors with volumeClaimTemplates (#286)
* Allow to use selectors with volumeClaimTemplates

* Fixed naming inconsistency, added null value

Co-authored-by: Claudio Vellage <claudio.vellage@pm.me>
Co-authored-by: Michael Marshall <mmarshall@apache.org>

### Motivation

Currently it's not possible to use selectors with volumeClaimTemplates which makes it hard/impossible to bind statically provisioned PVs.

### Modifications

Added (optional) selectors to `volumeClaimTemplates` and documented in values file.

### Verifying this change

- [ ] Make sure that the change passes the CI checks.
2022-10-20 13:46:23 -05:00
Michael Marshall
1e8491aebd
Fix CI by modifying Chart.yaml and updating ct lint command (#315)
### Motivation

Fix the CI lint step by modifying the Chart.yaml and by removing the maintainers validation step.
2022-10-20 13:17:51 -05:00
Samuel Verstraete
8f033bd1a5
allow specifying the nodeSelector for the init jobs (#225)
* allow specifying the nodeSelector for the init jobs

* Use pulsar_metadata.nodeSelector

Co-authored-by: samuel <samuel.verstraete@aprimo.com>

### Motivation

When deploying pulsar to an AKS cluster with windows nodepools i was unable to specify that the Jobs of the initalize release had to run on linux nodes. With the change i can now specify a node selector for the init jobs.

### Modifications

add nodeSelector on pulsar_init and bookie_init

### Verifying this change

- [ ] Make sure that the change passes the CI checks.
2022-10-19 23:41:39 -05:00
JiangHaiting
da6ce85c66
Bump 2.10.2 (#310)
### Motivation

Bump Apache Pulsar 2.10.2


### Verifying this change

- [ ] Make sure that the change passes the CI checks.
2022-10-19 22:51:08 -05:00
Michael Marshall
42ce7caa55
Update how to configure external zookeeper servers (#308)
### Motivation

In #269, we added a way to configure external zookeeper servers. However, it was added to the wrong section of the zookeeper config. The `zookeeper.configData` section is mapped directly into the zookeeper configmap.

### Modifications

Move `zookeeper.configData.ZOOKEEPER_SERVERS` to `zookeeper.externalZookeeperServerList`

### Verifying this change
This is a cosmetic change on an unreleased feature.
2022-10-19 16:28:33 -05:00
Michael Marshall
7f23af26b7
Replace monitoring solution with kube-prometheus-stack dependency (#299)
* Replace monitoring solution with kube-prometheus-stack dependency

* Enable pod monitors

* Download necessary chart dependencies for CI

* Actually run dependency update

* Enable missed podMonitor

* Disable alertmanager by default for feature parity

Related issues #294 #65

Supersedes #296 and #297

### Motivation

Our helm chart is out of date. I propose we make a breaking change for the monitoring solution and start using the `kube-prometheus-stack` as a dependency. This should make upgrades easier and will let users leverage all of that chart's features.

This change will result in the removal of the StreamNative Grafana Dashboards. We'll need to figure out the right way to address that. The apache/pulsar project has grafana dashboards, but they have not been maintained. With this added dependency, we'll have the benefit of being able to use k8s `ConfigMap`s to configure grafana dashboards.

### Modifications

* Remove old prometheus and grafana configuration
* Add kube-prometheus-stack chart as a dependency
* Enable several components by default. I am not opinionated on these, but it is based on the other values in the chart.

### Verifying this change

This is a large change that will require manual validation, and may break deployments. I propose this triggers a helm chart 3.0.0 release.
2022-10-19 10:23:08 -05:00
Yuwei Sung
816d88c942
added pdb version detection (#260)
* added pdb version detection

* refresh

* Update bookkeeper-pdb.yaml

update the capabilities syntax

* Update broker-pdb.yaml

update capability syntax

* Update proxy-pdb.yaml

update capability version syntax

* Update zookeeper-pdb.yaml

update capability version syntax

* Update zookeeper-pdb.yaml

fix typo

* Update bookkeeper-pdb.yaml

Co-authored-by: Marvin Cai <cai19930303@gmail.com>

Fixes pod disruption budget version warning

### Motivation

PDB policy api version, v1beta1 is deprecated in k8s1.21+ (not available in 1.25+).

### Modifications

zookeeper-pdb, proxy-pdb, broker-pdb and bookkeepr-pdb templates are modified.  If k8s api-resources container policy/v1, the *-pdb.yaml will generate respective apiVersion. 

### Verifying this change

- [ ] Make sure that the change passes the CI checks.
2022-10-18 22:52:11 -05:00
Rajan Dhabalia
89f28bca9c
Support mechanism to provide external zookeeper-server list to build global/configuration zookeeper (#269)
* Support mechanism to provide external zookeeper-server list to build global/configuration zookeeper

* Add external zk example

* add external zk list into values.yaml

Fixes #268

### Motivation
Right now, [chart dynamically](https://github.com/apache/pulsar-helm-chart/blob/master/charts/pulsar/templates/zookeeper-statefulset.yaml#L140) creates zk cluster with zk pods initialized in the same namespace. However, for global/configuration zookeeper, user requires to build zk clusters with pods deployed in different namespaces. Therefore, user needs a mechanism to pass an external list of zk-servers to the chart and build zk-cluster with pods across different namespaces.

### Modification
- Chart should be considering zk-value's configuration for external zookeeper and generate zk-configuration file with appropriate zk-server list and unique id of that zookeeper.

This PR sets `ZOOKEEPER_SERVERS` value provided by user and also sets override-value flag which will be used by [generate-zookeeper-config.sh](https://github.com/apache/pulsar/blob/master/docker/pulsar/scripts/generate-zookeeper-config.sh) to override external zk list in config file and assign appropriate id to the host.

https://github.com/apache/pulsar/pull/15987 fixes [generate-zookeeper-config.sh](https://github.com/apache/pulsar/blob/master/docker/pulsar/scripts/generate-zookeeper-config.sh) changes.


### Result
- User can add `ZOOKEEPER_SERVERS` string into `zookeeper.configData` in [Values.yaml](https://github.com/apache/pulsar-helm-chart/blob/master/charts/pulsar/values.yaml#L385) file to override external zk-server list.
2022-10-18 17:41:43 -05:00
Stepan Mazurov
1bcf255e12
feat(certs): use actual v1 spec for certs (#233)
Co-authored-by: Stepan Mazurov <smazurov@quantummetric.com>

### Motivation

In #204, api version of the cert resources was updated to v1. This was insufficient because `v1` has different spec from `v1alpha1` 

This MR finishes the work that #204 and @lhotari started.

### Modifications

Changed the spec of certs to match v1 cert manager spec.

### Verifying this change

- [ ] Make sure that the change passes the CI checks.
2022-10-18 15:40:43 -05:00
Penghui Li
8f1ca065b3
Bump Apache Pulsar 2.10.1 (#274)
* Bump Apache Pulsar 2.10.1

* Do not bump .Chart.version

* Remove unnecessary jq download that was failing with Permission Denied

Co-authored-by: Michael Marshall <mmarshall@apache.org>
2022-10-18 13:16:51 -05:00
Michael Marshall
58cd43fe8b
Remove '|| yes' in bk cluster init script (#305) 2022-10-18 18:46:07 +03:00
Michael Marshall
48501ebe84
Allow bk cluster init to restart on failure (#303)
### Motivation

This is essentially the same as https://github.com/apache/pulsar-helm-chart/pull/176. Without this change, an init pod can fail and be in `Error` state even though the second pod succeeded. This will prevent misleading errors.

### Modifications

* Replace `Never` with `OnFailure`

### Verifying this change

This is a trivial change.
2022-10-17 17:59:05 -05:00
Lari Hotari
25f355e6e2
Use appVersion as default tag for Pulsar images (#200)
Co-authored-by: Michael Marshall <mmarshall@apache.org>

### Motivation

There was a suggestion [in a dev mailing list discussion](https://lists.apache.org/thread/bgkvcyt1qq6h67p2k8xwp89xlncbqn3d) that the Helm chart's appVersion should be used as the default image tag.

### Additional context

There are some limitations in Helm. It is not possible to set "appVersion" from the command line. There's in an open feature request https://github.com/helm/helm/issues/8194 to add such a feature to Helm.

### Modifications

- change default values.yaml and set the tags for the images that use the Pulsar image to an empty value
- add "defaultPulsarImageTag" to values.yaml
- add a helper template "pulsar.imageFullName" that contains the logic to fall back to .Values.defaultPulsarImageTag and if it's not set, falling back to .Chart.AppVersion
- use the helper template in all other templates that require the logic
2022-10-17 15:42:58 -05:00
Arnar
f3ba780ab5
Alphabetically sort list of super users (#291)
Fixes #288 

### Motivation

When specifying multiple roles in `.Values.auth.superUsers` the values are converted to a comma-separated list by piping the dict through `values` and `join` in helm templating, `values` however doesn't guarantee that the order of elements will be the same every time. Therefor it recommends also passing it through `sortAlpha` to sort the list alphabetically.

This is a problematic when `.Values.broker.restartPodsOnConfigMapChange` is enabled because the checksum of the configmap changes every time the list's order is changed, resulting in the statefulsets rolling out a new version of the pods.

### Modifications

Pass list through `sortAlpha`.

### Verifying this change

- [x] Make sure that the change passes the CI checks.
2022-10-17 14:36:22 -05:00
Aliaksandr Shulyak
8b42a61f2e
Add nodeSelector to cluster initialize pod (#284)
* Add nodeSelector to cluster initialize pod

* Add option to values file

* Update charts/pulsar/templates/pulsar-cluster-initialize.yaml

Co-authored-by: Michael Marshall <mikemarsh17@gmail.com>

* Fix typo in values

Co-authored-by: Michael Marshall <mikemarsh17@gmail.com>

### Motivation

Add an option to choose where to run pulsar-cluster-initialize pod. Sometimes there is a necessity to run only on certain nodes.

### Modifications

Added nodeSelector option to the pulsar-cluster-initialize job.
2022-10-14 13:44:47 -05:00
Qiang Zhao
465d1726e2
Bump Apache Pulsar version to 2.9.3 (#277) 2022-07-18 23:24:46 +08:00
Michael Marshall
26bc26028b
Use https to get Apache Pulsar icon in Chart.yaml 2022-06-26 00:39:09 -05:00
HuynhKevin
3c59b43f28
Add imagePullSecrets zookeeper (#244)
* Add imagePullSecrets for zookeeper

* Add imagePullSecrets for zookeeper

Co-authored-by: Kevin Huynh <khuynh@littlebigcode.fr>

All components have the imagePullSecrets to avoid quota limit to init correctly the pods except zookeeper
2022-06-26 00:01:48 -05:00
Filipe Caixeta
c05f659ff4
make proxy httpNumThreads configurable (#251)
Fixes https://github.com/apache/pulsar-helm-chart/issues/250

### Motivation

`httpNumThreads` is hardcoded to 8 in `charts/pulsar/templates/proxy-configmap.yaml`
When trying to override in `values.yaml` by using `proxy.configData.httpNumThreads` we get an error because the keys get duplicated.
This happens because `{{ toYaml .Values.proxy.configData | indent 2 }}` doesn't deduplicate the keys and there is no other way to set `httpNumThreads`

### Modifications

Removing the key from charts/pulsar/templates/proxy-configmap.yaml and adding it to the values yaml solves the problem.

### Verifying this change

- [x] Make sure that the change passes the CI checks.
2022-06-25 23:57:30 -05:00
Yong Zhang
6afab51bad
Upgrade the pulsar manager image version to 0.3.0 (#271)
---

**Motivation**

The pulsar manager released 0.3.0, we can upgrade it in our charts.
2022-06-25 23:52:20 -05:00
Marvin Cai
c6ab1d18e3
Support defining extra env for broker and proxy statefulsset. (#273) 2022-06-20 07:59:43 -07:00
Michael Marshall
428736c788
Add bk, zk securityContext to support upgrade to non-root docker image (#266)
Master Issue: https://github.com/apache/pulsar/issues/11269

### Motivation

Apache Pulsar's docker images for 2.10.0 and above are non-root by default. In order to ensure there is a safe upgrade path, we need to expose the `securityContext` for the Bookkeeper and Zookeeper StatefulSets. Here is the relevant k8s documentation on this k8s feature: https://kubernetes.io/docs/tasks/configure-pod-container/security-context.

Once released, all deployments using the default `values.yaml` configuration for the `securityContext` will pay a one time penalty on upgrade where the kubelet will recursively chown files to be root group writable. It's possible to temporarily avoid this penalty by setting `securityContext: {}`.

### Modifications

* Add config blocks for the `bookkeeper.securityContext` and `zookeeper.securityContext`.
* Default to `fsGroup: 0`. This is already the default group id in the docker image, and the docker image assumes the user has root group permission.
* Default to `fsGroupChangePolicy: "OnRootMismatch"`. This configuration will work for all deployments where the user id is stable. If the user id switches between restarts, like it does in OpenShift, please set to `Always`.
* Remove gc configuration writing to directory that the user lacks permission. (Perhaps we want to write to `/pulsar/log/bookie-gc.log`?) 
* Add documentation to the README.

### Verifying this change

I first attempted verification of this change with minikube. It did not work because minikube uses hostPath volumes by default. I then tested on EKS v1.21.9-eks-0d102a7. I tested by deploying the current, latest version of the helm chart (2.9.3) and then upgrading to this PR's version of the helm chart along with using the 2.10.0 docker image. I also tested upgrading from a default version 

Test 1 is a plain upgrade using the default 2.9.3 version of the chart, then upgrading to this PR's version of the chart with the modification to use the 2.10.0 docker images. It worked as expected.

```bash
$ helm install test apache/pulsar
$ # Wait for chart to deploy, then run the following, which uses Pulsar version 2.10.0:
$  helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/
```

Test 2 is a plain upgrade using the default 2.9.3 version of the chart, then an upgrade to this PR's version of the chart, then an upgrade to this PR's version of the chart using 2.10.0 docker images. There is a minor error described in the `README.md`. The solution is to chown the bookie's data directory.

```bash
$ helm install test apache/pulsar
$ # Wait for chart to deploy, then run the following, which uses Pulsar version 2.9.2:
$  helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/
$ # Upgrade using Pulsar version 2.10.0
$  helm upgrade test -f charts/pulsar/values.yaml charts/pulsar/
```

### GC Logging

In my testing, I ran into the following errors when using `-Xlog:gc:/var/log/bookie-gc.log`:

```
pulsar-bookkeeper-verify-clusterid [0.008s] Error opening log file '/var/log/bookie-gc.log': Permission denied
pulsar-bookkeeper-verify-clusterid [0.008s] Initialization of output 'file=/var/log/bookie-gc.log' using options '(null)' failed.
pulsar-bookkeeper-verify-clusterid [0.005s] Error opening log file '/var/log/bookie-gc.log': Permission denied
pulsar-bookkeeper-verify-clusterid [0.006s] Initialization of output 'file=/var/log/bookie-gc.log' using options '(null)' failed.
pulsar-bookkeeper-verify-clusterid Invalid -Xlog option '-Xlog:gc:/var/log/bookie-gc.log', see error log for details.
pulsar-bookkeeper-verify-clusterid Error: Could not create the Java Virtual Machine.
pulsar-bookkeeper-verify-clusterid Error: A fatal exception has occurred. Program will exit.
pulsar-bookkeeper-verify-clusterid Invalid -Xlog option '-Xlog:gc:/var/log/bookie-gc.log', see error log for details.
pulsar-bookkeeper-verify-clusterid Error: Could not create the Java Virtual Machine.
pulsar-bookkeeper-verify-clusterid Error: A fatal exception has occurred. Program will exit.
```

I resolved the error by removing the setting.

### OpenShift Observations

I wanted to seamlessly support OpenShift, so I investigated using configuring the bookkeeper and zookeeper process with `umask 002` so that they would create files and directories that are group writable (OpenShift has a stable group id, but gives the process a random user id). That worked for most tools when switching the user id, but not for RocksDB, which creates a lock file at `/pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK` with the permission `0644` ignoring the umask. Here is the relevant error:

```
2022-05-14T03:45:06,903+0000  ERROR org.apache.bookkeeper.server.Main - Failed to build bookie server
java.io.IOException: Error open RocksDB database
    at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:199) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:88) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.lambda$static$0(KeyValueStorageRocksDB.java:62) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.LedgerMetadataIndex.<init>(LedgerMetadataIndex.java:68) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.<init>(SingleDirectoryDbLedgerStorage.java:169) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.newSingleDirectoryDbLedgerStorage(DbLedgerStorage.java:150) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.initialize(DbLedgerStorage.java:129) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:818) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:152) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:52) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:304) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.server.Main.doMain(Main.java:226) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    at org.apache.bookkeeper.server.Main.main(Main.java:208) [org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
Caused by: org.rocksdb.RocksDBException: while open a file for lock: /pulsar/data/bookkeeper/ledgers/current/ledgers/LOCK: Permission denied
    at org.rocksdb.RocksDB.open(Native Method) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
    at org.rocksdb.RocksDB.open(RocksDB.java:239) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
    at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:196) ~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
    ... 13 more
```

As such, in order to support OpenShift, I exposed the `fsGroupChangePolicy`, which allows for OpenShift support, but not necessarily _seamless_ support.
2022-06-13 22:11:13 -05:00
Frank Kelly
bfb6985de8
Add support for Horizontal Pod Autoscaling for Broker and Proxy. (#262)
* Add support for Horizontal Pod Autoscaling for Broker and Proxy.

* Add license
2022-05-06 08:04:13 -06:00
ran
cee3fcfe56
Bump version to 2.9.2 (#255)
* Bump version to `2.9.2`

* Because the latest Pulsar image is based on Java 11, some JVM param for printing GC information has been abandoned, change to use the new JVM param. refer to https://docs.oracle.com/en/java/javase/11/tools/java.html#GUID-BE93ABDC-999C-4CB5-A88B-1994AAAC74D5 and https://issues.redhat.com/browse/CLOUD-3040.

original param | new param
--|--
`-XX:+PrintGCDetails` | `-Xlog:gc*`
`-XX:+PrintGCApplicationStoppedTime` | `-Xlog:safepoint`
`-XX:+PrintHeapAtGC` | `-Xlog:gc+heap=trace`
`-XX:+PrintGCTimeStamps` | `-Xlog:gc::utctime`
* remove JVM param `-XX:G1LogLevel=finest`
2022-04-11 15:33:29 +08:00
Chirag Modi
192b3ca2ef
Remove completed init jobs using ttl (#235)
* feat: added ttlSecondsAfterFinished configuration to delete completed jobs

* added comments for clarification
2022-02-23 08:24:37 -08:00
Lari Hotari
1c4f745941
Improve Zookeeper "ruok" probes: use TLS port when TLS is enabled, specify "-q 1" for nc (#223)
- NOTICE: we are no more using "bin/pulsar-zookeeper-ruok.sh" from the apachepulsar/pulsar docker image. The probe script is part of the chart.

* Pass "-q 1" to netcat (nc) to fix issue with Zookeeper ruok probe

- see https://github.com/apache/pulsar/pull/14088

* Send ruok to TLS port when TLS is enabled

* Bump chart version
2022-02-17 07:48:20 +02:00
Frank Kelly
9613ee0292
Make PodSecurityPolicy name unique in k8s cluster when rbac.limit_to_namespace is true (#224)
- allows having multiple Pulsar clusters in different K8S namespaces but having the same helm release name
  - PodSecurityPolicy is a cluster-level-resource and name would collide without this change
2022-02-04 10:41:10 +02:00
Lari Hotari
dd0e6d827d
Increase Zookeeper probe timeouts (#220)
- 5 seconds seems to be a too short probe timeout on a system with low resources such as in CI
2022-01-31 19:24:19 +02:00
MMeent
c0a8c1b97f
Use the 'pulsar.matchLabels' template for matching components of this chart. (#118)
This also limits the scope of the PodMonitors to the resources of only this install, instead of all installs that share `component:` label values.

Co-authored-by: Matthias van de Meent <matthias.vandemeent@cofano.nl>
2022-01-26 15:38:52 +02:00
Lari Hotari
41ff20ec5e
Don't enable pulsar manager by default (#213)
- because of security reasons
  - it increases the attack surface
- it's an unnecessary feature for most users
  - wasted resource consumption
2022-01-26 15:34:30 +02:00
Lari Hotari
fdf9dd7757
Add -XX:+ExitOnOutOfMemoryError to Zookeeper's PULSAR_GC parameters in default values.yaml (#211) 2022-01-26 15:34:07 +02:00