pulsar-helm-chart/pulsar/templates/proxy-deployment.yaml
roman-popenov 56e0d05e25 [Issue-5994]: Start proxy pods when at least one broker pod is running (#6158)
### Motivation
Fixes #5994:
If the proxy service comes up before the brokers are up and reachable there will be HTTP 403 when running `bin/pulsar-admin` commands from inside the proxy pod.
 
The proxy will also not be able to connect to the brokers when data is pushed through binary port with the following error:
```bash
Caused by: org.apache.pulsar.broker.service.BrokerServiceException$PersistenceException: org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty bookies available
	... 14 more
Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty bookies available
22:11:07.633 [pulsar-web-32-6] INFO  org.eclipse.jetty.server.RequestLog - 172.17.0.6 - - [24/Jan/2020:22:11:07 +0000] "PUT /admin/v2/persistent/public/functions/assignments HTTP/1.1" 500 2528 "-" "Pulsar-Java-v2.5.0" 280
```

#### Workaround:
Restart the proxy pods once brokers pods are running

#### Proposed solution:
Hold off starting of the proxies until at least one broker is reachable in the cluster. 

### Modifications

Changes are inside `proxy-deployment.yaml` helm template file that defines a new init container before proxy is started. The init container waits until broker is reachable using the nslookup on the broker service with a sleep of 30 seconds between retries and up to number of brokers times.

Alternative solution that doesn't always work was `'until nslookup broker-service; sleep 2; done;', but 403 would still sometimes (could have been a fluke, but I saw it happening once).

### Verifying this change
1) Follow the instructions on how deploying helm and run:
`helm install pulsar --values pulsar/values-mini.yaml ./pulsar/`. 
2) Wait until all the services are up and running.  
3) Connect to proxy pod and run `bin/pulsar-admin broker-stats monitoring-metrics` - no 403 or permission errors should arise
4) Set up tenant, namespace
5) Push data into a topic - No errors in the proxy logs and client is able to push data into cluster through proxies
2020-01-31 23:52:11 -08:00

125 lines
4.5 KiB
YAML

#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
{{- if .Values.extra.proxy }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: "{{ template "pulsar.fullname" . }}-{{ .Values.proxy.component }}"
namespace: {{ .Values.namespace }}
labels:
app: {{ template "pulsar.name" . }}
chart: {{ template "pulsar.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: {{ .Values.proxy.component }}
cluster: {{ template "pulsar.fullname" . }}
spec:
replicas: {{ .Values.proxy.replicaCount }}
selector:
matchLabels:
app: {{ template "pulsar.name" . }}
release: {{ .Release.Name }}
component: {{ .Values.proxy.component }}
template:
metadata:
labels:
app: {{ template "pulsar.name" . }}
release: {{ .Release.Name }}
component: {{ .Values.proxy.component }}
cluster: {{ template "pulsar.fullname" . }}
annotations:
{{ toYaml .Values.proxy.annotations | indent 8 }}
spec:
{{- if .Values.proxy.nodeSelector }}
nodeSelector:
{{ toYaml .Values.proxy.nodeSelector | indent 8 }}
{{- end }}
{{- if .Values.proxy.tolerations }}
tolerations:
{{ toYaml .Values.proxy.tolerations | indent 8 }}
{{- end }}
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- "{{ template "pulsar.name" . }}"
- key: "release"
operator: In
values:
- {{ .Release.Name }}
- key: "component"
operator: In
values:
- {{ .Values.proxy.component }}
topologyKey: "kubernetes.io/hostname"
terminationGracePeriodSeconds: {{ .Values.proxy.gracePeriod }}
initContainers:
# This init container will wait for zookeeper to be ready before
# deploying the proxies
- name: wait-zookeeper-ready
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command: ["sh", "-c"]
args:
- >-
until bin/pulsar zookeeper-shell -server {{ template "pulsar.fullname" . }}-{{ .Values.zookeeper.component }} ls /admin/clusters/{{ template "pulsar.fullname" . }}; do
sleep 3;
done;
# This init container will wait for at least one broker to be ready before
# deploying the proxy
- name: wait-broker-ready
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command: ["bash", "-c"]
args:
- >-
for i in {0..{{ .Values.broker.replicaCount }}}; do
brokerServiceNumber="$(nslookup -timeout=10 {{ template "pulsar.fullname" . }}-{{ .Values.broker.component }} | grep Name | wc -l)"
if [[ ${brokerServiceNumber} -ge 1 ]]; then
break
fi
sleep 30;
done;
containers:
- name: "{{ template "pulsar.fullname" . }}-{{ .Values.proxy.component }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
{{- if .Values.proxy.resources }}
resources:
{{ toYaml .Values.proxy.resources | indent 10 }}
{{- end }}
command: ["sh", "-c"]
args:
- >
bin/apply-config-from-env.py conf/proxy.conf &&
bin/apply-config-from-env.py conf/pulsar_env.sh &&
bin/pulsar proxy
ports:
- name: http
containerPort: 8080
envFrom:
- configMapRef:
name: "{{ template "pulsar.fullname" . }}-{{ .Values.proxy.component }}"
{{- end }}