GCE Ingress creates a Network Endpoint Group with 0 configured #832

giladsh1 · 2019-08-21T19:30:02Z

GCE Ingress creates a network endpoint group with 0 configured, when setting targetPort name instead of the actual number.
This config will not work and will essentially create a useless network endpoint group, which doesn't recognise the pods.

apiVersion: v1
kind: Service
metadata:
  name: mgmt
  namespace: riscale-test
  annotations:
    cloud.google.com/neg: '{"ingress": true}'
    beta.cloud.google.com/backend-config: '{"ports": {"8080":"mgmt-service-backend"}}'
spec:
  selector:
    app: mgmt
  type: NodePort
  ports:
    - port: 8080
      targetPort: mgmt-port
      protocol: TCP

However, when changing the targetPort to 8080, the network endpoint group will recognise the running pods.
This was a tough bug to catch :-(

The text was updated successfully, but these errors were encountered:

rramkumar1 · 2019-08-23T13:57:14Z

@giladsh1 Would be helpful if you can post the specification for your Deployment / Pods.

/assign @freehan

giladsh1 · 2019-08-25T10:40:55Z

@rramkumar1 adding the deployment config as requested -

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mgmt
  namespace: riscale-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mgmt
  template:
    metadata:
      labels:
        app: mgmt
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: cloud.google.com/gke-nodepool
                    operator: In
                    values:
                      - default-pool
      restartPolicy: Always
      containers:
        - name: mgmt
          image: eu.gcr.io/riscale/mgmt
          imagePullPolicy: IfNotPresent
          resources:
            requests:
              cpu: 50m
              memory: 10Mi
            limits:
              cpu: 150m
              memory: 50Mi
          ports:
            - name: mgmt-port
              containerPort: 8080
          readinessProbe:
            httpGet:
              path: /m/health
              port: mgmt-port
            initialDelaySeconds: 2
            periodSeconds: 15
            successThreshold: 2
            failureThreshold: 4
          livenessProbe:
            httpGet:
              path: /m/health
              port: mgmt-port
            periodSeconds: 15
            failureThreshold: 4
          envFrom:
            - configMapRef:
                name: common-config
            - configMapRef:
                name: service-discovery
            - configMapRef:
                name: postgres-config
            - configMapRef:
                name: mongo-config
            - secretRef:
                name: rethinkdb-secrets
            - secretRef:
                name: postgres-secrets
          env:
            - name: GOOGLE_APPLICATION_CREDENTIALS
              value: /var/secrets/google/mgmt-kms-encrypt.json
          volumeMounts:
            - name: google-cloud-key
              mountPath: /var/secrets/google
              readOnly: true
      volumes:
        - name: google-cloud-key
          secret:
            secretName: mgmt-kms-encrypt-secret

axot · 2019-09-03T02:38:37Z

We are suffering exactly same issue.

axot · 2019-09-03T07:40:58Z

After changing targetPort to port number, ingress works now, but we found PODs in a zone never get ready, we deployed our PODs in a,b,c zone, in this case, b zone POD got the issue. READINESS GATES show 0/1 and the endpoint did not be added in service, also deployment rolling update never completed.

$ kgp -o wide
NAME                       READY   STATUS    RESTARTS   AGE     IP             NODE                                                  NOMINATED NODE   READINESS GATES
haproxy-56f8cbc54f-96vqh   1/1     Running   0          4h36m   10.123.131.3   gke-done-production-proxy-1-haproxy-0-4ea1ef33-r08b   <none>           1/1
haproxy-56f8cbc54f-vt2c8   1/1     Running   0          52m     10.123.129.6   gke-done-production-proxy-1-haproxy-0-ec94ed76-hpkn   <none>           1/1
haproxy-67d4497588-8rtdr   1/1     Running   0          12m     10.123.140.3   gke-done-production-proxy-1-haproxy-0-ec94ed76-q2zw   <none>           1/1
haproxy-67d4497588-tg9zt   1/1     Running   0          12m     10.123.141.2   gke-done-production-proxy-1-haproxy-0-97c8c492-949s   <none>           0/1

UPDATES,
We recreated all resources in that namespace, then the issue was lifted.

freehan · 2019-09-20T22:42:44Z

Okay. I think I uncover the problem. Will add a fix and e2e test for this.

k8s-ci-robot assigned freehan Aug 23, 2019

giladsh1 mentioned this issue Oct 3, 2019

Ingress not Working With Websocket #793

Closed

freehan mentioned this issue Oct 31, 2019

NEG Namedport Fix #917

Merged

k8s-ci-robot closed this as completed in #917 Nov 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCE Ingress creates a Network Endpoint Group with 0 configured #832

GCE Ingress creates a Network Endpoint Group with 0 configured #832

giladsh1 commented Aug 21, 2019

rramkumar1 commented Aug 23, 2019

giladsh1 commented Aug 25, 2019

axot commented Sep 3, 2019

axot commented Sep 3, 2019 •

edited

Loading

freehan commented Sep 20, 2019 •

edited

Loading

GCE Ingress creates a Network Endpoint Group with 0 configured #832

GCE Ingress creates a Network Endpoint Group with 0 configured #832

Comments

giladsh1 commented Aug 21, 2019

rramkumar1 commented Aug 23, 2019

giladsh1 commented Aug 25, 2019

axot commented Sep 3, 2019

axot commented Sep 3, 2019 • edited Loading

freehan commented Sep 20, 2019 • edited Loading

axot commented Sep 3, 2019 •

edited

Loading

freehan commented Sep 20, 2019 •

edited

Loading