Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for L7-ILB (take 2) #789

Merged
merged 1 commit into from
Jul 31, 2019

Conversation

spencerhance
Copy link
Contributor

@spencerhance spencerhance commented Jun 29, 2019

This PR supersedes #773

Adds support for the L7 Internal Load Balancer to the controller. A follow up PR with e2e tests will be added.

In the interest of time, I've added the L7-ILB work on top of #788 so that it can be reviewed in tandem. Unless you foresee gigantic changes in that PR, I would review this as well so it can be merged shortly after. The first commit on this PR is from that branch.

Currently, the feature is flag-gated with the '--enable-l7-ilb' flag. Enabling the flag will require resolving #767

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 29, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @spencerhance. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 29, 2019
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 29, 2019
@rramkumar1
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 29, 2019
@spencerhance
Copy link
Contributor Author

/cc @freehan

@k8s-ci-robot k8s-ci-robot requested a review from freehan July 1, 2019 20:36
@rramkumar1
Copy link
Contributor

@spencerhance Can you squash commits so that it's easier to see which commits are from the parent PR?

@spencerhance
Copy link
Contributor Author

@rramkumar1 good call, should be good now

pkg/backends/syncer.go Outdated Show resolved Hide resolved
@@ -88,3 +88,15 @@ func nodePorts(svcPorts []utils.ServicePort) []int64 {
}
return ports
}

func UpdateServicePortsForILB(ingSvcPorts []utils.ServicePort, ing *extensions.Ingress) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there is a better way to do this. You should probably fail at validation if there is service backend that does not have NEG enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I do need to also handle the default backend case

@@ -78,6 +78,7 @@ func (fr *FirewallRules) Sync(nodeNames, additionalPorts []string) error {
sort.Strings(targetTags)

ports := sets.NewString(additionalPorts...)
fr.srcRanges = append(fr.srcRanges, additionalRanges...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to have a separate firewall rule for ILB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not a separate firewall rule, rather adding the ILB range to the firewall rule since that is a requirement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think I misunderstood your question. It's possible to add another rule but I'm not sure if that would make much of a difference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could make the features more error independent. For instance, you could get an error updating ILB ranges, but still successfully update NEG

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point, I'll look at adding this today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it will require non-trivial refactoring since we currently only support one firewall. It's still possible, but it might make more sense to just update the src ranges separately to catch two different errors.

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like we need more unit tests?

pkg/backends/backends.go Outdated Show resolved Hide resolved
pkg/backends/backends.go Outdated Show resolved Hide resolved
pkg/backends/features/features.go Outdated Show resolved Hide resolved
pkg/backends/syncer.go Outdated Show resolved Hide resolved
pkg/backends/syncer_test.go Outdated Show resolved Hide resolved
pkg/healthchecks/healthchecks.go Outdated Show resolved Hide resolved
pkg/loadbalancers/features/l7ilb.go Outdated Show resolved Hide resolved
pkg/loadbalancers/features/l7ilb.go Show resolved Hide resolved
pkg/loadbalancers/features/l7ilb.go Outdated Show resolved Hide resolved
pkg/utils/utils.go Outdated Show resolved Hide resolved
@spencerhance spencerhance force-pushed the add-l7-ilb-take-2 branch 11 times, most recently from b5c84d4 to 2872362 Compare July 8, 2019 21:33
@spencerhance
Copy link
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 19, 2019
@spencerhance spencerhance mentioned this pull request Jul 22, 2019
10 tasks
pkg/utils/utils.go Show resolved Hide resolved
pkg/utils/utils.go Outdated Show resolved Hide resolved
pkg/loadbalancers/l7.go Outdated Show resolved Hide resolved
pkg/loadbalancers/l7.go Outdated Show resolved Hide resolved
pkg/loadbalancers/features/l7ilb.go Show resolved Hide resolved
pkg/healthchecks/healthchecks.go Show resolved Hide resolved
pkg/healthchecks/healthchecks.go Show resolved Hide resolved
pkg/healthchecks/healthchecks.go Show resolved Hide resolved
pkg/healthchecks/healthchecks.go Show resolved Hide resolved
pkg/firewalls/controller.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 23, 2019
pkg/backends/syncer_test.go Outdated Show resolved Hide resolved
pkg/composite/utils.go Show resolved Hide resolved
pkg/composite/utils.go Show resolved Hide resolved
versions := []meta.Version{meta.VersionGA}
keys := []*meta.Key{key1}
versions := []meta.Version{meta.VersionGA, meta.VersionAlpha}
keys := []*meta.Key{key1, key2}

for i := range versions {
list, err := ListBackendServices(gceCloud, keys[i], versions[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still seems quite strange to me that the list operation takes a key that it then only looks @ the scope for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, there's a TODO in pkg/composite/gen/main.go from when we last discussed it. I can update here or in another PR depending on your preference

pkg/firewalls/firewalls_test.go Outdated Show resolved Hide resolved
pkg/firewalls/firewalls_test.go Outdated Show resolved Hide resolved
return subnet.IpCidrRange, nil
}
}
return "", fmt.Errorf("L7 ILB Subnet not found in region: %s", region)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do

var ErrSubnetNotFound = errors.New(""primary subnet not found")

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the remaining issues and we can merge this. There are a lot of TODOs, but should pick them up in follow on PRs.

@@ -54,6 +54,7 @@ const (
IngressClassKey = "kubernetes.io/ingress.class"
GceIngressClass = "gce"
GceMultiIngressClass = "gce-multi-cluster"
GceL7ILBIngressClass = "gce-l7-ilb"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this "gce-ilb". L7 is already implicit in the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to "gce-internal," which I believe was the original name fwiw

}

if err != nil || len(hs.HealthStatus) == 0 || hs.HealthStatus[0] == nil {
return "Unknown"
return "Unknown", fmt.Errorf("error getting health for backend %s: %v", name, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for strings that come from input, use %q

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if flags.F.EnableL7Ilb {
backends, err = composite.ListAllBackendServices(b.cloud)
} else {
backends, err = composite.ListBackendServices(b.cloud, meta.GlobalKey(""), meta.VersionGA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO that this needs to be changed to not take a key

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func ListAllUrlMaps(gceCloud *gce.Cloud) ([]*UrlMap, error) {
resultMap := map[string]*UrlMap{}
key1, err := CreateKey(gceCloud, "", meta.Global)
if err != nil {
return nil, err
}
key2, err := CreateKey(gceCloud, "", meta.Regional)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// TODO(shance): this all needs to be replaced with `scope` and not key.
globalKey, err := CreateKey(gceCloud, "", meta.Global)
if err != nil {
  return nil, err
}
regionalKey, err := ...

for _, vk := range []struct {
  v meta.Version
  k meta.Key // TODO: change this to scope
} {
  {meta.VersionGA, GlobalKey},
  {meta.VersionAlpha, RegionalKey},
} {
  ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -163,9 +170,13 @@ func (fwc *FirewallController) sync(key string) error {
return err
}
negPorts := fwc.translator.GatherEndpointPorts(gceSvcPorts)
additionalRanges, err := fwc.ilbFirewallSrcRanges(gceIngresses)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flag gate this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -123,7 +126,11 @@ func (fwc *FirewallController) ToSvcPorts(ings []*extensions.Ingress) []utils.Se
var knownPorts []utils.ServicePort
for _, ing := range ings {
urlMap, _ := fwc.translator.TranslateIngress(ing, fwc.ctx.DefaultBackendSvcPortID)
knownPorts = append(knownPorts, urlMap.AllServicePorts()...)
svcPorts := urlMap.AllServicePorts()
if utils.IsGCEL7ILBIngress(ing) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be if enableILBL7 && utils.IsGCEL7Ingress(ing) so it's completely disabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for _, ing := range gceIngresses {
if utils.IsGCEL7ILBIngress(ing) {
ilbEnabled = true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func (fwc *FirewallController) ilbFirewallSrcRanges(gceIngresses []*extensions.Ingress) ([]string, error) {
var additionalRanges []string
ilbEnabled := false
if flags.F.EnableL7Ilb {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like hiding the enablement inside the function. Then when we unit test this, it has different behavior based on a global variable flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if err != nil {
return nil, fmt.Errorf("error unable to get ILB subnet source ranges: %v", err)
}
additionalRanges = append(additionalRanges, L7ILBSrcRange)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to append if we are not in a loop

return L7ILBSrcRange, nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -46,10 +48,37 @@ var (
meta.Zonal: {fakeZonalFeature},
}

fakeVersionToFeatures = map[meta.Version][]string{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can reduce verbosity here:

makeEntry := func(ga, alpha, beta meta.Version) {
  return &map[meta.Version]{   meta.VersionGA: {ga}, meta.VersionAlpha:{alpha}, meta.VersionBeta:{beta}   }
}
...
UrlMap: ...,
ForwardingRule: makeEntry(fakeGaFeature, fakeAlphaFeature, fakeBetaFeature),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 30, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 30, 2019
@spencerhance spencerhance force-pushed the add-l7-ilb-take-2 branch 2 times, most recently from afae10b to 2de3d93 Compare July 30, 2019 21:53
@@ -94,6 +94,7 @@ func (l *instanceGroupLinker) Link(sp utils.ServicePort, groups []GroupKey) erro
}

// ig_linker only supports L7 HTTP(s) External Load Balancer
// Hardcoded here since IGs are not supported for non GA-Global right now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to add a TODO here so we can find this later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Intiial support for L7-ILB and corresponding ingress class "gce-internal."
Functionality is flag-gated with "--enable-l7-ilb".
ForNEG bool
// ForILB designates whether the health check is for an ILB
// This means that the HC should be alpha + regional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to remove the comment?

desc: "foo ingress class",
ingress: &v1beta1.Ingress{
ObjectMeta: v1.ObjectMeta{
Annotations: map[string]string{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either do

{ ... }

or

&blah{\n
  ...\n
},\n

don't do

&blah{\n
   ...},

@bowei
Copy link
Member

bowei commented Jul 31, 2019

/lgtm
/approve

there are a ton of TODOs, but let's merge this for now and handle things in addons. The github UI is not handling PRs of this size very easily.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 31, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bowei, spencerhance

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 31, 2019
@k8s-ci-robot k8s-ci-robot merged commit 8458c23 into kubernetes:master Jul 31, 2019
@spencerhance spencerhance deleted the add-l7-ilb-take-2 branch July 31, 2019 23:31
@hach-que
Copy link

Is there a way to get this working on Google Kubernetes Engine?

I added the annotation kubernetes.io/ingress.class: "gce-internal" to an ingress on GKE, but it doesn't seem to be doing anything.

Do I have to manually deploy ingress-gce to the cluster, and disable the built-in version?

@hach-que
Copy link

hach-que commented Sep 15, 2019

For anyone trying to get this working on Google Kubernetes Engine, you need to:

  1. Make sure your GKE cluster is on 1.14. 1.13 definitely does not work.
  2. Ask your Google support channel or technical account manager to enable Alpha Access on the Compute API for the projects you want to deploy this in. This is because the GA and beta APIs do not yet surface subnets reserved for internal HTTPS load balancing (more on that in a bit).
    • If you're desperate to get around this and you're only using ILBs in one region, you can override the implementation of ILBSubnetSourceRange in i7ilb.go to just straight up return the CIDR of the subnet you reserve in step 5.
  3. Before running the setup script linked below, modify glbc.yaml and add the --enable-l7-ilb=true argument after --enable-csm=[ENABLE_CSM].
  4. Follow the instructions here: https://github.com/kubernetes/ingress-gce/blob/master/docs/deploy/gke/README.md
  5. I had to make some fixes to the script e.g. to support multiple node pools and to make API server detection more reliable, which can be found in our fork: https://github.com/networknext/ingress-gce/blob/master/docs/deploy/gke/gke-self-managed.sh
  6. You will need to remove the IAM permission from the GLBC service account and re-add Compute Admin permission. For some reason, the controller kept saying access denied until we did this.
  7. In the Google Cloud Console, go to add a new HTTPS load balancer, select "Only between my VMs", select a specific region, then you will be prompted to "Reserve subnet". Click that and choose a CIDR mask that doesn't overlap with an existing subnet (e.g. 10.28.0.0/20). Give it whatever name you like. Finish creating the load balancer (just pick whatever), and then delete it after it's spun up.
  8. If you rolled out your ingress before this point, you need to delete it as the GCE ingress controller won't have created it properly.
  9. Make sure your ingress is using networking.k8s.io/v1beta1, e.g. it should look something like this:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: blah
  annotations:
    kubernetes.io/ingress.allow-http: "false"   
    kubernetes.io/ingress.class: "gce-internal"
spec:
  backend:
    serviceName: blah
    servicePort: grpc

@spencerhance
Copy link
Contributor Author

@hach-que I'm glad you were able to to get it working! As you figured out, the feature hasn't been released to GKE yet but can be used if you deploy the controller yourself. As for step 3, you can actually avoid enabling CSM by adding the enable NEG annotation to your k8s service. Here is an example service yaml:

apiVersion: v1
kind: Service
metadata:
  name: internal-lb-svc
  namespace: default
  annotations:
    cloud.google.com/neg: '{"ingress": true}'
spec:
  ports:
  - name: host1
    port: 80
    protocol: TCP
    targetPort: 9376
  selector:
    run: internal-lb-app
  type: ClusterIP

Also feel free to file issues if you notice any bugs 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants