Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove etcd2 from non-upgrade CI tests #9672

Merged
merged 4 commits into from
Oct 3, 2018

Conversation

spiffxp
Copy link
Member

@spiffxp spiffxp commented Oct 3, 2018

ref: #7602
ref: kubernetes/enhancements#622
ref: kubernetes/kubernetes#69310

I would like to address the upgrade tests in a later PR.

I've broken into commit-per-env-file, with a list of jobs
that could be impacted for each commit

/hold
I would like to get a gut check on how green these jobs are today before
we go flipping this on them. But the idea would be to merge all at once,
and then revert whichever commit(s) are necessary if jobs start failing.

This will affect the following jobs:
- ci-containerd-e2e-gci-gce
- ci-containerd-e2e-gci-gce-1-1
- ci-cri-containerd-e2e-gci-gce
- ci-kubernetes-e2e-gci-gce
- ci-kubernetes-e2e-gce-canary
This will affect the following jobs:
- ci-kubernetes-e2e-gci-gce-sig-cli
This affects the following jobs:
- ci-kubernetes-e2e-gci-gce-ip-alias
- ci-cri-containerd-e2e-gci-gce-ip-alias
This affects the following jobs:
- ci-kubernetes-e2e-gce-alpha-api
@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/config Issues or PRs related to code in /config labels Oct 3, 2018
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 3, 2018
@spiffxp
Copy link
Member Author

spiffxp commented Oct 3, 2018

Things that are intentionally not being addressed here:

$ ag '2\.2\.1|etcd2'
config/jobs/kubernetes/sig-cluster-lifecycle/sig-cluster-lifecycle-misc.yaml
24:      - --upgrade_args=--ginkgo.focus=\[Feature:EtcdUpgrade\] --etcd-upgrade-storage=etcd2 --etcd-upgrade-version=2.2.1

config/jobs/kubernetes/sig-cluster-lifecycle/k8s-upgrade-gce.yaml
220:      - --env=STORAGE_BACKEND=etcd2
221:      - --env=ETCD_VERSION=2.2.1
222:      - --env=ETCD_IMAGE=2.2.1

@spiffxp
Copy link
Member Author

spiffxp commented Oct 3, 2018

/cc @liggitt @BenTheElder

@spiffxp
Copy link
Member Author

spiffxp commented Oct 3, 2018

/uncc @wojtek-t @pwittrock

Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold
thanks Aaron!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 3, 2018
@krzyzacy
Copy link
Member

krzyzacy commented Oct 3, 2018

weeeeeee

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, krzyzacy, spiffxp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@spiffxp
Copy link
Member Author

spiffxp commented Oct 3, 2018

Scraped testgrid for the affected jobs. There are multiple dashboards for some jobs, some with regexes, some are just dupes on different dashboards. Not bothering to filter.

job failing tests status
ci-containerd-e2e-gci-gce 0 "257 of 92421 tests (0.3%) and 33 of 163 runs (20.2%) failing in the past week"
ci-containerd-e2e-gci-gce-1-1 0 "154 of 70632 tests (0.2%) and 39 of 162 runs (24.1%) failing in the past week"
ci-cri-containerd-e2e-gci-gce 0 "278 of 91854 tests (0.3%) and 34 of 162 runs (21.0%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "129 of 144840 tests (0.1%) and 40 of 255 runs (15.7%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "129 of 144840 tests (0.1%) and 40 of 255 runs (15.7%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "129 of 144840 tests (0.1%) and 40 of 255 runs (15.7%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "5 of 12495 tests (0.0%) and 5 of 255 runs (2.0%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "3 of 2805 tests (0.1%) and 3 of 255 runs (1.2%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "34 of 70635 tests (0.0%) and 25 of 255 runs (9.8%) failing in the past week"
ci-kubernetes-e2e-gce-canary 0 "4 of 7150 tests (0.1%) and 2 of 325 runs (0.6%) failing in the past week"
ci-kubernetes-e2e-gci-gce-sig-cli 0 "6 of 5312 tests (0.1%) and 2 of 83 runs (2.4%) failing in the past week"
ci-cri-containerd-e2e-gci-gce-ip-alias 0 "137 of 46646 tests (0.3%) and 19 of 83 runs (22.9%) failing in the past week"
ci-kubernetes-e2e-gce-alpha-api 0 "3 of 6400 tests (0.0%) and 1 of 320 runs (0.3%) failing in the past week"
ci-kubernetes-e2e-gce-alpha-api 0 "3 of 6400 tests (0.0%) and 1 of 320 runs (0.3%) failing in the past week"

@spiffxp
Copy link
Member Author

spiffxp commented Oct 3, 2018

/hold cancel
let these gather data overnight (PT) when code under test is less likely to change

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 3, 2018
@k8s-ci-robot k8s-ci-robot merged commit e0a7f8b into kubernetes:master Oct 3, 2018
@k8s-ci-robot
Copy link
Contributor

@spiffxp: Updated the job-config configmap using the following files:

  • key sig-cli-config.yaml using file config/jobs/kubernetes/sig-cli/sig-cli-config.yaml

In response to this:

ref: #7602
ref: kubernetes/enhancements#622
ref: kubernetes/kubernetes#69310

I would like to address the upgrade tests in a later PR.

I've broken into commit-per-env-file, with a list of jobs
that could be impacted for each commit

/hold
I would like to get a gut check on how green these jobs are today before
we go flipping this on them. But the idea would be to merge all at once,
and then revert whichever commit(s) are necessary if jobs start failing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@spiffxp spiffxp deleted the rm-etcd2-from-tests branch October 3, 2018 04:43
@liggitt
Copy link
Member

liggitt commented Oct 3, 2018

let these gather data overnight (PT) when code under test is less likely to change

we are all agog

@spiffxp
Copy link
Member Author

spiffxp commented Oct 3, 2018

Visual inspection shows no appreciable change in flakiness, no new sudden failures.

job failing tests status
ci-containerd-e2e-gci-gce 0 "263 of 92910 tests (0.3%) and 35 of 163 runs (21.5%) failing in the past week"
ci-containerd-e2e-gci-gce-1-1 0 "155 of 71231 tests (0.2%) and 39 of 163 runs (23.9%) failing in the past week"
ci-cri-containerd-e2e-gci-gce 0 "279 of 92910 tests (0.3%) and 34 of 163 runs (20.9%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "140 of 146432 tests (0.1%) and 41 of 256 runs (16.0%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "140 of 146432 tests (0.1%) and 41 of 256 runs (16.0%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "140 of 146432 tests (0.1%) and 41 of 256 runs (16.0%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "5 of 12544 tests (0.0%) and 5 of 256 runs (2.0%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "3 of 2816 tests (0.1%) and 3 of 256 runs (1.2%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "1 of 9472 tests (0.0%) and 1 of 256 runs (0.4%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "41 of 71168 tests (0.1%) and 28 of 256 runs (10.9%) failing in the past week"
ci-kubernetes-e2e-gce-canary 0 "4 of 7150 tests (0.1%) and 2 of 325 runs (0.6%) failing in the past week"
ci-kubernetes-e2e-gci-gce-sig-cli 0 "3 of 5312 tests (0.1%) and 1 of 83 runs (1.2%) failing in the past week"
ci-cri-containerd-e2e-gci-gce-ip-alias 0 "133 of 46978 tests (0.3%) and 19 of 83 runs (22.9%) failing in the past week"
ci-kubernetes-e2e-gce-alpha-api 0 "3 of 6400 tests (0.0%) and 1 of 320 runs (0.3%) failing in the past week"
ci-kubernetes-e2e-gce-alpha-api 0 "3 of 6400 tests (0.0%) and 1 of 320 runs (0.3%) failing in the past week"

@spiffxp
Copy link
Member Author

spiffxp commented Oct 3, 2018

I'm still going to keep an eye on the ci-kubernetes-e2e-gci-gce since it's on sig-release-master-blocking, in case flakiness happens to change noticeably over the next N days. But this gives me enough confidence to move on to the upgrade tests.

@spiffxp
Copy link
Member Author

spiffxp commented Oct 8, 2018

Checking back in a while later now that etcd2 code has been removed from k/k. Still no appreciable difference.

job failing tests status
ci-containerd-e2e-gci-gce 0 "204 of 94380 tests (0.2%) and 61 of 165 runs (37.0%) failing in the past week"
ci-containerd-e2e-gci-gce-1-1 0 "136 of 72105 tests (0.2%) and 40 of 165 runs (24.2%) failing in the past week"
ci-cri-containerd-e2e-gci-gce 0 "168 of 94380 tests (0.2%) and 51 of 165 runs (30.9%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "202 of 147261 tests (0.1%) and 61 of 257 runs (23.7%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "202 of 147261 tests (0.1%) and 61 of 257 runs (23.7%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "202 of 147261 tests (0.1%) and 61 of 257 runs (23.7%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "3 of 12593 tests (0.0%) and 3 of 257 runs (1.2%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "2 of 2827 tests (0.1%) and 2 of 257 runs (0.8%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "2 of 9509 tests (0.0%) and 2 of 257 runs (0.8%) failing in the past week"
ci-kubernetes-e2e-gci-gce 0 "All passing in the past week"
ci-kubernetes-e2e-gci-gce 0 "68 of 71703 tests (0.1%) and 47 of 257 runs (18.3%) failing in the past week"
ci-kubernetes-e2e-gce-canary 0 "4 of 7150 tests (0.1%) and 2 of 325 runs (0.6%) failing in the past week"
ci-kubernetes-e2e-gci-gce-sig-cli 0 "5 of 5376 tests (0.1%) and 2 of 84 runs (2.4%) failing in the past week"
ci-cri-containerd-e2e-gci-gce-ip-alias 0 "66 of 47628 tests (0.1%) and 21 of 84 runs (25.0%) failing in the past week"
ci-kubernetes-e2e-gce-alpha-api 0 "3 of 6500 tests (0.0%) and 1 of 325 runs (0.3%) failing in the past week"
ci-kubernetes-e2e-gce-alpha-api 0 "3 of 6500 tests (0.0%) and 1 of 325 runs (0.3%) failing in the past week"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/config Issues or PRs related to code in /config cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants