-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USHIFT-3133: Skip cloud provider disruption monitors for MicroShift #28767
Conversation
@pacevedom: This pull request references USHIFT-3133 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@pacevedom: This pull request references USHIFT-3133 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Why does microshift not have an Infrastructure resource? Slightly worried there could be a lot of plumbing out there using that to determine what the cluster is. |
There are several reasons for doing the change this way:
However, if you think we should go with a deeper change for all the disruption tests thats also ok with me. I havent checked for any further issues though, so this was just a brief analysis. |
Does microshift deploy this shim by default? I'm not aware of all the history here, but all these APIs sound pretty core to OpenShift to me. It seems like the absence of these could cause a lot of problems with operator portability, things suddenly breaking when you deploy in Microshift. If I had to guess, I'd imagine their absence causes a lot of problems for you all, is that the case? Could they be present by default with static responses indicating that this is a Microshift cluster? Is there another way to detect you're running in a Microshift cluster?
ClusterVersion is also a broadly used and I would think very important API to have present in any OpenShift cluster, but in this specific case that error handling can probably just be adjusted in origin.
Disruption testing will tell you if you're losing connectivity to a variety of endpoints in the cluster during your test run. It's a framework capable of catching the very worst of bugs before they make it to customers. We consider it very high priority, and it seems like it would also be a very good signal that Microshift is healthy. There are some cloud provider endpoint checks we use to gather broad data that we can reach cloud X, but these have no impact on the cluster under test, they run on the CI cluster and poll static endpoints in the clouds. I don't think any adjustments should be needed for Microshift jobs there afaict.
|
The shim is something you need to configure manually before running the tests, as openshift-tests will check locally whether these files exist or not. Right now we are working on a PR to include a simple shim with both The absence of these api groups dates back to the origins of MicroShift, where only the minimal amount of APIs where included on purpose (SCCs and routes, specifically). In regards to operators, we just recently added support for OLM, and several upstream operators seem to work (like certmanager). If any OCP operators need to be deployed in MicroShift they can not make any assumptions on these API groups being present.
The "official" way is to look for the
I am afraid this is also what happens with the other config api group resources. In this case though, the version for MicroShift is taken from the You mean handling errors from non-existing
I agree with the importance of these tests, in fact we are considering introducing the same logic to our own e2e tests, but there are some significant differences.
MicroShift does not really use the cloud in the ways OpenShift may use it. In fact we are only using AWS as a way of getting VMs where we install our packages. |
These disruption monitors are just polling from the CI build cluster to static cloud endpoint, they won't impact the microshift cluster under test, I'm surprised that's hitting infra API. However we can live with Microshift not running them, we've got lots of signal from all the other jobs. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, pacevedom The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required |
@pacevedom: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Job Failure Risk Analysis for sha: 5fe6dc8
|
/hold Revision 5fe6dc8 was retested 3 times: holding |
/retest-required |
/hold cancel |
8045e28
into
openshift:master
These monitors are permafailing in MicroShift as their requirements are not met in MicroShift:
https://prow.ci.openshift.org/job-history/gs/test-platform-results/logs/periodic-ci-openshift-microshift-main-e2e-aws-ovn-ocp-conformance
/cc @dgoodwin