-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow other location account #104
Allow other location account #104
Conversation
let user run on any container image server and account let user use either docker or podman more common code shared by all wrappers (build_wrapper_image) ensure ripsaw namespace goes away before each wrapper test describe how to run them in README.md Now I can actually reproduce CI failures on my laptop to fix them
log where image got pulled from so we know it got image from location and account that we specify
Here's what happened when I ran it: [bengland@bene-laptop-6 snafu]$ awk '/PASS/||/FAIL/' full_ci.log
I don't know why cluster loader fails, didn't touch it. |
/rerun all |
Results for SNAFU CI Test
|
ci/common.sh
Outdated
if [ "$image_builder" = "docker" ] ; then | ||
docker build --tag=$image_spec -f $wrapper_dir/Dockerfile . && docker push $image_spec | ||
elif [ "$image_builder" = "podman" ] ; then | ||
buildah bud --tag $image_spec -f $wrapper_dir/Dockerfile . && podman push $image_spec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is causing an issue in the CI env
+ buildah bud --tag quay.io/rht_perf_ci/fs-drift:snafu_ci -f fs_drift_wrapper/Dockerfile .
ci/common.sh: line 48: buildah: command not found
We already build the image with the operator-sdk build command on line 21. Why are we trying to build it again? It should just be a podman/docker push
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the ripsaw image is built with operator-sdk, not the snafu images.
/rerun all |
Looking at CI log I see that "buildah" is not installed on the CI system. is podman supported? If so, then we should be installing buildah package on the CI system. The intent of the commit was to allow either docker or crio-built containers to be used. That's what the image_builder var was for. |
Results for SNAFU CI Test
|
/rerun all |
Results for SNAFU CI Test
|
@aakarsh I think I figured something out about this - all podman+buildah commands must be run with sudo. I made changes for this and now it works. docker does not require this because docker runs as a service. Will commit a change for this in ci/common.sh and retry CI once I've run it on my laptop successfully. |
want to get ripsaw PR 260 merged, then this will go better. |
/rerun all |
Results for SNAFU CI Test
|
/rerun all |
Results for SNAFU CI Test
|
I just rebased to use Raul's new ci/common.sh build_and_push routine . |
Results for SNAFU CI Test
|
CI is failing hard on this pr because the individual workload images aren't accessible ? I see following error:
So i checked the images in rht_perf_ci namespace and most of them were uploaded first time as we're using cloud-bulldozer namespace even for ci, and these images were private by default ( quay.io :/ ) I made them public so will hit rerun now and hopefully should've fixed it. |
/rerun all |
Results for SNAFU CI Test
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. will merge once @dry923 approves as well
LGTM, however why is the hammer ci job failing? |
hammerdb won't run in my minikube. One issue for me was in ripsaw/roles/hammerdb/templates/db_workload.yml.j2 "image:" lines, where Marko Karg's image was the default one not cloud-bulldozer. This is exactly why I wrote this PR - So that we wouldn't have to edit the benchmark yamls to run it in our private minikube. @mkarg75 can you fix the image: in db_workload.yml.j2 ? |
@bengland2 ah ok makes sense. @mkarg75 any reason we couldn't change the ripsaw role to point to hammerdb:master instead of latest to fall in line with the others as well as make the snafu ci a little more predictable? |
@mkarg75 @dry923 when I run in my minikube, I get
This is because it changes the tag to snafu_ci but leaves the account as mkarg, so there is no such thing in quay.io. This is why it doesn't run. The same thing should be happening in the CI server, because it's still trying to use quay.io/mkarg/hammerdb:snafu_ci, which does not exist. Marko's PR should help - the remaining thing is to have my PR edit the YAML file to change quay.io/cloud-bulldozer/hammerdb:latest ==> quay.io/rht_perf_ci/hammerdb:snafu_ci I found this syntax error in tests/mssql.yaml:
Also, I see that it never starts the sql-server pod to create the database
Also, it appears that this does not generate an error when the snafu wrapper is run.
|
I'm looking into the issues Ben faced. |
Hm, I've tried the same steps as Ben:
So there is a namespace, a deployment and a service created and deleted for me. And the pod is also running for me:
As long as I'm pointing to my hammerdb image the db initialization and tests work fine too, so there's something missing in the |
@mkarg75 can you ptal at the dockerfile and see what's misisng between your image and the one on cloud-bulldozer org? |
I just started a test run on my minishift instance:
So that's the pod from the cloud-bulldozer repo.
The only thing that is probably different to Ben's setup is that I've commented the metadata_collection as that was causing issues. |
@mkarg75, it's useful to see how you check that it's running right. I noticed you are using "oc" not "kubectl". Are you running openshift or minikube? how much disk space is needed for the benchmark to run? I'll try running it in my openshift cluster, see if it's any different. @dry923 what version of minikube and kubectl are you using? Use "kubectl version" cmd. I'm using minikube v1.6.1 on Fedora 31, and my client version is 1.16 and my server version is 1.17. |
@mkarg75 I don't use minishift anymore because of severe problems getting CI tests to pass with it. I use minikube instead. However, with openshift cluster, the mssql.yaml was accepted - perhaps kube 1.17 doesn't like it? I can downgrade my minikube to 1.16. However, there are other problems.
If we get a timeout waiting for the hammerdb pod to complete, then I don't think the test was a success: The workload pod exited in an error status:
The reason for this was:
Which resulted from this patch:
This patch would make it easier to diagnose such problems. |
These issues are all new to me, I'm inclined to say that they mostly originate from the move to stockpile. |
am retrying with minikube-1.5.2-0.x86_64 which uses k8s 1.16 on my laptop with just hammerdb. |
…ion_account rebase in light of numerous merged PRs
/rerun all |
Results for SNAFU CI Test
|
Results for SNAFU CI Test
|
/rerun all |
Results for SNAFU CI Test
|
@bengland2 looks good now. If you can fix the conflicts we should be good to go :-) |
@dry923 I see no conflicts at this time reported by git or github. |
@bengland2 github is showing me: |
That's not what github is showing me at this URL, are you somehow at your fork of snafu rather than the cloud-bulldozer/snafu? |
@bengland2 that's very strange. I'm at the same url and its showing the rebase conflicts message. @aakarshg are you showing this as having conflicts or good to merge? Ben and I are seeing different things :-/ |
@bengland2 @dry923 I see the conflicts too:
@bengland2 maybe you'll need to hit refresh once? |
@aakarshg @dry923 I just refreshed the page for the 2nd time (first was before my last post) and I still see "This branch has no conflicts with the base branch"!!!! Not only that, but git itself believes this. this totally contradicts my entire experience with github and git. There should be conflicts because of the merged PRs! So I just cancelled this entire pull request, saved the patches, and will close this PR and start over, because I do not understand what is going on here. It's discouraging since I put so much time into this PR, but no other way forward.
|
I deleted my snafu directory tree, re-cloned my fork from bengland2/snafu, checked out the branch, and re-pulled from cloud-bulldozer/snafu, still no conflicts! That's why I haven't closed it yet. @rsevilla87 do you think this is worth an effort to resurrect this PR? Would you do it differently? All I wanted was a way so I didn't have to constantly edit the yaml by hand to insert a different quay.io account name, and a README.md that explained what a developer has to do in order to develop a snafu wrapper. Thx for any ideas you can provide. |
@bengland2 I like the idea behind this PR, we really need it. I would like to avoid using |
@aakarsh, @rsevilla87 kindly posted an issue about the reason for sudo. I don't like to have to do it, but I don't want to run everything from root on my laptop. The way I've done it is such that the sudo disappears when run from the root account (using $SUDO) so it shouldn't affect the CI at all. We can get rid of it later when the workaround is not needed and rootless podman is error-free. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Snafu CI was tested on my laptop running F31 and minikube 1.6.1. I was able to use docker to run the CI (except for ycsb, not sure why yet), but was not able to use podman, but this may just be a problem on my laptop rather than in general. Also updated README.md to describe how CI works and how to test CI for your wrapper. Do not test this without ripsaw PR 260 (allows container image location and account to be overridden by user). Should address some of Snafu issue 90 (lack of user documentation).