Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No es index smf fsd #99

Merged
merged 9 commits into from
Jan 10, 2020
Merged

Conversation

bengland2
Copy link
Contributor

there is no need to use es_index environment variable in wrapper, since the part of the index name after the "ripsaw" prefix is determined by the wrapper. Also, documentation has been updated to be consistent with ripsaw use of elasticsearch.server and elasticsearch.port in CRs.

@bengland2
Copy link
Contributor Author

@acalhounRH does this look ok?

clarify what run_snafu.py wrapper developer has to do to post an ES doc
remove prefix hyphen from index names in yield statements
do not associate uuid, test_user and clustername with elasticsearch in CR
fix wrappers that use run_snafu.py to work this way
@bengland2
Copy link
Contributor Author

this commit has undergone considerable change, let me know if you like it now,tested standalone with fs-drift and smallfile, now testing with ripsaw. Goal is to pass CI tests.

run_snafu.py Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@acalhounRH
Copy link
Contributor

Looks good. Okay to Merge.

@bengland2
Copy link
Contributor Author

Needs "OK to test" label once PR97 merges (i.e. snafu CI is in place).

@dry923 dry923 added the ok to test Kick off our CI framework label Dec 4, 2019
@dry923
Copy link
Member

dry923 commented Dec 4, 2019

/rerun all

@rht-perf-ci
Copy link

Results for SNAFU CI Test

Test Result Runtime
cluster_loader FAIL 00:00:00
fio_wrapper PASS 00:06:16
fs_drift_wrapper FAIL 00:06:43
hammerdb FAIL 00:06:43
iperf PASS 00:05:31
pgbench-wrapper PASS 00:06:36
smallfile_wrapper FAIL 00:05:59
sysbench PASS 00:05:13
uperf-wrapper PASS 00:06:21
ycsb-wrapper PASS 00:05:12

@bengland2
Copy link
Contributor Author

@dry923 @aakarsh do we have a situation where snafu PR 99 and ripsaw PR 249 depend on each other and can't work unless both are committed? I think so. If so, I'd suggest committing ripsaw PR 249, which only affects smallfile and fs-drift test CRs. Then retest snafu PR 99. OK?

@aakarshg
Copy link
Contributor

aakarshg commented Dec 4, 2019

@bengland2 oh okay so oddly 249 ripsaw PR is failing fs_drift and smallfile o.O have to look into why.

@bengland2
Copy link
Contributor Author

oops, ripsaw PR 249 affects more than the smallfile+fs-drift test CRs, the thanksgiving break destroyed my memory ;-) So when I tested these two benchmarks successfully in ripsaw, I used both PRs together, not one at a time. @acalhounRH FYI

@aakarshg
Copy link
Contributor

aakarshg commented Dec 9, 2019

/rerun all

@rht-perf-ci
Copy link

Results for SNAFU CI Test

Test Result Runtime
cluster_loader FAIL 00:00:00
fio_wrapper PASS 00:06:17
fs_drift_wrapper PASS 00:07:13
hammerdb FAIL 00:06:37
iperf PASS 00:05:43
pgbench-wrapper PASS 00:06:45
smallfile_wrapper FAIL 00:06:00
sysbench PASS 00:05:13
uperf-wrapper PASS 00:06:41
ycsb-wrapper PASS 00:05:38

@aakarshg
Copy link
Contributor

aakarshg commented Dec 9, 2019

@bengland2 so the CI is correctly failing on smallfile_wrapper as there seems to be no documents indexed into ripsaw-smallfile-rsptimes, have to check what's up.

@bengland2
Copy link
Contributor Author

ok, I'll look at it, perhaps last change to ripsaw PR 249 somehow interfered with it.

@bengland2
Copy link
Contributor Author

@aakarsh once ripsaw PR 261 merges (lengthen smallfile CI test), then this problem will go away and we can finally be done with this, I tested that today.

@bengland2
Copy link
Contributor Author

ripsaw PR 261 (lengthen smallfile CI test) fixes smallfile problem here.

@aakarshg
Copy link
Contributor

merged ripsaw pr 261, will recheck this

@aakarshg
Copy link
Contributor

/rerun all

1 similar comment
@aakarshg
Copy link
Contributor

/rerun all

@rht-perf-ci
Copy link

Results for SNAFU CI Test

Test Result Runtime
cluster_loader FAIL 00:00:00
fio_wrapper PASS 00:06:11
fs_drift_wrapper PASS 00:06:46
hammerdb FAIL 00:06:35
iperf PASS 00:05:27
pgbench-wrapper PASS 00:06:41
smallfile_wrapper PASS 00:07:55
sysbench PASS 00:05:10
uperf-wrapper PASS 00:06:23
ycsb-wrapper PASS 00:05:15

Copy link
Contributor

@aakarshg aakarshg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is missing fio-analyzed result so this https://github.com/cloud-bulldozer/snafu/blob/master/fio_wrapper/fio_analyzer.py#L135 will also need to change, once done I'll merge it. What's odd is that this should've failed CI with fio, as it'd have gone to a ripsaw-fio--analyzed_result but the CI script doesn't look there.

@bengland2
Copy link
Contributor Author

this needs to be re-based, and README.md needs to be tweaked. Trying to do that now.

@bengland2
Copy link
Contributor Author

/rerun minikube_jjb

@bengland2
Copy link
Contributor Author

@aakarshg I did the code change you requested, you were right, I missed a spot.

@rht-perf-ci
Copy link

Results for SNAFU CI Test

Test Result Runtime
cluster_loader FAIL 00:00:00
fio_wrapper PASS 00:07:14
fs_drift_wrapper PASS 00:07:37
hammerdb FAIL 00:13:25
iperf PASS 00:05:19
pgbench-wrapper PASS 00:06:42
smallfile_wrapper PASS 00:05:59
sysbench PASS 00:05:18
uperf-wrapper PASS 00:06:31
ycsb-wrapper PASS 00:05:18

@aakarshg
Copy link
Contributor

aakarshg commented Jan 7, 2020

/rerun all

@rht-perf-ci
Copy link

Results for SNAFU CI Test

Test Result Runtime
cluster_loader FAIL 00:00:00
fio_wrapper PASS 00:06:27
fs_drift_wrapper PASS 00:07:09
hammerdb FAIL 00:13:46
iperf PASS 00:05:30
pgbench-wrapper PASS 00:08:21
smallfile_wrapper PASS 00:06:31
sysbench PASS 00:05:37
uperf-wrapper PASS 00:07:02
ycsb-wrapper PASS 00:06:42

@bengland2
Copy link
Contributor Author

hammerdb FAIL seems to be caused by it pushing benchmark image to cloud-bulldozer instead of rht_perf_ci. whereas the operator image is pushed to rht_perf_ci. Not a problem with this PR. Same kind of problem we're seeing elsewhere, failures to push image to repo are not detected. Other failure is cluster loader.

@bengland2
Copy link
Contributor Author

cluster loader failed because there was no cluster_loader/ci_test.sh for the CI to run -- issue 112. pgbench failed because of a random non-reproducible error in uploading pgbench image.

Error: Error copying image to the remote destination: Error trying to reuse blob sha256:b05580fca2f9aabb2d8fa975b29146c9147c8418e559f197c54a4fac04babb95 at destination: unexpected http code: 500 (Internal Server Error), URL: https://quay.io/v2/auth?account=bengland2&scope=repository%3Abengland2%2Fpgbench%3Apull%2Cpush&service=quay.io

So I consider this to be a pass. CI reliability issue 111 is where intermittent errors like this should be addressed.

Copy link
Contributor

@aakarshg aakarshg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good (ignoring all the unrelated CI errors ), but is missing the updates to clusterloader specifically https://github.com/cloud-bulldozer/snafu/blob/master/cluster_loader/trigger_cluster_loader.py#L86 which where the index needs to be an empty string as 'snafu-cl' will be added through this pr change. Will merge as soon as its fixed, sorry to have to kept this pr waiting for a while.

@bengland2
Copy link
Contributor Author

/rerun minikube_jjb

@bengland2
Copy link
Contributor Author

I made the change you requested and rebased, @aakarshg but it's not running the CI for some reason, can you clear your requested change because I can't.

@aakarshg
Copy link
Contributor

aakarshg commented Jan 9, 2020

/rerun all

@aakarshg
Copy link
Contributor

aakarshg commented Jan 9, 2020

I made the change you requested and rebased, @aakarshg but it's not running the CI for some reason, can you clear your requested change because I can't.

looks like this build has been in queue given that the other prs were triggered before this thats why its taking long.

@rht-perf-ci
Copy link

Results for SNAFU CI Test

Test Result Runtime
fio_wrapper PASS 00:06:48
fs_drift_wrapper PASS 00:07:14
hammerdb FAIL 00:13:41
iperf PASS 00:05:44
pgbench-wrapper PASS 00:07:08
smallfile_wrapper PASS 00:06:24
sysbench PASS 00:05:30
uperf-wrapper PASS 00:07:03
ycsb-wrapper PASS 00:05:44

@bengland2
Copy link
Contributor Author

hammerdb failed because of:

++ grep 'SEQUENCE COMPLETE'
Error from server (BadRequest): container "hammerdb" in pod "hammerdb-workload-ba4af13d-st9gt" is waiting to start: trying and failing to pull image
++ echo 'Hammerdb test: Success'
Hammerdb test: Success

and then it did not see any update to its ES index. But that doesn't seem to have anything to do with this PR, since hammerdb does not use run_snafu.py. I think the hammerdb image is big enough that it is timing out on the download?

Copy link
Contributor

@aakarshg aakarshg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM nice work @bengland2

@aakarshg aakarshg merged commit d25226b into cloud-bulldozer:master Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok to test Kick off our CI framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants