Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] MlDeprecationIT testMlDeprecationChecks failing #79058

Closed
danhermann opened this issue Oct 13, 2021 · 7 comments
Closed

[CI] MlDeprecationIT testMlDeprecationChecks failing #79058

danhermann opened this issue Oct 13, 2021 · 7 comments
Labels
:ml Machine learning Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@danhermann
Copy link
Contributor

Build scan:
https://gradle-enterprise.elastic.co/s/u2qkxilapasye/tests/:x-pack:plugin:deprecation:qa:rest:javaRestTest/org.elasticsearch.xpack.deprecation.MlDeprecationIT/testMlDeprecationChecks

Reproduction line:
./gradlew ':x-pack:plugin:deprecation:qa:rest:javaRestTest' --tests "org.elasticsearch.xpack.deprecation.MlDeprecationIT.testMlDeprecationChecks" -Dtests.seed=9D1455189E3BE57E -Dtests.locale=sk-SK -Dtests.timezone=Etc/GMT+12 -Druntime.java=11

Applicable branches:
master

Reproduces locally?:
Didn't try

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.xpack.deprecation.MlDeprecationIT&tests.test=testMlDeprecationChecks

Failure excerpt:

java.lang.RuntimeException: failed to delete policy: .deprecation-indexing-ilm-policy

  at __randomizedtesting.SeedInfo.seed([9D1455189E3BE57E:B20D02CBADA83A35]:0)
  at org.elasticsearch.test.rest.ESRestTestCase.lambda$deleteAllILMPolicies$17(ESRestTestCase.java:1015)
  at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
  at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
  at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1603)
  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
  at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
  at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
  at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
  at org.elasticsearch.test.rest.ESRestTestCase.deleteAllILMPolicies(ESRestTestCase.java:1011)
  at org.elasticsearch.test.rest.ESRestTestCase.wipeCluster(ESRestTestCase.java:748)
  at org.elasticsearch.test.rest.ESRestTestCase.cleanUpCluster(ESRestTestCase.java:371)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:566)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)
  at java.lang.Thread.run(Thread.java:834)

  Caused by: org.elasticsearch.client.ResponseException: method [DELETE], host [http://127.0.0.1:35261], URI [/_ilm/policy/.deprecation-indexing-ilm-policy], status line [HTTP/1.1 400 Bad Request]
  {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Cannot delete policy [.deprecation-indexing-ilm-policy]. It is in use by one or more indices: [.ds-.logs-deprecation.elasticsearch-default-2021.10.13-000001]"}],"type":"illegal_argument_exception","reason":"Cannot delete policy [.deprecation-indexing-ilm-policy]. It is in use by one or more indices: [.ds-.logs-deprecation.elasticsearch-default-2021.10.13-000001]"},"status":400}

    at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:329)
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:295)
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:269)
    at org.elasticsearch.test.rest.ESRestTestCase.lambda$deleteAllILMPolicies$17(ESRestTestCase.java:1013)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
    at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1603)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at org.elasticsearch.test.rest.ESRestTestCase.deleteAllILMPolicies(ESRestTestCase.java:1011)
    at org.elasticsearch.test.rest.ESRestTestCase.wipeCluster(ESRestTestCase.java:748)
    at org.elasticsearch.test.rest.ESRestTestCase.cleanUpCluster(ESRestTestCase.java:371)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:566)
    at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
    at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)
    at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
    at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
    at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
    at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
    at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)
    at java.lang.Thread.run(Thread.java:834)

@danhermann danhermann added Team:ML Meta label for the ML team >test-failure Triaged test failures from CI labels Oct 13, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@pgomulka
Copy link
Contributor

might be related #78850
I guess there is no such thing like a hidden ILM policy?

@danhermann
Copy link
Contributor Author

might be related #78850 I guess there is no such thing like a hidden ILM policy?

Ah, you're right. It's definitely related to that issue. I got distracted by the warning about system indices but that's not the source of the failure here.

@droberts195
Copy link
Contributor

The server side logs look like this:

[2021-10-13T12:14:30,365][INFO ][o.e.c.m.MetadataDeleteIndexService] [javaRestTest-0] [.ml-annotations-6/eyNejBqQSt6HKVTxHLq6yg] deleting index
[2021-10-13T12:14:30,366][INFO ][o.e.c.m.MetadataDeleteIndexService] [javaRestTest-0] [.ml-config/JbpMG-JNRLKsXgLZNy7Hgg] deleting index
[2021-10-13T12:14:30,366][INFO ][o.e.c.m.MetadataDeleteIndexService] [javaRestTest-0] [.ml-anomalies-shared/5R8dYNNkTwuaeyaf7QnwIQ] deleting index
[2021-10-13T12:14:30,366][INFO ][o.e.c.m.MetadataDeleteIndexService] [javaRestTest-0] [.ml-notifications-000002/U88A4C7HQT68BkFGtMztPQ] deleting index
[2021-10-13T12:14:30,976][INFO ][o.e.c.m.MetadataCreateIndexService] [javaRestTest-0] [.ds-.logs-deprecation.elasticsearch-default-2021.10.13-000001] creating index, cause [initialize_data_stream], templates [.deprecation-indexing-template], shards [1]/[1]
[2021-10-13T12:14:31,008][INFO ][o.e.c.m.MetadataCreateDataStreamService] [javaRestTest-0] adding data stream [.logs-deprecation.elasticsearch-default] with write index [.ds-.logs-deprecation.elasticsearch-default-2021.10.13-000001], backing indices [], and aliases []
[2021-10-13T12:14:31,367][INFO ][o.e.x.i.a.TransportPutLifecycleAction] [javaRestTest-0] adding index lifecycle policy [90-days-default]
[2021-10-13T12:14:31,500][INFO ][o.e.c.m.MetadataMappingService] [javaRestTest-0] [.ds-.logs-deprecation.elasticsearch-default-2021.10.13-000001/OxRDhnqlRN6lD69G82XvSw] update_mapping [_doc]
[2021-10-13T12:14:31,786][INFO ][o.e.x.i.IndexLifecycleTransition] [javaRestTest-0] moving index [.ds-.logs-deprecation.elasticsearch-default-2021.10.13-000001] from [null] to [{"phase":"new","action":"complete","name":"complete"}] in policy [.deprecation-indexing-ilm-policy]
[2021-10-13T12:14:31,873][INFO ][o.e.x.i.a.TransportPutLifecycleAction] [javaRestTest-0] adding index lifecycle policy [30-days-default]
[2021-10-13T12:14:32,086][INFO ][o.e.x.i.IndexLifecycleTransition] [javaRestTest-0] moving index [.ds-.logs-deprecation.elasticsearch-default-2021.10.13-000001] from [{"phase":"new","action":"complete","name":"complete"}] to [{"phase":"hot","action":"unfollow","name":"branch-check-unfollow-prerequisites"}] in policy [.deprecation-indexing-ilm-policy]

The test cleanup is not working as expected because the cleanup code that deletes all indices between tests causes a new datastream to be created if one of the indices it needs to delete is a system index. This is happening because the test cleanup code is deleting system indices in a deprecated way instead of using the feature reset API.

There might also be another problem related to hidden ILM policies, but the simplest quick fix to avoid running into that in this test will probably be to call feature reset explicitly at the end of the single ML integration test that's in the deprecation plugin test suite.

droberts195 added a commit to droberts195/elasticsearch that referenced this issue Oct 13, 2021
The generic post test cleanup code causes deprecation warnings
when deleting system indices. This is particularly problematic
for deprecation plugin integration tests.

This change makes the ML test in the deprecation plugin integration
tests clean up the ML system indices before the generic post test
cleanup runs.

Fixes elastic#79058
@pgomulka
Copy link
Contributor

I merged closed by #79071 but maybe let's wait with closing this one untill #79077 is merged?

@droberts195
Copy link
Contributor

I think #79071 is better than #79077 because it's using the LLRC to do the feature reset rather than the HLRC, and the HLRC is going away soon. I've closed #79077. I'll close this issue in a few hours if it stops failing now that #79071 is merged.

@droberts195 droberts195 added the :ml Machine learning label Oct 14, 2021
@droberts195
Copy link
Contributor

This hasn't failed in the last 4 hours - closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants