Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] AzureStorageCleanupThirdPartyTests testIndexLatest failing #107720

Closed
maxhniebergall opened this issue Apr 22, 2024 · 8 comments · Fixed by #107928 or #108336
Closed

[CI] AzureStorageCleanupThirdPartyTests testIndexLatest failing #107720

maxhniebergall opened this issue Apr 22, 2024 · 8 comments · Fixed by #107928 or #108336
Labels
:Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI

Comments

@maxhniebergall
Copy link
Member

maxhniebergall commented Apr 22, 2024

multiple Test failures:
AzureStorageCleanupThirdPartyTests » testIndexLatest
AzureStorageCleanupThirdPartyTests » testMultiBlockUpload
AzureStorageCleanupThirdPartyTests » testCleanup
AzureStorageCleanupThirdPartyTests » testListChildren
AzureStorageCleanupThirdPartyTests » testCreateSnapshot

Build scan:
https://gradle-enterprise.elastic.co/s/tddxo7sdi4fqs/tests/:modules:repository-azure:azureThirdPartyUnitTest/org.elasticsearch.repositories.azure.AzureStorageCleanupThirdPartyTests/testIndexLatest

Reproduction line:

./gradlew ':modules:repository-azure:azureThirdPartyUnitTest' --tests "org.elasticsearch.repositories.azure.AzureStorageCleanupThirdPartyTests.testIndexLatest" -Dtests.seed=60262F6D1781A7E3 -Dtests.locale=en-GB -Dtests.timezone=PNT -Druntime.java=21

Applicable branches:
main, 8.13, 7.17, 8.14

Reproduces locally?:
Didn't try

Failure history:
Failure dashboard for org.elasticsearch.repositories.azure.AzureStorageCleanupThirdPartyTests#testIndexLatest

Failure excerpt:

org.elasticsearch.repositories.RepositoryVerificationException: [test-repo] path [8.13_third_party_tests_ZXumGTOq] is not accessible on master node


  Caused by: com.azure.storage.blob.models.BlobStorageException: If you are using a StorageSharedKeyCredential, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate method call.
  If you are using a SAS token, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate generateSas method call.
  Please remember to disable 'Azure-Storage-Log-String-To-Sign' before going to production as this string can potentially contain PII.
  Status code 403, "<?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
  RequestId:20befcf0-901e-0039-6e1d-93950a000000
  Time:2024-04-20T12:24:38.3947907Z</Message><AuthenticationErrorDetail>Signed expiry time [Thu, 18 Apr 2024 12:24:00 GMT] must be after signed start time [Sat, 20 Apr 2024 12:24:38 GMT]</AuthenticationErrorDetail></Error>"

    at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:733)
    at com.azure.core.implementation.http.rest.ResponseExceptionConstructorCache.invoke(ResponseExceptionConstructorCache.java:56)
    at com.azure.core.implementation.http.rest.RestProxyBase.instantiateUnexpectedException(RestProxyBase.java:378)
    at com.azure.core.implementation.http.rest.AsyncRestProxy.lambda$ensureExpectedStatus$1(AsyncRestProxy.java:115)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:113)
    at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2400)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:171)
    at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.set(Operators.java:2196)
    at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.onSubscribe(Operators.java:2070)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)
    at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:157)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.FluxHide$SuppressFuseableSubscriber.onNext(FluxHide.java:137)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.FluxHide$SuppressFuseableSubscriber.onNext(FluxHide.java:137)
    at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1839)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:151)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
    at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onNext(MonoPeekTerminal.java:180)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableConditionalSubscriber.onNext(FluxMapFuseable.java:299)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableConditionalSubscriber.onNext(FluxMapFuseable.java:299)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1839)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:151)
    at reactor.core.publisher.SerializedSubscriber.onNext(SerializedSubscriber.java:99)
    at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onNext(FluxRetryWhen.java:174)
    at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)
    at reactor.core.publisher.Operators$MonoInnerProducerBase.complete(Operators.java:2666)
    at reactor.core.publisher.MonoSingle$SingleSubscriber.onComplete(MonoSingle.java:180)
    at reactor.core.publisher.MonoFlatMapMany$FlatMapManyInner.onComplete(MonoFlatMapMany.java:260)
    at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onComplete(FluxContextWrite.java:126)
    at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onComplete(FluxMap.java:275)
    at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onComplete(FluxSwitchIfEmpty.java:85)
    at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onComplete(FluxDoFinally.java:128)
    at reactor.core.publisher.FluxHandle$HandleSubscriber.onComplete(FluxHandle.java:220)
    at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onComplete(FluxMap.java:275)
    at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onComplete(FluxDoFinally.java:128)
    at reactor.core.publisher.FluxHandleFuseable$HandleFuseableSubscriber.onComplete(FluxHandleFuseable.java:236)
    at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onComplete(FluxContextWrite.java:126)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1840)
    at reactor.core.publisher.MonoCollectList$MonoCollectListSubscriber.onComplete(MonoCollectList.java:129)
    at reactor.core.publisher.FluxPeek$PeekSubscriber.onComplete(FluxPeek.java:260)
    at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:144)
    at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:415)
    at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:439)
    at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:493)
    at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:768)
    at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:114)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
    at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1383)
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1246)
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1295)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at org.elasticsearch.repositories.azure.SocketAccess.lambda$doPrivilegedVoidException$0(SocketAccess.java:46)
    at java.security.AccessController.doPrivileged(AccessController.java:571)
    at org.elasticsearch.repositories.azure.SocketAccess.doPrivilegedVoidException(SocketAccess.java:45)
    at org.elasticsearch.repositories.azure.executors.PrivilegedExecutor.lambda$execute$0(PrivilegedExecutor.java:27)
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.lang.Thread.run(Thread.java:1583)

@maxhniebergall maxhniebergall added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >test-failure Triaged test failures from CI labels Apr 22, 2024
@elasticsearchmachine elasticsearchmachine added blocker Team:Distributed Meta label for distributed team labels Apr 22, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine added the needs:risk Requires assignment of a risk label (low, medium, blocker) label Apr 24, 2024
@ywangd ywangd added medium-risk An open issue or test failure that is a medium risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels Apr 25, 2024
@ywangd
Copy link
Member

ywangd commented Apr 25, 2024

It seems the azure credentials from vault is expired?
Based on the Git history, @brianseeders could you please take a look? Thanks!

@brianseeders
Copy link
Contributor

It looks like these start failing every year, and a member of the @elastic/es-distributed team typically rotates them.

#94868
#84231 (comment)

@ywangd
Copy link
Member

ywangd commented Apr 26, 2024

Thanks Brian. I have now rotated the credentials and will re-enable the muted tests.

@thecoop thecoop reopened this Apr 26, 2024
@davidkyle
Copy link
Member

@brianseeders
Copy link
Contributor

@ywangd how were the credentials updated? Did you update secret/ci/elastic-elasticsearch/migrated/azure_thirdparty_sas_test_creds in vault-ci-prod?

@ywangd
Copy link
Member

ywangd commented Apr 30, 2024

@brianseeders I sent you a message over slack about it. Let's check whether the instruction is outdated. Thanks!

ywangd added a commit to ywangd/elasticsearch that referenced this issue May 7, 2024
Unmute Azure 3rd party tests (again) after re-generating credentials
following the updated guide.

Relates: elastic#107928
Fixes: elastic#107720
Fixes: elastic#107502
arteam pushed a commit that referenced this issue May 7, 2024
Unmute Azure 3rd party tests (again) after re-generating credentials
following the updated guide.

Relates: #107928
Fixes: #107720
Fixes: #107502
ywangd added a commit to ywangd/elasticsearch that referenced this issue May 7, 2024
Unmute Azure 3rd party tests (again) after re-generating credentials
following the updated guide.

Relates: elastic#107928
Fixes: elastic#107720
Fixes: elastic#107502
(cherry picked from commit eba6a84)

# Conflicts:
#	modules/repository-azure/src/internalClusterTest/java/org/elasticsearch/repositories/azure/AzureStorageCleanupThirdPartyTests.java
#	x-pack/plugin/snapshot-repo-test-kit/qa/azure/src/javaRestTest/java/org/elasticsearch/repositories/blobstore/testkit/AzureSnapshotRepoTestKitIT.java
elasticsearchmachine pushed a commit that referenced this issue May 7, 2024
* Unmute Azure 3rd party tests (#108336)

Unmute Azure 3rd party tests (again) after re-generating credentials
following the updated guide.

Relates: #107928
Fixes: #107720
Fixes: #107502
(cherry picked from commit eba6a84)

# Conflicts:
#	modules/repository-azure/src/internalClusterTest/java/org/elasticsearch/repositories/azure/AzureStorageCleanupThirdPartyTests.java
#	x-pack/plugin/snapshot-repo-test-kit/qa/azure/src/javaRestTest/java/org/elasticsearch/repositories/blobstore/testkit/AzureSnapshotRepoTestKitIT.java

* remove redundant change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI
Projects
None yet
7 participants