Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpClient fails to initialize SSL. #64492

Closed
meareindel opened this issue Jan 29, 2022 · 31 comments · Fixed by #65501 or #65435
Closed

HttpClient fails to initialize SSL. #64492

meareindel opened this issue Jan 29, 2022 · 31 comments · Fixed by #65501 or #65435
Assignees
Labels
area-System.Net.Security bug os-linux Linux OS (any supported distro)
Milestone

Comments

@meareindel
Copy link

meareindel commented Jan 29, 2022

Hey! We've got some strange issue with ssl initializing after upgrading to .net 6.0 from .net 5.0.

We have multiple microservices deployed on our own Kubernetes. There are 4 physical nodes on Debian Buster, each of node has several virtual machines as Kubernetes nodes. The microservices images are based on mcr.microsoft.com/dotnet/aspnet:6.0 with some debugging/servicing appendings like apt-get install curl and so on. We've also tried :6.0-focal and it was like it helped but I'll describe it below.

The microservices mainly exchanges data with each other by plain http through their internal hostnames, but for some specific reason sometimes they are connecting to each other through global network by https (in general it may be some external data provider which is not under our control).

The thing is sometimes when the https connection is starting it fails to initialize the openssl library with some strange errors - error:04067084:rsa routines:rsa_ossl_public_decrypt:data too large for modulus and (or) error:04067072:rsa routines:rsa_ossl_public_decrypt:padding check failed. As far as I know, there is no calls of routines:rsa_ossl_public_decrypt:data in openssl initialization. It fails randomly on different pods, different nodes etc. If one connection on one pod didn't fail, it won't fail ever. If one connection on one pod failed - it'll be failing again and again until pod restart, and then it repeats, meaning it can fail or not again. Is seems like there is some threading issue here, but it is just by suggestion. I've tried to figure out the ssl initialization process by reading the sources of System.Net.Security and Crypto.Native wrapper but no luck so far.

I still cannot reproduce this on some simple application, environment or test. It is not reproducing on windows and on local wsl2 docker - only in our Kubernetes environment so far (we have some other environments, but that one mentioned is our dev and we cannot upgrade others while dev is failing).

The other thing I've found that it happens only if we have /etc/ssl/openssl.cnf with some non-empty system default settings (system_default_sect, like described in #46271). If we leave system_default_sect empty or just provide no config file at all (that is the case of :6.0-focal, it has no openssl config file by default) - everything works fine. If we put there just the ciphers that are already default as described here (https://docs.microsoft.com/en-us/dotnet/core/compatibility/cryptography/5.0/default-cipher-suites-for-tls-on-linux) - it starts failing. And we need this config defaults (just like in :6.0 image) because we still need to connect to the sites with some obsolete weak ciphers.

It worked fine on .net 5.0.

Could you please suggest something or direct me to the way I can investigate it further?

Two exception stack examples attached.
modulus.txt
padding.txt

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jan 29, 2022
@ghost
Copy link

ghost commented Jan 29, 2022

Tagging subscribers to this area: @dotnet/ncl, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

Hey! We've got some strange issue with ssl initializing after upgrading to .net 6.0 from .net 5.0.

We have multiple microservices deployed on our own Kubernetes. There are 4 physical nodes on Debian Buster, each of node has several virtual machines as Kubernetes nodes. The microservices images are based on mcr.microsoft.com/dotnet/aspnet:6.0 with some debugging/servicing appendings like apt-get install curl and so on. We've also tried :6.0-focal and it was like it helped but I'll describe it below.

The microservices mainly exchanges data with each other by plain http through their internal hostnames, but for some specific reason sometimes they are connecting to each other through global network by https (in general it may be some external data provider which is not under our control).

The thing is sometimes when the https connection is starting it fails to initialize the openssl library with some strange errors - error:04067084:rsa routines:rsa_ossl_public_decrypt:data too large for modulus and (or) error:04067072:rsa routines:rsa_ossl_public_decrypt:padding check failed. As far as I know, there is no calls of routines:rsa_ossl_public_decrypt:data in openssl initialization. It fails randomly on different pods, different nodes etc. If one connection on one pod didn't fail, it won't fail ever. If one connection on one pod failed - it'll be failing again and again until pod restart, and then it repeats, meaning it can fail or not again. Is seems like there is some threading issue here, but it is just by suggestion. I've tried to figure out the ssl initialization process by reading the sources of System.Net.Security and Crypto.Native wrapper but no luck so far.

I still cannot reproduce this on some simple application, environment or test. It is not reproducing on windows and on local wsl2 docker - only in our Kubernetes environment so far (we have some other environments, but that one mentioned is our dev and we cannot upgrade others while dev is failing).

The other thing I've found that it happens only if we have /etc/ssl/openssl.cnf with some non-empty system default settings (system_default_sect, like described in #46271). If we leave system_default_sect empty or just provide no config file at all (that is the case of :6.0-focal, it has no openssl config file by default) - everything works fine. If we put there just the ciphers that are already default as described here (https://docs.microsoft.com/en-us/dotnet/core/compatibility/cryptography/5.0/default-cipher-suites-for-tls-on-linux) - it starts failing. And we need this config defaults (just like in :6.0 image) because we still need to connect to the sites with some obsolete weak ciphers.

It worked fine on .net 5.0.

Could you please suggest something or direct me to the way I can investigate it further?

Two exception stack examples attached.
modulus.txt
padding.txt

Author: meareindel
Assignees: -
Labels:

area-System.Net.Security, untriaged

Milestone: -

@wfurt
Copy link
Member

wfurt commented Jan 29, 2022

cc: @bartonjs

@meareindel
Copy link
Author

Any ideas? At least if it is bug or some configuration/behavior breaking changes etc?

@vcsjones
Copy link
Member

vcsjones commented Feb 2, 2022

The other thing I've found that it happens only if we have /etc/ssl/openssl.cnf with some non-empty system default settings (system_default_sect, like described in #46271). If we leave system_default_sect empty or just provide no config file at all (that is the case of :6.0-focal, it has no openssl config file by default) - everything works fine

A few questions:

First, Can you include the exact openssl.cnf that you are using?

The thing is sometimes when the https connection is starting it fails to initialize the openssl library with some strange errors

Do you have an estimate for how often is "sometimes"?

@meareindel
Copy link
Author

meareindel commented Feb 2, 2022

First, Can you include the exact openssl.cnf that you are using?

Well we are not using it explicitly, it was just a part of investigation. We just use the aspnet:6.0 image with no modifications of openssl.cnf - and by the way it is the very same of aspnet:5.0 image. Anyway, attached it for convenience (but with .txt extension because github doesn't know .cnf) - openssl.txt

Do you have an estimate for how often is "sometimes"?

Well we have a deploy job in our gitlab ci which deploys every microservice we're using at our dev environment. There are about 20 microservices, but some of them are scaled and some of them don't make a https connections at all. In summary there are about 12 pods deployed which make https connections. and at least 5 of them are affected with that problem every deploy, meaning that there are 5 pods right after deploy which cannot make https requests until restart (but pod restart has the same probability to fix or not to fix) or next deploy (where we will just have another 5 pods or even the very same).

@am11
Copy link
Member

am11 commented Feb 2, 2022

Found a similar-looking conversation on the internets: https://www.mail-archive.com/openssl-users@openssl.org/msg89470.html.

(based on which I have a guess) Maybe we just need to move this ERR_clear_error():

out of the if/else block (so it runs in both cases)?

One thing jumped out at me (perhaps intentional?) is we end up p/invoking CryptoNative_EnsureLibSslInitialized twice:

CryptoInitializer.Initialize();
//Call ssl specific initializer
Ssl.EnsureLibSslInitialized();
once from CryptoInitializer's cctor and then again from line 26. On native side, CryptoNative_EnsureOpenSslInitialized is called by CryptoNative_EnsureLibSslInitialized
void CryptoNative_EnsureLibSslInitialized()
{
CryptoNative_EnsureOpenSslInitialized();
which has pthread_once call, but rest of this function's body (including DetectCiphersuiteConfiguration()) will run for each call to CryptoNative_EnsureLibSslInitialized. If first call encountered an error and error list wasn't cleared, the next time we can end up having state errors (per the conversation in linked thread).

@bartonjs
Copy link
Member

bartonjs commented Feb 2, 2022

I don't think it's a double/concurrency problem (even though apparently both System.Net.Http and System.Net.Security will call EnsureLibSslInitialized), but a tainted error queue problem.

The SSL initializer throws if there's any error in the error queue after it calls the P/Invoke. What it means by that is "if any error was raised during initialization"... but if the error was left over by some previous operation (which perhaps hit the error and considered it skippable/part of a fallback without throwing) at just the core crypto layer, then we'll think SSL failed to initialize.

Alternatively, it could be that the error is coming from inside our configuration code itself... but if that's the case then it probably just means we're reporting the wrong error (the last one from a cascade instead of the first/correct one for the operation... which is a problem we're already looking to systematically address for vNext).

If you can get a repro under a debugger, doing something like call ERR_print_errors_fp(stdout) at the end of CryptoNative_EnsureLibSslInitialized might reveal something more interesting going on.

@meareindel
Copy link
Author

meareindel commented Feb 3, 2022

If you can get a repro under a debugger, doing something like call ERR_print_errors_fp(stdout)

Could you please elaborate? Does it mean I can attach to dotnet-process on affected pod with some debugger like gdb/SOS, put a breakpoint somehow at the right place and then call some code within it? Or should it be some part of runtime (System.Net.Security in particular) or the whole runtime be rebuilt with that call ERR_print_errors_fp(stdout)? Can I find some additional info about it somewhere, in particular how exactly I can do such a thing?

@bartonjs
Copy link
Member

bartonjs commented Feb 3, 2022

SOS isn't required, since no managed code is involved.

For LLDB it'd be something like

$ lldb dotnet -- bin/Debug/netcoreapp6.0/HttpClientStartupBug.dll
Added Microsoft public symbol server
(lldb) target create "/usr/bin/dotnet"
Current executable set to '/usr/bin/dotnet' (x86_64).
(lldb) settings set -- target.run-args  "bin/Debug/netcoreapp6.0/HttpClientStartupBug.dll"
(lldb) b CryptoNative_EnsureLibSslInitialized
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.
(lldb) run
Process 21619 launched: '/usr/bin/dotnet' (x86_64)
1 location added to breakpoint 1
Process 21619 stopped
* thread #14, name = '.NET ThreadPool', stop reason = breakpoint 1.1
    frame #0: 0x00007ffff4134bf0 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_EnsureLibSslInitialized
libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_EnsureLibSslInitialized:
->  0x7ffff4134bf0 <+0>: pushq  %rbp
    0x7ffff4134bf1 <+1>: pushq  %r15
    0x7ffff4134bf3 <+3>: pushq  %r14
    0x7ffff4134bf5 <+5>: pushq  %r13
(lldb) finish
Process 21619 stopped
* thread #14, name = '.NET ThreadPool', stop reason = step out
    frame #0: 0x00007fff7dce2a1f
->  0x7fff7dce2a1f: leaq   -0x80(%rbp), %rdi
    0x7fff7dce2a23: callq  *0x835af(%rip)
    0x7fff7dce2a29: leaq   -0x80(%rbp), %rdi
    0x7fff7dce2a2d: callq  *0x8359d(%rip)
(lldb) call (void)ERR_print_errors_fp(stdout);
(lldb)

Mine did nothing, because there were no errors in the queue. Had there been, it'd look like

(lldb) call (void)ERR_print_errors_fp(stdout);
140737353987904:error:0406706C:rsa routines:rsa_ossl_public_decrypt:data greater than mod len:../crypto/rsa/rsa_ossl.c:539:
(lldb)

@meareindel
Copy link
Author

Thanks for such a detailed explanation! I've managed to debug our apps with similar way but still no luck. It seems that attached debugger affects somehow how app works, and the bug doesn't reproduce.

I've made a script for gdb(not lldb for it can't 'finish' and 'call' in one breakpoint command) to attach to dotnet process in a running kubernetes pod and I run it just after the container start to catch ssl initialization routine. And I've changed the method for breakpoint to CryptoNative_ErrPeekLastError to be sure that thread will suspend right after initialization so there will be less debug affect. However, everything works fine with it. If I don't attach a debugger, the error comes back.

There also were some thread suspends because of SIG34 and SIGPIPE, maybe these also take part somehow, but I don't know. Added ignore of these in script.

I'll try to run applications with just an attached debugger (without breakpoints, just attach and continue) to see if it helps, and if it will, it probably would be some workaround (very odd though), but I wonder if attached debugger will affect overall performance of application. And as far as I understood if I run the dotnet process with gdb (define entrypoint of docker image with something like 'gdb run dotnet ...') I won't be able to detach gdb after breakpoint hit to ensure there is no performance hit at least right after ssl initialization.

Maybe some other ideas?

Gdb script I've used (the whole shell command which runs it):

cat << EOF > script && gdb --pid 1 -iex 'set pagination off' -x script -q
set logging on
handle SIG34 nostop noprint pass noignore
handle SIGPIPE nostop noprint pass noignore

define ssl_trace
call (void)ERR_print_errors_fp(stdout)
continue
end

b CryptoNative_ErrPeekLastError
commands
ssl_trace
end
continue
EOF

@bartonjs
Copy link
Member

bartonjs commented Feb 4, 2022

Well, the idea/hope is that hitting the error once will let the full error queue be displayed, and that from there it'll be "obvious" what's wrong.

@meareindel
Copy link
Author

If I remove that 'call (void)ERR_print_errors_fp(stdout)' from script, the error comes back. It seems that it actually do the trick, because it pops all the errors from queue, and then the execution continue as if there were no errors. But for some reason it does not print these errors to the stdout neither to gdb log or lldb stdout. Tested by 'call fprintf(stdout, "test")'. It seems that there is no tty for dotnet process (ps aux shows question mark). Will try to figure it out. At least maybe changing the image entrypoint to 'gdb run dotnet...' will probably work.

@wfurt
Copy link
Member

wfurt commented Feb 4, 2022

ERR_print_errors_fp will empty the error queue. You can try stderr as well. Some of it depends how to process is started and the output can be closed or redirected. You can also try "docker logs " (or Kubernetes equivalent)

@meareindel
Copy link
Author

Finally managed to get the output of 'call (void)ERR_print_errors_fp(stdout)' - it was just needed to 'call fflush(stdout)' additionally to see the output in pod logs.

Unfortunately there is no much info:

140052935976704:error:04067084:rsa routines:rsa_ossl_public_decrypt:data too large for modulus:../crypto/rsa/rsa_ossl.c:553:

and

139813453678336:error:0407008A:rsa routines:RSA_padding_check_PKCS1_type_1:invalid padding:../crypto/rsa/rsa_pk1.c:66:
139813453678336:error:04067072:rsa routines:rsa_ossl_public_decrypt:padding check failed:../crypto/rsa/rsa_ossl.c:588:

Sometimes there are twice of these errors, eg two 04067084 errors and(or) two pairs of 0407008A and 04067072.

I've also tried to build two custom aspnet images - one for Debian Buster with .net6 runtime and one for Debian Bullseye with .net5 runtime, and tried to use them as the base images for our microservices. Those with net5 runs ok, and other with net6 have this issue reproducing.

Also there are no parallel https connections needed to reproduce this - I mean, just one api request right after the pod start which makes a https connection can cause this issue, so it seems this is not caused by somehow multiple openssl init neither any threading synchronization. Perhaps it is kind of memory corruption somewhere or maybe some pointers are not NULL'ed on init... or maybe there is really something that calls that rsa_ossl_public_decrypt.

I've looked at the pal_ssl.c which @am11 was talking about and I wonder why there is ERR_clear_error call at all in the 'no system default' if branch. It would be some kind of a workaround though if that would get called unconditionally and just hide the real issue happening somewhere.

I'll try to see if CryptoNative_EnsureOpenSslInitialized or OPENSSL_init_ssl get called before CryptoNative_EnsureLibSslInitialized and also if there is any call of rsa_ossl_public_decrypt.

@meareindel
Copy link
Author

meareindel commented Feb 6, 2022

So that is what is going on:

Thread 40 ".NET ThreadPool" hit Breakpoint 3, 0x00007fc58c6cdb00 in RSA_padding_check_PKCS1_type_1 () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#0  0x00007fc58c6cdb00 in RSA_padding_check_PKCS1_type_1 () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#1  0x00007fc58c6cc9b6 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#2  0x00007fc58c6d0b6b in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#3  0x00007fc58c6d0e68 in RSA_verify () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#4  0x00007fca346091ab in CryptoNative_RsaVerifyHash () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libSystem.Security.Cryptography.Native.OpenSsl.so
#5  0x00007fc9c5a22836 in ?? ()
#6  0x0000000000000100 in ?? ()
#7  0x00007fc556ff9d20 in ?? ()
#8  0x00007fc556ff9d40 in ?? ()
#9  0x00000000d0363d97 in ?? ()
#10 0x00007fca37a43900 in ?? () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libcoreclr.so
#11 0x00007fc556ffccf0 in ?? ()
#12 0x00007fc9c5d0e7c8 in ?? ()
#13 0x00007fc9c5d0e7c8 in ?? ()
#14 0x00007fc556ff9d20 in ?? ()
#15 0x00007fc9c5a22836 in ?? ()
#16 0x00007fc556ff9de0 in ?? ()
#17 0x00007fc9c5a22078 in ?? ()
#18 0x00007fc9c5d0e7c8 in ?? ()
#19 0x00007fc668227920 in ?? ()
#20 0x00007fc590010340 in ?? ()
#21 0x00007fc688828220 in ?? ()
#22 0x00007fc688828770 in ?? ()
#23 0x00007fc900000001 in ?? ()
#24 0x0000002056ff9d80 in ?? ()
#25 0x00007fc556ffa9e0 in ?? ()
#26 0x00007fc68880c518 in ?? ()
#27 0x00007fc68880c518 in ?? ()
#28 0x00007fc6881516b0 in ?? ()
#29 0x00007fc68880ded0 in ?? ()
#30 0x00007fc556ff9e50 in ?? ()
#31 0x00007fc9c5a226f3 in ?? ()
#32 0x00007fc600000100 in ?? ()
#33 0x00007fc9c5a1ac6d in ?? ()
#34 0x0000010000000000 in ?? ()
#35 0x00007fc688828220 in ?? ()
#36 0x0000002056ff9e50 in ?? ()
#37 0x00007fc688828770 in ?? ()
#38 0x0000000000000000 in ?? ()

Thread 40 ".NET ThreadPool" hit Breakpoint 3, 0x00007fc58c6cdb00 in RSA_padding_check_PKCS1_type_1 () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#0  0x00007fc58c6cdb00 in RSA_padding_check_PKCS1_type_1 () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#1  0x00007fc58c6cc9b6 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#2  0x00007fc58c6d0b6b in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#3  0x00007fc58c6d0e68 in RSA_verify () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#4  0x00007fca346091ab in CryptoNative_RsaVerifyHash () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libSystem.Security.Cryptography.Native.OpenSsl.so
#5  0x00007fc9c5a25586 in ?? ()
#6  0x0000000000000100 in ?? ()
#7  0x00007fc556ff9d20 in ?? ()
#8  0x00007fc556ff9d40 in ?? ()
#9  0x00000000d0363d97 in ?? ()
#10 0x00007fca37a43900 in ?? () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libcoreclr.so
#11 0x00007fc556ffccf0 in ?? ()
#12 0x00007fc9c515af10 in ?? ()
#13 0x00007fc9c515af10 in ?? ()
#14 0x00007fc556ff9d20 in ?? ()
#15 0x00007fc9c5a25586 in ?? ()
#16 0x00007fc556ff9de0 in ?? ()
#17 0x00007fc9c4880848 in ?? ()
#18 0x00007fc9c515af10 in ?? ()
#19 0x00007fc66822a988 in ?? ()
#20 0x00007fc590010340 in ?? ()
#21 0x00007fc688828220 in ?? ()
#22 0x00007fc688828e78 in ?? ()
#23 0x00007fc900000001 in ?? ()
#24 0x0000002056ff9d80 in ?? ()
#25 0x00007fc556ffa9e0 in ?? ()
#26 0x00007fc68880c518 in ?? ()
#27 0x00007fc68880c518 in ?? ()
#28 0x00007fc6881516b0 in ?? ()
#29 0x00007fc68880ded0 in ?? ()
#30 0x00007fc556ff9e50 in ?? ()
#31 0x00007fc9c5a25443 in ?? ()
#32 0x00007fc600000100 in ?? ()
#33 0x00007fc9c5a24c1d in ?? ()
#34 0x0000010000000000 in ?? ()
#35 0x00007fc688828220 in ?? ()
#36 0x0000002056ff9e50 in ?? ()
#37 0x00007fc688828e78 in ?? ()
#38 0x0000000000000000 in ?? ()

Thread 40 ".NET ThreadPool" hit Breakpoint 3, 0x00007fc58c6cdb00 in RSA_padding_check_PKCS1_type_1 () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#0  0x00007fc58c6cdb00 in RSA_padding_check_PKCS1_type_1 () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#1  0x00007fc58c6cc9b6 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#2  0x00007fc58c6d0b6b in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#3  0x00007fc58c6d0e68 in RSA_verify () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#4  0x00007fca346091ab in CryptoNative_RsaVerifyHash () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libSystem.Security.Cryptography.Native.OpenSsl.so
#5  0x00007fc9c5a22836 in ?? ()
#6  0x0000000000000100 in ?? ()
#7  0x00007fc556ff9d20 in ?? ()
#8  0x00007fc556ff9d40 in ?? ()
#9  0x00000000d0363d97 in ?? ()
#10 0x00007fca37a43900 in ?? () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libcoreclr.so
#11 0x00007fc556ffccf0 in ?? ()
#12 0x00007fc9c5d0e7c8 in ?? ()
#13 0x00007fc9c5d0e7c8 in ?? ()
#14 0x00007fc556ff9d20 in ?? ()
#15 0x00007fc9c5a22836 in ?? ()
#16 0x00007fc556ff9de0 in ?? ()
#17 0x00007fc9c5a22078 in ?? ()
#18 0x00007fc9c5d0e7c8 in ?? ()
#19 0x00007fc6283b10b0 in ?? ()
#20 0x00007fc590010340 in ?? ()
#21 0x00007fc688836cc8 in ?? ()
#22 0x00007fc688836fc8 in ?? ()
#23 0x00007fc900000001 in ?? ()
#24 0x0000002056ff9d80 in ?? ()
#25 0x00007fc556ffa9e0 in ?? ()
#26 0x00007fc68880c518 in ?? ()
#27 0x00007fc68880c518 in ?? ()
#28 0x00007fc688160f60 in ?? ()
#29 0x00007fc68880ded0 in ?? ()
#30 0x00007fc556ff9e50 in ?? ()
#31 0x00007fc9c5a226f3 in ?? ()
#32 0x00007fc600000100 in ?? ()
#33 0x00007fc9c5a1ac6d in ?? ()
#34 0x0000010000000000 in ?? ()
#35 0x00007fc688836cc8 in ?? ()
#36 0x0000002056ff9e50 in ?? ()
#37 0x00007fc688836fc8 in ?? ()
#38 0x0000000000000000 in ?? ()
[Switching to Thread 0x7fc5dd5d1700 (LWP 28)]

Thread 22 ".NET ThreadPool" hit Breakpoint 1, 0x00007fca34609bf0 in CryptoNative_EnsureLibSslInitialized () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libSystem.Security.Cryptography.Native.OpenSsl.so
#0  0x00007fca34609bf0 in CryptoNative_EnsureLibSslInitialized () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libSystem.Security.Cryptography.Native.OpenSsl.so
#1  0x00007fc9c7e59852 in ?? ()
#2  0x00790053002e0073 in ?? ()
#3  0x00000000d0363d97 in ?? ()
#4  0x00007fca37a43900 in ?? () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libcoreclr.so
#5  0x00007fc5dd5cd7e8 in ?? ()
#6  0x00007fc9c818cd10 in ?? ()
#7  0x0002ab6154f93bec in ?? ()
#8  0x00007fc5dd5cd000 in ?? ()
#9  0x00007fc9c7e59852 in ?? ()
#10 0x00007fc5dd5cd080 in ?? ()
#11 0x00007fca37a500a0 in ?? () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libcoreclr.so
#12 0x00007fc5a8004b00 in ?? ()
#13 0x00007fc5dd5cd178 in ?? ()
#14 0x00007fc5dd5cd178 in ?? ()
#15 0x000056159fbd1360 in ?? ()
#16 0x00007fc5dd5cd0c0 in ?? ()
#17 0x00007fc5dd5cd100 in ?? ()
#18 0x00007fc5dd5cd0a0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 22 ".NET ThreadPool" hit Breakpoint 2, 0x00007fca3459cfe0 in OPENSSL_init_ssl () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#0  0x00007fca3459cfe0 in OPENSSL_init_ssl () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#1  0x00007fca345a0e8e in SSL_CTX_new () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#2  0x00007fca34609c56 in CryptoNative_EnsureLibSslInitialized () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libSystem.Security.Cryptography.Native.OpenSsl.so
#3  0x00007fc9c7e59852 in ?? ()
#4  0x00790053002e0073 in ?? ()
#5  0x00000000d0363d97 in ?? ()
#6  0x00007fca37a43900 in ?? () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libcoreclr.so
#7  0x00007fc5dd5cd7e8 in ?? ()
#8  0x00007fc9c818cd10 in ?? ()
#9  0x0002ab6154f93bec in ?? ()
#10 0x00007fc5dd5cd000 in ?? ()
#11 0x00007fc9c7e59852 in ?? ()
#12 0x00007fc5dd5cd080 in ?? ()
#13 0x00007fca37a500a0 in ?? () from /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.1/libcoreclr.so
#14 0x00007fc5a8004b00 in ?? ()
#15 0x00007fc5dd5cd178 in ?? ()
#16 0x00007fc5dd5cd178 in ?? ()
#17 0x000056159fbd1360 in ?? ()
#18 0x00007fc5dd5cd0c0 in ?? ()
#19 0x00007fc5dd5cd100 in ?? ()
#20 0x00007fc5dd5cd0a0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

There is some call of CryptoNative_RsaVerifyHash just right before (or even looks like at the same time) the call of CryptoNative_EnsureLibSslInitialized, so it seems that there are errors in queue from former call.

I've tried to find through all runtime sources of 6.0.1 branch where the CryptoNative_RsaVerifyHash call is, and the only place I've found so far is the libraries/Common/tests/System/Security/Cryptography/AlgorithmImplementations/RSA/SignVerify.cs and it's descendants, where are several tests that calls this. And it is very odd but it is looks like these tests are constantly fired by some timer in the running application, because I see that calls runned periodically within same thread then application is idle (but there is different count of calls, some times there are seven, sometimes there are five, some of them has shorter stack length, but they still goes one by one in same thread).

Any thoughts?

@meareindel
Copy link
Author

I've searched too little. These are indeed tests - not runtime though but rather our e2e tests :) and that calls were from JWT signature validation routine in the authentication middleware.

Looks like I've stuck. I suggested there is some race or whatever with that signature validation, but the issue seems to be repeated even without using authorization header at all (at our dev environment there is no strict requirement of authorization). Tried to regenerate RSA-keys within IDM (we're using Keycloak) but still no luck.

@meareindel
Copy link
Author

Finally managed to reproduce it on simple case. It turned out the combination of two things:

  1. We are using multiple authentication schemes, each of them trying to authenticate incoming request.
  2. We are using plain http connection to IDM because we're in same kubernetes network, so we do not initialize ssl when IDM metadata are fetched.

So in my simple case I have two schemas (AddJwtBearer), and first added scheme is not the scheme suited for incoming request; but first scheme tries to authenticate with incoming token and fails, and then something happens, so there are errors left in openssl error queue, and then the outgoing https connection init fails.

I'll try to reproduce this on a local docker a bit later, and will anyway attach small reproducible solution.

@wfurt
Copy link
Member

wfurt commented Feb 7, 2022

I'm wondering @bartonjs if there is path to crypto that avoids the OpenSsl initialization through SslStream. In 5.x we used the library init so everything happened automatically when library was loaded - similar to #62037

@bartonjs
Copy link
Member

bartonjs commented Feb 7, 2022

if there is path to crypto that avoids the OpenSsl initialization through SslStream

Sure. Use anything other than SslStream ;)

// Fires the crypto initializer, but not the "Ssl" initializer
RSA.Create().Dispose();

// Re-fires the crypto initializer, and also the "Ssl" initializer
[something with SslStream]

// This might actually fire the crypto initializer a third time, and the Ssl one a second time, but we can probably remove that (unless we still have the curl handler?)
[make HttpClient do an HTTPS thing]

But there's not a problem with the crypto initializer firing twice (or more). It's designed for that.

Looking at what's going on, we have

  • Call the crypto initializer (in OpenSSL 0.9.* and OpenSSL 1.0.* this initialized libcrypto without really waking up libssl; with OpenSSL 1.1.0+ it actually does everything to wake up OpenSSL)
    • This routine has no failure reporting, no exceptions are expected (other than "couldn't load a module")
  • At this point we... could... clear the error queue
  • If we're OpenSSL 1.0.*, call libssl!SSL_library_init
    • This routine technically has a return code. Except it's always 1. I don't think anything it does could contaminate the queue, and even if it did it doesn't look like looking at it is at all normal.
  • Do the config test to see if openssl.cnf has specified ciphersuites, or if we should be opinionated and use our restricted list.
    • This routine clears the error queue in the one case it knows it produces an error... so in that one case, no matter what, we've erased any previous error.
  • Done with the ssl initializer
    • It doesn't report a success/failure.
  • From managed code, ask if there's anything in the error queue.

Given that there's already a clearing path, and nothing ever reports failure (meaning nothing should ever have a reason to look at the error queue), I don't understand why that last check is there. It got added in dotnet/corefx#29171, but I don't see anything particularly justifying it.

I think my recommendation is to just... stop checking.

@meareindel
Copy link
Author

meareindel commented Feb 8, 2022

Hey! Here is the solution to reproduce this issue, added comments to clarify what's going on.
ssl-breaking.zip

@meareindel
Copy link
Author

meareindel commented Feb 8, 2022

We've ended up with this workaround (calling this method at the very start of our services):

        private void InitSsl()
        {
            var sslInitializerType = typeof(SslStream).Assembly.GetType("Interop+SslInitializer");
            if (sslInitializerType != null)
            {
                try
                {
                    sslInitializerType.InvokeMember("Initialize", BindingFlags.Static | BindingFlags.NonPublic | BindingFlags.InvokeMethod, null, null, null);
                    _logger.LogInformation("OpenSSL successfully inited");
                }
                catch (Exception ex)
                {
                    _logger.LogWarning(ex, "Failed to initialize OpenSSL");
                }
            }
            else
            {
                _logger.LogWarning("Failed to initialize OpenSSL: no initializer type found");
            }
        }

Wonder if it could be done with more 'intelligent' way.

I'll also try to make a docker-compose for the whole reproducible solution, but it needs some time to make a prepared test keycloak db and to make an image with that.

@wfurt
Copy link
Member

wfurt commented Feb 8, 2022

How do you trigger or observe the failure @meareindel? I build and run the docker image but nothing obvious shows up.

@meareindel
Copy link
Author

I had to describe it yesterday but was already bad at thinking.

For this solution to reproduce you need to have some additional IDM be deployed, like Keycloak, Identity4 or whatever, which could be connected by plain http. Then you target two authorization schemes to it, modifying constants in UrlsHelper (if it is Keycloak, but I doubt if it will work unchanged with some other IDM because of urls forming). Then you run both of it under docker and run the BreakSsl endpoint of ClientController. It will get the JWT from the second scheme and send it to Authenticate endpoint of ServerController just to run the token validation routine so that first scheme will reject the token first and then second scheme will get it validated. Then it will make a call to MakeHttps endpoint of ServerController, which in turn will try to make a https connection to github - and at this point it should fail.

A bit complicated though, that's why I'll try to make the docker-compose for the whole pipeline, but a bit later.

@meareindel
Copy link
Author

meareindel commented Feb 9, 2022

Here we are:

  1. Unpack break-ssl.zip to some folder, for example, /ssl-break
  2. cd /ssl-break
  3. build docker image with "docker build -t webapplication1 ." or from Visual Studio
  4. docker-compose up (change the host forward ports if needed in docker-compose.yml)
  5. wait until keycloak started (there will be something like 'webapplication1-keycloak-1 | 08:48:01,602 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://127.0.0.1:9990' in console)
  6. curl http://localhost:8181/Client/BreakSsl or open that url in browser (use other host port you've forwarded, if you changed it)
  7. see the error in docker-compose console

I've simplified urls forming comparing to previous code (but it now depends on how docker-compose.yml is written thought) and added preconfigured realms to keycloak so it imports these at start. The realm one is just empty realm (I mean no users) and the realm two has user 'test/test' with which net-app logs in to get the JWT.

@sagarsumant
Copy link

sagarsumant commented Feb 10, 2022

Btw i am getting this error
System.Net.Http.HttpRequestException : An error occurred while sending the request.
---- System.IO.IOException : The decryption operation failed, see inner exception.
-------- Interop+OpenSsl+SslException : Decrypt failed with OpenSSL error - SSL_ERROR_SSL.
------------ Interop+Crypto+OpenSslCryptographicException : error:04067072:rsa routines:rsa_ossl_public_decrypt:padding check failed

@meareindel - do you think it could be same issue or different issue?
I am trying this workaround in my app, but I see this,

image

@meareindel
Copy link
Author

meareindel commented Feb 10, 2022

@sagarsumant , it could be this issue if you have the same prerequisites (multiple authentication schemes with metadata and keys fetching by plain http - or maybe at least these are manually set from some config at the app start; and you send the JWTs which the first scheme can not authenticate); or it could be issue with self-signed certificates, for example, like in #62810, my workaroung will not help you in that case; or it could be some another issue at all, maybe even still unknown. You can try to search within existing runtime issues or/and google or create a new one.

The type 'Interop+SslInitializer' exists only when you run your app on Unix/Linux, so on IISExpress or Kestrel in Visual Studio it will always be null. You can run on Docker with right click on project in VS and "Add -> Docker support... -> Linux" in case you've never done it before someway; you will also need Docker Desktop running on WSL2. After that you will be able to select the "Docker" option in the Debug target list:
image

And you also will need to add ports config to the "Docker" section of launchSettings.json to be able to access your app from host browser or whatever:
image

@karelz
Copy link
Member

karelz commented Feb 10, 2022

Triage: Regression against 5.0. We should look at it in 7.0 and consider backporting to servicing once we know scope and risk of the fix.

@karelz karelz added this to the 7.0.0 milestone Feb 10, 2022
@karelz karelz added bug and removed untriaged New issue has not been triaged by the area owner labels Feb 10, 2022
@rzikm rzikm self-assigned this Feb 15, 2022
@rzikm
Copy link
Member

rzikm commented Feb 16, 2022

I have been unable to reproduce this in main, I suspect #65148 has fixed it. Either way, I have noticed something strange, I noticed that the errors in question ("data too large for modulus" or "padding check failed") should always be cleared when the method returns from CryptoNative_RsaVerifyHash, cause the code should follow the ret == -1 path and go into CryptoNative_ErrGetExceptionError or CryptoNative_ErrGetErrorAlloc (was renamed by #65148). So I tried the following lldb script

b ERR_put_error 
br mod -c "$rdi == 4 && ($rdx == 114 || $rdx == 132)" 1
br com add 1
th i
c
DONE

b CryptoNative_ErrGetErrorAlloc
br com add 2
th i
c
DONE

c

and gathered the threads from which the breakpoints were hit, and according to the log, the calls to CryptoNative_ErrGetErrorAlloc were from completely different threads than those to the ERR_put_error.

LLDB output data
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #42: tid = 1345, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #42: tid = 1345, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #50: tid = 1353, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #54: tid = 1357, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #40: tid = 1343, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #37: tid = 1340, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #41: tid = 1344, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #51: tid = 1354, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #39: tid = 1342, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #38: tid = 1341, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #38: tid = 1341, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #37: tid = 1340, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #43: tid = 1346, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #41: tid = 1344, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #41: tid = 1344, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #38: tid = 1341, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #38: tid = 1341, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #38: tid = 1341, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #39: tid = 1342, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #38: tid = 1341, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #39: tid = 1342, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #52: tid = 1355, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #36: tid = 1339, 0x00007f9989e3c600 libcrypto.so.1.1`ERR_put_error, name = '.NET ThreadPool', stop reason = breakpoint 1.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1
thread #52: tid = 1355, 0x00007f99f0202c20 libSystem.Security.Cryptography.Native.OpenSsl.so`CryptoNative_ErrGetErrorAlloc, name = '.NET ThreadPool', stop reason = breakpoint 2.1

@bartonjs am I missing something vital? cause this seems very wrong to me as we might be potentially missing many errors.

Anyway, removing the check in SslInitializer will fix this particular issue, so I will put up a PR in a minute (for main and later backport it to 6.0 if needed)

@bartonjs
Copy link
Member

am I missing something vital? cause this seems very wrong to me as we might be potentially missing many errors.

Yeah, I was having trouble reproducing it, too. The best I've come up with is that .NET doesn't claim exclusive control over OpenSSL and the repro may involve something else also calling into OpenSSL which is contaminating the error queue.

With #65148 we're still throwing based on the last reported error, but we're clearing at the start of operations to reduce the risk of cross-reporting errors.

@karelz karelz added the os-linux Linux OS (any supported distro) label Feb 21, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Mar 10, 2022
@rzikm
Copy link
Member

rzikm commented Mar 16, 2022

A fix has been merged both in main and release/6.0 branch (to be released in 6.0.4). closing as fixed.

Fixed in 7.0 (main) in PR #65435 and in 6.0.4 in PR #65501.

@rzikm rzikm closed this as completed Mar 16, 2022
@karelz karelz modified the milestones: 7.0.0, 6.0.x Apr 8, 2022
@ghost ghost locked as resolved and limited conversation to collaborators May 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.