Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cert validation: use Android cert validation APIs #2525

Merged
merged 50 commits into from
Oct 18, 2022

Conversation

danzh2010
Copy link
Contributor

@danzh2010 danzh2010 commented Sep 6, 2022

Description: add engine API to allow user config to use Android cert validation APIs.
Risk Level: high
Testing: added tests in Http2TestServerTest.java
Docs Changes:
Release Notes:
Fixes #1575
Part of #2144

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Contributor Author

/retest

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Contributor Author

/retest

Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Contributor Author

/retest

Copy link
Contributor

@RyanTheOptimist RyanTheOptimist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this looks awesome. A number of mostly small comments. We'll definitely need to get some other folks on the review since it's touching a lot of EM stuff that I'm not an expert in, but this looks great!

@@ -24,5 +24,7 @@ EXTENSIONS = {
"envoy.transport_sockets.raw_buffer": "//source/extensions/transport_sockets/raw_buffer:config",
"envoy.transport_sockets.tls": "//source/extensions/transport_sockets/tls:config",
"envoy.http.stateful_header_formatters.preserve_case": "//source/extensions/http/header_formatters/preserve_case:config",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Remove newline?

@@ -85,3 +85,7 @@ extern const char* socket_tag_config_insert;
* direct responses to mutate headers which are then later used for routing.
*/
extern const char* route_cache_reset_filter_insert;

extern const char* default_cert_validation_context_template;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comments, please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// NOLINT(namespace-envoy)

typedef struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments on all the structs/functions, please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

CertValidatorPtr PlatformBridgeCertValidatorFactory::createCertValidator(
const Envoy::Ssl::CertificateValidationContextConfig* config, SslStats& stats,
TimeSource& /*time_source*/) {
#if defined(__APPLE__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the platform verifier work on non-Android platforms? If so, perhaps we should make this check "if not android"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though we don't have non-Android platform supporting this, the current cronet tests do mock up the API calls so at least those cronet tests worked on non-Android platform. So I'm not going to enforce platform checking here.

Non-cronet enginer builders don't provide the interface to choose platform based cert validator after all.

}

private:
const envoy_cert_validator* platform_bridge_api_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since this member is actually the validator, I'd be inclined to name the member validator_, but your call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Return empty string
std::string getCaFileName() const override { return ""; }

void verifyCertChainByPlatform(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

private:
const Envoy::Ssl::CertificateValidationContextConfig* config_;
SslStats& stats_;
bool allow_untrusted_certificate_{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: does = false work here? If so, let's use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -315,6 +315,8 @@ protected static native int recordHistogramValue(long engine, String elements, b
*/
public static native String brotliConfigInsert();

public static native String certValidationTemplate(boolean use_platform);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Comments here, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's java I think that the convention is to use the camel case naming (see logLevel argument in one of the methods in this file). So let's rename to usePlatform (and update the corresponding doc string)

@@ -78,6 +103,96 @@ public void getSchemeIsHttps() throws Exception {
assertThat(response.getEnvoyError()).isNull();
}

@Test
public void testGetRequest() throws Exception {
System.out.println("TEST_testGetRequest");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: presumably we should remove these printlns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Though I really feel like it will always be needed. Hopefully there will be a better way to distinguish tests.

@danzh2010
Copy link
Contributor Author

/retest

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Contributor Author

/retest

Copy link
Contributor

@Augustyniak Augustyniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for simplifying the config template logic. It looks great now!

Left a few comments - mostly related to the registerAPI part and the logic responsible for the interpolation of strings in builders.

const std::string& cert_validation_template =
(this->platform_certificates_validation_on_ ? platform_cert_validation_context_template
: default_cert_validation_context_template);
config_builder << "- &validation_context" << cert_validation_template << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I personally think that it is much simpler now - thank you for working on this!

// TODO(danzh) this object leaks, but it's tied to the life of the engine.
envoy_cert_validator* api = (envoy_cert_validator*)safe_malloc(sizeof(envoy_cert_validator));
api->validate_cert = verify_x509_cert_chain;
api->release_validator = jvm_detach_thread;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we could release the api in here too (as part of release_validator). Optional comment as you've already added TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release_validator is called at the end of each validation to detach thread. We shouldn't release the api during the life time of the engine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things might change if we stop creating a new thread for each validation, but coalesce validations to a few dedicated threads.

bssl::UniquePtr<X509> leaf_cert(d2i_X509(
nullptr, const_cast<const unsigned char**>(&leaf_cert_der.bytes), leaf_cert_der.length));
envoy_cert_validation_result result =
platform_validator_->validate_cert(cert_chain.data(), cert_chain.size(), host_name.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that last week (in the community meeting) there was an idea to remove the use of registerAPI calls at all and just make platform_bridge_cert_validator call into verify_x509_cert_chain directly. I think that socket tagging example was brought up https://github.com/envoyproxy/envoy-mobile/pull/2423/files#diff-bf08f22bd707e11bdfa9ac4d50f12976176e2722ad82d3e66f5b88457b98f9a0R30 as it does call into platform specific stuff directly from c++ code without going through a register.

Would that be possible to stop using registerApi and call verify_x509_cert_chain directly from here? If yes, maybe we should do it to simplify the logic - we are doing additional work to register the API and I am not sure whether we need to. Do you think that using register is helping us in some way?

Copy link
Contributor Author

@danzh2010 danzh2010 Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to remove registerApi, but ran into a very subtle difficulty in tests where we run cronet engine in MacOS, i.e. https://github.com/envoyproxy/envoy-mobile/actions/runs/3003635334/jobs/4822128405. If the platform_validator_ API gets loaded based on platform, those tests will not be able to load verify_x509_cert_chain. And to make things more complex, one day we will have IOS validation API. Those tests will be very confused about which API to load if the engine doesn't register the right one during initialization. I think java_tests_mac CI is the only blocker for me to get rid of registerApi stuff in this PR. Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, not planning to keep blocking on this - just want to ask some questions so that I understand the current situation better. Thank you :)

The logs from https://github.com/envoyproxy/envoy-mobile/actions/runs/3003635334/jobs/4822128405 are expired - I cannot see them :( .

If the platform_validator_ API gets loaded based on platform, those tests will not be able to load verify_x509_cert_chain

Since I do not have access to the logs of the failing tests - could you elaborate on this? Any chance that you remember which tests were failing specifically? Some of our existing platform specific JNI utility methods branch on #if(_ANDROID_API_) check (example here) but not sure whether this would be useful in your case.

one day we will have IOS validation API

yes, that's when the API register will come in handy. I just wonder whether for now we could avoid that additional complexity as between now and then (when we have iOS platform cert validation) a lot can change.

Copy link
Contributor Author

@danzh2010 danzh2010 Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I do not have access to the logs of the failing tests - could you elaborate on this? Any chance that you remember which tests were failing specifically?

Some tests under https://github.com/envoyproxy/envoy-mobile/tree/main/test/java/org/chromium/net/testing runs by android_tests / java_tests_mac CI, and they uses cronet engines which will always use platform API for cert validation.

Some of our existing platform specific JNI utility methods branch on #if(_ANDROID_API_) check (example here) but not sure whether this would be useful in your case.

The AndroidNetworkLibrary APIs has mock interfaces, so that even on non-Android platform we can still run Java tests which call into verify_x509_cert_chain as long as those tests setup the mock interface. And there is no need of #if(_ANDROID_API_) for all the verification related JNI calls. But if we want to call validation interfaces based on #if not defined(__APPLE__) or #if(_ANDROID_API_), the cronet engine tests running on MacOS will not be able to make the right JNI calls.

That being said, today we are not supporting iOS cert validation APIs, so a temporary solution to make java tests on MacOS happy could be hard coding those JNI calls instead of calling them in #ifdef block. As a result, any engine enables platform based cert validation will call into those JNI functions. But this PR doesn't enable swift engine to use platform based cert validation, and in C++ engine we check platform before enabling it. This won't work once we support iOS cert validation API but still want to run tests under https://github.com/envoyproxy/envoy-mobile/tree/main/test/java/org/chromium/net/testing on MacOS.

one day we will have IOS validation API

yes, that's when the API register will come in handy. I just wonder whether for now we could avoid that additional complexity as between now and then (when we have iOS platform cert validation) a lot can change.

Will there be any chance not to run tests under https://github.com/envoyproxy/envoy-mobile/tree/main/test/java/org/chromium/net/testing on MacOS in the future? If that's the case, we won't even need platform registry after we support iOS cert validation API. Those CIs are helpful for Android developer using Mac, but not very valuable in real device scenario.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for a detailed answer(s). I think that the current approach of registering filters makes sense if it helps us to have a better test coverage in here. I wish there was an easier way to make all of this works (actual implementation + tests) - maybe that's something that we will be able to make easier in the future.

The one remaining comment that I have left is the one with regard to moving the registration of the filter from its current place to ...initialize method instead? I left more details in #2525 (comment). What do you think about this one?

Will there be any chance not to run tests under https://github.com/envoyproxy/envoy-mobile/tree/main/test/java/org/chromium/net/testing on MacOS in the future?

If we decided that this is the way to go I think that there is nothing that should prevent us from doing so from the technical point of view. Not entirely sure why we run some of Android tests on macOS and some on Linux but I am guessing that we just wanted to cover testing on more platforms - not sure how useful is that, we can discuss in the community meeting if needed.


set_vm(vm);
// At this point, we know Android APIs are available. Register cert chain validation JNI calls.
envoy_status_t result = jvm_register_cert_validator();
Copy link
Contributor

@Augustyniak Augustyniak Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decided that we need to keep registerAPI logic in I think that we should move the registration to AndoridJNILibrary.initialize in here because:

  1. The initialize method is all about the initialization so registering API sounds like a good fit for it.
  2. The initialize method is Android specific and so is AndroidJNILibrary.initialize
  3. find_class does not work before AndroidJNILibrary.initialize gets called and AndroidJNILibrary.initialize gets called after JNI_onLoad so in general if your implementation of jvm_register_cert_validator ever needs to access find_class it will fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we decided to keep using registerAPI, doing that during AndoridJNILibrary.initialize makes sense. In order do to so, I have to move the relevant JNI calls from jni_interface.cc to android_jni_utility.cc. PTAL!

@@ -192,6 +192,12 @@ - (nullable NSString *)resolveTemplate:(NSString *)templateYAML {
[definitions
appendFormat:@"- &stats_flush_interval %lus\n", (unsigned long)self.statsFlushSeconds];

NSString *cert_validator_template =
[[NSString alloc] initWithUTF8String:default_cert_validation_context_template];
[definitions appendFormat:@"- &validation_context_config_trust_chain%@\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the c++ builder is what we want other builders to look like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -284,6 +287,9 @@ String resolveTemplate(final String configTemplate, final String platformFilterT
configBuilder.append("] \n");
}

// Add a new anchor to override the default anchors in config header.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to prepend it with &validation_context or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I myself got confused about what was included in the template and what wasn't. &validation_context is already included in the template string. So in the engine implementations, there is no need to append it.
The Java engine here is doing the right thing. The C++ engine shouldn't have it appended again, though it seems still to be a valid YAML as the unit test passed. And the swift engine was out-dated.

Signed-off-by: Dan Zhang <danzh@google.com>
@@ -156,5 +156,25 @@ TEST(TestConfig, RemainingTemplatesThrows) {
}
}

#if not defined(__APPLE__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we add a doc string explaining that this test is effectively not run if being executed on a mac (which is the case currently)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, I added #else block where calling enablePlatformCertificatesValidation(true) causes death.

bssl::UniquePtr<X509> leaf_cert(d2i_X509(
nullptr, const_cast<const unsigned char**>(&leaf_cert_der.bytes), leaf_cert_der.length));
envoy_cert_validation_result result =
platform_validator_->validate_cert(cert_chain.data(), cert_chain.size(), host_name.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for a detailed answer(s). I think that the current approach of registering filters makes sense if it helps us to have a better test coverage in here. I wish there was an easier way to make all of this works (actual implementation + tests) - maybe that's something that we will be able to make easier in the future.

The one remaining comment that I have left is the one with regard to moving the registration of the filter from its current place to ...initialize method instead? I left more details in #2525 (comment). What do you think about this one?

Will there be any chance not to run tests under https://github.com/envoyproxy/envoy-mobile/tree/main/test/java/org/chromium/net/testing on MacOS in the future?

If we decided that this is the way to go I think that there is nothing that should prevent us from doing so from the technical point of view. Not entirely sure why we run some of Android tests on macOS and some on Linux but I am guessing that we just wanted to cover testing on more platforms - not sure how useful is that, we can discuss in the community meeting if needed.

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
HasSubstr("envoy_mobile.cert_validator.platform_bridge_cert_validator"));
ASSERT_THAT(bootstrap.DebugString(), Not(HasSubstr("trusted_ca")));
#else
EXPECT_DEATH(engine_builder.enablePlatformCertificatesValidation(true),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great - thank you

Augustyniak
Augustyniak previously approved these changes Oct 13, 2022
Copy link
Contributor

@Augustyniak Augustyniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are red.

At this point I am fine with merging this PR as is but I left a few tiny comments. It would be great if you could address/respond to them.

Thank you for working with me on this PR - I've learnt a lot about the JIN and c++ engine builder while working/talking with you.

envoy_status_t result =
register_platform_api(cert_validator_name, get_android_cert_validator_api());
if (result == ENVOY_FAILURE) {
return -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should just return register_platform_api... in here? ENVOY_SUCCESS is equal to 1 which is different than -1 that we return in here but I guess we decided to use ENVOY_SUCCESS/ENVOY_FAILURE constants so shiould be probably fine to return to simplify this and do return register_platform_api...

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline - looks good the way it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to return ENVOY_FAILURE here and made android_test_jni_inteface.cc return directly.

// At this point, we know Android APIs are available. Register cert chain validation JNI calls.
envoy_status_t result =
register_platform_api(cert_validator_name, get_android_cert_validator_api());
if (result == ENVOY_FAILURE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment with regard to a returns value as in my comment above.

Signed-off-by: Dan Zhang <danzh@google.com>
@danzh2010
Copy link
Contributor Author

/retest

1 similar comment
@Augustyniak
Copy link
Contributor

/retest

@Augustyniak
Copy link
Contributor

@goaway can you re-review/approve this PR? I think that your concerns have been addressed

@danzh2010
Copy link
Contributor Author

Friendly ping on this :)

@Augustyniak
Copy link
Contributor

@goaway I am going to merge this PR as I think that your concerns with regard to configuration template have been addressed.

@Augustyniak Augustyniak merged commit 6b07e7d into envoyproxy:main Oct 18, 2022
@Augustyniak Augustyniak mentioned this pull request Oct 18, 2022
colibie pushed a commit to colibie/envoy-mobile that referenced this pull request Oct 22, 2022
Description: add engine API to allow user config to use Android cert validation APIs.
Risk Level: high
Testing: added tests in Http2TestServerTest.java
Docs Changes:
Release Notes:
Fixes envoyproxy#1575
Part of envoyproxy#2144

Signed-off-by: danzh <danzh2010@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cronvoy: support Android TLS mechanism
5 participants