Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use hash func that generates deterministic result for protobuf #5814

Merged
merged 10 commits into from
Feb 8, 2019

Conversation

quanjielin
Copy link
Contributor

This PR uses hash func that generates deterministic result for proto.

Background -
When integrate envoy SDS with Istio, we found envoy sends out multiple requests for same sdsconfig(details in #5744);

After some debugging, we found the issue describes in protocolbuffers/protobuf#5668.

@quanjielin quanjielin force-pushed the quanlinhash0201 branch 2 times, most recently from 5eddb10 to 2d018ec Compare February 2, 2019 01:42
@quanjielin
Copy link
Contributor Author

/cc @qiwzhang @lizan PTAL.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the investigation and fix. Can we craft a failing test for both of these call sites? Thank you!

/wait

@@ -196,7 +196,7 @@ Router::RouteConfigProviderPtr RouteConfigProviderManagerImpl::createRdsRouteCon
// RdsRouteConfigSubscriptions are unique based on their serialized RDS config.
// TODO(htuch): Full serialization here gives large IDs, could get away with a
// strong hash instead.
const std::string manager_identifier = rds.SerializeAsString();
const std::string manager_identifier = std::to_string(MessageUtil::hash(rds));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably OK to use an 8-byte hash for this (we do similar things elsewhere for collision detection). Assuming people agree, can you remove the TODO above? (The alternative would be to use deterministic serialization to string much like MessageUtil::hash does.)

@@ -50,7 +50,8 @@ class SecretManagerImpl : public SecretManager {
findOrCreate(const envoy::api::v2::core::ConfigSource& sds_config_source,
const std::string& config_name,
Server::Configuration::TransportSocketFactoryContext& secret_provider_context) {
const std::string map_key = sds_config_source.SerializeAsString() + config_name;
const std::string map_key =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just use uint64_t for map_key type and change type elsewhere, no need convert to string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since resourcename is also needed to construct mapkey, are you suggesting using something like MessageUtil::hash(sds_config_source+config_name) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I meant rds_impl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@quanjielin
Copy link
Contributor Author

/cc @PiotrSikora

Signed-off-by: Quanjie Lin <quanlin@google.com>
Signed-off-by: Quanjie Lin <quanlin@google.com>
Signed-off-by: Quanjie Lin <quanlin@google.com>
Signed-off-by: Quanjie Lin <quanlin@google.com>
@@ -56,7 +56,7 @@ StaticRouteConfigProviderImpl::~StaticRouteConfigProviderImpl() {
// initialization needs to be fixed.
RdsRouteConfigSubscription::RdsRouteConfigSubscription(
const envoy::config::filter::network::http_connection_manager::v2::Rds& rds,
const std::string& manager_identifier, Server::Configuration::FactoryContext& factory_context,
const uint64_t& manager_identifier, Server::Configuration::FactoryContext& factory_context,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to drop & here (and in the header file).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Quanjie Lin <quanlin@google.com>
@lizan
Copy link
Member

lizan commented Feb 8, 2019

how about tests?

@PiotrSikora
Copy link
Contributor

@lizan what test(s) would you like to see here? Showing that SerializeAsString() is nondeterministic and MessageUtil::hash() is? Do we have examples of such tests for other xDS services? Thanks!

htuch
htuch previously approved these changes Feb 8, 2019
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is good without tests. Have we grepped for other uses of SerializeAsString? Ideally squash them all (doesn't have to be this PR).

@htuch
Copy link
Member

htuch commented Feb 8, 2019

To clarify, good without tests because I imagine we're talking about protobuf internal implementation details. If you happen to have a repeatable test case that can easily show this then we should add it, but I'm suspecting not.

PiotrSikora
PiotrSikora previously approved these changes Feb 8, 2019
@PiotrSikora
Copy link
Contributor

@quanjielin could you also fix source/common/config/config_provider_impl.h?

@mattklein123
Copy link
Member

Why are we ok without tests here? It seems like it shouldn't be very difficult to craft a case that reproduces this? Or am I missing the problem?

@htuch
Copy link
Member

htuch commented Feb 8, 2019

@mattklein123 you're asking to come up with two situations in which SerializeAsString() produces different results in the existing code base; maybe it's as simple as two back-to-back calls, but maybe it involves some stateful internals of protobuf library. If it's the former, seems totally reasonable to add a test, but the latter will have less value due to its tight coupling with a single implementation.

@mattklein123
Copy link
Member

Ok fair enough. Please open up a tech debt issue to remove all use of serializeasstring as well as a format check to block it's use at the very least? (A stronger version of @htuch previous comment). Thank you.

@quanjielin quanjielin dismissed stale reviews from PiotrSikora and htuch via 0535bba February 8, 2019 05:38
@@ -372,12 +373,12 @@ class ConfigProviderManagerImplBase : public ConfigProviderManager, public Singl
std::shared_ptr<T> getSubscription(
const Protobuf::Message& config_source_proto, Init::Manager& init_manager,
const std::function<ConfigSubscriptionInstanceBaseSharedPtr(
const std::string&, ConfigProviderManagerImplBase&)>& subscription_factory_fn) {
const uint64_t, ConfigProviderManagerImplBase&)>& subscription_factory_fn) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: fix format (extra spaces).

Signed-off-by: Quanjie Lin<quanlin@google.com>
Signed-off-by: Quanjie Lin <quanlin@google.com>
@PiotrSikora
Copy link
Contributor

(I've asked @quanjielin to force-push, due to missed DCO in previous commit)

PiotrSikora
PiotrSikora previously approved these changes Feb 8, 2019
Copy link
Contributor

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with small nit. Thanks for the quick turn-around!

@@ -345,8 +345,9 @@ class ConfigProviderManagerImplBase : public ConfigProviderManager, public Singl
using ConfigProviderSet = std::unordered_set<ConfigProvider*>;
using ConfigProviderMap = std::unordered_map<ConfigProviderInstanceType,
std::unique_ptr<ConfigProviderSet>, EnumClassHash>;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: new extra line.

@@ -50,7 +50,8 @@ class SecretManagerImpl : public SecretManager {
findOrCreate(const envoy::api::v2::core::ConfigSource& sds_config_source,
const std::string& config_name,
Server::Configuration::TransportSocketFactoryContext& secret_provider_context) {
const std::string map_key = sds_config_source.SerializeAsString() + config_name;
const std::string map_key =
std::to_string(MessageUtil::hash(sds_config_source)) + config_name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will a number started config_name potentially conflicts with other hashes? i.e. a config hash 123 with config name 0-abc conflicts with a config hash 12 with config name 30-abc, this happens much easier than hash conflict.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, yes, though I'm not sure how well do we guard against that in rest of the codebase.

@quanjielin could you add a separator? e.g.

const std::string map_key =
    absl::StrCat(MessageUtil::hash(sds_config_source), ".", config_name);

Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that will be good enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: Quanjie Lin <quanlin@google.com>
@htuch
Copy link
Member

htuch commented Feb 8, 2019

@quanjielin also, can you avoid force pushing from now on? Please DCO as appropriate to avoid this. It makes it hard to understand the diff between reviews.

@lizan
Copy link
Member

lizan commented Feb 8, 2019

LGTM, but please add a check like https://github.com/envoyproxy/envoy/blob/master/tools/check_format.py#L393 for SerializeAsString and some test like https://github.com/envoyproxy/envoy/blob/master/tools/check_format_test_helper.py#L194.

Not sure if we really need that, some use cases doesn't need deterministic serialization (e.g. gRPC codec, TAP writing to file.)

@htuch
Copy link
Member

htuch commented Feb 8, 2019

@lizan these an be whitelisted at a file-level. I rather play it safe with this one as we have been burned.

Signed-off-by: Quanjie Lin <quanlin@google.com>
Signed-off-by: Quanjie Lin <quanlin@google.com>
Signed-off-by: Quanjie Lin <quanlin@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@htuch htuch merged commit 925810d into envoyproxy:master Feb 8, 2019
@quanjielin quanjielin deleted the quanlinhash0201 branch February 9, 2019 02:41
fredlas pushed a commit to fredlas/envoy that referenced this pull request Mar 5, 2019
…proxy#5814)

This PR uses hash func that generates deterministic result for proto.

Background -
When integrate envoy SDS with Istio, we found envoy sends out multiple requests for same sdsconfig(details in envoyproxy#5744);

After some debugging, we found the issue describes in [protocolbuffers/protobuf#5668](protocolbuffers/protobuf#5668).

Signed-off-by: Quanjie Lin <quanlin@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants