-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata Exchange Filter #2325
Metadata Exchange Filter #2325
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gargnupur The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
FYI The upstream filter facility is now merge into Envoy, so it should be ready to use. We don't have to combine both filters in one PR, upstream can be done separately from downstream. |
data.drain(proxy_header_length); | ||
if (proxy_header.has_metadata()) { | ||
read_callbacks_->connection().streamInfo().setDynamicMetadata( | ||
"filters.network.alpn_proxy.upstream", proxy_header.metadata()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this a constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
} | ||
|
||
Network::FilterStatus AlpnProxyFilter::onWrite(Buffer::Instance& data, bool) { | ||
config_->stats().metadata_found_.inc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it incremented per each write?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left from testing... removed
namespace Tcp { | ||
namespace AlpnProxy { | ||
namespace { | ||
bool protoMapEq(const ProtobufWkt::Struct& obj, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MessageDifferencer is public now, you can find a use of it in envoy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks..Fixed
5300206
to
1d7be25
Compare
@kyessenov, @mandarjog , @lizan : Looks like tests are failing because of the following error: and I see this in other PR's too.. |
Yes it is failing on PRs since prow nodes were upgraded. |
/retest |
conn_state_ = WriteMetadata; | ||
config_->stats().alpn_protocol_found_.inc(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has to fall-through to WriteMetadata. Can you make that explicit or add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.. removed the else stmt.
if (config_->filter_direction_ == FilterDirection::Downstream) { | ||
writeNodeMetadata(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, this is intended to fall through I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
} | ||
const uint32_t initial_header_length = sizeof(AlpnProxyInitialHeader); | ||
if (data.length() < initial_header_length) { | ||
config_->stats().initial_header_not_found_.inc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem right to increment header on buffering. This is mostly random depending how TCP data is chunked, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check replied above...
I thought "return Network::FilterStatus::StopIteration;" is doing that and we don't need to drain "data" and buffer bytes read till now..
if (conn_state_ == Invalid) { | ||
return Network::FilterStatus::Continue; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. If buffer is large, then we need to drain the proxy header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought "return Network::FilterStatus::StopIteration;" is doing that and we don't need to drain "data" and buffer bytes read till now..
} | ||
} | ||
|
||
void AlpnProxyFilter::writeMetadata(const std::string key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: naming suggestion setMetadata so as not to confuse with network writes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.. to be in sync with set.. changed readMetadata to getMetadata too.
writeMetadata(UpstreamDynamicDataKey, struct_metadata); | ||
} else { | ||
writeMetadata(DownstreamDynamicDataKey, struct_metadata); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set conn_state_ = Done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
State Machine fall through will take care of it.. in default section?
case WriteMetadata: { | ||
// TODO(gargnupur): Try to move this just after alpn protocol is | ||
// determined and first onWrite is called in Upstream filter. | ||
if (config_->filter_direction_ == FilterDirection::Upstream) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking this guard may not be necessary. This essentially validates the expectation that onData is called first for downstream and onWrite is called first upstream. However, this is not important since one of them has to be called first anyways, and both callbacks do the right thing and transition out of the writing state (so not possible to write twice).
In fact, it might be necessary to remove this guard for server-first protocol. If the server (envoy) writes first on the downstream connection, then there is a bug with this guard. This can happen with MySQL, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that assumption was needed.. Thanks for the info.. Removed the guard.
Thank you for changing it to use a state machine. This is more readable IMHO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From implementation this looks much better than last time.
A few higher level question/request:
-
Can you have a one pager document (or is there any design doc for the protocol other than https://docs.google.com/document/d/1bWQAsrBZguk5HCmBVDEgEMVGS91r9uh3SIr7D7ELZBk/edit#heading=h.n6jc1obl82s7?) describing what the wire protocol is? I could read from the implementation but it is not clear and hard to read implementation against implementation to check if they can communicate correctly.
-
The name "ALPN Proxy" is very confusing because it has nothing to do with ALPN, we use ALPN to select this filter but itself doesn't deal with ALPN, I guess it is better to name "metadata_proxy" or something else (cc @PiotrSikora)
struct AlpnProxyInitialHeader { | ||
uint32_t magic; // Magic number in network byte order. Most significant byte | ||
// is placed first. | ||
static const uint32_t magic_number = 0x23071961; // decimal 587667809 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how this number is chosen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a random number..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you double check this doesn't conflict with major protocols? (i.e. couldn't be first 4 bytes of TLS, HTTP/1.1, HTTP/2, MySQL, etc..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done..
@lizan: One Pager Implementation doc can be found at: https://docs.google.com/document/d/15jKV_muOZKX4CccnA1w-X73SkqGt8cV3a9TBXxzs0TA/edit?usp=sharing. |
Thanks for putting the doc!
I'm in favor of doing that in this PR. At very least the registered name should be changed because it appears in config. |
I agree, lets rename it to Re If we can't be certain, we should have the ability override |
@lizan, @mandarjog , @kyessenov , @PiotrSikora: looks like newly generated random 0x3D230467 magic number should not conflict with Http, http/2, TLS, Redis, SQL... (googling this number doesn't say it's any protocol's magic number.) Summarizing the findings from chat here(thanks to @kyessenov,@PiotrSikora for the help):
|
This validates that having these inspectors doesnt affect Metadata Exchange filter
@lizan ping ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly lgtm. defer to @PiotrSikora
|
||
ProtobufTypes::MessagePtr | ||
MetadataExchangeUpstreamConfigFactory::createEmptyConfigProto() { | ||
return ProtobufTypes::MessagePtr{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: std::make_unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
std::string initial_header_buf = std::string( | ||
static_cast<const char*>(data.linearize(initial_header_length)), | ||
initial_header_length); | ||
const MetadataExchangeInitialHeader* initial_header = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is an undefined behavior (unaligned access) when initial_header_buf.c_str()
is not aligned to uint32_t
, UBSAN doesn't fail just because of luck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But shouldn't marking MetadataExchangeInitialHeader struct as packed attribute help with misalignment errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No packed struct doesn't mitigate misalignment errors, because you're doing reinterpret_cast
from a raw pointer which is not guaranteed to be aligned.
Since you're already doing a copy here, why not just do:
MetadataExchangeInitialHeader initial_header;
data.copyOut(0, initial_header_length, &initial_header);
then where you copied to is aligned.
Work left:
onNewConnection gets called when connection is established but before handshake is complete. Thus couldnt do metadata write here.
Things to note: