Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stateful_session: added strict mode to header based stateful session #30573

Merged
merged 16 commits into from
Dec 7, 2023

Conversation

cpakulski
Copy link
Contributor

Commit Message:
Added strict mode to header based stateful session
Additional Description:
Risk Level: Low
Testing: Extended existing tests and added new tests to cover this change
Docs Changes: Yes
Release Notes: Yes
Platform Specific Features: No
Fixes #30475

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
…th strict mode.

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #30573 was opened by cpakulski.

see: more, trace.

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Copy link
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look great. Only some minor comments.

Comment on lines 43 to 44
// If set to True, the HTTP request must be routed to the requested destination.
// If the requested destination is not available, Envoy returns 503. Defaults to False.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If the request destination is not available, Envoy will treat that as no healthy upstream and response with 503 status will be returned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high-level question: is this valid only for header, or will this also be used for other types of stateful-session (e.g., cookie). If the latter, then I propose creating a "common" proto, that they all share, and include that in each of them.
Also, consider clarifying what does "not available" means (i.e., non-existent because it was removed, or unhealthy because of active/passive health-check).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @adisuissa here is previous response from @cpakulski to this question in another issue. #30475 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wbpcode That comment seems to be an argument why this should be configured in the filter instead of in the cluster, and I agree that doing it at the filter level makes sense. But I don't think that answers the question @adisuissa raised about whether this is specific to header-based SSA or could also apply to cookie-based SSA. It seems like it should apply the same to either mechanism -- regardless of whether the affinity information is stored in a header or a cookie, we may want to say that if the specified host is not available, we fail the request instead of picking a different host.

Perhaps this field should be moved to the stateful session filter config itself, so that it applies regardless of what session state extension is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that strict setting does not make sense for cookie-based SSA. Let's imagine that there is strict setting for cookie-based SSA. Envoy receives a request, determines that the desired host is not not available and returns 503. A client must send another request WITHOUT cookie. Envoy selects next available upstream host, generates cookie and attaches it to the response. From now on, the client attaches that cookie to each subsequent request to get stickiness. Currently, when upstream host encoded in the cookie is not available, Envoy selects next available upstream host and encodes it in the cookie return in the response. So the final result is the same.
The difference between cookie-based and header-based stateful sessions is that in header-based, encoding format of the header is documented and the client sort of takes over load balancing logic. In cookie-based stateful session, the format and encoding of the cookie are not documented and should not be used by user.
IMHO strict settings only makes sense when user encodes the desired destination and says "I am only interested in that one destination" and such situation takes place only in header-based stateful session.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating that tracking issue, @wbpcode! We can follow up on this discussion over there.

How shall we proceed with this PR? Is strict mode still useful while the question of the session state being a public API is still undecided? If not, then I guess we should hold off on this PR until we resolve that issue. But if it is still useful, then how about we move this new field to the StatefulSession proto, so that it applies equally to both session state mechanisms?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this strict mode and the format could be completely independent parts and could be implemented in different PR. So, I prefer to proceed with this PR.

But if it is still useful, then how about we move this new field to the StatefulSession proto, so that it applies equally to both session state mechanisms?

Yeah, I agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to make it a part of StatefulSession proto.

I agree that strict mode is separate and independent from the format. Just need to think about the use case when strict is enabled and the received cookie is expired. I guess that in that case, the stateful filter should immediately return 503 without selecting a new host.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned above, I would argue that we should not have tried to solve #26347 on the data plane side in the first place. I think cookie expiration should be strictly an issue for the application to deal with, and the data plane shouldn't even know about the expiration time. I think if the cookie is sent, the data plane should honor it, and strict mode should apply, just as it would be expected to if the cookie were not expired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdroth I think that in this case managing cookie's expiration time falls into grey area. I think that issuer of the cookie should verify expiration time. Normally it is a server.
So, here - on one side, it should not be part of data plane processing, but on the other side Envoy issued the cookie so Envoy should verify it's content.

@@ -757,13 +757,14 @@ class StreamDecoderFilterCallbacks : public virtual StreamFilterCallbacks,
* host list of the routed cluster, the host should be selected first.
* @param host The override host address.
*/
virtual void setUpstreamOverrideHost(absl::string_view host) PURE;
virtual void setUpstreamOverrideHost(absl::string_view host, bool strict) PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtual void setUpstreamOverrideHost(Upstream::LoadBalancerContext::OverrideHost host) PURE;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually left that intensionally in that form. The reason is that info about the desired upstream host is assembled from two distinct pieces: config option strict and IP address from decoded header. There must be a point in the code where those two pieces are assembled into Upstream::LoadBalancerContext::OverrideHost, which is a pair. I thought that it would be cleaner to assemble them in the setUpstreamOverrideHost method, but maybe it would be better to assemble them into a pair before calling that method. Let me experiment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored.

…eader

based stateful sessions.

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
@cpakulski
Copy link
Contributor Author

Moved 'strict' option to main section of API, so it applies to cookie-based and header-based stateful sessions.
I think that this feature is independent from the internal format of the header/cookie.
Let's continue discussion of the format in #30678.

Copy link
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm api

@@ -23,6 +23,10 @@ message StatefulSession {
//
// [#extension-category: envoy.http.stateful_session]
config.core.v3.TypedExtensionConfig session_state = 1;

// If set to True, the HTTP request must be routed to the requested destination.
// If the requested destination is not available, Envoy returns 503. Defaults to False.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, note that if/when we implement this feature in gRPC, we would return status UNAVAILABLE instead of an HTTP status code. But HTTP 503 maps to UNAVAILABLE anyway, so we don't necessarily need to explicitly note this here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that info and reviewing API.
So, it would be HTTP 200 with grpc-status header code different than zero, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In gRPC, the SSA feature operates on the client side, not the server side, so there is no HTTP code involved; it would simply report to the application that the RPC failed with status UNAVAILABLE.

If someone used gRPC with an Envoy sidecar proxy and the SSA feature was used in Envoy instead of in gRPC, then Envoy would return HTTP status 503, which the gRPC client would map to status UNAVAILABLE. So the client application would see the same result in both cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining!

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
@cpakulski
Copy link
Contributor Author

/retest

Copy link
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with only one minor comment. Thanks for your such great contribution. 🌷

virtual void setUpstreamOverrideHost(absl::string_view host) PURE;
virtual void setUpstreamOverrideHost(Upstream::LoadBalancerContext::OverrideHost&&) PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this move semantics make sense? The override host just a pair of string view and bool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. Move will most likely apply only to std::pair internals. string_view and bool will just copy their values. I will change it to pass by value. Thanks!

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Copy link
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your great work!!!

@wbpcode wbpcode merged commit 62f4a14 into envoyproxy:main Dec 7, 2023
106 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stateful session: add strict mode to header based stateful session
6 participants