Support ES API Keys for client operations #1067

gingerwizard · 2020-10-01T10:44:09Z

The Elastic agent uses API keys to communicate with ES. This is the recommended way for clients to communicate with ES. We should do this by default. For benchmark-only this is trivial. For cases where we orchestrate ES we will need to create this API key using the creds.

@danielmitterdorfer @dliappis

The text was updated successfully, but these errors were encountered:

gingerwizard · 2020-10-01T13:24:02Z

This will be simplified greatly once we have universal API keys i.e. cloud API keys can be mapped to roles in ES clusters. This avoid us having to create anything.

gingerwizard · 2020-10-19T09:27:35Z

I see two potential modes here:

Every client uses the same API key passed as a parameter like basic auth
Every client authenticates and uses its own api key - this is the more realistic scenario if replicating our agent.

hub-cap · 2020-10-21T20:23:52Z

Do we have any idea for the permission set to be used for each key? If we do not define a fine grained permission set, the key will default to the same auth as the rally basic auth user. If we do not care about exercising the fine grained access control, it is definitely easiest to let every key default to the same permissions the http auth user has when creating it.

(Optional, array-of-role-descriptor) An array of role descriptors for this API key. This parameter is 
optional. When it is not specified or is an empty array, then the API key will have a point in time 
snapshot of permissions of the authenticated user. If you supply role descriptors then the resultant 
permissions would be an intersection of API keys permissions and authenticated user’s permissions 
thereby limiting the access scope for API keys. The structure of role descriptor is the same as the 
request for create role API. For more details, see create or update roles API.

gingerwizard · 2020-10-22T10:45:52Z

Given Rally is in most cases running in a privileged mode and should never be used against a production cluster, i'm personally ok with the token defaulting to the same permissions as the rally basic auth user. I can't see any cases where we would want to test restricted permissions unless we believe it has the potential to impact performance. Even if the latter does become true, this feels like an enhancement that can be addressed in later iterations.

This commit adds a optional client arg, use_api_key, which will generate and use an api key for every client and async client created by Rally. It uses an initial call via the existing auth scheme to ES to generate the key, and then creates a new client using the api_key for general use by Rally. The async client also does this, and temporarily creates a sync client to generate the key, and then uses it when creating an async client. Closes elastic#1067

danielmitterdorfer · 2020-11-03T12:24:30Z

Please refer to these definitions for terms that I use below:

Client: A logical client simulated by Rally. The number of simulated clients is specified by the clients property on a task.
Worker (process): The Rally actor that simulates load. Rally creates one worker per CPU core. Each worker may simulate multiple clients (using asyncio).
Elasticsearch Python client: The Python library that Rally uses send requests to Elasticsearch.

I agree with #1067 (comment) that we need two modes but I think we should first implement the mode where each client uses its own API key as this more relevant for realistic benchmarks.

@hub-cap and me had a chat about this already and the implementation in #1098 covers a scenario where each worker process (potentially simulating multiple clients) uses its own API key, i.e. we neither use shared key for all clients nor does each client use its own key.

In order to implement this, we need to consider the following aspects:

Clients and Elasticsearch Python clients are created per task (see the lifecycle of AsyncIoAdapter which gets recreated for each task). As creating many (thousand) API keys is time-consuming, they should only be created once for the benchmark and reused across tasks. This means we need a (generic) facility to create API keys at the beginning of a benchmark if needed, i.e. when the user specifies the respective client option.
API keys need to be stored in a data structure that allows each client to lookup its API key. We should not tie this data structure specifically to API keys but rather introduce a more generic data structure like a "client context" which allows to store arbitrary information per client.
Clients should be able to pass the API key transparently to the Elasticsearch Python client. At the moment the Elasticsearch Python client assumes that the API key is provided in the contructor of the elasticsearch.Elasticsearch class. It is passed to the constructor of elasticsearch.connection.base.Connection where the API key parameter is converted into the HTTP header authorization. While it would be possible that we set the API key ourselves per request in Rally's AIOHttpConnection with something like the following sketch, we should clarify whether it is possible that the Elasticsearch Python client provides a per-request override for the API key:

    async def perform_request(self, method, url, params=None, body=None, timeout=None, ignore=(), headers=None):
        final_headers = dict(headers)
        final_headers["authorization"] = self._get_api_key_header_val(api_key=retrieve_api_key_for_current_client())
        return await super().perform_request(method, url, params, body, timeout, ignore, final_headers)

sethmlarson · 2020-11-04T15:20:10Z

Supporting per-request api_key and http_auth shouldn't be too hard as it's being done for the Enterprise Search client. I've created an issue here: elastic/elasticsearch-py#1421

YohanSciubukgian · 2021-02-24T15:28:35Z

Any update on this issue ? It could be great to use ApiKey authentication to use Rally against our ES cluster.

DJRickyB · 2021-02-24T15:56:47Z

Hi @YohanSciubukgian ,

I asked around internally; sorry, there's hasn't been any work done here yet. Once this gets traction the ticket will be updated

Thanks

sethmlarson · 2021-04-20T15:24:21Z

The upstream issue blocking this feature has been implemented in elastic/elasticsearch-py#1577 and will be available in 7.13.0, you can test against the 7.x branch when developing this feature.

inqueue · 2021-12-16T04:13:56Z

API key authentication should also be supported for the ES-based metrics store.

michaelbaamonde · 2022-05-16T15:33:59Z

I'm working on this at the moment and wanted to separate some concerns and clarify scope. As initially proposed and later reiterated, the primary motivation is supporting a unique API key per simulated client. Secondarily, we should support a global, user-provided API key across all clients. I agree that we should also support API key authentication for the metrics store as suggested, but this is a bit orthogonal and requires changes to completely different code paths, so we can tackle that separately. I'll file a separate issue for this.

Since the primary use case entails Rally creating an API key per simulated client, it will need to be provided with credentials sufficient for calling the relevant Elasticsearch APIs for creating API keys and I'd argue also invalidating/deleting them as well upon benchmark completion.

This results in 3 different scenarios that we'll need to support:

Rally creates a unique API key per simulated client. Each request submitted by a simulated client passes its unique generated API key as an argument to the relevant ES Python client method on a per-request basis. All other requests issued to the ES benchmark candidate--chiefly the creation and eventual deletion of these API keys--authenticate via basic auth.
Same as above, but all other operations against the ES benchmark candidate authenticate via a user-provided API key instead of basic auth.
All operations against the benchmark candidate ES cluster use a global, user-provided API key. To be clear, this means that every client uses the same API key, and Rally does not generate any API keys itself.

To support these scenarios, we need to introduce two new client options:

create_api_key_per_client
- Boolean that instructs Rally to generate a unique API key per simulated client. These generated API keys are used for authentication on a per-request basis, overriding the global authentication mechanism.
global_api_key
- Provides an API key to be used to authenticate and issue requests to the candidate ES cluster.
- Used for all requests, unless create_api_key_per_client is also provided and set to true.
  - In that case, the global_api_key will be used only for administrative tasks, principally generating client API keys before the benchmark begins and deleting them after it is complete.
- Mutually exclusive with the basic_auth_user and basic_auth_password client options. If a user provides both, it's an error condition.

Here's how each of the above scenarios would be configured using the two new client options:

Unique API key per client?	Global auth	Required client options	Forbidden client options
Yes	Basic	`create_api_key_per_client`, `basic_auth_user`, `basic_auth_password`	`global_api_key`
Yes	API key	`create_api_key_per_client`, `global_api_key`	`basic_auth_user`, `basic_auth_password`
No	API key	`global_api_key`	`basic_auth_user`, `basic_auth_password`

We can hash out the details around naming on the PR, but I'll mention here that global_api_key is just a placeholder and I'm happy to come up with something better if it'll clarify that it can actually be overridden by create_api_key_per_client.

Happy to hear any feedback on this approach!

Otherwise, the implementation will largely follow the broad strokes outlined in this comment. We'll aim to support scenario 1 first (generated API keys for client operations, basic auth otherwise) given that it's highest priority.

inqueue · 2022-05-16T16:47:00Z

+1 for this approach.

Rally creates a unique API key per simulated client. Each request submitted by a simulated client passes its unique generated API key as an argument to the relevant ES Python client method on a per-request basis. All other requests issued to the ES benchmark candidate--chiefly the creation and eventual deletion of these API keys--authenticate via basic auth.

API keys created with basic authentication inherit permissions of the authenticated user, though they can be created with optional role descriptors (as mentioned in item 2 from the docs). To be clear, the intention as stated in #1067 (comment) is to create unique API keys per simulated client from a single authenticated user without additional or unique role descriptors?

+1 to deleting generated API keys, but preserving the global_api_key. Does it make sense to preserve generated API keys as a user preference for scenarios where running multiple benchmarks is desired, but the user wishes to reuse API keys already generated?

michaelbaamonde · 2022-05-17T13:29:25Z

To be clear, the intention as stated in #1067 (comment) is to create unique API keys per simulated client from a single authenticated user without additional or unique role descriptors?

Yeah, that's what I'm thinking, at least for a first implementation. It's worth looking into how this is done by default by Agent, though, since the intention is to mimic its behavior. Good point.

Does it make sense to preserve generated API keys as a user preference for scenarios where running multiple benchmarks is desired, but the user wishes to reuse API keys already generated?

I could see this maybe being useful if it turns out that API key generation for a large number of clients is painfully slow, but it seems like a stretch. Did you have other scenarios in mind where this would be advantageous? It shouldn't be difficult to support this, but my leaning is to file it away as a future enhancement unless we foresee a compelling need for it. What do you think?

pquentin · 2022-05-18T05:43:02Z

The approach sounds good to me too. I believe we should first focus on the first scenario in the table: generating per-client API keys with basic auth. I think it will provide 95% of the value since anything else won't be in the hot path.

inqueue · 2022-05-18T11:51:24Z

Does it make sense to preserve generated API keys as a user preference for scenarios where running multiple benchmarks is desired, but the user wishes to reuse API keys already generated?

I could see this maybe being useful if it turns out that API key generation for a large number of clients is painfully slow, but it seems like a stretch. Did you have other scenarios in mind where this would be advantageous? It shouldn't be difficult to support this, but my leaning is to file it away as a future enhancement unless we foresee a compelling need for it. What do you think?

We can let this rest and see if a need surfaces. I do not see any compelling reason to add it now.

michaelbaamonde · 2022-05-18T13:17:04Z

The approach sounds good to me too. I believe we should first focus on the first scenario in the table: generating per-client API keys with basic auth. I think it will provide 95% of the value since anything else won't be in the hot path.

100% agree. That's the plan!

michaelbaamonde · 2022-05-18T21:24:34Z

I could use some input on a core implementation detail that I've been thinking about and have discussed a bit with @DJRickyB. In particular, I'd love to hear your thoughts @pquentin.

As discussed previously, a key implementation detail is that a client should be able to pass its unique API key to the Elasticsearch client transparently. This implies request-level overrides of the authentication mechanism set on the Elasticsearch client instance. As noted, elasticsearch-py introduced support for this in version 7.13.0. It requires simply providing an api_key kwarg to any client API method, which is then injected into the headers for the corresponding request, requiring no mutation of the Elasticsearch client instance.

My initial thinking was to rely on this mechanism. This ultimately requires that we make runners aware of the API key associated with its caller and ensure that it properly passes the api_key kwarg to whatever Elasticsearch client method(s) it invokes. Given the current Runner interface, the most obvious way to do this is to add an api_key key to the params dict that runners expect, and then ensure by some means that all runners include it in their ES client requests.

This works, but it does mean that runners and parameter sources would now need to concern themselves with authentication. For default runners in the Rally code, maybe this is okay, but this could be rather trappy for authors of custom runners and/or parameter sources. They would need to make sure that that api_key parameter is preserved and included in Elasticsearch client method calls. I don't think it's a good idea to introduce what's essentially a change to the Runner API in order for custom runners to use auto-generated API keys. It's surprising behavior, and it mixes concerns a bit.

Ideally, runners shouldn't need to care about ES client options such as authentication. They currently don't need to, but per-request auth overrides as required here complicate matters.

There's probably all sorts of ways that we could approach this within Rally. But! Conveniently, elasticsearch-py has introduced the .options() method on clients, which can be used to set transport options, including setting the api_key attribute per request (docs). This creates a new ES client instance that (in the case we're concerned with) overrides whatever auth was configured on the ES client instance that options() was called on and instead uses an API key.

This seems like exactly what we need. Instead of requiring that runners pass the api_key to the ES client, the client calling the runner would instead provide the runner with an ES client instance already configured with its unique API key. At a relatively high level, the key move would be to have AsyncExecutor call the .options() method on its ES client instance with the appropriate API key and pass the resulting instance to the execute_single function, which invokes the runner.

Done in this manner, I think all runners--including custom ones that Rally doesn't control--should Just Work(TM) with per-client API keys. It seems like the right separation of concerns to me.

The catch is that we'd need to update the elasticsearch-py dependency to at least 8.0.0, which will require some unknown but certainly non-zero amount of work due to deprecations and such. I'm also not sure if creating an ES client instance per Rally client would cause any issues with a large number of clients and introduce a potential bottleneck within the load driver. I'd think and hope not, but we'd need to verify.

We could probably hack this general approach into Rally itself using the 7.14.0 Python client without too much fuss (we already subclass elasticsearch.AsyncElasticsearch and could extend it, or we could more bluntly just create an ES client per simulated client as opposed to per worker), but we'll eventually want to upgrade anyway, and we'd might as well just do it now given that a need has arisen. The benefits seem worth it.

Assuming, of course, that this idea doesn't have a fatal flaw that I haven't considered. 😄 Hence my request for feedback. Thanks!

pquentin · 2022-05-19T12:26:22Z

Done in this manner, I think all runners--including custom ones that Rally doesn't control--should Just Work(TM) with per-client API keys. It seems like the right separation of concerns to me.

💯 This is a great, and I absolutely agree that this separate concerns nicely.

Regarding upgrading the Elasticsearch Python client to 8.0, this is the right thing to do and long overdue, so it would be nice if we get it done, but I'm afraid "non-zero amount of work" is an understatement - this sounds like the same amount of work as this issue, if not more. Maybe you can try to work on it for a timeboxed period (1 week?) and if it turns out to be too much work, we can instead reimplement .options() ourselves until we upgrade.

michaelbaamonde · 2022-05-19T14:03:59Z

I'm afraid "non-zero amount of work" is an understatement - this sounds like the same amount of work as this issue, if not more

I had a hunch it might be pretty involved, but hadn't yet looked in depth, so thanks for confirming that.

Maybe you can try to work on it for a timeboxed period (1 week?) and if it turns out to be too much work, we can instead reimplement .options() ourselves until we upgrade.

This seems reasonable to me.

gingerwizard added the enhancement Improves the status quo label Oct 1, 2020

hub-cap self-assigned this Oct 19, 2020

hub-cap mentioned this issue Oct 22, 2020

Optionally use generated api_key auth to ES #1098

Closed

danielmitterdorfer unassigned hub-cap Dec 9, 2020

michaelbaamonde self-assigned this May 10, 2022

michaelbaamonde changed the title ~~Support ES API Keys~~ Support ES API Keys for client operations May 16, 2022

michaelbaamonde mentioned this issue May 16, 2022

Support ES API keys for metrics store authentication #1490

Open

michaelbaamonde mentioned this issue Jun 13, 2022

Create an ES client per simulated client instead of per worker. #1516

Merged

michaelbaamonde mentioned this issue Jun 21, 2022

Create and use a unique ES API key for each simulated client #1520

Merged

michaelbaamonde closed this as completed Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support ES API Keys for client operations #1067

Support ES API Keys for client operations #1067

gingerwizard commented Oct 1, 2020

gingerwizard commented Oct 1, 2020

gingerwizard commented Oct 19, 2020

hub-cap commented Oct 21, 2020

gingerwizard commented Oct 22, 2020

danielmitterdorfer commented Nov 3, 2020

sethmlarson commented Nov 4, 2020

YohanSciubukgian commented Feb 24, 2021

DJRickyB commented Feb 24, 2021

sethmlarson commented Apr 20, 2021

inqueue commented Dec 16, 2021

michaelbaamonde commented May 16, 2022 •

edited

Loading

inqueue commented May 16, 2022

michaelbaamonde commented May 17, 2022

pquentin commented May 18, 2022

inqueue commented May 18, 2022

michaelbaamonde commented May 18, 2022

michaelbaamonde commented May 18, 2022 •

edited

Loading

pquentin commented May 19, 2022

michaelbaamonde commented May 19, 2022 •

edited by pquentin

Loading

Support ES API Keys for client operations #1067

Support ES API Keys for client operations #1067

Comments

gingerwizard commented Oct 1, 2020

gingerwizard commented Oct 1, 2020

gingerwizard commented Oct 19, 2020

hub-cap commented Oct 21, 2020

gingerwizard commented Oct 22, 2020

danielmitterdorfer commented Nov 3, 2020

sethmlarson commented Nov 4, 2020

YohanSciubukgian commented Feb 24, 2021

DJRickyB commented Feb 24, 2021

sethmlarson commented Apr 20, 2021

inqueue commented Dec 16, 2021

michaelbaamonde commented May 16, 2022 • edited Loading

inqueue commented May 16, 2022

michaelbaamonde commented May 17, 2022

pquentin commented May 18, 2022

inqueue commented May 18, 2022

michaelbaamonde commented May 18, 2022

michaelbaamonde commented May 18, 2022 • edited Loading

pquentin commented May 19, 2022

michaelbaamonde commented May 19, 2022 • edited by pquentin Loading

michaelbaamonde commented May 16, 2022 •

edited

Loading

michaelbaamonde commented May 18, 2022 •

edited

Loading

michaelbaamonde commented May 19, 2022 •

edited by pquentin

Loading