Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put operation with lease takes linear apply time depending on number of keys already attached to lease #15993

Closed
marseel opened this issue Jun 1, 2023 · 11 comments

Comments

@marseel
Copy link

marseel commented Jun 1, 2023

What happened?

Put request with lease takes linear apply time depending on the number of keys already attached to the lease. With the increasing number of keys and constant QPS, it's increasing to the point, where etcd spend almost 100% of time applying put requests making lease renewal fail.

What did you expect to happen?

I would expect put with lease not to depend linearly on number of already attached keys to lease.

How can we reproduce it (as minimally and precisely as possible)?

Run Etcd with:

docker run   -p 2379:2379   -p 2380:2380   --mount type=bind,source=/tmp/etcd-data.tmp,destination=/etcd-data   --name etcd-gcr-v3.5.9   gcr.io/etcd-development/etcd:v3.5.9   /usr/local/bin/etcd   --name s1   --data-dir /etcd-data   --listen-client-urls http://0.0.0.0:2379   --advertise-client-urls http://0.0.0.0:2379   --listen-peer-urls http://0.0.0.0:2380   --initial-advertise-peer-urls http://0.0.0.0:2380   --initial-cluster s1=http://0.0.0.0:2380   --initial-cluster-token tkn   --initial-cluster-state new  --auto-compaction-retention=1 --log-level info   --logger zap   --log-outputs stderr

Test client:

package main

import (
	"context"
	"fmt"
	"strconv"
	"sync"
	"time"

	client "go.etcd.io/etcd/client/v3"
	"go.etcd.io/etcd/client/v3/concurrency"
	"golang.org/x/time/rate"
)

func main() {
	c, err := client.New(client.Config{
		Endpoints:   []string{"localhost:2379"},
		DialTimeout: 5 * time.Second})
	fmt.Println("Opening session")
	session, err := concurrency.NewSession(c, concurrency.WithTTL(int(30)))
	if err != nil {
		return
	}

	limiter := rate.NewLimiter(1000, 20)
	maxDuration := time.Duration(0)
	mu := sync.Mutex{}

	fmt.Println("Starting writes")
	for i := 0; i <= 10000000; i++ {
		i := i
		limiter.Wait(context.Background())
		key := "asdasdasdasdasd" + strconv.Itoa(i)
		go func() {
			leaseID := session.Lease()
			start := time.Now()
			c.Put(context.Background(), key, key, client.WithLease(leaseID))
			duration := time.Since(start)

			mu.Lock()
			defer mu.Unlock()
			if duration > maxDuration {
				fmt.Printf("MAX latency: %v, entries: %d\n", duration, i)
				maxDuration = duration
			}
			if i%1000 == 0 {
				fmt.Printf("latency: %v, entries: %d\n", duration, i)
			}
		}()
	}
}

Apply duration increase linearly:
Screenshot from 2023-06-01 12-55-31

Anything else we need to know?

The above repro is using quite a significant QPS of 1k, but I've observed a similar issue with ~50k entries attached to the lease and only 20 QPS of put requests:
image
As you can see above, with sum(rate(etcd_server_apply_duration_seconds_sum[10s])) approaching 1.0 - leases start failing to renew (red line is Lease renewal returning unavailable error)

Etcd version (please run commands below)

I've tested both 3.5.9 (repro) and 3.5.4 (production setup with 20 QPS)

Etcd configuration (command line flags or environment variables)

Provided in reproduction

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

No response

Relevant log output

No response

@marseel
Copy link
Author

marseel commented Jun 2, 2023

I just realized that this performance issue was actually a reason why we have seen such a huge improvement in Kubernetes when we started limiting max number of events attached to a single lease (10x CPU reduction of Etcd): kubernetes/kubernetes#98257 (comment)

@tjungblu
Copy link
Contributor

tjungblu commented Jun 2, 2023

Thanks for the detailed report and repro @marseel :)

I've been quickly running this through a profiler with the following result:

image

That should be relatively simple to optimize, it's mostly on the creating a slice from a map which is then checked individually per key.

func (aa *authApplierV3) checkLeasePuts(leaseID lease.LeaseID) error {
l := aa.lessor.Lookup(leaseID)
if l != nil {
for _, key := range l.Keys() {
if err := aa.as.IsPutPermitted(&aa.authInfo, []byte(key)); err != nil {
return err
}
}
}
return nil
}

This TODO also looks like a nice candidate to optimize:

etcd/server/auth/store.go

Lines 847 to 850 in 004195b

// TODO(mitake): this function would be costly so we need a caching mechanism
if !as.IsAuthEnabled() {
return nil
}

tjungblu added a commit to tjungblu/etcd that referenced this issue Jun 2, 2023
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
@marseel
Copy link
Author

marseel commented Jun 2, 2023

Thanks for looking into this.

I think it explains why I've seen different results in this repro and our production cluster.
If I understand correctly, my repro doesn't use auth, but the production setup does - that's why I have seen such a big difference in QPS between production and repro.
For our use case, it would make sense to have a "short" path that checks if a user is a root only once instead of for all keys -

if hasRootRole(user) {

So maybe something like, isAuthDisabledOrRootUser before this loop:

for _, key := range l.Keys() {

@tjungblu
Copy link
Contributor

tjungblu commented Jun 2, 2023

I spoke a little too soon about the simplicity of the fix, this is actually a much bigger story for the following reasons.

  1. Keys() as a string slices

As you can see most of the time we're spending on converting between string and byte slices here. Map keys can't be []byte, so I assume this is why a string key has been chosen instead. The map and its amortized O(1) access also makes sense, especially in the attach/detach scenarios.

Because etcd is only []byte everywhere, there's a significant toll on converting it back and forth here and in many other places around the Lease. What do you folks think about using a different map implementation?

Obviously, the IsPutPermitted would also require a batched version where we pass multiple keys.

  1. runtime enable/disable of auth

It's a cool feature, but it seems we have to continuously check for auth. Otherwise we could just switch out the authBackend entirely and skip all the overhead it comes with.


I wrote a little mitigation that would (at least) fix it for the cases where auth is not enabled:
tjungblu@dff670b

which I assume would be most relevant for k8s anyway. Let me know if this is too pragmatic 🧷

@tjungblu
Copy link
Contributor

tjungblu commented Jun 2, 2023

If I understand correctly, my repro doesn't use auth, but the production setup does - that's why I have seen such a big difference in QPS between production and repro.

ah gotcha, how are you configuring etcd in that case?

@marseel
Copy link
Author

marseel commented Jun 2, 2023

Similar, but with auth enabled and only the root user is writing keys. As we can see here:

if hasRootRole(user) {

permission checks for root user and does not depend on key value so we could have short path for it too.

tjungblu added a commit to tjungblu/etcd that referenced this issue Jun 2, 2023
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
tjungblu added a commit to tjungblu/etcd that referenced this issue Jun 2, 2023
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
@tjungblu
Copy link
Contributor

tjungblu commented Jun 2, 2023

luckily there's a check for that already:

tjungblu@014b17d#diff-85ead21002d52058142cc354a8367cfe808d7a5d7cdc65ff315aa84edb947c48R128-R132

but yeah, let's hear the thoughts of the others. I think this would be a much bigger refactoring otherwise.

@marseel
Copy link
Author

marseel commented Jun 2, 2023

lgtm from my point of view. Thanks!

@tjungblu
Copy link
Contributor

tjungblu commented Jun 5, 2023

Thanks, opened it as a PR then, let's see what the reviewers think.

tjungblu added a commit to tjungblu/etcd that referenced this issue Jun 6, 2023
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
tjungblu added a commit to tjungblu/etcd that referenced this issue Jun 6, 2023
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Backport of etcd-io#16005

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
tjungblu added a commit to tjungblu/etcd that referenced this issue Jun 6, 2023
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Backport of etcd-io#16005

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
giorio94 added a commit to giorio94/cilium that referenced this issue Jun 7, 2023
Currently, each etcd client is associated with a single session, and the
corresponding lease is attached to all upserted keys (if the lease
parameter is set). This approach, though, suffers from performance
issues, because put requests in etcd take linear apply time depending
on the number of keys already attached to the lease (etcd-io/etcd#15993).
This performance penalty is planned to be fixed in etcd (at least for the
common case in which the user which performs the request has root role).

In the meanwhile, let's make sure that we attach a limited number of
keys to a single lease. In particular, this commit introduces the etcd
lease manager, which is responsible for managing the lease acquisitions,
tracking the keys that are attached to each of them. Once the number of
keys per lease exceeds the configured threshold, a new lease gets
automatically acquired. The lease usage counter is decremented when a
given key gets deleted. Finally, in case one of the leases fails to be
renewed, the manager allows to emit a notification event for all the keys
that were attached to it.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
tklauser pushed a commit to cilium/cilium that referenced this issue Jun 8, 2023
Currently, each etcd client is associated with a single session, and the
corresponding lease is attached to all upserted keys (if the lease
parameter is set). This approach, though, suffers from performance
issues, because put requests in etcd take linear apply time depending
on the number of keys already attached to the lease (etcd-io/etcd#15993).
This performance penalty is planned to be fixed in etcd (at least for the
common case in which the user which performs the request has root role).

In the meanwhile, let's make sure that we attach a limited number of
keys to a single lease. In particular, this commit introduces the etcd
lease manager, which is responsible for managing the lease acquisitions,
tracking the keys that are attached to each of them. Once the number of
keys per lease exceeds the configured threshold, a new lease gets
automatically acquired. The lease usage counter is decremented when a
given key gets deleted. Finally, in case one of the leases fails to be
renewed, the manager allows to emit a notification event for all the keys
that were attached to it.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
tjungblu added a commit to tjungblu/etcd that referenced this issue Jun 16, 2023
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
romanspb80 pushed a commit to romanspb80/cilium that referenced this issue Jun 22, 2023
Currently, each etcd client is associated with a single session, and the
corresponding lease is attached to all upserted keys (if the lease
parameter is set). This approach, though, suffers from performance
issues, because put requests in etcd take linear apply time depending
on the number of keys already attached to the lease (etcd-io/etcd#15993).
This performance penalty is planned to be fixed in etcd (at least for the
common case in which the user which performs the request has root role).

In the meanwhile, let's make sure that we attach a limited number of
keys to a single lease. In particular, this commit introduces the etcd
lease manager, which is responsible for managing the lease acquisitions,
tracking the keys that are attached to each of them. Once the number of
keys per lease exceeds the configured threshold, a new lease gets
automatically acquired. The lease usage counter is decremented when a
given key gets deleted. Finally, in case one of the leases fails to be
renewed, the manager allows to emit a notification event for all the keys
that were attached to it.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
wenjiaswe pushed a commit to wenjiaswe/etcd that referenced this issue Jul 7, 2023
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Backport of etcd-io#16005

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
@stale
Copy link

stale bot commented Sep 17, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 17, 2023
@ahrtr ahrtr added stage/tracked and removed stale labels Sep 18, 2023
@serathius
Copy link
Member

This was fixed via #16005

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants