Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throttle per-index snapshot deletes #100793

Conversation

DaveCTurner
Copy link
Contributor

Each per-index process during snapshot deletion takes some nonzero
amount of working memory to hold the relevant snapshot IDs and metadata
generations etc. which we can keep under tighter limits and release
sooner if we limit the number of per-index processes running
concurrently. That's what this commit does.

@DaveCTurner DaveCTurner added >non-issue WIP :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.12.0 labels Oct 13, 2023
@DaveCTurner

This comment was marked as outdated.

@DaveCTurner DaveCTurner force-pushed the 2023/10/13/throttle-snapshot-index-deletes branch 3 times, most recently from c1f29e0 to b236c7a Compare October 16, 2023 20:56
Each per-index process during snapshot deletion takes some nonzero
amount of working memory to hold the relevant snapshot IDs and metadata
generations etc. which we can keep under tighter limits and release
sooner if we limit the number of per-index processes running
concurrently. That's what this commit does.
@DaveCTurner DaveCTurner force-pushed the 2023/10/13/throttle-snapshot-index-deletes branch from b236c7a to 9d5d363 Compare October 16, 2023 21:01
@DaveCTurner DaveCTurner marked this pull request as ready for review October 16, 2023 21:59
Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple questions. They may all be inconsequential. I'd appreciate your inputs to help my understanding. Thanks!

Comment on lines +351 to +352
if (snapshotsToDelete.containsAll(snapshotsContainingIndex)) {
return snapshotsContainingIndex.size() > snapshotsToDelete.size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means there are possibly duplicated SnapshotId entries in snapshotsContainingIndex? Why would that be possible? If there are duplicated entries, why do we consider this index to be "updated but not removed"? It seems to me that any duplicated entries are removed when computing survivingSnapshots.
Given these questions, I think some comments are necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied verbatim from the previous code. I also find it unclear, but I haven't had a chance to think about it very hard. Some comments would be welcome, but maybe not in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am aware this is existing code. I assumed that you knew the answers. Since it's unclear to you as well, perhaphs adding a TODO is useful?

Comment on lines +47 to +48
// This test ensures that we appropriately throttle the per-index activity when deleting a snapshot by marking an index as "active" when
// its index metadata is read, and then as "inactive" when it updates the shard-level metadata. Without throttling, we would pretty much
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this depends on the index having only a single shard. If that is case, maybe worth to mention in this comment.

// etc. which we can keep under tighter limits and release sooner if we limit the number of concurrently processing indices.
// Each one needs at least one snapshot thread at all times, so threadPool.info(SNAPSHOT).getMax() of them at once is enough to
// keep the threadpool fully utilized.
ThrottledIterator.run(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat class, TIL. In BlobStoreRepository, we have 3 ways for running jobs on a thread pool. I wonder whethere there is some possiblity to consolidate them:

  • ThreadPool#executor#execute - eagerly runs jobs on all available threads and eagerly queue remaining jobs
  • ThrottledTaskRunner#enqueueTask - Uses up to only a specified number of threads in the underlying threadpoo. Manages its own queue to avoid flooding the threadpool's queue.
  • ThrottledIterator#run - Uses up to only a specified number of threads in the underlying threadpool. Does not manage its own queue but still avoids flooding the threadpool's queue with an iterator. The usage of iterator also makes it more memory efficient than the above classes because jobs are not materialized till they are needed. One important difference with this class is that jobs must be added only once with an Iterator. It is not possible to add more jobs afterwards.

I am sure there are details that I am not aware yet. But conceptually it seems useful to have one abstraction that does the following:

  1. Configurable max number of threads to use
  2. Configurable queuing strategy, aggressive on conservative.
  3. Able to take new jobs beyond its instantiation time
  4. Eagerly run jobs on a single (or configurable) thread similar to ThrottledTaskRunner

Its most basic form would essentially behave just like a ThreadPool#executor and the advanced forms would be like ThrottledTaskRunner or ThrottledIterator. If such thing is possible, we can potentially use it in all places of BlobStoreRepository. Right now, we are still using plain ThreadPool#executor at shard deletion level. If this usage also needs to be migrated, I wonder whether it is helpful if we could have an overarching Threading abstraction that manages all jobs came out from a snapshot deletion.

Apologies this feels a bit rambling on my side. Feel free to disagree. Thanks!

Copy link
Contributor Author

@DaveCTurner DaveCTurner Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need to consolidate these things, they're quite different in many important ways. Conceptually:

  • ThrottledTaskRunner is about fairness/priority in a threadpool queue (amongst different computations).
  • ThrottledIterator is about limiting work-in-progress (within the context of a single computation) and not really anything to do with threading.

It's appropriate to use a plain ThreadPool#executor for the shard-level operations, there are effectively no other operations for the threadpool to do at this point (except perhaps some older background cleanup work but that can wait). It's important for the shard-level operations to go through as fast as possible because they're holding up the next snapshot. We don't want to interleave them with any background cleanup work from a previous delete which is what a ThrottledTaskRunner would do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important for the shard-level operations to go through as fast as possible because they're holding up the next snapshot.

I realised after writing this that of course this change does have the effect of interleaving the shard-level metadata ops with any leftover background cleanup activity a bit more. Eh. My feeling is that this is ok, background cleanup activity is not normally very onerous, and I believe it's more important to limit the WIP in this area.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being dense here

ThrottledIterator is about limiting work-in-progress (within the context of a single computation) and not really anything to do with threading.

ThrottledIterator does not have an executor by itself. But it sorta expects the work item to fork. In use case here, it effectively controls fairness/priority in threadpool's queue?
By a single computation, do you mean all tasks are passed in once via the iterator? The way how it triggers the next around of run is sorta similar to how ThrottledTaskRunner#pollAndSpawn is triggered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that ThrottledIterator only makes sense if the tasks do not all complete immediately on the calling thread, but that might mean that they dispatch a single task to a threadpool executor, or they dispatch several tasks (which in turn possibly dispatch further tasks) or indeed they send some work to another node entirely.

Originally ThrottledTaskRunner did not take a Releasable that tasks could use to indicate their completion on a different thread, it was only about whether the forked task had completed execution on the calling thread or not. Now we have that I guess we can cover some similar cases with it. I'd still want it to take an Iterator which computes the tasks on the fly, rather than having a fully-materialized queue tho.

Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The ongoing discussion about ThrottleIterator etc is really for my education and does not block this PR.

@DaveCTurner DaveCTurner added auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) and removed WIP labels Oct 17, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team label Oct 17, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine merged commit b0469e9 into elastic:main Oct 17, 2023
12 checks passed
@DaveCTurner DaveCTurner deleted the 2023/10/13/throttle-snapshot-index-deletes branch October 17, 2023 08:08
@DaveCTurner DaveCTurner restored the 2023/10/13/throttle-snapshot-index-deletes branch June 17, 2024 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue Team:Distributed Meta label for distributed team v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants