-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repo: Change locking for summary regeneration to be shared #2493
Conversation
This is trying to address: https://pagure.io/fedora-iot/issue/48 Basically we changed rpm-ostree to start doing a shared lock during commit by default, but this broke because pungi is starting a process doing a commit for each architecture, and then trying to regenerate the summary after each one. This patch is deleting a big comment with a rationale for why summary regeneration should be exclusive. Point by point: > This makes sure the commits and deltas don't get > deleted while generating the summary. But prune operations require an exclusive lock, which means that data still can't be deleted when the summary grabs a shared lock. > It also means we can be sure refs > won't be created/updated/deleted during the operation, without having to > add exclusive locks to those operations which would prevent concurrent > commits from working. First: The status quo *has* prevented concurrent commits from working! There is no real locking solution to this problem. What we really need to do here is regenerate the summary after each commit *or* when the caller decides to do it and e.g. include deltas at the same time. It's OK if multiple threads race to regenerate the summary; last-one-wins behavior here is totally fine.
cc @dbnicholson |
If we're OK with multiple |
OK right, looking at the code, the obvious problem with this is that this can trivially error out due to |
In https://pagure.io/fedora-iot/issue/48 we found a regression from coreos@e0d2a95 I think this is really an ostree bug; see ostreedev/ostree#2493 But for now we need to allow callers to choose either the old or new behavior. This way e.g. pungi can be configured to lock or not without having to upgrade/downgrade rpm-ostree.
We talked about this at Endless a couple times and I think we talked about it here, but I can't find it. In my original PR I did have it shared as can be seen in eba8a17. However, when the locking was actually added in #1681 it was made exclusive. I believe that came from a conversation we'd had that I can't find. I think the idea was that you want the data to be stable so that there aren't inconsistencies between refs, commits and deltas. But I think it's more in line with the rest of ostree for this to be shared and that you'll reach eventual consistency if your processes update the repo metadata consistently. Anyways, I think it's fine to make this shared. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
One thing I did always want to address that was made possible with the exclusive lock was to ensure you weren't racing when trying to sign the summary. If the summary generation takes an exclusive lock and the summary signing takes an exclusive lock, then you can't accidentally publish a mismatching signature because one process will block the other. Now that the locking API is public, you can take an exclusive lock before doing this whole process if you desire. The other thing I was trying to achieve in my unfinished #1946 was to make the summary generation and signing non-racy (at least on the repo side), but that's kind of separate. |
This is trying to address:
https://pagure.io/fedora-iot/issue/48
Basically we changed rpm-ostree to start doing a shared lock during
commit by default, but this broke because pungi is starting a process
doing a commit for each architecture, and then trying to regenerate
the summary after each one.
This patch is deleting a big comment with a rationale for why
summary regeneration should be exclusive. Point by point:
But prune operations require an exclusive lock, which means that
data still can't be deleted when the summary grabs a shared lock.
First: The status quo has prevented concurrent commits from working!
There is no real locking solution to this problem. What we really
need to do here is regenerate the summary after each commit or
when the caller decides to do it and e.g. include deltas at the same
time.
It's OK if multiple threads race to regenerate the summary;
last-one-wins behavior here is totally fine.