Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement very simple advisory lock mechanism #664

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vlovich
Copy link
Contributor

@vlovich vlovich commented May 11, 2024

What does this PR do?

Introduces very basic advisory lock integration. As I mentioned in the issue, going for a very simple non-fancy approach that only lets you acquire a single advisory lock per file entry.

Motivation

Implementing a database & would like to use a safe version of advisory locks to implement mutual exclusion for writers. It's not possible to implement advisory locking in pure safe code outside because the resulting usage becomes unsound very easily (the Rust advisory-lock crate pretends it's safe but I'd argue that crate is unsound because it doesn't guarantee the usage of the lock is correct within the context of the current process).

Related issues

#663

Additional Notes

#448 prevents us from implementing "blocking" versions that wait forever for the lock to become available. The reason is that it would require spawning a new thread which would leak memory (it would also be horribly slow & interact poorly with cancellation).

Checklist

[X] I have added unit tests to the code I am submitting
[X] My unit tests cover both failure and success scenarios
[] If applicable, I have discussed my architecture

@vlovich vlovich force-pushed the advisory-locks branch 8 times, most recently from e312871 to 531850c Compare May 11, 2024 16:31
"Not holding the lock!"
);

let source = self
Copy link
Contributor Author

@vlovich vlovich May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I've noticed here is that the background thread dispatch is quite expensive (the syscall takes microseconds but the round-trip to the background thread takes >10ms on my 3.9 GhZ CPU & the background thread should be idle). While that's fine for acquiring the lock (can argue the latency is just the cost of acquisition), it's a bit unfortunate for unlock - not sure if I should do the funlock explicitly since try_take_last_clone_unlocking_guard is much slower than drop + try_take_last_clone (except that pattern isn't quite as safe if you're trying to enforce a clean shutdown). I could be convinced that unlock should be a synchronous call - @glommer thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the main cpu busy? My suspicion is that you have a too high poll period, and if the main thread is busy, it will not know that it is completed until poll happens.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very familiar with advisory locks in practice. Never used them, so it's hard for me to talk about the caveats. But it seems to me that:

  • you have state in userspace that tracks whether the lock is held (which is great, btw)
  • you will only try to acquire the lock in the kernel when you already know you will succeed.

If that's the case, maybe we can just call the blocking call? Blocking for microseconds is absolutely not a problem, especially if this is a lock (not called thousands a time per second). You'd want this to happen on the blocking thread if this is an operation that can take dozens of milliseconds, potentially.

Is there any case where we are concerning that locking (or unlocking) could take more than 1ms ?

Copy link
Contributor Author

@vlovich vlovich May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarking it on my machine (which isn't the best but it's the tool I have without resorting to reading the Linux kernel source), it's ~220ns for lock + unlock call (lock & unlock seem equally expensive).

I originally had the blocking version also in this PR but that approach doesn't work because we can easily deadlock since we run out background threads to distribute work to. Probably no way to implement those without proper I/O uring support (or doing something silly like doing a try lock & then spawning a completely new thread just for waiting if it fails).

So yeah, I'll move these back onto the main thread.

Copy link
Contributor Author

@vlovich vlovich May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, just want to confirm. While it's fine for local files, this may show problems for NFS/SMB shares of which I don't have any to really test since those will map to network I/O. Handle it via documentation or stick to the block approach just in case?

Copy link
Contributor Author

@vlovich vlovich May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use fstatfs to determine if it's NFS/SMB & fall back to the blocking path in that case if we want "optimal" behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to assume NFS/SMB is an unlikely use case for DmaFile for now so leaving it out. Can always add a fstatfs call later to distinguish network FS from local at the cost of an extra syscall in the advisory lock path.

@vlovich vlovich force-pushed the advisory-locks branch 2 times, most recently from 877bc4f to 83261bf Compare May 13, 2024 23:22
@glommer
Copy link
Collaborator

glommer commented May 28, 2024

Is this ready to go ? Looks good on a new review.

In related news, I am considering removing the clippy tests from CI. I still love the idea, but we struggled a lot already trying to figure out a good way to pin versions to make sure it stays stable, yet it keeps breaking. Do you have any thoughts ?

@vlovich
Copy link
Contributor Author

vlovich commented May 28, 2024

I thought it was but need to fix failing tests.

I actually don't mind the clippy stuff in CI. Keeps the codebase consistent which outweighs the nitpickiness annoyance of it for me personally. Probably could do with a pre push hook that runs the required clippy lints and doctests before so that I don't churn so much on it though :)

If we're running clippy on a floating version (eg nightly or latest stable) then we could just not do that and stick to running clippy only on pinned versions at the risk of having more work when bumping the minimum / maximum rust version we build against (+ having downstream projects get more complaints if they're running newer than the version we test against). All options come with trade offs 😃

@glommer
Copy link
Collaborator

glommer commented May 28, 2024

yeah, the pre-hooks may work. I'll take a look at it later this week.
It was supposed to be a pinned version, but somehow it keeps being updated.

@vlovich
Copy link
Contributor Author

vlovich commented May 31, 2024

BTW the failure here was just a normal compile error :).

Blocking versions can't be implemented because they would require
spawning a new thread to avoid jamming up the background thread used for
other syscalls (the jamming up can generate a deadlock which is what
testing found). The blocking version would require spawning a new thread
with a standalone executor but that's horibbly expensive AND runs into
DataDog#448.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants