Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use openat when encountering ENAMETOOLONG #88731

Closed
wants to merge 7 commits into from

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Sep 7, 2021

This adds a fallback to various fs methods that handles ENAMETOOLONG by splitting the path into relative segments and then opening each segment with openat(dirfd, segment, O_PATH).

Limitations:

  • not all fs methods are covered. the primary motivation is to get remove_dir_all working to delete a deep directory structure created by accident
  • currently linux-only. this could be extended to platforms which have either O_PATH or O_EXEC but there's no CI coverage for the BSDs, so I don't want to foist it on them.

fixes #53339

@rust-highfive
Copy link
Collaborator

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 7, 2021
@lukaslueg
Copy link
Contributor

IMHO this is very close to unexpected magic behaviour. Essentially ENAMETOLONG is filtered, which may provide comfort, but is unexpected. There is also no way Ask-For-Forgiveness-Not-Permission, expecting and handling ENAMETOLONG in the caller. Maybe this should get a separate entry point.

@the8472
Copy link
Member Author

the8472 commented Sep 7, 2021

File names which are too long (>255 bytes on many filesystems) would still result in ENAMETOOLONG, i.e. the error is not filtered. This fallback only handles paths exceeding PATH_MAX (4096 bytes on linux).

Having a separate deep version for all the std::fs functions would require them to be platform-specific extension because some platforms lack the -at syscalls or O_PATH/O_EXEC. That would be quite unergonomic, especially for the convenience methods like fs::write.

@the8472 the8472 added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Sep 8, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 3, 2021
@m-ou-se
Copy link
Member

m-ou-se commented Oct 3, 2021

@joshtriplett, @cuviper, what's your opinion on this change?

@m-ou-se m-ou-se assigned joshtriplett and unassigned m-ou-se Oct 3, 2021
@m-ou-se m-ou-se added the O-linux Operating system: Linux label Oct 3, 2021
@joshtriplett
Copy link
Member

Why does this need O_PATH, rather than O_DIRECTORY? The latter is supported by every kernel version we support, and I think it'll work here.

@joshtriplett joshtriplett added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 4, 2021
@the8472
Copy link
Member Author

the8472 commented Oct 4, 2021

Already answered on zulip, but repeating it here for the record:

O_PATH is necessary on linux because open requires permission flags to be passed (this uses O_RDONLY) but that would fail when traversing exec/search-only directories. O_PATH circumvents that because it ignores the permission flags. Some other platforms have O_EXEC that could be used instead.

@cuviper josh suggested I check with you whether RHEL happened to backport O_PATH support since the minimum kernel version the only thing keeping us from using it by default for remove_dir_all.

@the8472 the8472 added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 4, 2021
@cuviper
Copy link
Member

cuviper commented Oct 4, 2021

@cuviper josh suggested I check with you whether RHEL happened to backport O_PATH support since the minimum kernel version the only thing keeping us from using it by default for remove_dir_all.

No, that wasn't backported.

However, Linux doesn't check for invalid/unknown flags in open/openat -- this is how we used to get away with passing O_CLOEXEC even before we updated to a sufficient minimum kernel. So if we pass O_RDONLY | O_PATH, it should behave the same as O_RDONLY.

but that would fail when traversing exec/search-only directories.

So now we're talking about the failure of a new fallback, which means this is a corner case that didn't work before due to long paths, and will still fail (on old kernels) if it can't read. This seems acceptable to me, but should we let that EACCES through or return the original ENAMETOOLONG error?

@the8472
Copy link
Member Author

the8472 commented Oct 4, 2021

So if we pass O_RDONLY | O_PATH, it should behave the same as O_RDONLY.

That may be good enough to always use openat in remove_dir_all instead of the current approach + fallback for each file visited.

This seems acceptable to me, but should we let that EACCES through or return the original ENAMETOOLONG error?

I lean towards passing them through since on newer kernels that do support O_PATH it'll be a genuine traversal error. And a complete lack of permissions is probably more common than -r+x.

@rust-log-analyzer

This comment has been minimized.

@the8472
Copy link
Member Author

the8472 commented Oct 10, 2021

let's see if this also builds on the other unixes

@bors try

@bors
Copy link
Contributor

bors commented Oct 10, 2021

⌛ Trying commit c2ca4be16e5dc9c0bf006849c7b2f1d76c086806 with merge 8b572758695f4536fbe69894c98b5560ae8ff1cc...

@bors
Copy link
Contributor

bors commented Oct 11, 2021

☀️ Try build successful - checks-actions
Build commit: 8b572758695f4536fbe69894c98b5560ae8ff1cc (8b572758695f4536fbe69894c98b5560ae8ff1cc)

@the8472
Copy link
Member Author

the8472 commented Oct 11, 2021

It appears to work. remove_dir_all now walks the directory tree incrementally via openat which allows it to reach arbitrary depths. It now uses a loop instead of recursion and only keeps a limited number of file descriptors open, so it won't run into stack or rlimits either.

Protip while fiddling with recursive deletion: Use a sandbox or have btrfs snapshots ready 😅

@cuviper
Copy link
Member

cuviper commented Oct 12, 2021

let's see if this also builds on the other unixes

@bors try doesn't run full CI, just dist-x86_64-linux.

@workingjubilee
Copy link
Member

#95925 was closed so this is de facto unblocked, I guess.

@the8472
Copy link
Member Author

the8472 commented Jul 31, 2023

Ah yes. Though this PR was written prior to the fix for CVE-2022-21658 (which inconveniently has no associated issue...) so I'll have to go over it again to make sure it wouldn't reintroduce that.

@the8472
Copy link
Member Author

the8472 commented Jul 31, 2023

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 31, 2023
@workingjubilee
Copy link
Member

Oh. 😬

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Aug 5, 2023

The Miri subtree was changed

cc @rust-lang/miri

@rust-log-analyzer

This comment has been minimized.

Comment on lines +57 to +47
#[cfg(all(miri, any(target_os = "linux")))]
use libc::open64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised that this import is only present for Miri -- that seems strange?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two code-paths, one that uses openat and one that uses open. Miri only supports the latter, so the entire machinery around -at syscalls is configured-out under miri and the alternative is configured-in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, and there are some other import for open64 below for other targets. The entire set of cfgs is just tricky to have an overview of. Makes sense.

@RalfJung
Copy link
Member

RalfJung commented Aug 6, 2023

FWIW the reason Miri needs special cases here as that the standard library doesn't expose openat-like functionality itself itself -- something like a way to open a file on a ReadDir (or some other representation of a "directory FD"). Without that we can't implement an openat shim. (Miri can run Linux targets on Windows hosts, which is quite valuable and means falling back to the host openat isn't really an option.)

This adds a fallback to various fs methods that handles ENAMETOOLONG
by splitting the path into relative segments and then opening each
segment with openat(dirfd, segment, O_PATH).

Limitations:

* not all fs methods are covered. the primary motivation is to get
  remove_dir_all working to delete a deep directory structure
  created by accident
* requires O_PATH, so this can currently only be a fallback and
  not the default due to our current minimum kernel version. If you're
  not dealing with long paths it won't be used.
* currently linux-only. this could be extended to platforms which have
  either O_PATH or O_EXEC but there's no CI coverage for the BSDs
  so I don't want to foist it on them
* O(n²) performance if remove_dir_all has to use the fallback, albeit
  a small constant factor due to the long path segments used
  but ideally it should be rewritten to use openat in its recursion steps.
  But to do it properly we need a higher minimum kernel version.
it can remove arbitrarily-deep directory trees without exhausting
the stack or file descriptor limits.
The symlink attack/TOCTOU from CVE-2022-21658 that can occur when
traversing the directory hierarchy more than level at a time is
addressed by retracing the .. hierarchy after opening a descendant.
Opening .. isn't subject to symlink attacks so we can reliably compare
dev/ino numbers.
@the8472
Copy link
Member Author

the8472 commented Aug 14, 2023

FWIW the reason Miri needs special cases here as that the standard library doesn't expose openat-like functionality itself itself -- something like a way to open a file on a ReadDir

You could use cap-std for that which offers that functionality. Maybe it's not race-free on all platforms if they don't support the necessary primitives but I assume at least on the major unix-likes it would be using openat too.

But yeah, we should get the basics into std. Though that'll pose a compatibility question around windows since the necessary API there isn't officially stable.

@bors
Copy link
Contributor

bors commented Sep 2, 2023

☔ The latest upstream changes (presumably #115469) made this pull request unmergeable. Please resolve the merge conflicts.

@JohnCSimon
Copy link
Member

@the8472
ping from triage - can you post your status on this PR? This PR has not received an update in a few months, and the PR itself is ... 2021 . Thank you!

@the8472
Copy link
Member Author

the8472 commented Dec 19, 2023

Closing for now. This will be more easily implemented on top of ACP 259

@the8472 the8472 closed this Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-linux Operating system: Linux S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std::fs::remove_dir_all doesn't handle paths that are too long