Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make views require dedicated unsafe marker traits. #32

Merged
merged 8 commits into from
May 27, 2022

Conversation

sunfishcode
Copy link
Owner

Requiring Into and From conversions is not sufficient for
FilelikeView and SocketlikeView, becuase there's no guarantee
that the Into implementation will return the same handle as the From
implementation. If a type allows its handle to be reassigned, that
can lead to the old handle being freed twice, and the new handle being
leaked.

To fix this, introduce unsafe marker traits FilelikeViewType and
SocketlikeViewType, and have FilelikeView and SocketlikeView
require these traits.

Requiring `Into` and `From` conversions is not sufficient for
`FilelikeView` and `SocketlikeView`, becuase there's no guarantee
that the `Into` implementation will return the same handle as the `From`
implementation. If a type allows its handle to be reassigned, that
can lead to the old handle being freed twice, and the new handle being
leaked.

To fix this, introduce unsafe marker traits `FilelikeViewType` and
`SocketlikeViewType`, and have `FilelikeView` and `SocketlikeView`
require these traits.
@sunfishcode
Copy link
Owner Author

This PR as it stands is not enough. Here's another example:

use io_lifetimes::AsFilelike;
use std::fs::File;

fn main() {
    let a = File::open("Cargo.toml").unwrap();
    let b = File::open("Cargo.toml").unwrap();

    let mut v = a.as_filelike_view::<std::fs::File>();
    *v = b;
}

is enough to get a double-close.

Maybe we also need to remove the DerefMut implementation. That's pretty restrictive, because things like Read::read require &mut self. But maybe that means we need more utilities for doing read/write on non-mutable references.

@the8472
Copy link

the8472 commented May 14, 2022

because things like Read::read require &mut self

It's not that restrictive due to impl Read for &File, at worst you have to deref and then add a &mut.

@sunfishcode
Copy link
Owner Author

Oh wow, so this works:

    let v = f.as_filelike_view::<std::fs::File>();
    (&*v).read(&mut buf)

I would not have guessed that.

sunfishcode added a commit to bytecodealliance/rustix that referenced this pull request May 17, 2022
And similar for `dup3`.

The idea behind using `&OwnedFd` is that `dup2`'s second operand isn't like a
normal borrow. It effectively closes the old file descriptor, and creates a
new one with the same index. This could break assumptions of classes that have
an `AsFd` to allow users to do special I/O operations, but which don't expect
users can close and reopen their file descriptor as some completely unrelated
resource.

However, the existence of things like [`FilelikeView`], as well as the
`ManuallyDrop` pattern, mean that `&OwnedFd` doesn't actually prevent
users from using `dup2` on a `BorrowedFd`.

With sunfishcode/io-lifetimes#32 though, `&mut OwnedFd` would be
sufficient, because it removes the `DerefMut` implementation.

So change `rustix` stance to be that `dup2` requires `&mut OwnedFd`.

This means that it's no longer possible to pass the same file descriptor
to both operands of `dup2` or `dup3` with safe Rust, which means it's not
possible to observe the difference in behavior in that case, so remove
the `dup3.rs` test.

[`FilelikeView`]: https://docs.rs/io-lifetimes/latest/io_lifetimes/views/struct.FilelikeView.html
@sunfishcode
Copy link
Owner Author

The removal of DerefMut makes it possible to consider having rustix::io::dup2 take an &mut OwnedFd as its second argument. A PR for this is bytecodealliance/rustix#332.

This potentially means that the answer to "what's the practical difference between BorrowedFd<'a>, &'a OwnedFd, and &mut 'a OwnedFd is (a) BorrowedFd has a different representation and can be used in FFI, and (b) &mut 'a OwnedFd can be used as the second argument to dup2.

@sunfishcode
Copy link
Owner Author

sunfishcode commented May 17, 2022

The current PR here plus bytecodealliance/rustix#332 suggest a rule: there can only be at most one mutable OwnedFd for a given file descriptor, at any time.

An alternative would be to say "there can only be at most one OwnedFd for a given file descriptor, at any time". Same as before, but without mutable. We could still have views, because views don't hand out &OwnedFd; they hand out references to types that encapsulate and OwnedFd, like &File. OwnedFd doesn't implement the *likeViewType marker traits. There's an &OwnedFd on the inside of File's implementation, but it's not exposed. Edit: So we'd need to find some clever way to word this, so that it's ok to have multiple OwnedFds, as long as they're sufficiently encapsulated.

Another alternative would be to say you can have many OwnedFds for a given file descriptor at the same time, as long as you're careful that only one is dropped and none outlive the drop. That would mean that anyone can use dup2 to reassign the file description of any BorrowedFd, and we should do bytecodealliance/rustix#312.

@the8472
Copy link

the8472 commented May 17, 2022

Another alternative would be to say you can have many OwnedFds for a given file descriptor at the same time, as long as you're careful that only one is dropped and none outlive the drop.

That seems like it would take us back to RawFd territory that we wanted to avoid with the BorrowedFd<'a>/OwnedFd distinction.

@sunfishcode
Copy link
Owner Author

sunfishcode commented May 17, 2022

Another alternative would be to say you can have many OwnedFds for a given file descriptor at the same time, as long as you're careful that only one is dropped and none outlive the drop.

That seems like it would take us back to RawFd territory that we wanted to avoid with the BorrowedFd<'a>/OwnedFd distinction.

It would still be unsafe to do OwnedFd::from_raw_fd, for that reason. However, as_filelike_view::<File>() needs to do from_raw_fd to construct a temporary File that it can hand out references to. It does take care of making sure things aren't double-closed (after the fix in this PR). But, is it even ok for it to have a temporary OwnedFd during the lifetime of the real owning OwnedFd? The answers to the above options are, respectively:

  • Yes, because the temporary is not mutable.
  • Yes, because the temporary OwnedFd is sufficiently encapsulated inside File
  • Yes, because you can have any number of OwnedFds for the same fd live at the same time provided you take care.

@the8472
Copy link

the8472 commented May 17, 2022

Yes, because the temporary OwnedFd is sufficiently encapsulated inside File

This, of sorts. Because File promises, through the FilelikeViewType trait, to behave correctly when being used to wrap a not-really-owned-fd inside a View<File>.

Any of those in isolation would be unsafe. It's the constraints imposed by their combination that makes it safe. E.g. an owned File constructed from a BorrowedFd would be unsafe because you could drop it. But when you drop the View then it takes the necessary steps to restore safety. But this is only guaranteed to be correct if the wrapper promises to behave in a sane manner (fd that goes in is the fd that you'll take out).

@sunfishcode
Copy link
Owner Author

I'm having difficulty figuring out how to describe this rule. I think it's something like "types implementing FilelikeViewType shall not expose their internal OwnedFd as an &OwnedFd or &mut OwnedFd outside their encapsulation boundary", but I don't know exactly what the encapsulation boundary is.

@the8472
Copy link

the8472 commented May 18, 2022

I'm having difficulty figuring out how to describe this rule.

How about this?

Types implementing FilelikeViewType must, when constructed via From<OwnedFd>, return the same file descriptor, pointing to the same file description when deconstructed via Into<OwnedFd>.
Additionally they must not rely on exclusive access to the file descriptor that would normally be promised by OwnedFd.
E.g. struct Mmap(OwnedFd) cannot implement FilelikeViewType because otherwise the shared memory references borrowed for the lifetime of Mmap could be overwritten, thus violating immutability, through another BorrowedFd obtained from the original struct from which the view would have been derived.


types implementing FilelikeViewType shall not expose their internal OwnedFd as an &OwnedFd or &mut OwnedFd outside their encapsulation boundary

I think &OwnedFd is ok? That's very similar to a BorrowedFd except that you have a place to borrow from? On its own it's just not sufficient for the safety guarantees we need.

@sunfishcode
Copy link
Owner Author

Additionally they must not rely on exclusive access to the file descriptor that would normally be promised by OwnedFd.

Should OwnedFd normally promise this? How might we word such a promise?

E.g. struct Mmap(OwnedFd) cannot implement FilelikeViewType because otherwise the shared memory references borrowed for the lifetime of Mmap could be overwritten, thus violating immutability, through another BorrowedFd obtained from the original struct from which the view would have been derived.

Such an Mmap would presumably rely on exclusive access to the file description too. Someone with an OwnedFd obtained by try_clone() could do I/O and violate immutability in the same way. I suspect an Mmap(OwnedFd) that can be constructed from an arbitrary provided OwnedFd and that hands out references to mapped memory is always going to be unsound. Is there another example which would demonstrate the desired constraint here?

@the8472
Copy link

the8472 commented May 18, 2022

Such an Mmap would presumably rely on exclusive access to the file description too.

Ah, yes. I think it would still make a good example on OwnedFd to show a thing it can't promise.
But at least that would not prevent it from implementing Into<OwnedFd>, only a safe From<OwnedFd>.

Is there another example which would demonstrate the desired constraint here?

Other than close and dup2 I can't think of any unixy things that are really unsafe to perform without exclusive file descriptor access but safe with shared file description.
Maybe the distinction matters somewhere on windows?

@sunfishcode
Copy link
Owner Author

Such an Mmap would presumably rely on exclusive access to the file description too.

Ah, yes. I think it would still make a good example on OwnedFd to show a thing it can't promise. But at least that would not prevent it from implementing Into<OwnedFd>, only a safe From<OwnedFd>.

I am trying to avoid getting into the business of documenting how to use mmap safely in Rust here :-}. It would be useful, but it's complex and subtle enough that it should be a separate effort.

Is there another example which would demonstrate the desired constraint here?

Other than close and dup2 I can't think of any unixy things that are really unsafe to perform without exclusive file descriptor access but safe with shared file description. Maybe the distinction matters somewhere on windows?

Same here. I don't have comprehensive knowledge of Windows, but I'm not aware of anything like this. As far as I can tell, Windows doesn't even have anything corresponding to dup2.

And I know see use cases that need protection from dup2, such as using memfd_create + file sealing and subsequently assuming that the contents are not mutated.

And something I was missing above is that we don't need to define the abstraction boundary for a view type. Just saying "Into<OwnedFd> has to return the same fd as From<OwnedFd> received to be usable as a view" is sufficient, because any type which exposes the &mut OwnedFd to code it doesn't control would not be able to guarantee that.

So it might be sufficient to just say this:

  • dup2's second operand must be a &mut OwnedFd. That protects ordinary AsFd implementors from having their file descriptions rewritten out from underneath them, breaking invariants they might be relying on for safety. I don't know exactly how to word that yet, but we can figure that out.
  • FilelikeViewType requires types guarantee Into<OwnedFd> returns the same fd From<OwnedFd> gave them.

What do you think?

@sunfishcode
Copy link
Owner Author

One outcome of this approach is that &OwnedFd and BorrowedFd<'_> become conceptually interchangeable, except for representation and ergonomics. You can get a BorrowedFd<'_> from an &OwnedFd with as_fd, and view APIs will be able to encapsulate the unsafe needed to provide a &OwnedFd from a BorrowedFd<'_>.

@the8472
Copy link

the8472 commented May 24, 2022

A BorrowedFd works when you don't have a place to borrow from, e.g. because you're dealing with an opaque OS type from which the fd number can be queried but not retrieved via field access.

@sunfishcode
Copy link
Owner Author

Yes, that's what I'm calling ergonomics here. You could also use &OwnedFd, by constructing a temporary OwnedFd like as_filelike_view does, but that's less convenient.

In other words, these two questions:

  • "If I implement AsFd for a type, which system calls can users do with it?"
  • "If I have a type with a method that returns an &OwnedFd, what system calls can users do with it?"

have the same answer. Also, both BorrowedFd<'_> and &OwnedFd are tied to a lifetime. Both are copyable. And both are "immutable" references in the sense that you can't reassign a new file descriptor (or, with bytecodealliance/rustix#332, a new file description).

@the8472
Copy link

the8472 commented May 24, 2022

Yes, that's what I'm calling ergonomics here. You could also use &OwnedFd, by constructing a temporary OwnedFd like as_filelike_view does, but that's less convenient.

You couldn't because you wouldn't have a place to store the OwnedFd in from which you could borrow &OwnedFd. The method returns, its stackspace is gone.

@sunfishcode
Copy link
Owner Author

You couldn't because you wouldn't have a place to store the OwnedFd in from which you could borrow &OwnedFd. The method returns, its stackspace is gone.

You'd define a struct View(OwnedFd) and return one of those by value, and it would Deref to &OwnedFd, and implement Drop to not close the fd. Less convenient, but doable. This is what as_filelike_view does.

@the8472
Copy link

the8472 commented May 24, 2022

Ah, I it does collapse the concepts of View and BorrowedFd. But currently we only have the latter in std. So if they're not introduced at the same time then this creates a path dependency where View needs to be a separate type I think.
Or we could make the view the thing under the hood and make BorrowedFd a type alias without initially implementing as_filelike_view.

@sunfishcode
Copy link
Owner Author

I think it makes sense to keep BorrowedFd<'_> a separate type. View types internally need to use Option<T>, rather than just the T that they conceptually hold, because their drop implementations need to be able to leave self in a valid-but-empty state, so their representation is different. Option<ViewFd<OwnedFd> wouldn't get the niche optimization that Option<BorrowedFd<'_>> gets. And Option<ViewFd<OwnedFd>> wouldn't be usable in FFI declarations the way that Option<BorrowedFd<'_>> is.

@the8472
Copy link

the8472 commented May 24, 2022

How about struct BorrowedView(ManuallyDrop<OwnedFd>)?

@sunfishcode
Copy link
Owner Author

Hm, interesting. For a general-purpose view type, we'd want a type parameter, so we can have File and similar views. That would require it to hold a ManuallyDrop<T>, so that it can hand out &Ts. But if T has any dynamically allocated resources, like maybe a Vec buffer, that would leak those resources. But perhaps we could prevent that by making it a requirement of the marker trait that types be ok with leaking (other than the fd).

@the8472
Copy link

the8472 commented May 24, 2022

Hmm, no that seems too limiting. And MaybeUninit doesn't work either because it prevents niche optimization. So separate types it is then. That also has the advantage that BorrowFd doesn't need to implement Drop while View does.

`ManuallyDrop::take` allows the view types to properly consume their temporary
objects without needing an `.unwrap()`.
@sunfishcode
Copy link
Owner Author

I just noticed that ManuallyDrop has a take function, and that means the view types don't need to use Option. I've now added a patch here implementing that. It does require unsafe, but it kind of feels symmetric with the unsafe used to create the temporary objects in the first place.

This seems like an improvement on the current PR here, but it also suggests that it might be feasible to eliminate BorrowedFd and just use view types.

@sunfishcode
Copy link
Owner Author

I've now tried to implement this, and encountered the following problems:

  • BorrowedFd implements Copy, but the view types are unable to implement Copy since they implement Drop. Also, if we care about view types holding additional dynamically-allocated resources, not all view types will want to implement Copy.
  • BorrowedFd::from_raw is a const fn, but the view correspondent view_raw can't be a const fn because it uses From<OwnedFd>::from which is not a const fn.

Making BorrowedFd implement Copy supports features that make it more ergonomic in common cases, and a need for BorrowedFd::from_raw to be a const fn has come up in practice, so these might be the reasons to keep BorrowedFd separate.

sunfishcode added a commit to bytecodealliance/rustix that referenced this pull request May 26, 2022
And similar for `dup3`.

The idea behind using `&OwnedFd` is that `dup2`'s second operand isn't like a
normal borrow. It effectively closes the old file descriptor, and creates a
new one with the same index. This could break assumptions of classes that have
an `AsFd` to allow users to do special I/O operations, but which don't expect
users can close and reopen their file descriptor as some completely unrelated
resource.

However, the existence of things like [`FilelikeView`], as well as the
`ManuallyDrop` pattern, mean that `&OwnedFd` doesn't actually prevent
users from using `dup2` on a `BorrowedFd`.

With sunfishcode/io-lifetimes#32 though, `&mut OwnedFd` would be
sufficient, because it removes the `DerefMut` implementation.

So change `rustix` stance to be that `dup2` requires `&mut OwnedFd`.

This means that it's no longer possible to pass the same file descriptor
to both operands of `dup2` or `dup3` with safe Rust, which means it's not
possible to observe the difference in behavior in that case, so remove
the `dup3.rs` test.

[`FilelikeView`]: https://docs.rs/io-lifetimes/latest/io_lifetimes/views/struct.FilelikeView.html
@sunfishcode
Copy link
Owner Author

Making BorrowedFd implement Copy supports features that make it more ergonomic in common cases,

Thinking about this more, there wouldn't be as much of a need for BorrowedFd to implement Copy if it were a view, because it would deref to an &OwnedFd at each use, rather than being copied at each use. But there are other situations where it's useful for BorrowedFd to be Copy, such as when users want to wrap a BorrowedFd in some other type which wants to be Copy. So I still think the reasoning here holds.

@sunfishcode
Copy link
Owner Author

Thanks for the discussion, @the8472! I found it very valuable to talk through this here. I'm going to merge this PR, because I think we have a clear enough view (hah) of how this form of view needs to work. But if you have any further observations, please post about them, here or in the I/O safety tracking issue.

@sunfishcode sunfishcode merged commit b3c7b00 into main May 27, 2022
@sunfishcode sunfishcode deleted the sunfishcode/view-trait branch May 27, 2022 16:58
sunfishcode added a commit to sunfishcode/io-extras that referenced this pull request May 27, 2022
The main change here is that view types no longer implement `DerefMut`,
which is needed by `Read` and `Write`. Fortunately, there's a way to
get a `DerefMut` implementation:

sunfishcode/io-lifetimes#32 (comment)
sunfishcode added a commit to sunfishcode/io-extras that referenced this pull request May 27, 2022
* Update to io-lifetimes 0.7.0-beta.0.

The main change here is that view types no longer implement `DerefMut`,
which is needed by `Read` and `Write`. Fortunately, there's a way to
get a `DerefMut` implementation:

sunfishcode/io-lifetimes#32 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants