Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC for redirecting stdio of child processes to open file handles #1055

Closed
wants to merge 11 commits into from
Closed

RFC for redirecting stdio of child processes to open file handles #1055

wants to merge 11 commits into from

Conversation

ipetkov
Copy link
Contributor

@ipetkov ipetkov commented Apr 10, 2015

@ipetkov
Copy link
Contributor Author

ipetkov commented Apr 10, 2015

cc @alexcrichton


```rust
pub struct Pipe {
pub reader: File,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some details on how this structure is going to be created as well? (the more detailed an RFC is the better!)

Also, I think that File may not be the best type to use here, or at least it may depend on the implementation. I would guess that on unix this would use the pipe system call, but File has methods like set_len which may not work out too well for pipes.

As a final point, could you add some words as to how this will be implemented on Windows as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton I think you are right that File isn't the correct type to use here. I was thinking as a File as an abstraction over a file descriptor/HANDLE, but I didn't really think about the distinctions between pipes and files. I'll update the design with a separate wrapper for pipes to maintain those distinctions!

@rschmitt
Copy link

Isn't this what InheritFd did in old_io? Was that also implemented on Windows?

child process or redirect stdio to `/dev/null`. It would also be largely useful
to allow stdio redirection to any currently opened `std::fs::File` (henceforth
`File`) handle. This would allow redirecting stdio with a physical file or even
another process (via OS pipe) without forcing the parent process to buffer the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When directing to another process, the pipe actually already exist in a sense where Stdio::piped() means that the child will be created with a stdin/stdout/stderr handle, under which is just a file descriptor. That may end up being more useful in the long run to hook up processes together perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that the pipes exist internally, but they only pipe between the parent and child only. Although an API for hooking up several processes directly may be useful, I think that extending the current API with this functionality may be too troublesome to make it truly flexible, besides simply allowing the caller to pass in some sort of file descriptor and let them worry about hooking things up properly.

@alexcrichton
Copy link
Member

Overall I like the direction that this is going in, but I'm hesitant with how it interacts with other aspects of the system. For example this interface should largely boil down to "just passing a file descriptor" on unix and something like "just passing a HANDLE" on windows. Right now the restriction of using a File itself may be too limiting, and I'm not quite so sure about borrowed-vs-owned aspect of the argument just yet, I think I'll need to ponder this a bit more.

Regardless though thanks for the RFC!

* Don't use `File` as the underlying pipe handle, it is too limiting and
  carries file related assumptions.
* Instead, define a new `Pipe` type whose reader/writer are wrappers over a
  fd/HANDLE.
* Note how pipes will be opened in Unix and Windows
* Propose new redirection API: based on ownership, defined for concrete types
  provided by the standard library
@ipetkov
Copy link
Contributor Author

ipetkov commented Apr 12, 2015

@rschmitt Yeah InheritFd allowed redirection of arbitrary file descriptors, even on Windows, except the caller would have to manually convert the HANDLE as an fd via libc calls themselves. Not to mention the unsafety of the file descriptor getting closed at some point before the command was actually spawned (or worse a different file opened under that same descriptor). This RFC seeks to make things a bit easier and safer when redirecting stdio!

@alexcrichton I agree that using a File is too limiting, so I've updated my design to create a new handle for pipes.

For example this interface should largely boil down to "just passing a file descriptor" on unix and something like "just passing a HANDLE" on windows.

I think this would be a good approach for the future, but I wanted to avoid Unix/Windows discrepancies in the API around sockets (they aren't quite HANDLES on Windows and I'm not sure if they can be substituted for each other) so I thought about using a "white list" of accepted types for redirection. Although not as flexible as accepting the raw OS handle type it seems like a decent compromise for now.

...
// sys::fs2::File is a safe wrapper over a fd/HANDLE,
// not to be confused with the public `std::fs::File`
Redirect(Rc<sys::fs2::File>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an Rc removes the Send and Sync bounds on Stdio, which is a breaking change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this would have to be Arc then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that should work.

* We don't want to break Stdio's already present Sync and Send bounds
wrapped file handle.

```rust
impl Clone for Stdio { ... }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this may be a bit ambitious for the standard library, it's pretty rare to see an Arc under the hood of a primitive for the purpose of implementing Clone right now. There's definitely a use case of being able to share descriptors among spawned processes, but I think that we may want to expose that via raw handles for now.

For example unix could have Stdio::from_raw_fd and windows could have Stdio::from_raw_handle which will be low-level wrappers to pass I/O handles to the child. This would satisfy the ability to share the same handle among many processes (probably unsafely) for now as it's probably a somewhat rare desire.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the unsafety should be acceptable for now, especially since commands are probably executed as soon as they are built and all handles are still in scope.

Also I think it might be better to simply have a redirect method defined to take AsRawFd implementers on Unix, and AsRawHandle implementers on Windows to make usage of the API more portable. I'll update the proposal with the changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think it might be better to simply have a redirect method defined to take AsRawFd implementers on Unix, and AsRawHandle implementers on Windows to make usage of the API more portable

You've mentioned this in the RFC as written, but this'll need to be a platform-specific extension as the set of types implementing AsRawFoo is different across platforms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking something along the lines of a StdioExt trait which will be defined in std::os::$patform::io based on the appropriate AsRawFoo, but their method signatures would match lexically.

// Unix
pub trait StdioExt { fn redirect<T: AsRawFd>(t: &T) -> Stdio { ... } }
// Windows
pub trait StdioExt { fn redirect<T: AsRawHandle>(t: &T) -> Stdio { ... } }

So then you could write something like the following in both Unix and Windows

#[cfg(unix)] use std::os::unix::io::StdioExt as StdioExt;
#[cfg(windows)] use std::os::windows::io::StdioExt as StdioExt;

let file = File::open(...);
let stdio = unsafe { StdioExt::redirect(file) };
...

Instead of having to write the following everywhere you want to redirect something

#[cfg(unix)] use std::os::unix::io::StdioExt as StdioExt;
#[cfg(windows)] use std::os::windows::io::StdioExt as StdioExt;

let file = File::open(...);
let stdio = unsafe {
    if cfg!(unix) {
        StdioExt::from_raw_fd(file)
    } else {
        StdioExt::from_raw_handle(file)
    }
};
...

This way users are still somewhat aware that this is platform specific code, without suffering the constant need of cfg.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Definitely sounds like a plausible idea to me!

@alexcrichton alexcrichton self-assigned this Apr 15, 2015
* Use an OS specific `StdioExt` trait to allow for redirection based on
  `AsRawFd` and `AsRawHandle` implementors.
* Leverage the OS pipes already present in `ChildStd{in, out, err}` after
  spawning a `Command` to easily allow piping data between processes.
@ipetkov
Copy link
Contributor Author

ipetkov commented Apr 18, 2015

I've updated and simplified the design to work with ChildStd{in, out, err} which are already using OS pipes to talk with the child process.

I've left out a public interface for safely creating OS pipes out of this version because it doesn't seem as necessary any more. It would be pretty trivial to expose a public wrapper over AnonPipe either now or in the future should this become a requested feature.

* `StdioExt` should be defined in `std::os::$platform::process` not
  `std::os::$platform::io`
* Default IO of `Command`s is to inherit the stdio handles, so we should
  specify `Stdio::piped` where applicable.
```

This would require that the internally defined `AnonPipe` wrapper be implemented
using HANDLEs (and not file descriptors) on Windows. This can easily be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was actually recently fixed in rust-lang/rust#24426 as well, so yay!

@alexcrichton
Copy link
Member

This looks great to me @ipetkov, thanks!

I'm a tad bit uncertain about whether unsafe is the right way to go here, but I think that it's the best course of action. Right now we're providing the guarantee that a File is the sole and unique owner of that file descriptor/handle, hence why from_raw_fd is now unsafe. If we were to make Stdio::redirect safe, then you could create the redirection, drop the source handle, open a new handle, and then spawn the child. This newly opened handle would then be used by the child process, violating the unique ownership semantics of File. As a result I think it's fine to call the API unsafe.

cc @nagisa, @retep998, @carllerche, @aturon

@l0kod
Copy link

l0kod commented Apr 21, 2015

👍 We definitely need this features to replace the lost std::old_io::pipe::PipeStream and StdioContainer::InheritFd.

The Redirect(RawFd) could be replaced with something like Redirect<T: AsRawFd>(&mut T) to be able to not need unsafe code: exclusive use of the file descriptor/handle during all it's lifetime.

StdioExt::redirect<T: AsRawFd>(t: &'a mut T) should return a Stdio with the 'a lifetime.

We should be able to connect the stdio/pipes before spawning any command.

Agree with @alexcrichton to no use File but something lighter. I previously proposed to expose something like FileDesc but common between UNIX and Windows (e.g. IoHandle, cf rust-lang/rust#15643) to keep a common function signature whatever OS is used. However, &mut As{RawFd,Handle} could do the job.

@alexcrichton
Copy link
Member

@l0kod the recommended unsafety here derives from the fact that there's no lifetime associated with the return value (see the section about backwards compatibility and my prior comment). It also already does take T: AsRawFd as Stdio::Redirect is not an expose enum variant (it's only exposed through an extension trait).

@ipetkov
Copy link
Contributor Author

ipetkov commented Apr 22, 2015

We should be able to connect the stdio/pipes before spawning any command.

@l0kod That's one argument for exposing a public Pipe interface which wraps over AnonPipe and would allow setting that all up before spawning, which would be pretty easy to add

@l0kod
Copy link

l0kod commented Apr 25, 2015

the recommended unsafety here derives from the fact that there's no lifetime associated with the return value

Well, I suggest to add a lifetime to the return value: rust-lang/rust#24251 (comment)

@ipetkov
Copy link
Contributor Author

ipetkov commented Apr 25, 2015

@l0kod I think @alexcrichton is referring to Stdio as the return value, and not the raw handle type. Stdio is stored by Command, so if it was to get a lifetime parameter due to AsRaw{Fd, Handle}, so would Command. At this point this would be a breaking change of a stabilized API.

@l0kod
Copy link

l0kod commented Apr 25, 2015

I didn't realize this was so impacting. I know Rust has a big part stabilized but I understood that some (rare) breaking changes could happen before the final 1.0.0
@alexcrichton Is it really too late to fix this objects' lifetimes?

@ipetkov
Copy link
Contributor Author

ipetkov commented May 8, 2015

This would take the (File) ownership and could get ride of the unsafe.

What about redirecting multiple child processes to the same File? Then either Stdio would have to implement Clone, or anything else that holds a file descriptor/HANDLE.

@l0kod
Copy link

l0kod commented May 9, 2015

What about redirecting multiple child processes to the same File? Then either Stdio would have to implement Clone, or anything else that holds a file descriptor/HANDLE.

A Stdio::tee(&self) -> Stdio could be in charge of writing multiple time to the same file description (i.e. Stdio). It seems appropriate to use tee(2) in the Stdio implementation for that.
This way, we can control and handle potentially unsafe operations in only one implementation.

* Take `FromRaw{Fd, Handle}` into account as a possible implementation
* Open discussion on a higher level API for child process stdio redirection
@ipetkov
Copy link
Contributor Author

ipetkov commented May 15, 2015

I've updated the proposal to consider FromRaw* as an implementation and proposed several approaches for a higher level API. After thinking about the tradeoffs for each approach, I'm still a fan of using StdioExt (because I find it cleaner than using an intermediary trait), however, I am open to other ideas!

@l0kod Is tee(2) available across platforms (e.g. Windows)?

@aturon aturon added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label May 22, 2015
@rschmitt
Copy link

rschmitt commented Jun 5, 2015

In the meantime, is there a workaround we can use on nightly and/or stable? Heatseeker has only been usable on Windows ever since old_io was removed (relevant code).

@retep998
Copy link
Member

retep998 commented Jun 5, 2015

@ipetkov I cannot find tee(2) or equivalent on Windows.

@ipetkov
Copy link
Contributor Author

ipetkov commented Jun 5, 2015

@rschmitt there are two workarounds that come to mind:

  • Spawn a thread to buffer data from the given descriptor to the child's stdin, but probably won't scale very well
  • dup2 the given descriptor to the parent's stdin fd and have the child inherit that

@rschmitt
Copy link

rschmitt commented Jun 5, 2015

@ipetkov Your second suggestion is exactly what I needed. Works great for what I need, sure beats doing the ioctls by hand. Thanks!

@l0kod
Copy link

l0kod commented Jun 6, 2015

Is tee(2) available across platforms (e.g. Windows)?

Even if tee(2) is not available, the Stdio could implement a tee() -> Stdio and use something similar to tee(2) internally (e.g. DuplicateHandle or manual copy?).

@alexcrichton
Copy link
Member

@ipetkov could you update this RFC with specific signatures for the functions you have in mind? I've found it useful to have the signatures in mind ahead of time as it makes it easier to discuss portions of the API in terms of generics and ownership.

@ipetkov
Copy link
Contributor Author

ipetkov commented Jun 14, 2015

@l0kod To my understanding of the tee syscall on Linux it doesn't quite behave like simply duplicating a descriptor, so it may prove difficult to emulate that functionality on other systems (lest we introduce confusion by using the same name).

Internal duplication of the OS handle is definitely a possibility, however, I personally lean towards allowing the caller to choose if they want duplication (and expecting them to use dup themselves) rather than forcing it on them.

@alexcrichton Updated the RFC!

ensure the handle will remain valid until the child is spawned, making this
method `unsafe` as well.

Below are several alternative ways of exposing a high level API. They are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend structuring this RFC to make a specific recommendation on which API route to choose and leave the other two to the "Alternatives" section. I would personally recommend something like:

  1. Having an extra StdioExt struct seems somewhat odd as it's not actually returned or needed to be a struct, so I'd list this as an alternative.
  2. Using an extension trait is currently the standard method for extending functionality on a type, so I'd recommend listing this as the what the RFC proposes.
  3. I would recommend listing this as an alternative, but definitely emphasize the non-cross-platform aspect as it's the reason I wouldn't want to pursue it.

* Low level `FromRaw*` `AsRaw*` APIs are already present in libstd, thus the
  focus of this RFC has shifted to just developing a high level API.
@ipetkov
Copy link
Contributor Author

ipetkov commented Jun 15, 2015

Updated the RFC to recognize the FromRaw* implementations on Stdio and to reflect @alexcrichton 's suggestions of the previously offered design alternatives

Unfortunately, since the actual methods pertaining to `FromRaw*` and `AsRaw*`
are OS specific, their usage requires either constant `cfg` checks or OS
specific code duplication. Moreover, the conversion to `Stdio` is `unsafe` and
requires the caller to ensure the OS handle remains valid until spawning the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API is actually a little subtle in this regard, but the contract of FromRawFd is that it takes ownership of the file descriptor (or handle) in question, so the unsafety is based on ownership, not on the lifetime of the resource itself. Specifically, the various primitives in the standard library provide the guarantee that they are the sole owner of the fd/handle in question, and this being safe would otherwise be a violation of that.

Also, if we assume that a IntoRawFd trait exists, using the raw underlying interfaces may not be so bad as you'd just have two functions (unix/windows) doing Stdio::from_raw_fd(file.into_raw_fd()). I suspect that adding traits like IntoRawFd are somewhat inevitable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more concrete, one concern I have about this is that this RFC is based on an ergonomic concern which will be alleviated soon in the future. The alternative (IntoRawFd) I don't think is unergonomic enough to warrant a specialized function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton

the contract of FromRawFd is that it takes ownership of the file descriptor (or handle) in question

Ah, I had overlooked this...

However, doesn't this require that the caller mem::forget the wrapper over the descriptor/HANDLE or at least manually dup it (and perform error checking)? Because the contract of AsRawFd does not transfer ownership and then both Stdio and the original wrapper will both try to close the handle.

I suspect that adding traits like IntoRawFd are somewhat inevitable.

I also agree, however, I feel like something like IntoRawFd could be very useful now, because being forced to mem::forget or dup handles (i.e. when using the current state of the API) is very unergonomic to force on userland, especially when the standard library does such a good job from abstracting much of that away (and the reason why the Command interface is so handy to have).

The alternative (IntoRawFd) I don't think is unergonomic enough to warrant a specialized function.

I could definitely live with a compromise like IntoRawFd, but a specialized function could avoid accidental use of AsRawFd instead of IntoRawFd and prevent someone from shooting their foot (since both traits would produce a fd which FromRawFd will happily accept).

Assuming the availability of an IntoRawFd trait the currently proposed API could be tweaked to use it more appropriately:

pub trait StdioExt {
   // Take ownership of the handle
   fn redirect<T: IntoRaw*>(t: T) -> Stdio;
   // Unsafely borrow the handle, letting caller ensure it is valid
   unsafe fn redirect_by_ref<T: AsRaw*>(t: &T) -> Stdio;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, doesn't this require that the caller mem::forget the wrapper over the descriptor/HANDLE or at least manually dup it (and perform error checking)?

I believe this is correct, yes.

Assuming the availability of an IntoRawFd trait the currently proposed API could be tweaked to use it more appropriately:

I think one part of this that I'm uneasy about is how Stdio::redirect(t) is the same as Stdio::from_raw_fd(t.into_raw_fd()). Along those lines it seems redundant to have the redirect method, and then only adding an unsafe redirect_by_ref method also seems somewhat out of place because it's relatively the same thing as Stdio::from_raw_fd(f.as_raw_fd()) + a mem::forget somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a point there; I suppose that minus the cross-platform benefits the RFC's design does not offer any further benefits than something like IntoRawFd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, in that case it may be good motivation to push harder on these traits perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so. Do you think this RFC would be an appropriate place to discuss them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would serve as good motivation, but I think that IntoRaw{Fd,Handle,Socket} may want to be its own RFC.

@ipetkov
Copy link
Contributor Author

ipetkov commented Jun 24, 2015

Based on @alexcrichton's feedback I've written up RFC #1174 to discuss adding new traits such as IntoRaw{Fd, Socket, Handle}.

This RFC still retains its cross-platform benefits over using IntoRaw*, however, if there is no strong desire for a cross-platform redirection API at the time, we can close this RFC in favor of #1174.

@alexcrichton alexcrichton added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Jul 1, 2015
@alexcrichton
Copy link
Member

This RFC is now entering its week long final comment period.

@SimonSapin
Copy link
Contributor

#[cfg(unix)] use std::os::unix::process::StdioExt;
#[cfg(windows)] use std::os::windows::process::StdioExt;

If some form of StdioExt::redirect is available on all supported platforms, the above is just noise IMO. I’d much prefer redirect being in inherent method of Stdio, as proposed in Alternatives.

I’d go even further: rather than have redirect itself have a platform-specific signature, it could use a new trait (AsStdioRedirect?) that internally uses eiher AsRawFd or AsRawHandle.

@sfackler
Copy link
Member

sfackler commented Jul 6, 2015

@SimonSapin a given type may not implement both AsRawFd on Unix and AsRawHandle on Windows (e.g. TcpStream which implements AsRawSocket instead of AsRawHandle). Having to import the StdioExt traits makes people opt into potential platform specific breakage.

Taking an AsRaw* by value and then storing it away seems a bit weird - particularly since it's not unreasonable to have non-owning types that implement AsRaw*. I'd be in favor of accepting #1174 and using that instead.

@alexcrichton
Copy link
Member

The consensus of the libs team on this RFC is to close this in favor of the IntoRawFd RFC, so I'm going to close this in favor of that. Thanks regardless though for the RFC @ipetkov!

@ipetkov
Copy link
Contributor Author

ipetkov commented Jul 8, 2015

Thanks for all the feedback on this RFC anyway @alexcrichton!

@ipetkov ipetkov deleted the process-stdio-redirection branch July 8, 2015 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants