Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs::copy hangs on docker (Linux) #75446

Closed
h33p opened this issue Aug 12, 2020 · 19 comments · Fixed by #75428
Closed

fs::copy hangs on docker (Linux) #75446

h33p opened this issue Aug 12, 2020 · 19 comments · Fixed by #75428
Assignees
Labels
C-bug Category: This is a bug. O-linux Operating system: Linux T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@h33p
Copy link

h33p commented Aug 12, 2020

Running Fedora 32 with selinux set to non-enforcing, building a project with OpenCV.

I am building a rather weird docker setup, but what I ran into was hanging during build stage of opencv-rust, around here.

I expected to see this happen: build succeed rather quickly

Instead, this happened: execution froze on fs::copy call, and cpu got stuck running almost full speed.

My project was mounted using -v $PWD/project:/project:Z (name changed), opencv was manually cloned (for debugging) into the root of the docker image, at /opencv-rust.

Copies from /opencv-rust to /project don't work (example: /opencv-rust/bindings/cpp/opencv_4/aruco.cpp => /project/target/release/build/opencv-50ff47d79816a5ea/out/aruco.cpp)

File copies from /project to /opencv-rust work just fine (example: /project/target/release/build/opencv-50ff47d79816a5ea/out/xobjdetect_types.hpp => /opencv-rust/bindings/cpp/opencv_4/xobjdetect_types.hpp)

It appears to be an issue in the implementation of Linux's fs::copy, as implementing the more generic version above does not freeze the operation.

Meta

rustc --version --verbose:

rustc 1.45.2 (d3fb005a3 2020-07-31)
binary: rustc
commit-hash: d3fb005a39e62501b8b0b356166e515ae24e2e54
commit-date: 2020-07-31
host: x86_64-unknown-linux-gnu
release: 1.45.2
LLVM version: 10.0

@h33p h33p added the C-bug Category: This is a bug. label Aug 12, 2020
@jonas-schievink jonas-schievink added T-libs Relevant to the library team, which will review and decide on the PR/issue. O-linux Operating system: Linux labels Aug 12, 2020
@h33p
Copy link
Author

h33p commented Aug 12, 2020

Update: I reimplemented the Linux version, and the written count always gets stuck on 0 in the function.

@tavianator
Copy link
Contributor

Any chance you can strace the process that gets stuck?

@h33p
Copy link
Author

h33p commented Aug 12, 2020

It's an endless copy_file_range(6, NULL, 11, NULL, 41205, 0) = 0

@h33p
Copy link
Author

h33p commented Aug 12, 2020

According to man, If the file offset of fd_in is at or past the end of file, no bytes are copied, and copy_file_range() returns zero., although the behaviour is odd. If I were to guess running this under docker does not return EXDEV, even though it is technically on external device (not in the view of the docker container though).

@h33p
Copy link
Author

h33p commented Aug 12, 2020

This particular case of 0 is not being handled in the current implementation, only the errors. Adding match arms and redirecting 0 to one of the errors to force using fallback mode could work. However, there may be cases where 0 gets returned legitimately?

@the8472
Copy link
Member

the8472 commented Aug 12, 2020

The current implementation queries the size of the source file and then does a decrementing loop. Does the file in question get truncated by another thread while the copy is in progress?

@the8472
Copy link
Member

the8472 commented Aug 12, 2020

Can you check which syscalls are used in the direction that works?

If I were to guess running this under docker does not return EXDEV, even though it is technically on external device (not in the view of the docker container though).

Recent kernels support cross device copy_file_range under certain circumstances, e.g. overlayfs can delegate to the underlying device and btrfs can copy between volumes. Bind mounts also should count as the same device. But maybe there's a bug in there in combination with selinux?

@cuviper
Copy link
Member

cuviper commented Aug 12, 2020

If selinux is non-enforcing (per OP), that should rule it out.

@the8472
Copy link
Member

the8472 commented Aug 12, 2020

It still seems really odd. The kernel's copy_file_range implementation itself does a lot of fallbacks internally when the preferred methods return a 0, i.e. 0 bytes copied ultimately lead to a splice operation from file to file which is not all that different than what we're doing in userspace. So 0 bytes being copied is quite unexpected for a non-empty file. There are lots of layers involved though and any of them could be the cause.

Adding yet another workaround is possible, but if possible I'd like to identify the root cause so we can report it to the responsible parties.

@tavianator
Copy link
Contributor

Coreutils tests for a zero-byte copy_file_range() and falls back on a read()/write() loop if that happens: coreutils/coreutils@4b04a0c

Their code has a comment mentioning that this happens for /proc special files. I tried to reproduce it with

fs::copy("/proc/self/cmdline", "./foo");

but ran into another bug: stat() says that file is empty, so fs::copy() creates an empty file and doesn't copy anything. I think trusting st_size is fundamentally broken, for both this reason and the race that @the8472 mentioned.

@the8472
Copy link
Member

the8472 commented Aug 13, 2020

@rustbot claim

@the8472
Copy link
Member

the8472 commented Aug 13, 2020

@tavianator that's another good reason to change the logic, but it'd still be good to know what's actually causing it so we can add the root cause to comments or report things upstream if they haven't been fixed already.
The direction-dependence smells like something overlayfs-related, but that's just a guess without further details from @h33p

@h33p
Copy link
Author

h33p commented Aug 13, 2020

This is interesting. Kernel's generic_copy_file_checks sets the length to zero upon return, because i_size_read returns 0. Which ultimately means, that the inode's size in the kernel is zero, while the metadata returns size of 41205. Possible kernel bug?

Yet, this behaviour only occurs when copying from docker's system over to overlayed home directory, all technically on the same overlayfs, but on different filesystems on the host side.

@tavianator
Copy link
Contributor

Maybe it's getting i_size from the stacked inode in the overlayfs, not the underlying one

@the8472
Copy link
Member

the8472 commented Aug 13, 2020

I'm trying to reproduce it from basic pieces but so far it works just fine

mkdir direct overlay upper lower work
mount -t overlay overlay -o lowerdir=lower,upperdir=upper,workdir=work overlay
echo "foo" > lower/IN
strace -ffe openat,copy_file_range ./copy.rs
#!/usr/bin/env run-cargo-script

fn main() -> std::io::Result<()> {
  println!("{}", std::fs::copy("./overlay/IN", "./direct/OUT")?);
  Ok(())
}
[pid 22796] openat(AT_FDCWD, "./overlay/IN", O_RDONLY|O_CLOEXEC) = 3
[pid 22796] openat(AT_FDCWD, "./direct/OUT", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0100644) = 4
[pid 22796] copy_file_range(3, NULL, 4, NULL, 3, 0) = 3
  • kernel 5.7.12-arch1-1
  • btrfs

@h33p
Copy link
Author

h33p commented Aug 14, 2020

I'm sorry I made one horrible mistake, the directories are not mounted on the same overlayfs, they seem to be bind mounted

Output on docker container:

/dev/mapper/fedora-home on /project type ext4 (rw,relatime,seclabel)
/dev/mapper/fedora-home on /test_out type ext4 (rw,relatime,seclabel)
/dev/mapper/fedora-home on /model type ext4 (rw,relatime,seclabel)
/dev/mapper/fedora-home on /dataset type ext4 (ro,relatime,seclabel)

I am not sure how exactly would one reproduce this environment without actually using docker/podman to do so.

@h33p
Copy link
Author

h33p commented Aug 14, 2020

Not only that, I noticed that the behaviour is very strange. Some files do cause the lock ups, some don't.

Now, building by using opencv-rust directly from crates.io, running build script creates this file:
/home/developer/.cargo/registry/src/gitpro.ttaallkk.top-1ecc6299db9ec823/opencv-0.45.0/bindings/cpp/opencv_4/aruco.cpp

Running fs::copy script with the path causes it to lock up. But, if I cp the file elsewhere, and copy it back in place, fs::copy does not cause any more lockups. If I append a new line to the file, no more lock ups are caused, so much so as changing file permissions fixes it. Basically, if I modify the file in any shape or form, it seems to update itself and work just fine.

This really seems like some serious, hard to reproduce kernel bug to me.

@h33p
Copy link
Author

h33p commented Aug 14, 2020

My current kernel version is 5.7.14-200.fc32.x86_64, but the same happened with an older 5.7 kernel as well.

@the8472
Copy link
Member

the8472 commented Aug 14, 2020

I'm sorry I made one horrible mistake, the directories are not mounted on the same overlayfs, they seem to be bind mounted

Those are the target directories. But the source also matters and depending on the docker storage driver you're using that might be overlayfs.

Running fs::copy script with the path causes it to lock up. But, if I cp the file elsewhere, and copy it back in place, fs::copy does not cause any more lockups. If I append a new line to the file, no more lock ups are caused, so much so as changing file permissions fixes it. Basically, if I modify the file in any shape or form, it seems to update itself and work just fine.

Yeah, again if overlayfs is involved that may make a difference whether the file comes from the upper or the lower.

I am not sure how exactly would one reproduce this environment without actually using docker/podman to do so.

A step by step reduced testcase would help. I grasp the rough outline what is happening but there are many details that might make a difference.

I can add a workaround without that, but then I can't verify that the issue is fixed and it would make reporting things upstream more difficult too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-linux Operating system: Linux T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants