cp: fix possible OOM and partial write with large files #6694

neyo8826 · 2024-09-13T06:29:38Z

There are 2 fixes:

pwrite does not guarantee that all bytes will be written, write_all_at is a proper wrapper over it
sparse_copy_without_hole (triggered on ZFS) can eat RAM if file is large and there are not enough holes; now the write happens at max 16 MiB chunks (as far as I tested, it saturates SSD anyways without much CPU)

I have tested it with checking sha256sum of a 4 GiB file.

sylvestre · 2024-09-13T06:42:23Z

src/uu/cp/src/platform/linux.rs

@@ -141,6 +141,8 @@ where
    }
    let src_fd = src_file.as_raw_fd();
    let mut current_offset: isize = 0;
+    let step = std::cmp::min(size, 16 * 1024 * 1024) as usize; // 16 MiB


please add a comment to explain why you are doing this

sylvestre · 2024-09-13T06:42:49Z

src/uu/cp/src/platform/linux.rs

-                current_offset.try_into().unwrap(),
-            )
-        };
+        for i in (0..len).step_by(step) {


same, please document this a bit more :) (i know it wasn't documented before :)

sylvestre · 2024-09-13T06:43:18Z

thanks
did you do some benchmarking to see the impact on perf?
https://github.com/sharkdp/hyperfine/ is great for this

neyo8826 · 2024-09-13T06:58:10Z

More docs :)

I have "benchmarked" it with my large file (technically not sparse, but heuristic triggered it I suppose).
I was slower even before implementing the chunking because now it actually writes the whole file (thank god for the checksumming in my build tool, that's how I discovered the bug).
After adding the chunking, the performance stayed the same, it was IO saturated (GCP high perf SSD)
For small files it should behave the same with 1 read and write call.

sylvestre · 2024-09-13T07:07:53Z

any idea of the impact on slow harddrive ?

neyo8826 · 2024-09-13T07:15:08Z

Well, if the copy is between 2 HDDs, then it shouldnt matter.
If they are on the same, there will be more head movement, and the write cache will matter a lot (8MB to 1GB?)
We do not sync manually, so even the FS cache will help.
I would like to see a proper benchmark with both consumer and enterprise HDDs, but it is not possible for me right now.

16MB seems a lot, it should balance out anything with the FS help (just my opinion)

github-actions · 2024-09-13T07:23:57Z

GNU testsuite comparison:

GNU test failed: tests/cp/sparse-2. tests/cp/sparse-2 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/cp/sparse-extents-2. tests/cp/sparse-extents-2 is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
GNU test failed: tests/cp/sparse-extents-2. tests/cp/sparse-extents-2 is passing on 'main'. Maybe you have to rebase?

github-actions · 2024-09-13T08:37:05Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

github-actions · 2024-09-14T08:21:07Z

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

neyo8826 force-pushed the copy_fix branch from 75485ae to ad18516 Compare September 13, 2024 06:32

sylvestre reviewed Sep 13, 2024

View reviewed changes

neyo8826 force-pushed the copy_fix branch from 0e1d535 to 9e50017 Compare September 13, 2024 08:08

sylvestre force-pushed the copy_fix branch from 9e50017 to e203d4e Compare September 14, 2024 07:46

cp: fix possible OOM and partial write with large files

043fc0e

neyo8826 force-pushed the copy_fix branch from e203d4e to 043fc0e Compare September 16, 2024 05:12

sylvestre requested a review from BenWiederhake September 16, 2024 15:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cp: fix possible OOM and partial write with large files #6694

cp: fix possible OOM and partial write with large files #6694

neyo8826 commented Sep 13, 2024

sylvestre Sep 13, 2024

sylvestre Sep 13, 2024

sylvestre commented Sep 13, 2024

neyo8826 commented Sep 13, 2024

sylvestre commented Sep 13, 2024

neyo8826 commented Sep 13, 2024

github-actions bot commented Sep 13, 2024

github-actions bot commented Sep 13, 2024

github-actions bot commented Sep 14, 2024

cp: fix possible OOM and partial write with large files #6694

Are you sure you want to change the base?

cp: fix possible OOM and partial write with large files #6694

Conversation

neyo8826 commented Sep 13, 2024

sylvestre Sep 13, 2024

Choose a reason for hiding this comment

sylvestre Sep 13, 2024

Choose a reason for hiding this comment

sylvestre commented Sep 13, 2024

neyo8826 commented Sep 13, 2024

sylvestre commented Sep 13, 2024

neyo8826 commented Sep 13, 2024

github-actions bot commented Sep 13, 2024

github-actions bot commented Sep 13, 2024

github-actions bot commented Sep 14, 2024