Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resize2fs corrupts extents on ext4 filesystem (offline shrinking) #146

Open
viavi-ab opened this issue May 25, 2023 · 5 comments
Open

resize2fs corrupts extents on ext4 filesystem (offline shrinking) #146

viavi-ab opened this issue May 25, 2023 · 5 comments

Comments

@viavi-ab
Copy link

viavi-ab commented May 25, 2023

I have come across a situation where resize2fs corrupts an ext4 filesytem and I can reproduce it reliably.

I have captured an image file with "e2image -r" prior to the resize and have put together a script that reproduces the bug on 1.47.0 (I can't seem to upload the large file to github, so I have cloned this repo and committed my test files at https://github.com/viavi-ab/e2fsprogs/tree/master/ab_test).

The ext4 image that I'm resizing is from a working server that, prior to the resize, had no filesystem problems.

This is an offline resize (shrink) of a large (2TB) filesystem containing a single (very large) sparse file scattered across ~8000 extents (the original image was sanitized, all other files removed, and then zerofree'ed for better compression and to hide data that I'm not allowed to share).

My script goes like this:

e2fsck -f -p work.img
resize2fs work.img 1500000000k
e2fsck -f -p work.img

The filesystem passes the first e2fsck, resize2fs reports no problems, yet the second e2fsck complains (amongst other things) about:

Inode 12, end of extent exceeds allowed value
        (logical block 2507128, physical block 640097, len 1)

Inode 12 has an invalid extent
        (logical block 2507128, invalid physical block 389065080, len 1)

Inode 12, end of extent exceeds allowed value
        (logical block 2515444, physical block 648221, len 1)

Inode 12 has an invalid extent
        (logical block 2515444, invalid physical block 389073396, len 1)

Bug does not depend on the presence of other files or the amount of free space in my image, but seems to be related to the particular dispersion of extents of that particular file.

I think what is happening is that as resize2fs is relocating data blocks, it somehow trips on some of the extents resulting in the situation shown above (note how the two logical blocks are duplicated; also, the large physical block numbers were once within the limits of the filesystem, but after resize they have become invalid).

I may be wrong, but looking at the extents allocated to the file, it seems like it has two particular interior extents (level 1) that, after resize, end up with no leaf extents (level 2).

Bug was initially found on e2fsprogs 1.45.6 (stock ubuntu 22 package) but can be reproduced on fresh binaries compiled from the latest e2fsprogs source (1.47.0).

Rearranging the extents of the file inside the image seems to work around the problem; for example, any of these will allow resize2fs to do its work without corrupting the filesystem:

  • Mount the image, duplicate the file, overwrite the original ("cp file file.new && mv file.new file"). I believe this causes the extent tree of file to be replaced by a fresh one, probably tidier.
  • Run e2fsck with "-E optimize_extents" prior to resize2fs.

If the bug is in the way resize2fs handles the extents, however, neither of these completely rules out the possibility of corrupting the filesystem anyway.

@viavi-ab
Copy link
Author

The ext4 image is available at https://github.com/viavi-ab/e2fsprogs/blob/master/ab_test/sanitized_e2.img.zst (~90MB).

The script that reproduces the bug is sanitized_reproducer (requires unzstd to extract the raw image; uncompressed image is a 1.8TB sparse file).

Attached are the output of debugfs stat and extents commands before and after the resize:

ab_test.zip

@antmat
Copy link

antmat commented May 30, 2023

Hi, I believe that this is connected to #145, I'm not a maintainer and don't have working solution, but maybe you could take a look at it.

@viavi-ab
Copy link
Author

Hi @antmat, I think I've found the bug and am putting together a patch.
What’s the preferred way of submitting a patch for review?

@antmat
Copy link

antmat commented Jun 21, 2023

Hi @antmat, I think I've found the bug and am putting together a patch. What’s the preferred way of submitting a patch for review?

Hi! Unfortunately I'm not a maintainer of this project - I'm just experiencing related issue. I believe @tytso could help. Maybe the right way is to apply patch on kernel mailing list - I don't know ¯_(ツ)_/¯

viavi-ab added a commit to viavi-ab/e2fsprogs that referenced this issue Jun 23, 2023
This patch modifies the logic of ext2fs_extent_set_bmap() so that:

- Handling of "only block in extent" special case is removed
  - This affects files during a shrink with resize2fs; for example, when a file
    has an extent of 100 blocks and all 100 are relocated, the current
    (unpatched) code does this:
    - (ref.: inode_scan_and_fix() -> ext2fs_block_iterate3() -> process_block()
      which remaps each block and returns BLOCK_CHANGED to
      ext2fs_block_iterate3(), which then calls ext2fs_extent_set_bmap() to
      update the extent tree).
    - A new extent is inserted for the same lblk (because the pblk has moved),
      and the original extent is pushed back/shrunk by 1 block (start pblk is
      incremented by 1, e_len reduced by 1).
    - All other blocks in the extent (except the last one) are appended to the
      newly-inserted extent, and the original extent is pushed back/shrunk
      accordingly (one block at a time).
    - When it's time to process the last remaining block, the code notices an
      extent of e_len = 1 (the last one remaining with the old pblk) and
      "optimizes" this by rewriting it in place (instead of appending it to the
      new extent).
    - The end result is that *every* file that has an extent with e_len > 1 is
      *always* split into two extents (the second one consisting of a single
      block).
  - I think the intention here was to make the code more efficient by making the
    update in place, but the way resize2fs calls this function practically
    causes the number of extents to double up during a shrink; in my testcase a
    large file with ~8300 extents before the resize ends up with ~8200 after
    resize (compared with ~15k extents with the unpatched code).

- More preference is given to growing the previous extent's end, by swapping two
  if blocks
    - This helps compacting the extents during resize2fs (swapping the blocks is
      necessary because of the removal above).
  - When an existing extent is shrunk and its e_len becomes zero, it is deleted.
    - I am not clear if an extent with length == 0 is valid; judging from
      https://www.mail-archive.com/ptxdist@pengutronix.de/msg20364.html,
      probably not.
  - Without this change, "extent.e_len += (extent.e_lblk - start)" in
    ext2fs_extent_fix_parents() may overflow and throw the end of the extent way
    outside the size of the filesystem. Check with the e2 image referenced in this bug.

In summary, this patch fixes tytso#146 and
optimizes the allocation of extents after a shrinking resize2fs.

        Signed-off-by: Andrea Biardi <andrea.biardi@viavisolutions.com>
@viavi-ab
Copy link
Author

Apologies @antmat , I thought you were involved in the project. I emailed @tytso directly.
In the meantime I've pushed my patch to https://github.com/viavi-ab/e2fsprogs/tree/master
It solves my problem with #146, perhaps you can check if it solves your #145 too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants