Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow zhack label repair to restore detached devices. #14773

Merged
merged 1 commit into from
May 3, 2023

Conversation

buzzingwires
Copy link
Contributor

@buzzingwires buzzingwires commented Apr 20, 2023

Motivation and Context

This change is intended to address #4187 and expand upon the zhack label repair command implemented in d04b5c9 with the ultimate goal of allowing the zpool of a detached device to be rendered reimportable. Additionally, it allows zhack label repair to handle block devices and other media, as well as sizes which are not divisible by the size of a label.

Description

The concepts in Jeff Bonwick's original code at https://gist.github.com/jjwhitney/baaa63144da89726e482 are given a modernized reimplementation. Using the -u option, uberblocks are regenerated using the birth TXG of the root block pointer, and the nvlist is also set to reference this TXG. The previous checksum repairing functionality is retained with the -c option, or as a default behavior if no options are specified. Both options may be run together in the same command, for both functionalities. fstat64_blk is also used to determine the size of block devices and other media, so the code will work on them, in addition to normal files.

How Has This Been Tested?

The existing test suite for the zhack label repair command has been rewritten to have four test types.

  1. resembles the original test and determines if corrupted checksums can be successfully repaired.
  2. tests if a detached device can be made reimportable.
  3. is a combination of the last two tests, which corrupts checksums on a detached device, before attempting repair.
  4. is like the third test, but ensures the -c and -u options work when used in the same command, rather than separate ones.

Additionally, each test ensures that -c will not undetach a device, and that -u will not proceed with corrupted checksums.

Each test runs on a device of randomly determined size. Approximately half the time, the standard minimum device size will be used, and the other half, there will be a possible variation of 0 to 2^15-1 bytes more.

Lint and checkstyle have been run, as have the aforementioned tests.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@buzzingwires buzzingwires marked this pull request as ready for review April 22, 2023 03:20
@buzzingwires
Copy link
Contributor Author

buzzingwires commented Apr 22, 2023

As of the pull request first being opened, this code has seen a couple style revisions, the tests have been rewritten to use loopback devices instead of files, and fstat64_blk is now used to portably determine the size of a device or file. The tests specific to this code seem to pass without issue on the buildbot, though a some unrelated tests seem to fail.

Please let me know if there is anything else I need to do. This is my first ever pull request I've made anywhere, so a lot of the finer details are still unclear to me.

For future reference, does the 🚀 emoji have any special significance as a reaction? I've become aware that emojis on GitHub often do in this context, but am unaware of what norms the ZFS community may follow.

And, should successive changes after opening the pull request be in the form of extra commits, or should they just be squashed and force-pushed back to origin, as I have now?

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking up this work and opening this PR.

Instead of files, and fstat64_blk is now used to portably determine the size of a device or file. The tests specific to this code seem to pass without issue on the buildbot, though a some unrelated tests seem to fail.

We do have a couple unrelated failures in the CI you can safely ignore those.

Please let me know if there is anything else I need to do. This is my first ever pull request I've made anywhere, so a lot of the finer details are still unclear to me.

The next steps here are for reviewers to go over the PR and provide some feedback for discussion. I've gone ahead and made a first round of comments to get the ball rolling.

For future reference, does the rocket emoji have any special significance as a reaction?

Not really, we don't use the emoji's as part of the official workflow. However, folks do use them as a way to express interest in a change.

And, should successive changes after opening the pull request be in the form of extra commits, or should they just be squashed and force-pushed back to origin, as I have now?

Please go ahead and squash the commits and force update. This will need to happen as a final step before any change is integrated anyway so it's helpful to do frequently.

cmd/zhack.c Outdated Show resolved Hide resolved
cmd/zhack.c Outdated Show resolved Hide resolved
cmd/zhack.c Outdated Show resolved Hide resolved
cmd/zhack.c Outdated Show resolved Hide resolved
cmd/zhack.c Show resolved Hide resolved
man/man1/zhack.1 Outdated Show resolved Hide resolved
cmd/zhack.c Outdated Show resolved Hide resolved
cmd/zhack.c Outdated Show resolved Hide resolved
cmd/zhack.c Outdated Show resolved Hide resolved
This commit expands on the zhack label repair command in d04b5c9 by
adding the -u option to undetach a device by regenerating uberblocks,
in addition to the existing functionality of fixing checksums, now
represented by -c. Previous behavior is retained in the case of no
options.

The changes are heavily inspired by Jeff Bonwick's labelfix
utility, as archived at:

https://gist.github.com/jjwhitney/baaa63144da89726e482

Additionally, it is now capable of properly determining the size of
block devices and other media, as well as handling sizes which are
not divisible by 2^18. This should make it viable for use on physical
devices and partitions, in addition to files.

These changes should make it possible to import zpools that have had
their uberblocks erased, such as in the case of pools rendered
inaccessible by erroneous detach commands.

Signed-off-by: buzzingwires <buzzingwires@outlook.com>
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels May 2, 2023
@buzzingwires
Copy link
Contributor Author

Looks good!

I'm glad! Thank you so much for all your guidance over the last couple weeks. Please let me know if there's anything else you need.

@behlendorf behlendorf merged commit a46001a into openzfs:master May 3, 2023
andrewc12 pushed a commit to andrewc12/openzfs that referenced this pull request May 3, 2023
This commit expands on the zhack label repair command in d04b5c9 by
adding the -u option to undetach a device by regenerating uberblocks,
in addition to the existing functionality of fixing checksums, now
represented by -c. Previous behavior is retained in the case of no
options.

The changes are heavily inspired by Jeff Bonwick's labelfix
utility, as archived at:

https://gist.github.com/jjwhitney/baaa63144da89726e482

Additionally, it is now capable of properly determining the size of
block devices and other media, as well as handling sizes which are
not divisible by 2^18. This should make it viable for use on physical
devices and partitions, in addition to files.

These changes should make it possible to import zpools that have had
their uberblocks erased, such as in the case of pools rendered
inaccessible by erroneous detach commands.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: buzzingwires <buzzingwires@outlook.com>
Closes openzfs#14773
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants