Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd zvol target #9016

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

aerusso
Copy link
Contributor

@aerusso aerusso commented Jul 11, 2019

Motivation and Context

This is an alternative to #8975, implementing granular zfs-volume@.service units for each zvol.

Description

Introduce a zfs-volumes.target, which Requires= zfs-volume@.service units that are dynamically generated for each zvol by the existing systemd-generator infrastructure at boot time.

Each unit polls once per second, exiting if it detects the zvol link, succeeding unconditionally after a 30 second timeout.

How Has This Been Tested?

It builds. The target takes 30 seconds to start if there is a missing link to a zvol, starts up within a second of the link being created. Appropriate systemd links were created after a systemctl daemon-reload. The modified version of zfs-list.cache did not cause interference with mismatched versions of zfs-mount-generator.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@codecov
Copy link

codecov bot commented Jul 11, 2019

Codecov Report

Merging #9016 (9fde4ad) into master (161ed82) will increase coverage by 3.92%.
The diff coverage is n/a.

❗ Current head 9fde4ad differs from pull request most recent head f34b249. Consider uploading reports for the commit f34b249 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9016      +/-   ##
==========================================
+ Coverage   75.17%   79.10%   +3.92%     
==========================================
  Files         402      419      +17     
  Lines      128071   123696    -4375     
==========================================
+ Hits        96283    97846    +1563     
+ Misses      31788    25850    -5938     
Flag Coverage Δ
kernel 79.75% <ø> (+0.99%) ⬆️
user 66.77% <ø> (+19.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
module/os/linux/spl/spl-vmem.c 54.16% <0.00%> (-45.84%) ⬇️
include/os/linux/zfs/sys/trace_acl.h 33.33% <0.00%> (-33.34%) ⬇️
cmd/zdb/zdb_il.c 30.43% <0.00%> (-23.27%) ⬇️
module/unicode/u8_textprep.c 10.90% <0.00%> (-22.67%) ⬇️
include/os/linux/zfs/sys/trace_arc.h 85.71% <0.00%> (-14.29%) ⬇️
module/os/linux/zfs/spa_misc_os.c 48.27% <0.00%> (-12.44%) ⬇️
lib/libzfs/os/linux/libzfs_mount_os.c 61.94% <0.00%> (-11.89%) ⬇️
module/zfs/zrlock.c 86.15% <0.00%> (-10.57%) ⬇️
module/zfs/zfs_onexit.c 76.19% <0.00%> (-9.86%) ⬇️
cmd/zed/zed_file.c 33.80% <0.00%> (-8.42%) ⬇️
... and 358 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e410130...f34b249. Read the comment docs.

Copy link
Contributor

@pzakha pzakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some scalability testing with this code.

Not related to this code, but I've noticed that /usr/lib/systemd/system-generators/zfs-mount-generator introduced in #7329 is not a valid path for generators on Ubuntu, so I had to move this to /lib/systemd/system-generators/zfs-mount-generator to have this work. I'll file a separate bug for that.

I've tested this on the same system as for #8975, which had ~1000 zvols. I've noticed that many of the units timed-out (269 in total) with this kind of message:
sh[6709]: zvol domain0/VOLS/zvol499 unavailable
We could probably extend the timeout if needed.

One issue scalability-wise is that since this feature requires enabling history_event-zfs-list-cacher.sh which does a zfs list on each dataset addition or removal, this may not be practical on systems with a large amount of datasets. There are also scalability issues with systemd itself, but that's another matter.

There is definitely value in this feature for systems that have a lower number of zvols, so I think it's reasonable to keep this as an opt-in feature for now. I believe this feature should be able to coexist with #8975, which doesn't provide granularity but does provide scalability and a stand-alone script to wait for zvols to be ready (it can be called after a pool has been imported manually). To keep the naming a bit more consistent, I can rename zfs-volume.target to zfs-volumes.target.

DefaultDependencies=no

[Service]
ExecStart=sh -c 'timeout=30 ; while [ $timeout -gt 0 ] ; do [ -e "/dev/zvol/$0" ] && exit 0 ; sleep 1 ; timeout=$((timeout-1)); done ; echo "zvol $0 unavailable"' %I
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be /bin/sh.
systemd requires full path for ExecStart scripts (At least on Ubuntu).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The man page (in Debian and Ubuntu), says that "It is thus safe to use just the executable name in case of executables located in any of the "standard" directories". It does recommend absolute paths to "avoid ambiguity". Could you confirm that this is actually broken in Ubuntu? If there's even one example where this could be a problem, I think it's worth adjusting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omfg. Ubuntu 12.04, 18.10, 19.04 allow non-absolute paths while 14.04, 16.04, 18.04 requires absolute paths (they fail with Executable path is not absolute, ignoring: sh -c). WTF are they doing...

Short: yes, ExecStart must be absolute here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing this @c0d3z3r0. Sorry I missed your comment @aerusso. This is indeed madness...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c0d3z3r0 Strange! Thanks for testing, I've updated this to use /bin/sh.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jul 12, 2019
@aerusso
Copy link
Contributor Author

aerusso commented Jul 13, 2019

Thanks for looking at this @pzakha!

Not related to this code, but I've noticed that /usr/lib/systemd/system-generators/zfs-mount-generator introduced in #7329 is not a valid path for generators on Ubuntu, so I had to move this to /lib/systemd/system-generators/zfs-mount-generator to have this work. I'll file a separate bug for that.

That's a packaging issue as far I can tell. The Debian source (which I've made sure stayed synchronized) has the correct settings (in dh_auto_configure). It might make sense to give more sensible defaults, but I think the feeling I've gotten on these kinds of issues is that it should be solved by packagers (though I can see merit in zfsonlinux doing distribution-specific builds, I've decided to draw the battle lines at native Debian packaging).

I've tested this on the same system as for #8975, which had ~1000 zvols. ...

I was worried about this, actually. Earlier version of the zed scriptlet actually tried to only update the affected datasets. We dropped it at the time because it was getting complicated. Perhaps it can be revisited, given the demonstrated need.

To keep the naming a bit more consistent, I can rename zfs-volume.target to zfs-volumes.target

I feel like zfs-volumes.target makes more sense, so I think that's for the best!

@behlendorf
Copy link
Contributor

@aerusso @pzakha I think we should strongly consider pulling this change and #8975 in to a single PR. This would let us easily resolve any discrepancies, for example we'll need to settle on zfs-volumes.target or zfs-volume.target. Additionally, if we want this new target to be installed with WantedBy=zfs.target. This seems reasonable to me, but in this regard the two PRs still differ.

My suggestion would be to lay this out the same way as the zpool-import.target, so either the zfs-volume-wait.service or zfs-volume@.service could be used by zfs-volumes.target. Defaulting to zfs-volume-wait.service I think makes the most sense since it requires no additional configuration. As does disabling the zfs-volumes.target, at least initially. So:

enable zfs-volume-wait.service
disable zfs-volume@.service
disable zfs-volumes.target 

Does that sound like a reasonable way forward for both of these PRs?

@pzakha
Copy link
Contributor

pzakha commented Jul 15, 2019

@aerusso @pzakha I think we should strongly consider pulling this change and #8975 in to a single PR. This would let us easily resolve any discrepancies, for example we'll need to settle on zfs-volumes.target or zfs-volume.target. Additionally, if we want this new target to be installed with WantedBy=zfs.target. This seems reasonable to me, but in this regard the two PRs still differ.

My suggestion would be to lay this out the same way as the zpool-import.target, so either the zfs-volume-wait.service or zfs-volume@.service could be used by zfs-volumes.target. Defaulting to zfs-volume-wait.service I think makes the most sense since it requires no additional configuration. As does disabling the zfs-volumes.target, at least initially. So:

enable zfs-volume-wait.service
disable zfs-volume@.service
disable zfs-volumes.target 

Does that sound like a reasonable way forward for both of these PRs?

I was planning to do 2 changes to #8975 to make it consistent with this PR:

  1. rename zfs-volume.target -> zfs-volumes.target
  2. Add WantedBy=zfs.target to zfs-volumes.target (but keep the target disabled).

I think given those changes, this PR could be a natural follow-up, as bundling them together in a single review might delay integration of #8975. @aerusso what do you think?

@pzakha
Copy link
Contributor

pzakha commented Jul 29, 2019

Not related to this code, but I've noticed that /usr/lib/systemd/system-generators/zfs-mount-generator introduced in #7329 is not a valid path for generators on Ubuntu, so I had to move this to /lib/systemd/system-generators/zfs-mount-generator to have this work. I'll file a separate bug for that.

That's a packaging issue as far I can tell. The Debian source (which I've made sure stayed synchronized) has the correct settings (in dh_auto_configure). It might make sense to give more sensible defaults, but I think the feeling I've gotten on these kinds of issues is that it should be solved by packagers (though I can see merit in zfsonlinux doing distribution-specific builds, I've decided to draw the battle lines at native Debian packaging).

Yes, you are right. It looks like we do not explicitly set --with-systemdgeneratordir, which should fix this problem.

Copy link
Contributor

@pzakha pzakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine in my preliminary testing, although I'm wondering if we should extend the default timeout of 30 seconds.

@@ -46,6 +46,8 @@ fi
# avoid regressions, this dependency is reduced to "wants" rather than
# "requires". **THIS MAY CHANGE**
req_dir="${dest_norm}/local-fs.target.wants/"
zvol_reqdir="${dest_norm}/zfs-volumes.target.requires"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we integrated #8975, do you think we still need to add an additional requires on zfs-volumes.target?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so (enabling the zfs-volume@ units should presumably allow the user to disable zfs-volume-wait.service).

I feel like I may be misunderstanding something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original idea was to only use zfs-volume-wait.service as a dependency for zfs-volumes.target, but that's mostly because using it along with zfs-volume@ as a dependency was redundant. Disabling zfs-volume-wait.service when we are using zfs-volume@ would be another way to do this.

@aerusso
Copy link
Contributor Author

aerusso commented Jul 31, 2019

I've rebased on master, and implemented the changes. I have not tested it to see how it interacts with #8975, but I've disabled the template units by default. Presumably, we should add some documentation about disabling zfs-volume-wait.service and enabling zfs-volume@.service somewhere. Maybe the zfs-mount-generator man page?

@pzakha
Copy link
Contributor

pzakha commented Jul 31, 2019

Presumably, we should add some documentation about disabling zfs-volume-wait.service and enabling zfs-volume@.service somewhere. Maybe the zfs-mount-generator man page?

That sounds good to me. I think we may also want to mention the trade-offs between the 2 implementations, especially regarding scalability.

I'm also wondering if the 30 seconds timeout is too aggressive. If the zfs-volume@.service service returns "OK" when it is not effectively ready then that could have an impact on whoever depends on that zvol link to be there.

@behlendorf
Copy link
Contributor

@aerusso @c0d3z3r0 what are your thought on moving this PR forward?

@@ -4,6 +4,8 @@ disable zfs-import-scan.service
enable zfs-import.target
enable zfs-mount.service
enable zfs-share.service
disable zfs-volumes.target
disable zfs-volume@.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if that is needed since the service only comes into effect when some zfs-volume@volname is enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will gladly enable this if this is the consensus. We seem to be moving forward very cautiously with this zfs-mount-generator project, and I don't want to be too aggressive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be reasonable to enable this and the zfs-mount-generator by default in the next tagged release. Why don't we go ahead and merge this so it's currently disabled, then open a new PR to enable the generators by default. This way it'll be enabled in the master branch by default well before anything is tagged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I‘m ok with the target disabled but disable the template is nonsense as it does not disable anything ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, line 8 can be dropped but 7 must remain

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, one more thing... zfs-volume@.service isn't even an installable unit, so there is not need for the disabled line because it can't be enabled anyways

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, disable zfs-volume@*.service might parse (the documentation on these preset files is pretty sparse), but it's also unnecessary, since it would only exist if you were trying to use this.

Also, line 7 should also go. I added that disable out of an abundance of caution (not creating and enabling a feature). This patch doesn't create that `.target' anymore, so I'm not going to disable it.

etc/systemd/system/zfs-volume@.service.in Outdated Show resolved Hide resolved
etc/systemd/system-generators/zfs-mount-generator.in Outdated Show resolved Hide resolved
DefaultDependencies=no

[Service]
ExecStart=/bin/sh -c 'timeout=30 ; while [ $timeout -gt 0 ] ; do [ -e "/dev/zvol/$0" ] && exit 0 ; sleep 1 ; timeout=$((timeout-1)); done ; echo "zvol $0 unavailable"' %I
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is systemd, no need for such complicated construct :-)

BindsTo=%i.device
After=%i.device
Conflicts=shutdown.target  # maybe not needed, needs testing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's a lot nicer!

I'm not super familiar with how the timeout on this works. If this device is slow to show up, will this give up and enter a failed state, or forever remain in a waiting state? Is this what JobTimeoutSec= does (if so, when does it become "queued"?) The other timeout setting I see in man systemd.unit seems to do with once the job has fired. Is my understanding correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my late response, I Will have a look again later rhis day

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JobTimeoutSec is there for a) setting a timeout other than the default (infinity) for normal units and b) for overriding the default device unit timeout (which is specified by DefaultTimeoutStartSec=). In our case b) applies, which we should not modify upstream. DefaultTimeoutStartSec is 90s for Debian for example.

When the timeout is reached, the unit gets aborted and enters "failed" state. What happens then depends on the dependencies specified by "Wants" and "Requires". When the zvol target requires all zvol@.service and the zvol target is required by multi-user.target, then systemd will completely refuse to boot and go to recovery mode. This is not what we want but since zfs-volumes.target is only WantedBy=zfs.target, it only gets marked as failed.

@c0d3z3r0
Copy link
Contributor

Currently it seems the generator does not create the target directory, but I don't know why, yet

@behlendorf behlendorf added the Status: Work in Progress Not yet ready for general review label Nov 13, 2019
@aerusso
Copy link
Contributor Author

aerusso commented Nov 16, 2019

@c0d3z3r0 Could you comment what you mean it "does not create the target directory"? Are you calling the generator manually? This was not something I ever experienced during testing.

@c0d3z3r0
Copy link
Contributor

@c0d3z3r0 Could you comment what you mean it "does not create the target directory"? Are you calling the generator manually? This was not something I ever experienced during testing.

Sorry, that was unclear - I meant the zvol_reqdir doesn't get created

@aerusso
Copy link
Contributor Author

aerusso commented Nov 16, 2019

@c0d3z3r0 I'm very confused how that's possible. Is the local-fs.target.wants being populated with the links to the mount files? If you're calling it manually, are you passing /run/systemd/generator as its first argument?

@c0d3z3r0
Copy link
Contributor

@c0d3z3r0 I'm very confused how that's possible. Is the local-fs.target.wants being populated with the links to the mount files? If you're calling it manually, are you passing /run/systemd/generator as its first argument?

not calling it manually; I will do a fresh zfs setup later and test this again

@aerusso
Copy link
Contributor Author

aerusso commented Nov 16, 2019

@c0d3z3r0 I've actually think I have a simpler way to do all of this. Let me get that up before wasting time testing an older patch.

@c0d3z3r0
Copy link
Contributor

@c0d3z3r0 I've actually think I have a simpler way to do all of this. Let me get that up before wasting time testing an older patch.

great, lmk when it's ready for testing :)

@aerusso
Copy link
Contributor Author

aerusso commented Nov 16, 2019

@c0d3z3r0 Take a look at this version; I think you will be pleased. It drops the extra template unit entirely, opting just to Require=dev-zvol-%i.device directly (via drop-in file). Cursory testing on my end shows that JobTimeoutSec=30 should match the existing behavior.

I also was unable to reproduce any issues creating the dependency directory here.

@c0d3z3r0
Copy link
Contributor

@aerusso hah, very nice solution! congrats :)) LGTM

@pzakha
Copy link
Contributor

pzakha commented Nov 18, 2019

It seems like we are now depending on .device units being automatically created by systemd for each zvol. I was curious to see how this works in practice.

I've manually created a new zvol rpool/zvol1 and looked for dev-zvol-rpool-zvol1.device, but it wasn't there. I've tried adding Requires=dev-zvol-rpool-zvol1.device to the zfs-volumes.target unit file, and that resulted in zfs-volumes.target to be inactive on reboot with systemd[1]: Dependency failed for ZFS volumes are ready.

Is there something else required to have .device files created? I do see that /dev/zvol/rpool/zvol1 exists, and I do see .device files for other devices such as /dev/xvda --> dev-xvda.device. Are we missing some udev rules?

@aerusso
Copy link
Contributor Author

aerusso commented Nov 19, 2019

@pzakha I'm running systemd 243 (Debian unstable). I was worried I was going to have to do something, but creating the zvol "just worked for me."

In 99-systemd.rules, I found

SUBSYSTEM=="block", TAG+="systemd"

and, man systemd.device claims that:

systemd will dynamically create device units for all kernel devices that are marked with the "systemd" udev tag (by default all block and network devices, and a few others).

So this is supposedly "default" systemd behavior---does your distribution's man page reflect that?

@jgallag88
Copy link
Contributor

@pzakha Were you testing this on a Delphix Engine? We disabled .device units for zvols because they were causing us to hit the hardcoded systemd limit for number of units on a system: delphix/delphix-platform#80

@pzakha
Copy link
Contributor

pzakha commented Nov 19, 2019

Okay, thanks for the clarifications, I was not aware we had it disabled explicitly in the product, so sorry for the confusion.

So given that udev seems to already provide per-zvol units, which I believe was the main selling point for the original PR, what other advantages do we get with this new approach?

zfs-zvol-wait.service already waits on zvols to be ready, and filters out zvols that are not expected to have links (there are several corner cases). From my understanding, once we ship this PR, it will be redundant with zfs-zvol-wait.service and one of them would have to be removed.

@pzakha
Copy link
Contributor

pzakha commented Nov 19, 2019

So in this case, the idea would be to manually edit the FSLIST to include only the zvols that the administrator cares about and not use history_event-zfs-list-cacher.sh, right?

Edit: The alternative would be to add the following statements to whatever service needs those zvols (which should work without this PR):

Requires=<path-to-zvol1>.device
Requires=<path-to-zvol2>.device
...

@aerusso
Copy link
Contributor Author

aerusso commented Nov 19, 2019

@pzakha

So in this case, the idea would be to manually edit the FSLIST to include only the zvols that the administrator cares about and not use history_event-zfs-list-cacher.sh, right?

This is not my intention with this PR. The idea is to have a "proper" integration into systemd, that "does the right thing" automatically. I read through your changes to zvol_wait, and this is not yet at feature parity (I'll have to fix it before it gets merged). In particular, it doesn't check redact_snaps

The way I see it, if people want to fine-tune their systemd unit dependencies, they should depend on specific dev-zvol-*.device units. If they can't be bothered to do that, then just depend on the blanket zfs-volumes.target. We need the zfs-mount-generator framework for other features it provides, so ideally we push zfs-volumes.target management into the mount generator, and drop the zvol_wait infrastructure once all the kinks in this have been worked out (much later on).

@pzakha
Copy link
Contributor

pzakha commented Nov 19, 2019

Okay, that makes more sense. Once we bring this to feature parity (exclude the right zvols), it would make sense to remove zfs-zvol-wait.service. We may also want to extend the timeout to more than 30 seconds at that point as in practice it should not be required anyway.

Keeping the zvol_wait script might still be worthwhile for cases where a pool is imported after boot has completed and we want to wait for the zvols to be complete.

This will still have the disadvantages of not scaling well, but I think it should be fine for most folks, and for others they can keep using zvol_wait.

@Rudd-O
Copy link
Contributor

Rudd-O commented Apr 14, 2020

Can the code, instead of using polling alone, use a combination of inotify/fanotify and polling, or perhaps just inotify/fanotify alone, since it knows ahead of time which volumes are supposed to appear?

Does this work well with ZFS encryption of subvolumes?

@behlendorf
Copy link
Contributor

@aerusso if you still want to move forward with this tighter systemd integration, can you get this rebased and add the missing functionality.

@aerusso
Copy link
Contributor Author

aerusso commented Nov 20, 2021

@behlendorf I'm going to wait for #12138 to land, rather than force it to rebase on this patch (or lose the race and then rebase this a second time).

@behlendorf
Copy link
Contributor

@aerusso so now that #12138 has landed, can I talk you in to dusting off this PR so we can move forward with it?

@aerusso
Copy link
Contributor Author

aerusso commented Feb 13, 2022

So, I have a (local, untested) version that is rebased on the new C version of zfs-mount-generator, with feature parity to what's currently here. However: it does NOT have feature parity with mainlined zvol_wait. In particular, the following filter is going to be tricky:

list_zvols() {
	read -r default_volmode < /sys/module/zfs/parameters/zvol_volmode
	zfs list -t volume -H -o \
	    name,volmode,receive_resume_token,redact_snaps |
	    while IFS="	" read -r name volmode token redacted; do # IFS=\t here!

		# /dev links are not created for zvols with volmode = "none"
		# or for redacted zvols.
		[ "$volmode" = "none" ] && continue
		[ "$volmode" = "default" ] && [ "$default_volmode" = "3" ] &&
		    continue
		[ "$redacted" = "-" ] || continue

		# We also ignore partially received zvols if it is
		# not an incremental receive, as those won't even have a block
		# device minor node created yet.
		if [ "$token" != "-" ]; then

			# Incremental receives create an invisible clone that
			# is not automatically displayed by zfs list.
			if ! zfs list "$name/%recv" >/dev/null 2>&1; then
				continue
			fi
		fi
		echo "$name"
	done
}

The problem is that /sys/module/zfs/parameters/zvol_volmode won't be (guaranteed to be) available at systemd generator-time. And I really don't think adding snapshot-receive events to add to the already overburdened history-cacher zedlet is a good idea (and I have a suspicion I'd miss at least one weird corner case).

I'd like to bounce the following proposal off people: refactor zvol_wait to be able to act on a single dataset (or on all datasets, by default, with no command line options). Then, call that from a systemd unit. Unfortunately, this gets rid of a lot of the beautiful simplicity we achieved in the last version (where everything was handled nicely by systemd). We also can't use zvol_wait (AFAICT) to just filter the units: there is no systemd "ConditionExecSuccess".

@aerusso aerusso force-pushed the pulls/systemd-zvols branch 2 times, most recently from fd6bb46 to 80452d5 Compare February 19, 2022 16:53
Do not link the pthreads library into zfs-mount-generator

Signed-off-by: Antonio Russo <aerusso@aerusso.net>
zvol_wait provides a mechanism to check whether a zvol has been made
available.  By design, it only acts on zvols that are expected to become
ready in the normal course of operation.

This functionality is embedded inside an inner loop, making it unusable
for other purposes.

This patch isolates that logic and moves it into a common function.

Signed-off-by: Antonio Russo <aerusso@aerusso.net>
This extends zvol_wait to be able to wait for a specific zvol.

Signed-off-by: Antonio Russo <aerusso@aerusso.net>
The zfs-mount-generator is used to produce granular systemd mount
units that integrate the mounting of zfs datasets into systemd.

Extend the zfs-mount-generator to also inform systemd about the
initialization of zvols.

Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pzakha @c0d3z3r0 it'd be great if you could take another look at this.

My feeling is relying on the updated version of zvol_wait is still a step in the right direction and it'll be nice to have this functionality available. Even if it adds back some of the complexity.

@pzakha
Copy link
Contributor

pzakha commented Feb 28, 2022

I took a quick glimpse and the new logic seems to make sense to me. I'll try to do a full review later this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing Status: Work in Progress Not yet ready for general review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants