Convey snapshot information #658

exarkun · 2014-09-04T02:47:58Z

This partially addresses #46 - but does not completely resolve it.

Added here is a new flocker-volume snapshots command and a new method on IRemoteVolumeManager to invoke it. This allows the pushing side of an interaction to learn what snapshots exist on the receiving side of that interaction. A follow-up branch will use this information to generate incremental streams (using the functionality implemented in #657).

This branch is based on #657. Review that first, then recompute the diff to review this one.

…t should have before. Also fix the test to catch this.

…sts that loosely constrain its behavior.

…ation-46

wallrj · 2014-09-05T17:30:50Z

Thanks @exarkun

I've only glanced at the code. It all looks good. I guess what's not clear to me (yet) is what triggers a snapshot to be taken after the initial push....oh, I see that the snapshot occurs implicitly on every call to f.v.zfs.FileSystem.reader

Tests all pass
The coverage of the new code seems mostly complete except http://build.clusterhq.com/results/use-snapshot-information-46/flocker-1884/complete/flocker_volume_script.html, but the other subcommand options aren't completely unit tested either so that's probably ok.

What happens:

if the check_call([b"zfs", b"snapshot", snapshot]) raises CalledProcessError?
if the check_output([b"zfs"] + _list_snapshots_command(self)), fails
if RemoteVolumeManager._destination.get_output( [b"flocker-volume", ..."snapshots"... fails.

Will sufficient information be logged (stderr) to allow us to debug the failures?

And more generally, I was trying to imagine some failure scenarios eg what would happen if a user, having modified their deployment.yml config, runs flocker-deploy to move an application but realises they've made a mistake. They wanted to move the application to a different node. So they kill flocker-deploy, correct the deployment config and re-run it. Meanwhile the original phase1 push:

may have completed.
may still be in progress
may have failed

I can't remember how a node knows that it is authoritative for a particular volume. Is it possible that two nodes might identify themselves authoritative for a volume and attempt to send their data to the new recipient?

But I those general problems (if they are problems) are addressed by the handoff in phase2 (after stopping the application).

Anyway, that's all I can think of. Please address or answer the points above as you see fit and then merge so that you can continue with the final integration.

exarkun · 2014-09-05T17:37:27Z

The coverage of the new code seems mostly complete except http://build.clusterhq.com/results/use-snapshot-information-46/flocker-1884/complete/flocker_volume_script.html, but the other subcommand options aren't completely unit tested either so that's probably ok

I think maybe we have better coverage than the report suggests here. Some of our tests run the CLI tools in child processes. Lines executed in those child processes aren't measured and reported by the coverage tool.

tomprince · 2014-09-06T01:24:01Z

I've filed ClusterHQ/build.clusterhq.com#21 for collecting that coverage information.

exarkun · 2014-09-08T12:14:05Z

Will sufficient information be logged (stderr) to allow us to debug the failures?

Probably not. In general we're doing a really bad job of zfs error reporting right now. My story for now is that this will improve when we switch to libzfs_core bindings. Hopefully we'll do that real soon.

Expand the remote volume manager interface so it can convey information about the snapshots that exist for the filesystems it manages.

exarkun added 7 commits September 3, 2014 19:33

Make the in-memory filesystem snapshot listing return a Deferred as i…

eb78b41

…t should have before. Also fix the test to catch this.

Introduce snapshots on the volume manager interface. Add a few te…

22ac1a9

…sts that loosely constrain its behavior.

Remove some redundancy from some existing volume manager tests.

0b91861

Implement snapshots.

7d0dbf4

Add a test for flocker-volume snapshots.

7c8a46e

Implement flocker-volume snapshots.

517dbf1

Lint.

293736f

exarkun added the review label Sep 4, 2014

robhaswell mentioned this pull request Sep 4, 2014

Incrementally update a local volume on a remote node #46

Closed

4 tasks

Merge remote-tracking branch 'origin/master' into use-snapshot-inform…

d9381d6

…ation-46

wallrj added the accepted label Sep 5, 2014

exarkun mentioned this pull request Sep 5, 2014

Integrate incremental push #669

Merged

wallrj removed the review label Sep 5, 2014

wallrj assigned exarkun Sep 5, 2014

exarkun added a commit that referenced this pull request Sep 8, 2014

Merge pull request #658 from ClusterHQ/use-snapshot-information-46

e38b3d2

Expand the remote volume manager interface so it can convey information about the snapshots that exist for the filesystems it manages.

exarkun merged commit e38b3d2 into master Sep 8, 2014

exarkun deleted the use-snapshot-information-46 branch September 8, 2014 12:43

exarkun removed the accepted label Sep 8, 2014

adamtheturtle mentioned this pull request Sep 24, 2014

Release Flocker 0.2.0 #795

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convey snapshot information #658

Convey snapshot information #658

exarkun commented Sep 4, 2014

wallrj commented Sep 5, 2014

exarkun commented Sep 5, 2014

tomprince commented Sep 6, 2014

exarkun commented Sep 8, 2014

Convey snapshot information #658

Convey snapshot information #658

Conversation

exarkun commented Sep 4, 2014

wallrj commented Sep 5, 2014

exarkun commented Sep 5, 2014

tomprince commented Sep 6, 2014

exarkun commented Sep 8, 2014