Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Incompatibility with rsync 3.2.4 or newer #1351

Merged
merged 3 commits into from
Nov 9, 2022
Merged

Fix: Incompatibility with rsync 3.2.4 or newer #1351

merged 3 commits into from
Nov 9, 2022

Conversation

buhtz
Copy link
Member

@buhtz buhtz commented Nov 4, 2022

Summary

This fix #1247 (and kind of #1320). The core problem is that BIT do call rsync in a way that is not compatible with newer versions of rsync. The version 3.2.4 of rsync activated a new method of argument protection. This PR does fix the rsync calls and did all necessary modifications. Important to know is that BIT still work with old rsync versions, too.

From my point of view that solution is final/stable. I introduced two new unit tests in former PR's (#1340 #1334). I did systematic manual tests with different GNU/Linux distributions including different rsync versions. I used all four types of snapshot profiles (local, SSH, local encrypted, SSH encrypted) in that test. I used git grep to find possibly relevant parts of the code. All that steps brought up extra issues/problems I deald with. But theoretically there still is a risk of side effects.

Because of that please carefully review that PR and also do multiple and pedantic tests yourself.

Short description of the solution

In the past BIT did call rsync (simplified) like this:

rsync user@localhost:"/home/user/folder with blank in name"

Now it is done without the quoting of the path via " and with the new flag -s:

rsync -s user@localhost:/home/user/folder with blank in name

The solution in detail

The rsync maintainer has checked and confirmed my approach.

The goals are

  • Not using workarounds like --old-args flag or environment variables like RSYNC_OLD_ARGS or RSYNC_PROTECT_ARGS to force newer rsync versions to use the old way of argument protection.
  • Use only one way to call rsync and don't use if statements and/or check the version of the current installed rsync.
  • The solution should work in the future, too.

About the quoting

In most cases quoting a path in BIT is done by the method Snapshots.rsyncRemotePath(). It has an argument quote which is by default quote='"'.

But I didn't modify that default value because of possible side effects.
I modified the calls of rsyncRemotePath() to explicit set that quote argument to an empty string (quote='').

After doing some tests and git grep searches I can say that the productive BIT code never call that method with its quote default value. That is how it should be. There are only some unittest doing it. And I wouldn't touch that tests yet.

Background

Rsync

The new argument protection was made the default behaviour with rsync version 3.2.4 but it was introduced (but optional) with version 3.0.0 in the year 2008.

The flag -s also have the synonym flags --protect-args and --secluded-args (read further). The latter two are not consistent exist in all versions. Only -s exists in all versions (since 3.0.0). That is why that short form is used in that approach.

Further reading about argument protection: ADVANCED USAGE section in rsync manpage.

Interactions with workarounds

Because of #1247 we advised the users a workaround. They should use --old-args flag (in BIT Expert options) or the environment variables like RSYNC_OLD_ARGS or RSYNC_PROTECT_ARGS to force newer rsync versions to use the old way of argument protection. Do this workarounds conflicts with that PR?

I checked it out. The environment variables having a lower priority then flags on commandline. It means -s will overwrite that environment variables. No problem here.

The --old-args combined with -s will cause an "conflict error" by rsync. Because of that this PR does remove --old-args automatically from the rsyncOptions config variable and give a WARNING message about it on the terminal (see config.py::Config.rsyncOptions()). IMHO it is OK not to inform the users via GUI in a more offensive way. It only affects a small number of users and won't hurt them or their data if we do it that way.

About Snapshot.backupConfig() and tools.Execute()

It is a general problem of BIT that tools.Execute() do only show a WARNING when the command had a problem. When using an encrypted SSH snapshot profile that rsync call there faild because of #1247 and no one noticed it. It even took me multiple manual tests in --debug mode to recognize that WARNING line in between the hundreds of other debug output lines.

I canceled my first intention to create a unit test for that case because the encryption (EncFS) need be replaced or rewritten in the near future. I choose a more simple approach and simply "transformed" the WARNING (only in that specific situation) into an (red) ERROR. But with this PR this ERROR shouldn't come up anymore.

Testing

TravisCI

I tested that PR in my own forked repo first. Travis gave me all lights green there. Here on bit-team the Travis account is currently out of credits and won't run.

Manual tests

  • tested all four types of snapshot profiles:
    • local
    • local encrypted
    • SSH
    • SSH encrypted
  • create snapshots (100 snapshots, one on each day; using faketimelib)
  • manually delete snapshots
  • smart (auto) delete snapshots
  • restore snapshots

Distros

  • Debian 11 stable (using the "old" rsync)
  • Debian (12) testing (using the "new" rsync)
  • Manjaro (Arch based)
  • Ubuntu 22
  • a special Debian 11 (on ARM)
  • via Travis: Ubuntu 20

Any suggestions about other important distros that are not based on Debian or Arch?

The core problem is that BIT do call rsync in a way that is not compatible
with newer versions of rsync. The version 3.2.4 of rsync activated a new
method of argument protection. This do fix the rsync calls and do all
necessary modifications.

Important to know is that BIT still work with old rsync versions, too.

Fix #1247
@emtiu
Copy link
Member

emtiu commented Nov 4, 2022

Congratulations! Our first big bugfix – thanks, @buhtzz 🥇

It is a general problem of BIT that tools.Execute() do only show a WARNING when the command had a problem. When using an encrypted SSH snapshot profile that rsync call there faild because of #1247 and no one noticed it. It even took me multiple manual tests in --debug mode to recognize that WARNING line in between the hundreds of other debug output lines.

Out of interest: Would you say that this is same problem described in #84 and #907 and #1121? If yes, then my understanding is: Your solution is a small improvement, because the error becomes more visible in the console/logs. But it's not yet a fix, the error is still invisible from the GUI/desktop. Correct?

I canceled my first intention to create a unit test for that case because the encryption (EncFS) need be replaced or rewritten in the near future. I choose a more simple approach and simply "transformed" the WARNING (only in that specific situation) into an (red) ERROR. But with this PR this ERROR shouldn't come up anymore.

Just my personal opinion on this (main discussion in #1248): There will be no replacement for encfs in the near future. Yes, it's known to be insecure under certain circumstances, but it's very well-tested and robust. The only possible replacement seems to be gocryptfs, and it would be a lot of work to integrate.

@buhtz
Copy link
Member Author

buhtz commented Nov 4, 2022

Out of interest: Would you say that this is same problem described in #84 and #907 and #1121?

Technically I would say "no". The problem is that tools.Excecute() do run shell commands in a generalized way but did not take care if they fail or not. To be really sure we could implement "the hammer" and add a dirty and hard raise RuntimeException in there when the return code is !0. But of course this is a bit to much. But it is maybe an option in debug situations.

Conceptual "yes". It is part of the problem that the core of BIT doesn't communicate with its other parts e.g. the GUI. We could open a meta issue about it. But I suspect it is also "to meta" for an issue and should better be discussed on bit-dev. IMHO it is part of a re-design and not just adding a new "communication system" (e.g. DBUS or other IPC solutions). BIT just shouldn't call itself via subprocess.Popen(). There are better ways.

my understanding is: Your solution is a small improvement, because the error becomes more visible in the console/logs. But it's not yet a fix, the error is still invisible from the GUI/desktop. Correct?

Correct. Not a fix. Just a workaround and only relevant in that very special use case where an "SSH encrypted" snapshot profile is used.

@what-the-diff
Copy link

what-the-diff bot commented Nov 4, 2022

ATTENTION: That content is auto-generated by a stupid algorithm. Some of that information's and links are not valid and wrong. It was just a test.

@buhtz
Copy link
Member Author

buhtz commented Nov 4, 2022

Sorry I couldn't resists. Just read that heise-Article about a GPT-3 based algorithm generate summary text from code diffs for PRs. I see no value in such a tool even if it would work 100% perfect. I just played a bit arround.

@emtiu
Copy link
Member

emtiu commented Nov 4, 2022

Sorry I couldn't resists. Just read that heise-Article about a GPT-3 based algorithm generate summary text from code diffs for PRs. I see no value in such a tool even if it would work 100% perfect. I just played a bit arround.

I was honestly surprised at the quality of the result 😀

@emtiu emtiu added this to the 1.3.3 milestone Nov 6, 2022
@aryoda
Copy link
Contributor

aryoda commented Nov 7, 2022

@buhtz Just to be sure I understood it right:

  1. -s is supported since 2008 (v3.0.0) in rsync with the same semantics
  2. rsync -s user@localhost:/home/user/folder with blank in name (with -s but without surrounding quotes) is the "new" way of doing it in BiT?

Which scenarios do you consider as the main risks to be tested by us?

Copy link
Contributor

@aryoda aryoda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code and found only one location that possibly could have been forgotten to change:

remote = self.rsyncRemotePath(sid.path(use_mode = ['ssh', 'ssh_encfs']), use_mode = [], quote = '\\\"')

@buhtz Could you please check this LOC which still contains an oddly escaped quote...? THX!

@buhtz
Copy link
Member Author

buhtz commented Nov 8, 2022

  1. -s is supported since 2008 (v3.0.0) in rsync with the same semantics
  2. rsync -s user@localhost:/home/user/folder with blank in name (with -s but without surrounding quotes) is the "new" way of doing it in BiT?

Yes, it is just that simple.

But to more precise with point 2. In BIT it is done that way

subprocess.Popen(['rsync', '-s', 'user@localhost:/home/user/folder with blank in name'])

So Popen() itself will surround the command elements with quotes if it sees the need (e.g. blanks) for it.

Which scenarios do you consider as the main risks to be tested by us?

Good question. I don't have a special scenario in mind. I would say that scenarios that seems to be not special. ;)
Technically most of the magic happens with SSH profiles. But I would always say to test nearly everything because I'm not enough into all details and features of BIT to judge that point.

I have reviewed the code and found only one location that possibly could have been forgotten to change:

remote = self.rsyncRemotePath(sid.path(use_mode = ['ssh', 'ssh_encfs']), use_mode = [], quote = '\\\"')

@buhtz Could you please check this LOC which still contains an oddly escaped quote...? THX!

I have an idea what's going on there and why it works. Removing a snapshot is done on the local filesystem because the remote is mounted. But I will take a closer look to it.

@buhtz
Copy link
Member Author

buhtz commented Nov 9, 2022

remote = self.rsyncRemotePath(sid.path(use_mode = ['ssh', 'ssh_encfs']), use_mode = [], quote = '\\\"')

@buhtz Could you please check this LOC which still contains an oddly escaped quote...? THX!

OK, I got it. When you look into the resulting rsync-path-string (when using smartRemove in background in an SSH snapshot profile) you can see it is not quoted.

This is the call

remote = self.rsyncRemotePath(sid.path(use_mode = ['ssh', 'ssh_encfs']), use_mode = [], quote = '\\\"')

Looking into

    def rsyncRemotePath(self, path, use_mode = ['ssh', 'ssh_encfs'], quote = '"'):
        mode = self.config.snapshotsMode()

        if mode in ['ssh', 'ssh_encfs'] and mode in use_mode:
            user = self.config.sshUser()
            host = tools.escapeIPv6Address(self.config.sshHost())
            return '%(u)s@%(h)s:%(q)s%(p)s%(q)s' %{'u': user,
                                                   'h': host,
                                                   'q': quote,
                                                   'p': path}
        else:
            return path

mode is "ssh" but use_mode is []. Because of that the if-block is never entered and the quote argument is ignored.
I don't really understand why it is intended that way but I would work on that in separate in #1359.

So I would say the PR is "clean" again and don't have any unclear problems.
But I would keep it open for some more weeks. It doesn't hurry. I want to make sure.
I will refresh my request for testing to the other reports and on the ML in some days.

@aryoda
Copy link
Contributor

aryoda commented Nov 9, 2022

I'd also say "go" for a merge (the earlier we use it the earlier we may find hidden issues).

I want to set up kinda integration test for myself (taking backups, restoring them and do a folder compare) but this will take a few more weeks (not much time ATM).

Any vetos?

@emtiu
Copy link
Member

emtiu commented Nov 9, 2022

No veto. I haven't found time to test this myself, but I trust the procedures @buhtz described enough to merge :)

@Germar
Copy link
Member

Germar commented Nov 9, 2022

Good job Christian! Thank you!
I'd suggest to add a deprecation warning in Snapshots.rsyncRemotePath if quote is not empty. This way we would find missed calls for this easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incompatibility with rsync 3.2.4
4 participants