Skip to content

Commit

Permalink
bpo-41316: Make tarfile follow specs for FNAME (GH-21511)
Browse files Browse the repository at this point in the history
tarfile writes full path to FNAME field of GZIP format instead of just basename if user specified absolute path. Some archive viewers may process file incorrectly. Also it creates security issue because anyone can know structure of directories on system and know username or other personal information.

RFC1952 says about FNAME:
This is the original name of the file being compressed, with any directory components removed.

So tarfile must remove directory names from FNAME and write only basename of file.

Automerge-Triggered-By: @jaraco
(cherry picked from commit 22748a8)

Co-authored-by: Artem Bulgakov <ArtemSBulgakov@ya.ru>
  • Loading branch information
miss-islington and ArtemSBulgakov committed Oct 21, 2020
1 parent 19019ec commit 7917170
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 1 deletion.
2 changes: 2 additions & 0 deletions Lib/tarfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,8 @@ def _init_write_gz(self):
self.__write(b"\037\213\010\010" + timestamp + b"\002\377")
if self.name.endswith(".gz"):
self.name = self.name[:-3]
# Honor "directory components removed" from RFC1952
self.name = os.path.basename(self.name)
# RFC1952 says we must use ISO-8859-1 for the FNAME field.
self.__write(self.name.encode("iso-8859-1", "replace") + NUL)

Expand Down
14 changes: 13 additions & 1 deletion Lib/test/test_tarfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -1416,12 +1416,15 @@ def write(self, data):
pax_headers={'non': 'empty'})
self.assertFalse(f.closed)


class GzipWriteTest(GzipTest, WriteTest):
pass


class Bz2WriteTest(Bz2Test, WriteTest):
pass


class LzmaWriteTest(LzmaTest, WriteTest):
pass

Expand Down Expand Up @@ -1464,8 +1467,17 @@ def test_file_mode(self):
finally:
os.umask(original_umask)


class GzipStreamWriteTest(GzipTest, StreamWriteTest):
pass
def test_source_directory_not_leaked(self):
"""
Ensure the source directory is not included in the tar header
per bpo-41316.
"""
tarfile.open(tmpname, self.mode).close()
payload = pathlib.Path(tmpname).read_text(encoding='latin-1')
assert os.path.dirname(tmpname) not in payload


class Bz2StreamWriteTest(Bz2Test, StreamWriteTest):
decompressor = bz2.BZ2Decompressor if bz2 else None
Expand Down
1 change: 1 addition & 0 deletions Misc/ACKS
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,7 @@ Colm Buckley
Erik de Bueger
Jan-Hein Bührman
Lars Buitinck
Artem Bulgakov
Dick Bulterman
Bill Bumgarner
Jimmy Burgett
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix the :mod:`tarfile` module to write only basename of TAR file to GZIP compression header.

0 comments on commit 7917170

Please sign in to comment.