Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/7.0] Fix prefix writing on TarHeaderWrite #75373

Merged
merged 7 commits into from
Sep 14, 2022

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Sep 9, 2022

Backport of #75350 to release/7.0

/cc @jozkee

Customer Impact

Customers using the new Tar APIs to archive filenames larger than 200 characters in Pax format may find an unexpected and unclear exception:

System.ArgumentException : The output byte buffer is too small to contain the encoded data, encoding codepage '20127' and fallback 'System.Text.EncoderReplacementFallback'. (Parameter 'bytes')

It is an important scenario as Pax is the default format.

Testing

Tested with multiple formats, although the only one presenting the error was Pax.

Risk

Low.

@ghost
Copy link

ghost commented Sep 9, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Backport of #75350 to release/7.0

/cc @jozkee

Customer Impact

Testing

Risk

IMPORTANT: Is this backport for a servicing release? If so and this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.

Author: github-actions[bot]
Assignees: -
Labels:

area-System.IO

Milestone: -

@danmoseley
Copy link
Member

Is this filenames, or paths over 200?
If filenames, how common do we believe this will be for customers?

@danmoseley
Copy link
Member

@stephentoub any comments?

@danmoseley
Copy link
Member

Have you examined other metadata that the TAR API's read/write to check for similar arbitrary limits?

@jozkee
Copy link
Member

jozkee commented Sep 9, 2022

Is this filenames, or paths over 200?
If filenames, how common do we believe this will be for customers?

It is the TarEntry.Name property which holds the path, so, not just filename components, but nested files or entries could step into this issue.

@danmoseley
Copy link
Member

OK, we clearly want to be able to support TAR'ing 200 chars below a root. The severity meets the bar. The change seems straightforward, but I don't know the format. I'd like @stephentoub 's take.

BTW -- for the 'risk' in these templates, it's usually good to add some reasoning -- it helps think through. Eg., is change affecting other scenarios? is code already proven in another branch -- or reusing existing code tested elsewhere - or just self evidently correct? and my favorite question, reused from a past team - if I told you that I looked into the future and two weeks from now we discovered this change broke something -- what would that thing most likely have been, and why? 😺

@jozkee
Copy link
Member

jozkee commented Sep 9, 2022

is change affecting other scenarios?

no

is code already proven in another branch -- or reusing existing code tested elsewhere - or just self evidently correct?

It is self-evidently correct. Code that is manipulating a string was using incorrect consts to get a buffer size for the remaining of 255 (name(100) + prefix(155)). It is proven in main branch.

if I told you that I looked into the future and two weeks from now we discovered this change broke something -- what would that thing most likely have been, and why? 😺

The thing that could be broken is either the consts/math is still wrong or the assumption that n chars can fit in n bytes using Encoding.ASCII.GetBytes is wrong, the former also applies for the code without the fix, so I guess it can be discarded.

n chars can fit in n bytes using Encoding.ASCII.GetBytes

@carlossanlop is there a test for that? i.e: what if the Name contains non-UTF8 characters?

@jozkee
Copy link
Member

jozkee commented Sep 9, 2022

for the 'risk' in these templates, it's usually good to add some reasoning -- it helps think through. Eg., is change affecting other scenarios? is code already proven in another branch -- or reusing existing code tested elsewhere - or just self evidently correct? and my favorite question, reused from a past team - if I told you that I looked into the future and two weeks from now we discovered this change broke something -- what would that thing most likely have been, and why? 😺

@danmoseley it would be nice if those questions were in the template 😝.

@danmoseley
Copy link
Member

@danmoseley it would be nice if those questions were in the template 😝.

Haha fair.. maybe I will edit it ( in the bot)

@danmoseley
Copy link
Member

And no further objections from me ..@stephentoub?

@stephentoub
Copy link
Member

@stephentoub any comments?

I'd like @stephentoub 's take.

And no further objections from me ..@stephentoub?

It's 7:30pm on a Friday night. I'll review but it won't be right this second.... ;-)

@danmoseley
Copy link
Member

Haha I'm just over communicating

@carlossanlop
Copy link
Member

@stephentoub do the latest commits look good for merging?

[InlineData(TarEntryFormat.Gnu)]
public void WriteLongName(TarEntryFormat format)
{
var r = new Random();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this would have a seed. That way the test is reproducible.

Suggested change
var r = new Random();
var r = new Random(42);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you used 42. It's the answer to life, the universe, and everything. 😃

{
TarEntry entry;
MemoryStream ms = new();
using (TarWriter writer = new(ms, true))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using (TarWriter writer = new(ms, true))
using (TarWriter writer = new(ms, leaveOpen: true))

@carlossanlop
Copy link
Member

The CI failure is unrelated (System.Net.Http.Functional.Tests timeout).

@carlossanlop
Copy link
Member

carlossanlop commented Sep 14, 2022

Since the two remaining feedback comments are in test code, I asked @jozkee to include them in the next backport PR so we don't restart the CI on this one.

@danmoseley can we get an approval?

Edit: I confirmed with Dan that this comment can be interpreted as approval #75373 (comment)

:shipit:

@carlossanlop carlossanlop merged commit 66a9136 into release/7.0 Sep 14, 2022
@carlossanlop carlossanlop deleted the backport/pr-75350-to-release/7.0 branch September 14, 2022 15:30
@ghost ghost locked as resolved and limited conversation to collaborators Oct 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants