Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more ZipArchive tests to confirm the Zip64 fix does not affect reading zip files generatd with the old bug #103152

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System;
using System.Linq;
using System.Reflection;
using System.Text;
using Xunit;

namespace System.IO.Compression.Tests;

public class ManualTests
{
[Fact]
public static void GenerateZip64File()
{
// This test has the purpose of generating a large zip file containing the Zip64 bug fix introduced in https://github.com/dotnet/runtime/pull/102053
// The file can later be used in the VerifyZip64FixInNetFramework test method in ManualTests.NetFramework.cs to confirm that it can read it correctly.

const ushort Zip64Version = 45;
const uint ZipLocalFileHeader_OffsetToVersionFromHeaderStart = 4;
byte[] largeBuffer = GC.AllocateUninitializedArray<byte>(1_000_000_000); // 1 GB
string zipArchivePath = Path.Combine(Path.GetTempPath(), "over4GB_WithoutBug.zip");

using FileStream fs = File.Open(zipArchivePath, FileMode.Create, FileAccess.ReadWrite);

// Create
using (ZipArchive archive = new(fs, ZipArchiveMode.Create, true))
{
ZipArchiveEntry file = archive.CreateEntry("file.txt", CompressionLevel.NoCompression);

using (Stream stream = file.Open())
{
// Write 5GB of data
for (var i = 0; i < 5; i++)
{
stream.Write(largeBuffer);
}
}
}
Assert.True(fs.Length > int.MaxValue, $"File size is not big enough to test the Zip64 fix: {fs.Length} vs {int.MaxValue}");

fs.Position = 0;

// Create an archive using .NET with the fix, to later use it in .NET Framework
using (ZipArchive zip = new ZipArchive(fs, ZipArchiveMode.Read, leaveOpen: true))
{
FieldInfo offsetOfLHField = typeof(ZipArchiveEntry).GetField("_offsetOfLocalHeader", BindingFlags.NonPublic | BindingFlags.Instance);

if (offsetOfLHField is null || offsetOfLHField.FieldType != typeof(long))
{
Assert.Fail("Cannot find the private field of _offsetOfLocalHeader in ZipArchiveEntry or the type is not long. Code may be changed after the test is written.");
}

ZipArchiveEntry entry = zip.Entries.First();

long currentPosition = fs.Position;

// Confirm it's set to the correct value
using (BinaryReader reader = new(fs, Encoding.UTF8, leaveOpen: true))
{
fs.Position = (long)offsetOfLHField.GetValue(entry) + ZipLocalFileHeader_OffsetToVersionFromHeaderStart;
ushort version = reader.ReadUInt16();
Assert.Equal(Zip64Version, version);
}
}

Console.WriteLine($"Zip file location: {zipArchivePath}");
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.IO.Compression;
using System.Linq;
using System.Reflection;
using System.Text;
using Xunit;

namespace System.IO.Compression.Tests;

public class ManualTests
{
[Fact]
public static void VerifyZip64FixInNetFramework()
{
// This test has the purpose of verifying that .NET Framework can read a zip file generated with the Zip64 fix introduced in .NET.
// The zip file with the fix needs to be manually generated by the GenerateZip64File test method in ManualTests.Net.cs.

string zipArchivePath = Path.Combine(Path.GetTempPath(), "over4GB_WithoutBug.zip");
Assert.True(File.Exists(zipArchivePath));

using FileStream fs = File.OpenRead(zipArchivePath);

// Open archive to verify that we can still read an archive with the buggy version
using (ZipArchive zip = new ZipArchive(fs, ZipArchiveMode.Read))
Copy link
Member

@ericstj ericstj Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug in NETFX was actually with the validation in System.IO.Packaging. So this would pass even with the broken zip.

I'm not sure there's value in having this test committed even if we made it for IO.Packaging, since it has manual steps.

Maybe there's some value in having some suite of compatibility tests that produce different zips / IO.Packaging packages on one framework, and then consume them on the other framework without intervention? Perhaps those could be done in a way that they are a theory based on a number of different scenarios? I don't see that as a necessary thing for this particular fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have prior art running two different test processes for different frameworks?

{
Assert.Equal(1, zip.Entries.Count);
ZipArchiveEntry entry = zip.Entries.First();
Assert.Equal("file.txt", entry.Name);
Assert.Equal(5_000_000_000, entry.Length);
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>$(NetCoreAppCurrent)</TargetFramework>
</PropertyGroup>

<ItemGroup>
<Compile Include="ManualTests.Net.cs" />
</ItemGroup>

</Project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>$(NetFrameworkCurrent)</TargetFramework>
</PropertyGroup>

<ItemGroup>
<Compile Include="ManualTests.NetFramework.cs" />
</ItemGroup>

<ItemGroup>
<Reference Include="System.IO.Compression" />
</ItemGroup>

</Project>
114 changes: 101 additions & 13 deletions src/libraries/System.IO.Compression/tests/ZipArchive/zip_LargeFiles.cs
Original file line number Diff line number Diff line change
@@ -1,14 +1,33 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Linq;
using System.Reflection;
using System.Text;
using Xunit;

namespace System.IO.Compression.Tests;

[Collection(nameof(DisableParallelization))]
public class zip_LargeFiles : ZipFileTestBase
{
private const ushort Zip64Version = 45;
private const uint ZipLocalFileHeader_OffsetToVersionFromHeaderStart = 4;

private static void FillWithHardToCompressData(byte[] buffer) => Random.Shared.NextBytes(buffer);

private static FieldInfo GetOffsetOfLHField()
{
FieldInfo offsetOfLHField = typeof(ZipArchiveEntry).GetField("_offsetOfLocalHeader", BindingFlags.NonPublic | BindingFlags.Instance);

if (offsetOfLHField is null || offsetOfLHField.FieldType != typeof(long))
{
Assert.Fail("Cannot find the private field of _offsetOfLocalHeader in ZipArchiveEntry or the type is not long. Code may be changed after the test is written.");
}

return offsetOfLHField;
}

[ConditionalFact(typeof(PlatformDetection), nameof(PlatformDetection.IsSpeedOptimized), nameof(PlatformDetection.Is64BitProcess))] // don't run it on slower runtimes
[OuterLoop("It requires almost 12 GB of free disk space")]
public static void UnzipOver4GBZipFile()
Expand Down Expand Up @@ -44,11 +63,6 @@ public static void UnzipOver4GBZipFile()
}
}

private static void FillWithHardToCompressData(byte[] buffer)
{
Random.Shared.NextBytes(buffer);
}

[ConditionalTheory(typeof(PlatformDetection), nameof(PlatformDetection.IsSpeedOptimized), nameof(PlatformDetection.Is64BitProcess))] // don't run it on slower runtimes
[OuterLoop("It requires 5~6 GB of free disk space and a lot of CPU time for compressed tests")]
[InlineData(false)]
Expand All @@ -63,8 +77,6 @@ public static void CheckZIP64VersionIsSet_ForSmallFilesAfterBigFiles(bool isComp
string zipArchivePath = Path.Combine(Path.GetTempPath(), "over4GB.zip");
string LargeFileName = "largefile";
string SmallFileName = "smallfile";
uint ZipLocalFileHeader_OffsetToVersionFromHeaderStart = 4;
ushort Zip64Version = 45;

try
{
Expand Down Expand Up @@ -104,12 +116,7 @@ public static void CheckZIP64VersionIsSet_ForSmallFilesAfterBigFiles(bool isComp
{
using var reader = new BinaryReader(fs);

FieldInfo offsetOfLHField = typeof(ZipArchiveEntry).GetField("_offsetOfLocalHeader", BindingFlags.NonPublic | BindingFlags.Instance);

if (offsetOfLHField is null || offsetOfLHField.FieldType != typeof(long))
{
Assert.Fail("Cannot find the private field of _offsetOfLocalHeader in ZipArchiveEntry or the type is not long. Code may be changed after the test is written.");
}
FieldInfo offsetOfLHField = GetOffsetOfLHField();

foreach (ZipArchiveEntry entry in archive.Entries)
{
Expand All @@ -126,4 +133,85 @@ public static void CheckZIP64VersionIsSet_ForSmallFilesAfterBigFiles(bool isComp
File.Delete(zipArchivePath);
}
}

[ConditionalFact(typeof(PlatformDetection), nameof(PlatformDetection.IsSpeedOptimized), nameof(PlatformDetection.Is64BitProcess))] // don't run it on slower runtimes
[OuterLoop("It requires around 6 GB of free disk space")]
public static void CompatZip64BeforeAndAfterFix()
{
// This test has the purpose of confirming that ZipArchive can still process a zip file that was created
// with these APIs before the Zip64 bug was fixed: https://github.com/dotnet/runtime/pull/102053

ushort buggyZip64Version = 20;
byte[] largeBuffer = GC.AllocateUninitializedArray<byte>(1_000_000_000); // 1 GB
string zipArchivePath = Path.Combine(Path.GetTempPath(), "over4GB.zip");

try
{
using FileStream fs = File.Open(zipArchivePath, FileMode.Create, FileAccess.ReadWrite);

// Create
using (ZipArchive archive = new(fs, ZipArchiveMode.Create, true))
{
ZipArchiveEntry file = archive.CreateEntry("file.txt", CompressionLevel.NoCompression);

using (Stream stream = file.Open())
{
// Write 5GB of data
for (var i = 0; i < 5; i++)
{
stream.Write(largeBuffer);
}
}
}
Assert.True(fs.Length > int.MaxValue, $"File size is not big enough to test the Zip64 fix: {fs.Length} vs {int.MaxValue}");

fs.Position = 0;

// Open archive to modify the bit as it used to look before fix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the source has a define for DEBUG_FORCE_ZIP64. I don't see us ever use that, but I imagine the idea was we could use reflection to set that field and it would make a smaller zip use Zip64. That might be an interesting method to test this without forcing things to outerloop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to keep it in mind. I wasn't aware that it could be used for testing. At the moment I felt that it was more important to confirm that the real life scenario is working. I'll consider using the define in the future.

using (ZipArchive zip = new ZipArchive(fs, ZipArchiveMode.Read, leaveOpen: true))
{
FieldInfo offsetOfLHField = GetOffsetOfLHField();

ZipArchiveEntry entry = zip.Entries.First();

long currentPosition = fs.Position;

// Confirm it's initially set to the correct value
using (BinaryReader reader = new(fs, Encoding.UTF8, leaveOpen: true))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can open the reader and writer at the same time, and just move the position back 2 bytes after you read it. That makes this more reabable and should reduce the code a bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is more important to be able to close the ZipArchive, dispose it, then reopen it from the same stream, and confirm that the ZipArchive can be opened, the ZipArchiveEntry can be created without unexpected errors, and finally confirm the zip64 value was unmodified.

If I reduce the code as suggested, that would mean using the exact same ZipArchive object, without really verifying that the ZipArchive and the ZipArchiveEntry instances could still be created successfully.

{
fs.Position = (long)offsetOfLHField.GetValue(entry) + ZipLocalFileHeader_OffsetToVersionFromHeaderStart;
ushort version = reader.ReadUInt16();
Assert.Equal(Zip64Version, version);
}

fs.Position = currentPosition;

// Change it to the value that a version of .NET previous to the fix would have written
using (BinaryWriter writer = new(fs, Encoding.UTF8, leaveOpen: true))
{
fs.Position = (long)offsetOfLHField.GetValue(entry) + ZipLocalFileHeader_OffsetToVersionFromHeaderStart;
writer.Write(buggyZip64Version);
}
}

fs.Position = 0;

// Open archive to verify that we can still read an archive with the buggy version
using (ZipArchive zip = new ZipArchive(fs, ZipArchiveMode.Read))
{
FieldInfo offsetOfLHField = GetOffsetOfLHField();
ZipArchiveEntry entry = zip.Entries.First();

// Confirm it's still set to the old buggy value (to prove we could still read a malformed file)
using BinaryReader reader = new(fs, Encoding.UTF8);
fs.Position = (long)offsetOfLHField.GetValue(entry) + ZipLocalFileHeader_OffsetToVersionFromHeaderStart;
ushort version = reader.ReadUInt16();
Assert.Equal(buggyZip64Version, version);
}
}
finally
{
File.Delete(zipArchivePath);
}
}
}
Loading