Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Base64Encoder mini changes #28888

Merged
merged 1 commit into from
Apr 6, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions src/System.Memory/src/System/Buffers/Text/Base64Encoder.cs
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ public static OperationStatus EncodeToUtf8(ReadOnlySpan<byte> bytes, Span<byte>
if (maxSrcLength != srcLength - 2)
goto DestinationSmallExit;

if (isFinalBlock != true)
if (!isFinalBlock)
goto NeedMoreDataExit;

if (sourceIndex == srcLength - 1)
Expand Down Expand Up @@ -108,7 +108,7 @@ public static OperationStatus EncodeToUtf8(ReadOnlySpan<byte> bytes, Span<byte>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int GetMaxEncodedToUtf8Length(int length)
{
if (length < 0 || length > MaximumEncodeLength)
if ((uint)length > MaximumEncodeLength)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case length < 0 is covered due the cast to uint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hides intent, though (that negative lengths are invalid). And without unchecked, it would throw if the correct compiler/runtime flags are flipped.
My naive guess is that the compiler has some built-in way to perform this particular optimization anyways; "positive-but-less-than" is a common category.

Copy link
Member Author

@gfoidl gfoidl Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length will just be reinterpreted as uint, so negative numbers are interpreted as very high (> int.MaxValue) numbers. MaximumEncodeLength is a constant (so no cast necessary) and on negative lengths the branch will be taken.

It's no compiler magic here, just interpretation of (raw-) values.

I don't think it hides intent, as used widely and as common trick to safe comparisons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length will just be reinterpreted as uint, so negative numbers are interpreted as very high (> int.MaxValue) numbers.

Yeah. That's what causes the throw (checked here is a way to force the -checked compiler option) .

I don't think it hides intent, as used widely and as common trick to safe comparisons.

The bare-bones form is demonstrably not always safe.
The equivalent behavior in C++ is officially undefined (although pretty much all implementations will behave as anticipated, for multiple reasons)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Clockwork-Muse, we do this optimization all over the place.

ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.length);

return (((length + 2) / 3) * 4);
Expand All @@ -135,7 +135,7 @@ public static OperationStatus EncodeToUtf8InPlace(Span<byte> buffer, int dataLen
if (buffer.Length < encodedLength)
goto FalseExit;

int leftover = dataLength - dataLength / 3 * 3; // how many bytes after packs of 3
int leftover = dataLength - (dataLength / 3) * 3; // how many bytes after packs of 3
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to ensure that no compiler will "optimize" the / 3 * 3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parenthesis only ever clarify intent, and are essentially removed at the parse stage. "Optimizing" such a statement that way would completely break intent and integer math (consider what that process would do to dataLength / 4 * 2).

Now, for clarifying intent, it might be a benefit to have them there.


int destinationIndex = encodedLength - 4;
int sourceIndex = dataLength - leftover;
Expand Down Expand Up @@ -228,6 +228,6 @@ private static int EncodeAndPadTwo(ref byte oneByte, ref byte encodingMap)

private const byte EncodingPad = (byte)'='; // '=', for padding

private const int MaximumEncodeLength = (int.MaxValue >> 2) * 3; // 1610612733
private const int MaximumEncodeLength = (int.MaxValue / 4) * 3; // 1610612733
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a constant, so it can be written "easier" to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is a constant, the change is OK. However, in general, division by 4 is slower than bit shifting by 2.

image

    public int UseBitShift(int value) {
        return (value >> 2) * 3;
    }
    
    public int UseDivision(int value) {
        return (value / 4) * 3;
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, in general, division by 4 is slower than bit shifting by 2.

For int. For uint they're identical.

}
}