Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Base64Encoder mini changes #28888

Merged
merged 1 commit into from
Apr 6, 2018
Merged

Base64Encoder mini changes #28888

merged 1 commit into from
Apr 6, 2018

Conversation

gfoidl
Copy link
Member

@gfoidl gfoidl commented Apr 6, 2018

Some minor changes to the Base64Encoder.

I missed the review of #24888, so it goes here.

/cc: @ahsonkhan

Copy link
Member Author

@gfoidl gfoidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes for review

@@ -108,7 +108,7 @@ public static OperationStatus EncodeToUtf8(ReadOnlySpan<byte> bytes, Span<byte>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int GetMaxEncodedToUtf8Length(int length)
{
if (length < 0 || length > MaximumEncodeLength)
if ((uint)length > MaximumEncodeLength)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case length < 0 is covered due the cast to uint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hides intent, though (that negative lengths are invalid). And without unchecked, it would throw if the correct compiler/runtime flags are flipped.
My naive guess is that the compiler has some built-in way to perform this particular optimization anyways; "positive-but-less-than" is a common category.

Copy link
Member Author

@gfoidl gfoidl Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length will just be reinterpreted as uint, so negative numbers are interpreted as very high (> int.MaxValue) numbers. MaximumEncodeLength is a constant (so no cast necessary) and on negative lengths the branch will be taken.

It's no compiler magic here, just interpretation of (raw-) values.

I don't think it hides intent, as used widely and as common trick to safe comparisons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length will just be reinterpreted as uint, so negative numbers are interpreted as very high (> int.MaxValue) numbers.

Yeah. That's what causes the throw (checked here is a way to force the -checked compiler option) .

I don't think it hides intent, as used widely and as common trick to safe comparisons.

The bare-bones form is demonstrably not always safe.
The equivalent behavior in C++ is officially undefined (although pretty much all implementations will behave as anticipated, for multiple reasons)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Clockwork-Muse, we do this optimization all over the place.

@@ -135,7 +135,7 @@ public static OperationStatus EncodeToUtf8InPlace(Span<byte> buffer, int dataLen
if (buffer.Length < encodedLength)
goto FalseExit;

int leftover = dataLength - dataLength / 3 * 3; // how many bytes after packs of 3
int leftover = dataLength - (dataLength / 3) * 3; // how many bytes after packs of 3
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to ensure that no compiler will "optimize" the / 3 * 3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parenthesis only ever clarify intent, and are essentially removed at the parse stage. "Optimizing" such a statement that way would completely break intent and integer math (consider what that process would do to dataLength / 4 * 2).

Now, for clarifying intent, it might be a benefit to have them there.

@@ -228,6 +228,6 @@ private static int EncodeAndPadTwo(ref byte oneByte, ref byte encodingMap)

private const byte EncodingPad = (byte)'='; // '=', for padding

private const int MaximumEncodeLength = (int.MaxValue >> 2) * 3; // 1610612733
private const int MaximumEncodeLength = (int.MaxValue / 4) * 3; // 1610612733
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a constant, so it can be written "easier" to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is a constant, the change is OK. However, in general, division by 4 is slower than bit shifting by 2.

image

    public int UseBitShift(int value) {
        return (value >> 2) * 3;
    }
    
    public int UseDivision(int value) {
        return (value / 4) * 3;
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, in general, division by 4 is slower than bit shifting by 2.

For int. For uint they're identical.

@stephentoub stephentoub merged commit 11a5782 into dotnet:master Apr 6, 2018
@karelz karelz added this to the 2.1.0 milestone Apr 6, 2018
@gfoidl gfoidl deleted the base64 branch April 7, 2018 16:51
pjanotti pushed a commit to pjanotti/corefx that referenced this pull request Apr 8, 2018
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants