-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notes for review
@@ -108,7 +108,7 @@ public static OperationStatus EncodeToUtf8(ReadOnlySpan<byte> bytes, Span<byte> | |||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | |||
public static int GetMaxEncodedToUtf8Length(int length) | |||
{ | |||
if (length < 0 || length > MaximumEncodeLength) | |||
if ((uint)length > MaximumEncodeLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case length < 0
is covered due the cast to uint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hides intent, though (that negative lengths are invalid). And without unchecked
, it would throw if the correct compiler/runtime flags are flipped.
My naive guess is that the compiler has some built-in way to perform this particular optimization anyways; "positive-but-less-than" is a common category.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length
will just be reinterpreted as uint
, so negative numbers are interpreted as very high (> int.MaxValue) numbers. MaximumEncodeLength
is a constant (so no cast necessary) and on negative lengths the branch will be taken.
It's no compiler magic here, just interpretation of (raw-) values.
I don't think it hides intent, as used widely and as common trick to safe comparisons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length
will just be reinterpreted asuint
, so negative numbers are interpreted as very high (> int.MaxValue) numbers.
Yeah. That's what causes the throw (checked
here is a way to force the -checked
compiler option) .
I don't think it hides intent, as used widely and as common trick to safe comparisons.
The bare-bones form is demonstrably not always safe.
The equivalent behavior in C++ is officially undefined (although pretty much all implementations will behave as anticipated, for multiple reasons)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Clockwork-Muse, we do this optimization all over the place.
@@ -135,7 +135,7 @@ public static OperationStatus EncodeToUtf8InPlace(Span<byte> buffer, int dataLen | |||
if (buffer.Length < encodedLength) | |||
goto FalseExit; | |||
|
|||
int leftover = dataLength - dataLength / 3 * 3; // how many bytes after packs of 3 | |||
int leftover = dataLength - (dataLength / 3) * 3; // how many bytes after packs of 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to ensure that no compiler will "optimize" the / 3 * 3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parenthesis only ever clarify intent, and are essentially removed at the parse stage. "Optimizing" such a statement that way would completely break intent and integer math (consider what that process would do to dataLength / 4 * 2
).
Now, for clarifying intent, it might be a benefit to have them there.
@@ -228,6 +228,6 @@ private static int EncodeAndPadTwo(ref byte oneByte, ref byte encodingMap) | |||
|
|||
private const byte EncodingPad = (byte)'='; // '=', for padding | |||
|
|||
private const int MaximumEncodeLength = (int.MaxValue >> 2) * 3; // 1610612733 | |||
private const int MaximumEncodeLength = (int.MaxValue / 4) * 3; // 1610612733 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a constant, so it can be written "easier" to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, in general, division by 4 is slower than bit shifting by 2.
For int. For uint they're identical.
Commit migrated from dotnet/corefx@11a5782
Some minor changes to the Base64Encoder.
I missed the review of #24888, so it goes here.
/cc: @ahsonkhan