Skip to content

Commit

Permalink
Remove the branches from len_utf8
Browse files Browse the repository at this point in the history
This changes `len_utf8` to add all of the range comparisons together,
rather than branching on each one. We should definitely test performance
though, because it's possible that this will pessimize mostly-ascii
inputs that would have had a short branch-predicted path before.
  • Loading branch information
cuviper committed May 14, 2024
1 parent ac385a5 commit 38f14be
Showing 1 changed file with 3 additions and 9 deletions.
12 changes: 3 additions & 9 deletions library/core/src/char/methods.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1739,15 +1739,9 @@ impl EscapeDebugExtArgs {

#[inline]
const fn len_utf8(code: u32) -> usize {
if code < MAX_ONE_B {
1
} else if code < MAX_TWO_B {
2
} else if code < MAX_THREE_B {
3
} else {
4
}
1 + ((code >= MAX_ONE_B) as usize)
+ ((code >= MAX_TWO_B) as usize)
+ ((code >= MAX_THREE_B) as usize)
}

/// Encodes a raw u32 value as UTF-8 into the provided byte buffer,
Expand Down

0 comments on commit 38f14be

Please sign in to comment.