Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for unchecked math #59148

Merged
merged 4 commits into from
Jun 4, 2019
Merged

add support for unchecked math #59148

merged 4 commits into from
Jun 4, 2019

Conversation

lcnr
Copy link
Contributor

@lcnr lcnr commented Mar 12, 2019

add compiler support for

/// Returns the result of an unchecked addition, resulting in
/// undefined behavior when `x + y > T::max_value()` or `x + y < T::min_value()`.
pub fn unchecked_add<T>(x: T, y: T) -> T;

/// Returns the result of an unchecked substraction, resulting in
/// undefined behavior when `x - y > T::max_value()` or `x - y < T::min_value()`.
pub fn unchecked_sub<T>(x: T, y: T) -> T;

/// Returns the result of an unchecked multiplication, resulting in
/// undefined behavior when `x * y > T::max_value()` or `x * y < T::min_value()`.
pub fn unchecked_mul<T>(x: T, y: T) -> T;

cc rust-lang/rfcs#2508

@rust-highfive
Copy link
Collaborator

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 12, 2019
@lachlansneff
Copy link

What's the purpose of these? Wrapping math is zero-overhead on every supported architecture, I believe.

@scottmcm
Copy link
Member

Previous attempt: #52205

@lachlansneff It's not zero-overhead for optimizations, though. x.wrapping_add(1) < x can be true, but x.nowrap_add(1) < x is always false.

@lcnr
Copy link
Contributor Author

lcnr commented Mar 13, 2019

@lachlansneff This PR is partially inspired by this blog post: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

Signed integer overflow: If arithmetic on an 'int' type (for example) overflows, the result is undefined. One example is that "INT_MAX+1" is not guaranteed to be INT_MIN. This behavior enables certain classes of optimizations that are important for some code. For example, knowing that INT_MAX+1 is undefined allows optimizing "X+1 > X" to "true". Knowing the multiplication "cannot" overflow (because doing so would be undefined) allows optimizing "X*2/2" to "X". While these may seem trivial, these sorts of things are commonly exposed by inlining and macro expansion. A more important optimization that this allows is for "<=" loops like this:

for (i = 0; i <= N; ++i) { ... }

In this loop, the compiler can assume that the loop will iterate exactly N+1 times if "i" is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses "int" as induction variables.

@eddyb
Copy link
Member

eddyb commented Mar 13, 2019

@lcnr Yes, but that's caused by C making the wrong default (int) easier to write, in Rust the types force you to index with usize, which optimizes well without UB.

I'd defer to @rust-lang/wg-unsafe-code-guidelines and @pcwalton for more qualified opinions on UB but I doubt we need/want this.

@lcnr
Copy link
Contributor Author

lcnr commented Mar 13, 2019

@eddyb while I don't think that the increase in performance will be that relevant, I will try some quick benchmarks when all additions are replaced with add nsw nuw to see if it would make a difference.

@strega-nil
Copy link
Contributor

I simultaneously have two opinions here:

  1. this is not an optimization that should ever be necessary with well-written code.

  2. there's no real reason not to have these, and for specific code it may make a difference, however unlikely that might be.

@hanna-kruppe
Copy link
Contributor

My main reason for not wanting these is that I fear some people will cargo-cult them in the belief that it makes their program faster when really they just escalate all integer overflow bugs in their code to instant UB.

@strega-nil
Copy link
Contributor

@rkruppe personally, I'm far more worried about existing utilities in the standard library, like transmute.

@lcnr
Copy link
Contributor Author

lcnr commented Mar 17, 2019

I have looked at the speedup of ./x.py bench when all add, sub and mul are replaced with their unchecked variants. These benchmarks are obviously not very conclusive and can be partially attributed to random noise:

The repo where the default add, sub and mul are replaced with their unchecked counterparts can be found here, in case anyone wants to do some tests on their own.

file pastebin

@lcnr
Copy link
Contributor Author

lcnr commented Mar 17, 2019

The following functions/benchmarks seem like they are actually affected by this.

  • char: to_digit
  • iter: by_ref().sum()(the speed of sumwithout by_ref() remains equal)
  • str: find, contains, match_indices, split when used with str, probably in str::pattern::TwoWaySearcher

@eddyb
Copy link
Member

eddyb commented Mar 18, 2019

iter: by_ref().sum() (the speed of sum without by_ref() remains equal)

This one, at least, is definitely wrong, sum should, on overflow:

  • panic (in debug mode)
  • wrap around (in release mode)

Otherwise, something as simple as [255u8, 1].iter().cloned().by_ref().sum() becomes UB.

Some of the others may be too - if there are any who aren't (wrong optimizations, that is), you should instead open issues about missed optimizations.

@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 28, 2019

These are only exposed in core::intrinsics as unsafe fns right? AFAICT core::intrinsics is perma unstable, so I don't think people will assume that these are intended for "general" usage. I don't see much harm in exposing them here, and I've wanted these a couple of times when diagnosing perf issues and filling bugs for SIMD vectors, so I can imagine these might be useful for primitive types as well.

One thing that one can do with them is implement saturated/checked/wrapping arithmetic in Rust itself on top of these intrinsics only, instead of using the LLVM specific ones. This is not a very practical thing to do, but if the intent is to experiment, I think that's fine.

@hanna-kruppe
Copy link
Contributor

These are only exposed in core::intrinsics as unsafe fns right? AFAICT core::intrinsics is perma unstable, so I don't think people will assume that these are intended for "general" usage.

Normally when people request a feature they want to use it in their project and most people want it have a trajectory towards (eventual) stability. I don't want to put words in anyone's mouths but I see no reason to expect that these will stay perma-unstable if they're added, except by accident because everyone forgets about them.

One thing that one can do with them is implement saturated/checked/wrapping arithmetic in Rust itself on top of these intrinsics only, instead of using the LLVM specific ones.

I don't understand this at all. For wrapping arithmetic you can't do better than remove the UB on overflow and let the processor do its thing. For checked and saturating arithmetic, even if they're implemented as a code sequence that "looks before it leaps" and thus could have a non-overflowing operating embedded in it, I don't see how the optimizations that need UB-on-overflow would be applicable.

@sanxiyn sanxiyn added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 4, 2019
@sanxiyn
Copy link
Member

sanxiyn commented Apr 4, 2019

As far as I can tell, this is not waiting on technical review but waiting on decision from the relevant team. (By the way, what is the relevant team here?)

@gnzlbg
Copy link
Contributor

gnzlbg commented Apr 4, 2019

@sanxiyn these are perma unstable core intrinsics, so probably the compiler team is the relevant team here. It wouldn't hurt for the lang and lib teams to know that these might become available.

@lcnr
Copy link
Contributor Author

lcnr commented Apr 4, 2019

After reading the conversation here and some personal though I am actually slightly against adding these intrinsics to the language. The most important reason for me is that even only emitting the unchecked versions during codegen does not lead to many performance improvements. All noticeable improvements I've looked at were due to enabling UB after these changes.

Personally I would like to only keep the undefined add/sub/mul for codegen and emit them for operations which are guaranteed to not overflow thanks to some kind of range analysis. In case someone wants to use these intrinsics, it should then be possible with the following code:

// this currently does not generate add nuw instead of a simple add instruction,
// requires some future optimizations
a.checked_add(b).unwrap_or_else(|| core::hint::unreachable_unchecked())

@Centril Centril added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 28, 2019
@Centril
Copy link
Contributor

Centril commented Apr 28, 2019

Adding T-Lang + T-Libs for the question of "do we want to expose this to users?" and T-Compiler for "do we want to use this internally for something else?" and nominating for all teams.

@alexcrichton alexcrichton removed the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 1, 2019
@alexcrichton
Copy link
Member

The libs team discussed this during triage and concluded that we see no issues including these eventually in the standard library. Stabilization would certainly be a different story, but adding them eventually was not objected to by anyone.

@scottmcm
Copy link
Member

scottmcm commented May 1, 2019

this is not an optimization that should ever be necessary with well-written code.

I'll note that we're already doing weird things to hack around the lack of these, for example

fn next(&mut self) -> Option<A> {
if self.start < self.end {
// We check for overflow here, even though it can't actually
// happen. Adding this check does however help llvm vectorize loops
// for some ranges that don't get vectorized otherwise,
// and this won't actually result in an extra check in an optimized build.
if let Some(mut n) = self.start.add_usize(1) {
mem::swap(&mut n, &mut self.start);
Some(n)
} else {
None
}
} else {
None
}
}

I'd much rather see that code as "I'm using add_unchecked(1) here for optimization and here's why it's safe" than the current "this is super weird but happens to work right now".

Even if these intrinsics never stabilize, I wouldn't be surprised to find a handful of places they're worth it in super-core library pieces like Range above or maybe things like slice iterators.

@scottmcm
Copy link
Member

scottmcm commented Jun 2, 2019

@lcnr Based on the commits now showing in here, it looks like something went awry in resolving the conflicts? Maybe try the rebase again?

@RalfJung
Copy link
Member

RalfJung commented Jun 2, 2019

And in particular, please rebase, don't merge.

@lcnr
Copy link
Contributor Author

lcnr commented Jun 2, 2019

this was the weirdest rebase I have ever done...
seems like it is somewhat sorted now, even if it still seems like I am adding fn not to llvm/builder.
Should be fine otherwise :D

Edit: nevermind, I just didn't fully understand the changes made by @eddyb

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:1bf9e0d9:start=1559506424020022212,finish=1559506426107512035,duration=2087489823
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
$ export AWS_ACCESS_KEY_ID=AKIA46X5W6CZEJZ6XT55

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@lcnr
Copy link
Contributor Author

lcnr commented Jun 3, 2019

@eddyb should be ready for review

@lcnr
Copy link
Contributor Author

lcnr commented Jun 3, 2019

Discussed in the compiler team triage meeting. General consensus was that we're ok landing this, it shouldn't add a big maintenance burden etc. Obviously, we would want to see tests that these are unstable and not usable outside rustc.

Added tests for both unsafe and unstable.
@nikomatsakis @eddyb

@eddyb
Copy link
Member

eddyb commented Jun 3, 2019

@bors r+

@bors
Copy link
Contributor

bors commented Jun 3, 2019

📌 Commit d7e0834 has been approved by eddyb

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 3, 2019
bors added a commit that referenced this pull request Jun 3, 2019
add support for unchecked math

add compiler support for
```rust
/// Returns the result of an unchecked addition, resulting in
/// undefined behavior when `x + y > T::max_value()` or `x + y < T::min_value()`.
pub fn unchecked_add<T>(x: T, y: T) -> T;

/// Returns the result of an unchecked substraction, resulting in
/// undefined behavior when `x - y > T::max_value()` or `x - y < T::min_value()`.
pub fn unchecked_sub<T>(x: T, y: T) -> T;

/// Returns the result of an unchecked multiplication, resulting in
/// undefined behavior when `x * y > T::max_value()` or `x * y < T::min_value()`.
pub fn unchecked_mul<T>(x: T, y: T) -> T;
```

cc rust-lang/rfcs#2508
@bors
Copy link
Contributor

bors commented Jun 3, 2019

⌛ Testing commit d7e0834 with merge e22b7a3...

@bors
Copy link
Contributor

bors commented Jun 4, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: eddyb
Pushing e22b7a3 to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.