Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize: add at least a small cost for most instructions #31455

Closed
wants to merge 2 commits into from

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Mar 22, 2019

This should better reflect the aggregate expected cost of their presence
and restrict endless inlining: SROA may eliminate many of them, but
probably not all of them.

This was part of #31338, but I realized that I need to separate the nanosoldier results.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 22, 2019

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@timholy
Copy link
Sponsor Member

timholy commented Mar 23, 2019

Looks like some regressions in array handling. Maybe pair this with an increase in the cost thresholds?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 23, 2019

That would somewhat defeat the purpose of this. In the benchmark, it looks like this improved performance by 10% on many benchmarks, and up to 40% on others. I would rather understand why that one benchmark is not being optimized well.

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@timholy
Copy link
Sponsor Member

timholy commented Mar 24, 2019

That would somewhat defeat the purpose of this.

Depends on the purpose; it wouldn't defeat the "endless inlining" (in the sense of infinite). But agreed, best to focus on the few failures.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 25, 2019

@nanosoldier runbenchmarks("broadcast", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 29, 2019

Results seem pretty good to me now, so I'll plan to merge over the weekend, unless there's any more comments.

@KristofferC
Copy link
Sponsor Member

Why did you only run the broadcast benchmarks? There where categories with regression outside of that.

ID time ratio memory ratio
["array", "index", "("sumvector_view", "1:100000")"] 1.03 (50%) 1.40 (1%) ❌
["array", "index", "("sumvector_view", "BitArray{2}")"] 1.60 (50%) ❌ 2.49 (1%) ❌
["array", "index", "("sumvector_view", "SubArray{Float32,2,Base.Res...},true}")"] 2.26 (50%) ❌ 2.49 (1%) ❌
["array", "index", "("sumvector_view", "SubArray{Int32,2,Base.ReshapedArray{Int...true}")"] 2.23 (50%) ❌ 2.49 (1%) ❌

for example

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 29, 2019

Those are the only ones that would have changed. I looked into several others. Most were expected/intended. The usual causes were an insufficient benchmark complexity, that's now just measuring overhead noise (e.g. scalar iteration), and inaccurate cost modeling issues (harder to deal with, but maybe we should store the result for boundscheck=none at least? this problem existed before, we just don't have good benchmark coverage of the case.)

@mbauman
Copy link
Sponsor Member

mbauman commented Mar 29, 2019

View construction and indexing seems like quite a relevant microbenchmark... and is something we've tried hard to make as fast as possible. Why can't we just manually inline whatever is throwing a wrench into things here?

Edit: oh, without looking I'd bet it's the generic first that's not getting inlined. Alright, if that's the case I'm slightly more sympathetic.

@vtjnash vtjnash added the status:triage This should be discussed on a triage call label Apr 15, 2019
@StefanKarpinski StefanKarpinski removed the status:triage This should be discussed on a triage call label May 9, 2019
@JeffBezanson JeffBezanson self-requested a review May 9, 2019 18:30
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jul 27, 2020

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jul 28, 2020

Wow, okay, that seems quite bad

@vtjnash vtjnash force-pushed the jn/getfield-cost branch 2 times, most recently from 185b62a to dd5a0a8 Compare June 4, 2021 03:38
@vtjnash

This comment has been minimized.

@nanosoldier

This comment has been minimized.

This removes the dependence on inlining for performance.
This should better reflect the aggregate expected cost of their presence
and restrict endless inlining: SROA may eliminate many of them, but
probably not all of them.
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jun 4, 2021

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @christopher-dG

@mbauman mbauman added the compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) label Jun 4, 2021
@vtjnash vtjnash closed this Nov 22, 2021
@vtjnash vtjnash deleted the jn/getfield-cost branch November 22, 2021 21:31
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Nov 22, 2021

Not sure that it is worthwhile trying to figure out those regressions for this minor gain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:optimizer Optimization passes (mostly in base/compiler/ssair/)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants