-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exact BigFloat
to IEEE FP conversion in pure Julia
#50691
Conversation
""" | ||
ieee754_exponent_min(::Type{T}) where {T<:IEEEFloat} = Int(1 - exponent_max(T))::Int | ||
|
||
exponent_min(::Type{Float16}) = ieee754_exponent_min(Float16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these functions need to exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean why is exponent_min
necessary in addition to ieee754_exponent_min
?
The idea was to have exponent_min
be consistent with functions like exponent_max
, in that it only has methods that are strictly required. To be honest, I don't really understand the significance of that, I was just trying to follow the discussion in #50654, the "incorrectly switches the position / ordering of Union
and Type
" part, in particular.
Apart from that the formula in the definition of ieee754_exponent_min
is defined in the IEEE 754 standard, and it's only known to be valid for IEEE 754 floating-point. So theoretically we could have some non-IEEE floating-point types with different definitions of exponent_min
.
end | ||
|
||
# ±floatmax(T) | ||
function ieee754_representation( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do any of these functions need to exist? Why are they better than floatmax
/ Inf
etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is they give the representation of a floating-point number as an unsigned integer. In mpfr.jl, in to_ieee754
, the representation is what I need at that point, so it leads to simpler, and, I suppose, slightly faster code.
Apart from that, these additions to float.jl could actually be used (in a future PR) for refactoring float.jl so as to remove almost all "magic constants" from the file. So, taking floatmax
for example: floatmax
is currently defined with a magic constant for each method, when it would be prettier to define it via ieee754_representation
.
Thank you so much for writing this! I have a few minor style nits, but overall it looks really good. |
Added more tests. |
Tweaked the new test in |
Ping? |
There's lots of code, but most of it seems like it will be useful in general. For example, I think I'll use the changes in float.jl and rounding.jl to improve the JuliaLang#49749 PR. The changes in float.jl could also be used to refactor float.jl to remove many magic constants. Benchmarking script: ```julia using BenchmarkTools f(::Type{T} = BigFloat, n::Int = 2000) where {T} = rand(T, n) g!(u, v) = map!(eltype(u), u, v) @Btime g!(u, v) setup=(u = f(Float16); v = f();) @Btime g!(u, v) setup=(u = f(Float32); v = f();) @Btime g!(u, v) setup=(u = f(Float64); v = f();) ``` On master (dc06468): ``` 46.116 μs (0 allocations: 0 bytes) 38.842 μs (0 allocations: 0 bytes) 37.039 μs (0 allocations: 0 bytes) ``` With both this commit and JuliaLang#50674 applied: ``` 42.310 μs (0 allocations: 0 bytes) 42.661 μs (0 allocations: 0 bytes) 41.608 μs (0 allocations: 0 bytes) ``` So, with this benchmark at least, on an AMD Zen 2 laptop, conversion to `Float16` is faster, but there's a slowdown for `Float32` and `Float64`. Fixes JuliaLang#50642 (exact conversion to `Float16`)
Added another test set. |
I think this is good to go. |
ping |
CI is green on this. @oscardssmith Is this good to merge? I don't see any "approval" reviews on this. |
Fixes issues introduced in JuliaLang#50691 and found in JuliaLang#55906: * use inbounds and boundscheck macros in rawbigints, for catching OOB with `--check-bounds=yes` * fix OOB in `truncate` The fixes need backporting to v1.11, so this would ideally be merged before JuliaLang#55906.
There's lots of code, but most of it seems like it will be useful in general. For example, I think I'll use the changes in float.jl and rounding.jl to improve the #49749 PR. The changes in float.jl could also be used to refactor float.jl to remove many magic constants.
Benchmarking script:
On master (dc06468):
With both this commit and #50674 applied:
So, with this benchmark at least, on an AMD Zen 2 laptop, conversion to
Float16
is faster, but there's a slowdown forFloat32
andFloat64
.Fixes #50642 (exact conversion to
Float16
)