Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exact BigFloat to IEEE FP conversion in pure Julia #50691

Merged
merged 2 commits into from
Aug 21, 2023

Conversation

nsajko
Copy link
Contributor

@nsajko nsajko commented Jul 27, 2023

There's lots of code, but most of it seems like it will be useful in general. For example, I think I'll use the changes in float.jl and rounding.jl to improve the #49749 PR. The changes in float.jl could also be used to refactor float.jl to remove many magic constants.

Benchmarking script:

using BenchmarkTools
f(::Type{T} = BigFloat, n::Int = 2000) where {T} = rand(T, n)
g!(u, v) = map!(eltype(u), u, v)
@btime g!(u, v) setup=(u = f(Float16); v = f();)
@btime g!(u, v) setup=(u = f(Float32); v = f();)
@btime g!(u, v) setup=(u = f(Float64); v = f();)

On master (dc06468):

  46.116 μs (0 allocations: 0 bytes)
  38.842 μs (0 allocations: 0 bytes)
  37.039 μs (0 allocations: 0 bytes)

With both this commit and #50674 applied:

  42.310 μs (0 allocations: 0 bytes)
  42.661 μs (0 allocations: 0 bytes)
  41.608 μs (0 allocations: 0 bytes)

So, with this benchmark at least, on an AMD Zen 2 laptop, conversion to Float16 is faster, but there's a slowdown for Float32 and Float64.

Fixes #50642 (exact conversion to Float16)

@nsajko nsajko marked this pull request as draft July 27, 2023 18:51
@nsajko nsajko marked this pull request as ready for review July 28, 2023 00:37
"""
ieee754_exponent_min(::Type{T}) where {T<:IEEEFloat} = Int(1 - exponent_max(T))::Int

exponent_min(::Type{Float16}) = ieee754_exponent_min(Float16)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these functions need to exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean why is exponent_min necessary in addition to ieee754_exponent_min?

The idea was to have exponent_min be consistent with functions like exponent_max, in that it only has methods that are strictly required. To be honest, I don't really understand the significance of that, I was just trying to follow the discussion in #50654, the "incorrectly switches the position / ordering of Union and Type" part, in particular.

Apart from that the formula in the definition of ieee754_exponent_min is defined in the IEEE 754 standard, and it's only known to be valid for IEEE 754 floating-point. So theoretically we could have some non-IEEE floating-point types with different definitions of exponent_min.

end

# ±floatmax(T)
function ieee754_representation(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any of these functions need to exist? Why are they better than floatmax/ Inf etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is they give the representation of a floating-point number as an unsigned integer. In mpfr.jl, in to_ieee754, the representation is what I need at that point, so it leads to simpler, and, I suppose, slightly faster code.

Apart from that, these additions to float.jl could actually be used (in a future PR) for refactoring float.jl so as to remove almost all "magic constants" from the file. So, taking floatmax for example: floatmax is currently defined with a magic constant for each method, when it would be prettier to define it via ieee754_representation.

@oscardssmith
Copy link
Member

Thank you so much for writing this! I have a few minor style nits, but overall it looks really good.

@nsajko
Copy link
Contributor Author

nsajko commented Jul 28, 2023

Added more tests.

@nsajko
Copy link
Contributor Author

nsajko commented Jul 29, 2023

Tweaked the new test in test/rounding.jl.

@nsajko
Copy link
Contributor Author

nsajko commented Aug 2, 2023

Ping?

There's lots of code, but most of it seems like it will be useful in
general. For example, I think I'll use the changes in float.jl and
rounding.jl to improve the JuliaLang#49749 PR. The changes in float.jl could
also be used to refactor float.jl to remove many magic constants.

Benchmarking script:
```julia
using BenchmarkTools
f(::Type{T} = BigFloat, n::Int = 2000) where {T} = rand(T, n)
g!(u, v) = map!(eltype(u), u, v)
@Btime g!(u, v) setup=(u = f(Float16); v = f();)
@Btime g!(u, v) setup=(u = f(Float32); v = f();)
@Btime g!(u, v) setup=(u = f(Float64); v = f();)
```

On master (dc06468):
```
  46.116 μs (0 allocations: 0 bytes)
  38.842 μs (0 allocations: 0 bytes)
  37.039 μs (0 allocations: 0 bytes)
```

With both this commit and JuliaLang#50674 applied:
```
  42.310 μs (0 allocations: 0 bytes)
  42.661 μs (0 allocations: 0 bytes)
  41.608 μs (0 allocations: 0 bytes)
```

So, with this benchmark at least, on an AMD Zen 2 laptop, conversion
to `Float16` is faster, but there's a slowdown for `Float32` and
`Float64`.

Fixes JuliaLang#50642 (exact conversion to `Float16`)
@nsajko
Copy link
Contributor Author

nsajko commented Aug 11, 2023

Added another test set.

@oscardssmith
Copy link
Member

I think this is good to go.

@nsajko
Copy link
Contributor Author

nsajko commented Aug 20, 2023

ping

@oscardssmith oscardssmith added the status:merge me PR is reviewed. Merge when all tests are passing label Aug 20, 2023
@DilumAluthge
Copy link
Member

CI is green on this.

@oscardssmith Is this good to merge? I don't see any "approval" reviews on this.

@DilumAluthge DilumAluthge merged commit ac607dc into JuliaLang:master Aug 21, 2023
7 checks passed
@DilumAluthge DilumAluthge removed the status:merge me PR is reviewed. Merge when all tests are passing label Aug 21, 2023
@nsajko nsajko deleted the bf2f branch August 21, 2023 04:19
nsajko added a commit to nsajko/julia that referenced this pull request Sep 28, 2024
Fixes issues introduced in JuliaLang#50691 and found in JuliaLang#55906:
* use inbounds and boundscheck macros in rawbigints, for catching OOB
  with `--check-bounds=yes`
* fix OOB in `truncate`

The fixes need backporting to v1.11, so this would ideally be merged
before JuliaLang#55906.
giordano pushed a commit that referenced this pull request Sep 29, 2024
Fixes issues introduced in #50691 and found in #55906:
* use `@inbounds` and `@boundscheck` macros in rawbigints, for catching
OOB with `--check-bounds=yes`
* fix OOB in `truncate`
KristofferC pushed a commit that referenced this pull request Sep 30, 2024
Fixes issues introduced in #50691 and found in #55906:
* use `@inbounds` and `@boundscheck` macros in rawbigints, for catching
OOB with `--check-bounds=yes`
* fix OOB in `truncate`

(cherry picked from commit 17445fe)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:bignums BigInt and BigFloat domain:float16 kind:bugfix This change fixes an existing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BigFloat incorrectly rounded to Float16 (subnormals)
3 participants