-
-
Notifications
You must be signed in to change notification settings - Fork 851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVX conversions for Luminance and Rgb #2039
Conversation
Looks good overall but benchmark results look strange, ran your fork on my machine:
Note that I'm using win10. There's no way
We had similar speedup of the P.S. It's also worth checking whether |
Using Linux, I converted a BMP image into Jpeg, using System.Drawing.Common package. Looks decent to me: System.Drawing.Common is indeed using Libgdiplus, which in turn is using libjpeg. See: Btw, I ran my banchmark on this configuration:
|
ref Vector256<float> destRef = ref yBlock.V0; | ||
|
||
const int bytesPerL8Stride = 8; | ||
for (int i = 0; i < 8; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (int i = 0; i < 8; i++) | |
for (nint i = 0; i < bytesPerL8Stride; i++) |
Is 8
from the condition by coincidence the same as bytesPerL8Stride
? If not, just use 8
, but change to nint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it is not. 8 is the number of floats in the destination Vector256. Using the nint
type does indeed eliminate the cast to IntPtr.
Thanks for the feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gfoidl for the reviews, always welcome!
const int bytesPerL8Stride = 8; | ||
for (int i = 0; i < 8; i++) | ||
{ | ||
Unsafe.Add(ref destRef, i) = Avx2.ConvertToVector256Single(Avx2.ConvertToVector256Int32(Unsafe.AddByteOffset(ref l8ByteSpan, (IntPtr)(bytesPerL8Stride * i)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsafe.Add(ref destRef, i) = Avx2.ConvertToVector256Single(Avx2.ConvertToVector256Int32(Unsafe.AddByteOffset(ref l8ByteSpan, (IntPtr)(bytesPerL8Stride * i)))); | |
Unsafe.Add(ref destRef, i) = Avx2.ConvertToVector256Single(Avx2.ConvertToVector256Int32(Unsafe.AddByteOffset(ref l8ByteSpan, bytesPerL8Stride * i))); |
The cast to IntPtr
can go away with nint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Vector256<byte> rgb, rg, bx; | ||
|
||
const int bytesPerRgbStride = 24; | ||
for (int i = 0; i < 8; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (int i = 0; i < 8; i++) | |
for (nint i = 0; i < 8; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
const int bytesPerRgbStride = 24; | ||
for (int i = 0; i < 8; i++) | ||
{ | ||
rgb = Avx2.PermuteVar8x32(Unsafe.AddByteOffset(ref rgbByteSpan, (IntPtr)(bytesPerRgbStride * i)).AsUInt32(), extractToLanesMask).AsByte(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rgb = Avx2.PermuteVar8x32(Unsafe.AddByteOffset(ref rgbByteSpan, (IntPtr)(bytesPerRgbStride * i)).AsUInt32(), extractToLanesMask).AsByte(); | |
rgb = Avx2.PermuteVar8x32(Unsafe.AddByteOffset(ref rgbByteSpan, bytesPerRgbStride * i).AsUInt32(), extractToLanesMask).AsByte(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I can't judge output quality without original input image, encoding parameters, Libjpeg is, of course, faster than ImageSharp but encoder part of the jpeg codec is almost complete wiht all the optimizations from libjpeg (not even counting stuff which ImageSharp has and libjpeg doesn't). All I want is to be sure that these results are true so 'ImageSharp is slow' won't happen again due to some incorrect encoding setup. IMO we should either resolve these performance questions here or at least open an issue/discussion. |
@br3aker Expanded benchmarks including a minimum of SkiaSharp is probably wise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for you help here!
Prerequisites
Description
This PR fixed #1763
Implemented AVX versions of Luminance and RGB conversions in JPEG Encoding, as described in the issue.
Added Benchmarks for these cases.
Results with and without AVX: