Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SSE2 versions of ClampedAddSubtractFull and ClampedAddSubtractHalf #1805

Merged
merged 12 commits into from
Nov 8, 2021

Conversation

brianpopow
Copy link
Collaborator

@brianpopow brianpopow commented Nov 2, 2021

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This PR adds SSE2 versions of ClampedAddSubtractFull and ClampedAddSubtractHalf which are used during encoding and decoding of lossless webp.

Related to: #1786

TODO:

  • Add tests.

Before:
before

After:
After

Benchmark results encoding:

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19043.1320 (21H1/May2021Update)
Intel Core i7-6700K CPU 4.00GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.100-rc.2.21505.57
  [Host]     : .NET 5.0.11 (5.0.1121.47308), X64 RyuJIT
  Job-RZASZW : .NET 5.0.11 (5.0.1121.47308), X64 RyuJIT
  Job-RSRLCF : .NET Core 3.1.20 (CoreCLR 4.700.21.47003, CoreFX 4.700.21.47101), X64 RyuJIT
  Job-JFGCRR : .NET Framework 4.8 (4.8.4420.0), X64 RyuJIT

IterationCount=3  LaunchCount=1  WarmupCount=3

|                     Method |        Job |              Runtime |             Arguments |    TestImage |      Mean |     Error |   StdDev | Ratio | RatioSD |       Gen 0 |     Gen 1 |     Gen 2 |  Allocated |
|--------------------------- |----------- |--------------------- |---------------------- |------------- |----------:|----------:|---------:|------:|--------:|------------:|----------:|----------:|-----------:|
|        'Magick Webp Lossy' | Job-RZASZW |             .NET 5.0 | /p:DebugType=portable | Png/Bike.png |  23.36 ms |  2.292 ms | 0.126 ms |  0.14 |    0.00 |           - |         - |         - |      68 KB |
|    'ImageSharp Webp Lossy' | Job-RZASZW |             .NET 5.0 | /p:DebugType=portable | Png/Bike.png | 247.26 ms | 29.033 ms | 1.591 ms |  1.53 |    0.00 | 135000.0000 |         - |         - | 552,714 KB |
|     'Magick Webp Lossless' | Job-RZASZW |             .NET 5.0 | /p:DebugType=portable | Png/Bike.png | 161.25 ms | 11.824 ms | 0.648 ms |  1.00 |    0.00 |           - |         - |         - |     519 KB |
| 'ImageSharp Webp Lossless' | Job-RZASZW |             .NET 5.0 | /p:DebugType=portable | Png/Bike.png | 305.26 ms | 66.906 ms | 3.667 ms |  1.89 |    0.03 |  34000.0000 | 5000.0000 | 2000.0000 | 161,674 KB |
|                            |            |                      |                       |              |           |           |          |       |         |             |           |           |            |
|        'Magick Webp Lossy' | Job-RSRLCF |        .NET Core 3.1 |               Default | Png/Bike.png |  23.38 ms |  0.190 ms | 0.010 ms |  0.14 |    0.00 |           - |         - |         - |      68 KB |
|    'ImageSharp Webp Lossy' | Job-RSRLCF |        .NET Core 3.1 |               Default | Png/Bike.png | 256.32 ms | 36.712 ms | 2.012 ms |  1.59 |    0.01 | 135000.0000 |         - |         - | 552,713 KB |
|     'Magick Webp Lossless' | Job-RSRLCF |        .NET Core 3.1 |               Default | Png/Bike.png | 161.41 ms |  8.593 ms | 0.471 ms |  1.00 |    0.00 |           - |         - |         - |     523 KB |
| 'ImageSharp Webp Lossless' | Job-RSRLCF |        .NET Core 3.1 |               Default | Png/Bike.png | 321.77 ms | 30.517 ms | 1.673 ms |  1.99 |    0.01 |  34000.0000 | 5000.0000 | 2000.0000 | 161,673 KB |
|                            |            |                      |                       |              |           |           |          |       |         |             |           |           |            |
|        'Magick Webp Lossy' | Job-JFGCRR | .NET Framework 4.7.2 |               Default | Png/Bike.png |  23.43 ms |  1.900 ms | 0.104 ms |  0.14 |    0.00 |           - |         - |         - |      68 KB |
|    'ImageSharp Webp Lossy' | Job-JFGCRR | .NET Framework 4.7.2 |               Default | Png/Bike.png | 377.71 ms | 34.625 ms | 1.898 ms |  2.33 |    0.02 | 135000.0000 |         - |         - | 554,351 KB |
|     'Magick Webp Lossless' | Job-JFGCRR | .NET Framework 4.7.2 |               Default | Png/Bike.png | 161.85 ms | 13.271 ms | 0.727 ms |  1.00 |    0.00 |           - |         - |         - |     520 KB |
| 'ImageSharp Webp Lossless' | Job-JFGCRR | .NET Framework 4.7.2 |               Default | Png/Bike.png | 387.72 ms | 14.809 ms | 0.812 ms |  2.40 |    0.01 |  34000.0000 | 5000.0000 | 2000.0000 | 162,118 KB |

Benchmark results decoding:

|                     Method |        Job |              Runtime |             Arguments |        TestImageLossy |        TestImageLossless |       Mean |       Error |   StdDev |    Gen 0 | Gen 1 | Gen 2 | Allocated |
|--------------------------- |----------- |--------------------- |---------------------- |---------------------- |------------------------- |-----------:|------------:|---------:|---------:|------:|------:|----------:|
|        'Magick Lossy Webp' | Job-RXAKRB |             .NET 5.0 | /p:DebugType=portable | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   111.0 ms |    40.93 ms |  2.24 ms |        - |     - |     - |     25 KB |
|    'ImageSharp Lossy Webp' | Job-RXAKRB |             .NET 5.0 | /p:DebugType=portable | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   286.2 ms |   108.16 ms |  5.93 ms |        - |     - |     - |  2,428 KB |
|     'Magick Lossless Webp' | Job-RXAKRB |             .NET 5.0 | /p:DebugType=portable | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   115.0 ms |    72.28 ms |  3.96 ms |        - |     - |     - |     16 KB |
| 'ImageSharp Lossless Webp' | Job-RXAKRB |             .NET 5.0 | /p:DebugType=portable | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   311.2 ms |   529.47 ms | 29.02 ms |        - |     - |     - |  2,091 KB |
|        'Magick Lossy Webp' | Job-SHZFXZ |        .NET Core 3.1 |               Default | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   124.1 ms |    74.47 ms |  4.08 ms |        - |     - |     - |     25 KB |
|    'ImageSharp Lossy Webp' | Job-SHZFXZ |        .NET Core 3.1 |               Default | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   308.8 ms |   271.16 ms | 14.86 ms | 500.0000 |     - |     - |  2,428 KB |
|     'Magick Lossless Webp' | Job-SHZFXZ |        .NET Core 3.1 |               Default | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   112.0 ms |    43.24 ms |  2.37 ms |        - |     - |     - |     15 KB |
| 'ImageSharp Lossless Webp' | Job-SHZFXZ |        .NET Core 3.1 |               Default | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   462.8 ms |   188.74 ms | 10.35 ms |        - |     - |     - |  2,092 KB |
|        'Magick Lossy Webp' | Job-VLFICQ | .NET Framework 4.7.2 |               Default | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   108.5 ms |    66.75 ms |  3.66 ms |        - |     - |     - |     32 KB |
|    'ImageSharp Lossy Webp' | Job-VLFICQ | .NET Framework 4.7.2 |               Default | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   582.0 ms |   169.32 ms |  9.28 ms |        - |     - |     - |  2,436 KB |
|     'Magick Lossless Webp' | Job-VLFICQ | .NET Framework 4.7.2 |               Default | Webp/earth_lossy.webp | Webp/earth_lossless.webp |   109.7 ms |    28.50 ms |  1.56 ms |        - |     - |     - |     18 KB |
| 'ImageSharp Lossless Webp' | Job-VLFICQ | .NET Framework 4.7.2 |               Default | Webp/earth_lossy.webp | Webp/earth_lossless.webp | 1,865.8 ms | 1,250.04 ms | 68.52 ms |        - |     - |     - |  9,729 KB |

@codecov
Copy link

codecov bot commented Nov 2, 2021

Codecov Report

Merging #1805 (9fa7ac2) into master (94b9962) will increase coverage by 0.14%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1805      +/-   ##
==========================================
+ Coverage   87.19%   87.33%   +0.14%     
==========================================
  Files         936      936              
  Lines       47922    47944      +22     
  Branches     6016     6018       +2     
==========================================
+ Hits        41785    41873      +88     
+ Misses       5144     5078      -66     
  Partials      993      993              
Flag Coverage Δ
unittests 87.33% <100.00%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/ImageSharp/Formats/Webp/Lossless/ColorCache.cs 94.11% <ø> (ø)
.../ImageSharp/Formats/Webp/Lossless/LosslessUtils.cs 97.54% <100.00%> (+6.72%) ⬆️
...ageSharp/Formats/Webp/Lossless/PredictorEncoder.cs 98.10% <0.00%> (+5.10%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94b9962...9fa7ac2. Read the comment docs.

@brianpopow brianpopow changed the title WIP: Add SSE2 versions of ClampedAddSubtractFull and ClampedAddSubtractHalf Add SSE2 versions of ClampedAddSubtractFull and ClampedAddSubtractHalf Nov 2, 2021
@brianpopow brianpopow requested a review from a team November 3, 2021 14:23
Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@brianpopow brianpopow merged commit d8d2616 into master Nov 8, 2021
@brianpopow brianpopow deleted the bp/clampedaddsubtractsse branch November 8, 2021 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants