Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fast path to MemoryExtensions.Trim for input that needs no trimming #84210

Merged
merged 8 commits into from
Apr 2, 2023

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Apr 1, 2023

Profiler states that majority of time during Guid.Parse is spent inside .Trim() and it seems that if we can trade around 1% of perf for inputs with trailing/leading whitespaces we can get up to 45% boost for other cases, e.g.:

[Benchmark]
public Guid ParseN() => Guid.Parse("14c63ab0f7cd464f8096539cbe70a80e");

[Benchmark]
public Guid ParseB() => Guid.Parse("{8FB23982-A2C4-4F46-B915-C0CE14241EFD}");

[Benchmark]
public Guid ParseDefault() => Guid.Parse("d1f168a8-ca0e-428d-8677-8b343b745376");

[Benchmark]
public Guid ParseDefaultWhitespaces() => Guid.Parse("  d1f168a8-ca0e-428d-8677-8b343b745376  ");

string _emptyInput = "";
[Benchmark]
public bool TryParseEmpty() => Guid.TryParse(_emptyInput, out Guid _);
|                  Method |                   Toolchain |      Mean |
|------------------------ |---------------------------- |----------:|
|                  ParseN |      \Core_Root\corerun.exe | 18.802 ns |
|                  ParseN | \Core_Root_base\corerun.exe | 24.441 ns |
|                         |                             |           |
|                  ParseB |      \Core_Root\corerun.exe | 12.714 ns |
|                  ParseB | \Core_Root_base\corerun.exe | 19.173 ns |
|                         |                             |           |
|            ParseDefault |      \Core_Root\corerun.exe | 18.007 ns |
|            ParseDefault | \Core_Root_base\corerun.exe | 24.995 ns |
|                         |                             |           |
| ParseDefaultWhitespaces |      \Core_Root\corerun.exe | 25.903 ns |
| ParseDefaultWhitespaces | \Core_Root_base\corerun.exe | 25.747 ns |
|                         |                             |           |
|           TryParseEmpty |      \Core_Root\corerun.exe |  2.544 ns |
|           TryParseEmpty | \Core_Root_base\corerun.exe |  6.919 ns |

Another mystery is why B format is faster than N (reproduces accross multiple runs)

PS: grep.app for Guid.Parse with constant input: https://grep.app/search?q=Guid.Parse%28%22&filter[lang][0]=C%23 (unlikely on hot path but anyway)

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Apr 1, 2023
@ghost ghost assigned EgorBo Apr 1, 2023
@stephentoub
Copy link
Member

It's often the case that trim is used when no trimming is actually needed. Is there a way to fix this in trim instead by prioritizing that use case?

@EgorBo
Copy link
Member Author

EgorBo commented Apr 1, 2023

It's often the case that trim is used when no trimming is actually needed. Is there a way to fix this in trim instead by prioritizing that use case?

Good point, e.g. rewrite Trim() function to be:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static ReadOnlySpan<char> Trim(this ReadOnlySpan<char> span)
{
    // Assume that in the most cases input doesn't need trimming
    if (span.Length == 0 ||
        (span.Length > 0 && !char.IsWhiteSpace(span[0]) && !char.IsWhiteSpace(span[span.Length - 1])))
    {
        return span;
    }

    return TrimFallback(span);

    [MethodImpl(MethodImplOptions.NoInlining)]
    static ReadOnlySpan<char> TrimFallback(ReadOnlySpan<char> span)
    {
        int start = 0;
        for (; start < span.Length; start++)
        {
            if (!char.IsWhiteSpace(span[start]))
            {
                break;
            }
        }

        int end = span.Length - 1;
        for (; end > start; end--)
        {
            if (!char.IsWhiteSpace(span[end]))
            {
                break;
            }
        }

        return span.Slice(start, end - start + 1);
    }
}

@EgorBo
Copy link
Member Author

EgorBo commented Apr 1, 2023

@stephentoub done, it seems to even improve benchmark results. Do I need to do the same for TrimStart/TrimEnd or those are fine and rarely used?

…ns.Trim.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>
@stephentoub
Copy link
Member

Do I need to do the same for TrimStart/TrimEnd or those are fine and rarely used?

I think we can leave those alone for now. They're used much less, and the one hot path I know about in our own code is already guarded.

break;
}
}
return span.Slice(start, end - start + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the clamp helpers in the fallback?

Copy link
Member Author

@EgorBo EgorBo Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we add an additional overhead for cases when input does contain whitespaces I decided to have a slightly faster fallback (inlined clamp helpers) - didn't want to mark the helpers always-inlineable. Can revert to clamp helpers if you think an extra 1ns is not worth it

EgorBo and others added 2 commits April 1, 2023 23:36
Co-authored-by: Stephen Toub <stoub@microsoft.com>
@EgorBo EgorBo changed the title Improve Guid.Parse for input without whitespaces Add fast path to MemoryExtensions.Trim for input that needs no trimming Apr 10, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants