Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexOf and LastIndexOf are for some test cases few times slower on Linux #13676

Open
adamsitnik opened this issue Oct 29, 2019 · 1 comment
Open
Labels
area-System.Globalization os-linux Linux OS (any supported distro) tenet-performance Performance related issue
Milestone

Comments

@adamsitnik
Copy link
Member

IndexOf and LastIndexOf are for some test cases few times slower on Linux.

Slower Lin/Win Win Median (ns) Lin Median (ns) Modality
System.Memory.ReadOnlySpan.IndexOfString(input: "だ", value: "た", comparisonType: InvariantCulture) 34.21 107.43 3675.43
System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (, None, True)) 7.66 2952.92 22631.17
System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, IgnoreCase, True)) 7.62 2954.37 22515.44
System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, None, True)) 5.75 2963.99 17052.76
System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (, IgnoreCase, True)) 5.55 2958.29 16412.98
System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, IgnoreSymbols, False)) 4.87 3222.91 15680.86
System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, OrdinalIgnoreCase, False)) 3.47 596.96 2072.40
System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, IgnoreSymbols, False)) 3.17 3183.26 10084.88
System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, OrdinalIgnoreCase, False)) 3.12 678.58 2118.31
System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, IgnoreCase, True)) 2.73 3713.38 10119.90
System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (, None, True)) 2.67 3850.16 10264.14
System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, None, True)) 2.64 3935.11 10406.63
System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (, IgnoreCase, True)) 2.64 3707.07 9803.11
System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, Ordinal, False)) 2.36 131.66 311.09
System.Memory.ReadOnlySpan.IndexOfString(input: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAX 2.35 235.63 553.18
System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (pl-PL, None, False)) 2.17 8300.73 18011.26
System.Memory.ReadOnlySpan.IndexOfString(input: "Hello WorldbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbareallyreallylongHello World 1.72 45.05 77.69
System.Memory.ReadOnlySpan.IndexOfString(input: "More Test's", value: "Tests", comparisonType: OrdinalIgnoreCase) 1.61 53.91 86.82
System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (pl-PL, None, False)) 1.37 9237.59 12681.21
System.Memory.ReadOnlySpan.IndexOfString(input: "StrIng", value: "string", comparisonType: OrdinalIgnoreCase) 1.30 37.58 48.82

How to run the benchmarks:

git clone https://github.com/dotnet/performance.git
python3 ./performance/scripts/benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Globalization*IndexOf*'  '*IndexOfString'

Recommended profilers are PerfCollect and VTune which works great on Linux.

Some of the differences come from ICU (on Windows we are using WinAPI), some might come from the managed hot path. The person who is willing to work on this issue should use VTune/PerfCollect, identify the problem and solve it. It might require tuning the ICU itself ;)

@adamsitnik
Copy link
Member Author

I've used VTune to profile the biggest difference just to make sure we don't do anything stupid:

static int Main(string[] args)
{
    int result = 0;
    
    for (int i = 0; i < 100_000; i++)
        result ^= IndexOfString("\u3060", "\u305F", StringComparison.InvariantCulture);

    return result;
}

[MethodImpl(MethodImplOptions.NoInlining)]
static int IndexOfString(string input, string value, StringComparison comparisonType)
{
    ReadOnlySpan<char> inputSpan = input.AsSpan();
    ReadOnlySpan<char> valueSpan = value.AsSpan();

    int result = 0;
    
    result ^= inputSpan.IndexOf(valueSpan, comparisonType); result ^= inputSpan.IndexOf(valueSpan, comparisonType);
    result ^= inputSpan.IndexOf(valueSpan, comparisonType); result ^= inputSpan.IndexOf(valueSpan, comparisonType);
    result ^= inputSpan.IndexOf(valueSpan, comparisonType); result ^= inputSpan.IndexOf(valueSpan, comparisonType);
    result ^= inputSpan.IndexOf(valueSpan, comparisonType); result ^= inputSpan.IndexOf(valueSpan, comparisonType);
    result ^= inputSpan.IndexOf(valueSpan, comparisonType); result ^= inputSpan.IndexOf(valueSpan, comparisonType);
    result ^= inputSpan.IndexOf(valueSpan, comparisonType); result ^= inputSpan.IndexOf(valueSpan, comparisonType);
    result ^= inputSpan.IndexOf(valueSpan, comparisonType); result ^= inputSpan.IndexOf(valueSpan, comparisonType);
    result ^= inputSpan.IndexOf(valueSpan, comparisonType); result ^= inputSpan.IndexOf(valueSpan, comparisonType);

    return result;
}

80% of time is spent in usearch_openFromCollator (ICU):

image

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 26, 2020
@tarekgh tarekgh removed the untriaged New issue has not been triaged by the area owner label Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Globalization os-linux Linux OS (any supported distro) tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

4 participants