Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in codegen of string.IsNullOrEmpty() #63116

Closed
ShreyasJejurkar opened this issue Dec 24, 2021 · 16 comments
Closed

Difference in codegen of string.IsNullOrEmpty() #63116

ShreyasJejurkar opened this issue Dec 24, 2021 · 16 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@ShreyasJejurkar
Copy link
Contributor

I was playing with JIT a little bit recently, and found the below scenario where I am checking does give string is null or empty with two different styles,

  1. With raw null and length check.
  2. and with string.IsNullOrEmpty() built-in method, which does the same raw null and length check behind the scenes.

But the collagen is a little different for both cases!
For given a program below (corresponding sharplab link)

using System;
public class C {
    public void M() {
        var someString = "";
        
        if(someString == null || someString.Length == 0) {
            Console.Write("hello");
        }
    }
    
    public void T() {
        string someString = "";
        
        if(string.IsNullOrEmpty(someString)) {
            Console.Write("hello");
        }
    }
}

Following is the resulting codegen

C.M()
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: mov ecx, [0x8d72018]
    L0009: cmp dword ptr [ecx+4], 0
    L000d: jne short L001a
    L000f: mov ecx, [0x8dce1d8]
    L0015: call System.Console.Write(System.String)
    L001a: pop ebp
    L001b: ret

C.T()
    L0000: mov ecx, [0x8dce1d8]
    L0006: call System.Console.Write(System.String)
    L000b: ret

As we can see above, codegen for string.IsNullOrEmpty(string) is quite optimized, so my question is can we do the same optimization for the raw null check call as well (OR maybe fold that raw call with the built-in method by the compiler), given the fact that the built-in also does the same behind the scenes, but still codegen is quite different!

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Dec 24, 2021
@ghost
Copy link

ghost commented Dec 24, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

I was playing with JIT a little bit recently, and found the below scenario where I am checking does give string is null or empty with two different styles,

  1. With raw null and length check.
  2. and with string.IsNullOrEmpty() built-in method, which does the same raw null and length check behind the scenes.

But the collagen is a little different for both cases!
For given a program below (corresponding sharplab link)

using System;
public class C {
    public void M() {
        var someString = "";
        
        if(someString == null || someString.Length == 0) {
            Console.Write("hello");
        }
    }
    
    public void T() {
        string someString = "";
        
        if(string.IsNullOrEmpty(someString)) {
            Console.Write("hello");
        }
    }
}

Following is the resulting codegen

C.M()
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: mov ecx, [0x8d72018]
    L0009: cmp dword ptr [ecx+4], 0
    L000d: jne short L001a
    L000f: mov ecx, [0x8dce1d8]
    L0015: call System.Console.Write(System.String)
    L001a: pop ebp
    L001b: ret

C.T()
    L0000: mov ecx, [0x8dce1d8]
    L0006: call System.Console.Write(System.String)
    L000b: ret

As we can see above, codegen for string.IsNullOrEmpty(string) is quite optimized, so my question is can we do the same optimization for the raw null check call as well (OR maybe fold that raw call with the built-in method by the compiler), given the fact that the built-in also does the same behind the scenes, but still codegen is quite different!

Author: ShreyasJejurkar
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@huoyaoyuan
Copy link
Member

See #63095 and #4207

@ShreyasJejurkar
Copy link
Contributor Author

I can see your PR linked to the issue, will that PR will fix this issue!?
I saw some regression in diff!

@EgorBo
Copy link
Member

EgorBo commented Dec 24, 2021

@ShreyasJejurkar in your T method jit sees that someString is a string literal which is never null so it folded everything

@ShreyasJejurkar
Copy link
Contributor Author

Thanks @EgorBo for the reply! Won't JIT see same thing for method M because it is the same thing there as well and can fold?

@EgorBo
Copy link
Member

EgorBo commented Dec 24, 2021

    public void M() {
        var someString = "";
        
        if(someString == null || someString.Length == 0) {
            Console.Write("hello");
        }
    }

Ah, I missed that part.
Yes, it's a known problem in jit - it needs "Forward Substitution" feature to handle cases like this one. Honestly, I thought Roslyn always propagated ldstr by value 🤔

@ShreyasJejurkar
Copy link
Contributor Author

Even more what got me into thinking that, the string.IsNullOrEmpty() under the hood implementation is the same as method M, but still it did not fold! https://source.dot.net/#System.Private.CoreLib/String.cs,486

@EgorBo
Copy link
Member

EgorBo commented Dec 24, 2021

Even more what got me into thinking that, the string.IsNullOrEmpty() under the hood implementation is the same as method M, but still it did not fold! https://source.dot.net/#System.Private.CoreLib/String.cs,486

It's because JIT actually supports "Forward substitution" but only when it inlines functions.

@ShreyasJejurkar
Copy link
Contributor Author

I created a new version of the same IsNullOrEmpty() with the same implementation and with aggressive inline, but still codegen is not optimized, not sure why JIT cannot inline that method as well!

using System;
using System.Runtime.CompilerServices;
public class C {
    
    public void M() {
        var someString = "";
        
        if(IsNullOrEmptyVer(someString)) {
            Console.Write("hello");
        }
    }
    
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public bool IsNullOrEmptyVer(string s) {
        return s == null || s.Length == 0;
    }
    
    
    public void T() {
        string someString = "";
        
        if(string.IsNullOrEmpty(someString)) {
            Console.Write("hello");
        }
    }
}

SharpLab link -> https://sharplab.io/#v2:EYLgtghglgdgNAFxBAzmAPgAQEwEYCwAUJgAwAEmuAdAEoCuMCUYAplQMID2YADlADYsATgGVhANygBjFigDcRTAGYK2MuzIBvImV1kde5RQAsZALIAKAJRaDeveIhCyKbixEIhsAOZkAvGQARIEKhPb2duFQAGYWAJIoAHJ0/PwA8kIAorwIAJ4AasIWrqweXjDeVjbaYeF1XDCuglQA6l4ILBaBABYsqZyBVqF1ugC+keO1upEA2mYsCN2cACZxvPwW84srazzpPEycjVQAgt7eQrIoUOIscTD8sD5WALqRRsCcnPxkCcmpGWyBwKRUo5BQ1Ui9kwAHYXP4AjAUj90OgXFQADIsCqLBFkEjDPSTCJTfSkoyYUwAFWstlJ0Nw4LcZR8/iCIShek5uhiFjBVD+yMBOVyxWZnmekPpIwaTTYbSgHS6vX6g0JdWJRKIoyAA===

@ShreyasJejurkar
Copy link
Contributor Author

I saw this issue about "Forward Substitution", but it's dead I guess! #6973

@EgorBo
Copy link
Member

EgorBo commented Dec 24, 2021

@ShreyasJejurkar it's Roslyn - it decided to save your "" into a temp. Make your IsNullOrEmptyVer static.

        IL_0000: ldstr ""
        IL_0005: stloc.0
        IL_0006: ldarg.0
        IL_0007: ldloc.0
        IL_0008: call instance bool C::IsNullOrEmptyVer(string)

but when you make it static:

        IL_0000: ldstr ""
        IL_0005: call bool C::IsNullOrEmptyVer(string)

@ShreyasJejurkar
Copy link
Contributor Author

Thanks @EgorBo , it did. SharpLab link

@EgorBo
Copy link
Member

EgorBo commented Dec 24, 2021

NOTE: I am not blaming Roslyn here 🙂 - it's a correct IL and it's jit's fault that it doesn't do forward sub.

@ShreyasJejurkar
Copy link
Contributor Author

@EgorBo ahh, it's ok. No one to blame here, just an opportunity.
Are there any plans to address this kind of issue!?

@EgorBo
Copy link
Member

EgorBo commented Dec 24, 2021

Are there any plans to address this kind of issue!?

Soon or later 🙂

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this issue Jan 13, 2022
Extend ref counting done by local morph so that we can determine
single-def single-use locals.

Add a phase that runs just after local morph that will attempt to
forward single-def single-use local defs to uses when they are in
adjacent statements.

Fix or work around issues uncovered elsewhere:
* `gtFoldExprCompare` might fold "identical" volatile subtrees
* `fgGetStubAddrArg` cannot handle complex trees
* some simd/hw operations can lose struct handles
* some calls cannot handle struct local args

Addresses dotnet#6973 and related issues. Still sorting through exactly
which ones are fixed, so list below may need revising.

Fixes dotnet#48605.
Fixes dotnet#51599.
Fixes dotnet#55472.

Improves some but not all cases in dotnet#12280 and dotnet#62064.

Does not fix dotnet#33002, dotnet#47082, or dotnet#63116; these require handling multiple
uses or bypassing statements.
@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Jan 25, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the Future milestone Jan 25, 2022
AndyAyersMS added a commit that referenced this issue Feb 4, 2022
Extend ref counting done by local morph so that we can determine
single-def single-use locals.

Add a phase that runs just after local morph that will attempt to
forward single-def single-use local defs to uses when they are in
adjacent statements.

Fix or work around issues uncovered elsewhere:
* `gtFoldExprCompare` might fold "identical" volatile subtrees
* `fgGetStubAddrArg` cannot handle complex trees
* some simd/hw operations can lose struct handles
* some calls cannot handle struct local args
* morph expects args not to interfere
* fix arm; don't forward sub no return calls
* update debuginfo test (we may want to revisit this)
* handle subbing past normalize on store assignment
* clean up nullcheck of new helper

Addresses #6973 and related issues. Still sorting through exactly
which ones are fixed, so list below may need revising.

Fixes #48605.
Fixes #51599.
Fixes #55472.

Improves some but not all cases in #12280 and #62064.

Does not fix #33002, #47082, or #63116; these require handling multiple
uses or bypassing statements.
@EgorBo
Copy link
Member

EgorBo commented Dec 3, 2022

Closing since it has expected codegen now (apparently, via Fsub)

@EgorBo EgorBo closed this as completed Dec 3, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Jan 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

4 participants