Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor codegen for Vector128.Create() with constants sharing the same value #63432

Closed
dubiousconst282 opened this issue Jan 6, 2022 · 2 comments · Fixed by #63442
Closed

Poor codegen for Vector128.Create() with constants sharing the same value #63432

dubiousconst282 opened this issue Jan 6, 2022 · 2 comments · Fixed by #63442
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@dubiousconst282
Copy link
Contributor

Description

Calling Vector128.Create(float, float, float, float) with constants that share the same value results in poor codegen, as demonstrated below.

Sharplab

using System;
using System.Runtime.Intrinsics;

public class C {
    static Vector128<float> M1() {
        return Vector128.Create(1f,2f,4f,4f);
    }
    static Vector128<float> M2() {
        return Vector128.Create(1f,2f,3f,4f);
    }
}

Output assembly (CoreCLR 6.0.21.52210 on amd64)

C.M1()
    L0000: vzeroupper
    L0003: vmovss xmm0, [0x7ffb22bb0480]
    L000b: vmovss xmm1, [0x7ffb22bb0484]
    L0013: vinsertps xmm0, xmm0, xmm1, 0x10
    L0019: vmovss xmm1, [0x7ffb22bb0488]
    L0021: vmovaps xmm2, xmm1
    L0025: vinsertps xmm0, xmm0, xmm2, 0x20
    L002b: vinsertps xmm0, xmm0, xmm1, 0x30
    L0031: vmovupd [rcx], xmm0
    L0035: mov rax, rcx
    L0038: ret

C.M2()
    L0000: vzeroupper
    L0003: vmovupd xmm0, [0x7ffb22bb04c0]
    L000b: vmovupd [rcx], xmm0
    L000f: mov rax, rcx
    L0012: ret

This seems to affect all other VectorXXX.Create() functions, but only for float and double.

@dubiousconst282 dubiousconst282 added the tenet-performance Performance related issue label Jan 6, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Jan 6, 2022
@ghost
Copy link

ghost commented Jan 6, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Calling Vector128.Create(float, float, float, float) with constants that share the same value results in poor codegen, as demonstrated below.

Sharplab

using System;
using System.Runtime.Intrinsics;

public class C {
    static Vector128<float> M1() {
        return Vector128.Create(1f,2f,4f,4f);
    }
    static Vector128<float> M2() {
        return Vector128.Create(1f,2f,3f,4f);
    }
}

Output assembly (CoreCLR 6.0.21.52210 on amd64)

C.M1()
    L0000: vzeroupper
    L0003: vmovss xmm0, [0x7ffb22bb0480]
    L000b: vmovss xmm1, [0x7ffb22bb0484]
    L0013: vinsertps xmm0, xmm0, xmm1, 0x10
    L0019: vmovss xmm1, [0x7ffb22bb0488]
    L0021: vmovaps xmm2, xmm1
    L0025: vinsertps xmm0, xmm0, xmm2, 0x20
    L002b: vinsertps xmm0, xmm0, xmm1, 0x30
    L0031: vmovupd [rcx], xmm0
    L0035: mov rax, rcx
    L0038: ret

C.M2()
    L0000: vzeroupper
    L0003: vmovupd xmm0, [0x7ffb22bb04c0]
    L000b: vmovupd [rcx], xmm0
    L000f: mov rax, rcx
    L0012: ret

This seems to affect all other VectorXXX.Create() functions, but only for float and double.

Author: compilerdeceiver397
Assignees: -
Labels:

tenet-performance, area-CodeGen-coreclr, untriaged

Milestone: -

@EgorBo
Copy link
Member

EgorBo commented Jan 6, 2022

Nice find!
It seems in case if a constant feeds Vector.Create it has to be marked as DONT_CSE (except the cases where we mix variables and constants) or use VNs when we put vector to data section, but I think DONT_CSE is a better choice.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 6, 2022
@EgorBo EgorBo modified the milestones: 6.0.x, 7.0.0 Jan 6, 2022
@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Jan 11, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jan 15, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Feb 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants