Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] [Issue: 61620] Optimizing ARM64 for *x = dblCns; #61847

Merged
merged 10 commits into from
Nov 30, 2021
34 changes: 34 additions & 0 deletions src/coreclr/jit/lower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6793,6 +6793,40 @@ void Lowering::LowerStoreIndirCommon(GenTreeStoreInd* ind)
if (!comp->codeGen->gcInfo.gcIsWriteBarrierStoreIndNode(ind))
{
LowerStoreIndir(ind);
Copy link
Contributor

@SingleAccretion SingleAccretion Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern in lower.cpp is that LowerCommonX methods call LowerX methods. It may be that LowerStoreIndir has modified ind by the time we check we check for our optimization opportunity.

So I suggest a reorder: move the optimization before LowerStoreIndir. You should then be able to delete the #if for SetContained, as it would be dead code - LowerStoreIndir should check for the containment as appropriate.

Also: is containing all FP constants worth it (does it even work?) on ARM/ARM64? The codegen for things like *x = 1.1 should be checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comments.

So I suggest a reorder: move the optimization before LowerStoreIndir. You should then be able to delete the #if for SetContained, as it would be dead code - LowerStoreIndir should check for the containment as appropriate.

I'll make a call to LowerStoreInd after optimization.

Also: is containing all FP constants worth it (does it even work?) on ARM/ARM64? The codegen for things like *x = 1.1 should be checked.

You made the right remarks. I will fix them and come back with new changes soon.


// Moved from lowerxarch.cpp. Issue: #61620
SeanWoo marked this conversation as resolved.
Show resolved Hide resolved
if (varTypeIsFloating(ind) && ind->Data()->IsCnsFltOrDbl())
{
// Optimize *x = DCON to *x = ICON which is slightly faster on xarch
SeanWoo marked this conversation as resolved.
Show resolved Hide resolved
GenTree* data = ind->Data();
double dblCns = data->AsDblCon()->gtDconVal;
ssize_t intCns = 0;
var_types type = TYP_UNKNOWN;

if (ind->TypeIs(TYP_FLOAT))
{
float fltCns = static_cast<float>(dblCns); // should be a safe round-trip
intCns = static_cast<ssize_t>(*reinterpret_cast<INT32*>(&fltCns));
type = TYP_INT;
}
#ifdef TARGET_AMD64
SeanWoo marked this conversation as resolved.
Show resolved Hide resolved
else
{
assert(ind->TypeIs(TYP_DOUBLE));
intCns = static_cast<ssize_t>(*reinterpret_cast<INT64*>(&dblCns));
type = TYP_LONG;
}
#endif

if (type != TYP_UNKNOWN)
{
data->BashToConst(intCns, type);
#if defined(TARGET_ARM64)
data->SetContained();
#endif
ind->ChangeType(type);
}
}
}
}

Expand Down
32 changes: 1 addition & 31 deletions src/coreclr/jit/lowerxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -130,37 +130,7 @@ void Lowering::LowerStoreIndir(GenTreeStoreInd* node)
return;
}
}
else if (node->Data()->IsCnsFltOrDbl())
{
// Optimize *x = DCON to *x = ICON which is slightly faster on xarch
GenTree* data = node->Data();
double dblCns = data->AsDblCon()->gtDconVal;
ssize_t intCns = 0;
var_types type = TYP_UNKNOWN;

if (node->TypeIs(TYP_FLOAT))
{
float fltCns = static_cast<float>(dblCns); // should be a safe round-trip
intCns = static_cast<ssize_t>(*reinterpret_cast<INT32*>(&fltCns));
type = TYP_INT;
}
#ifdef TARGET_AMD64
else
{
assert(node->TypeIs(TYP_DOUBLE));
intCns = static_cast<ssize_t>(*reinterpret_cast<INT64*>(&dblCns));
type = TYP_LONG;
}
#endif

if (type != TYP_UNKNOWN)
{
data->SetContained();
data->BashToConst(intCns, type);
node->ChangeType(type);
}
}


// Optimization: do not unnecessarily zero-extend the result of setcc.
if (varTypeIsByte(node) && (node->Data()->OperIsCompare() || node->Data()->OperIs(GT_SETCC)))
{
Expand Down