Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aarch64] Add hyperbolic and arc trig intrinsic lowering #98937

Merged
merged 1 commit into from
Jul 19, 2024

Conversation

farzonl
Copy link
Member

@farzonl farzonl commented Jul 15, 2024

The change(s)

  • VecFuncs.def: define intrinsic to sleef/armpl mapping
  • LegalizerHelper.cpp: add missing fewerElementsVector handling for the new trig intrinsics
  • AArch64ISelLowering.cpp: Add arch64 specializations for lowering like neon instructions
  • AArch64LegalizerInfo.cpp: Legalize the new trig intrinsics. aarch64 has specail legalization requirments in AArch64LegalizerInfo.cpp. If we redirect the clang builtin without handling this we will break the aarch64 compiler

History

This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for acos, asin, atan, cosh, sinh, and tanh.

#70079
#70080
#70081
#70083
#70084
#95966

Why is aarch64 needed

The last step is to redirect the acos, asin, atan, cosh, sinh, and tanh to emit the intrinsic. We can't emit the intrinsic without the intrinsics becoming legal for aarch64 in AArch64LegalizerInfo.cpp

- `VecFuncs.def`: define intrinsic to  sleef/armpl mapping
- `LegalizerHelper.cpp`: add missing `fewerElementsVector` handling for
  the new trig intrinsics
- `AArch64ISelLowering.cpp`: Add arch64 specializations for lowering
  like neon instructions
- `AArch64LegalizerInfo.cpp`: Legalize the new trig intrinsics.
  aarch64 has specail legalization requirments in
  `AArch64LegalizerInfo.cpp`. If we redirect the clang builtin without
  handling this we will break the aarch64 compiler
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 15, 2024

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-llvm-analysis

Author: Farzon Lotfi (farzonl)

Changes

The change(s)

  • VecFuncs.def: define intrinsic to sleef/armpl mapping
  • LegalizerHelper.cpp: add missing fewerElementsVector handling for the new trig intrinsics
  • AArch64ISelLowering.cpp: Add arch64 specializations for lowering like neon instructions
  • AArch64LegalizerInfo.cpp: Legalize the new trig intrinsics. aarch64 has specail legalization requirments in AArch64LegalizerInfo.cpp. If we redirect the clang builtin without handling this we will break the aarch64 compiler

History

This change is part of an implementation of #87367 investigation on supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for acos, asin, atan, cosh, sinh, and tanh.

#70079
#70080
#70081
#70083
#70084
#95966

Why is aarch64 needed

The last step is to redirect the acos, asin, atan, cosh, sinh, and tanh to emit the intrinsic. We can't emit the intrinsic without the intrinsics becoming legal for aarch64 in AArch64LegalizerInfo.cpp


Patch is 163.00 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/98937.diff

22 Files Affected:

  • (modified) llvm/include/llvm/Analysis/VecFuncs.def (+66)
  • (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+6)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+18)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+3-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll (+48)
  • (added) llvm/test/CodeGen/AArch64/GlobalISel/legalize-acos.mir (+227)
  • (added) llvm/test/CodeGen/AArch64/GlobalISel/legalize-asin.mir (+227)
  • (added) llvm/test/CodeGen/AArch64/GlobalISel/legalize-atan.mir (+227)
  • (added) llvm/test/CodeGen/AArch64/GlobalISel/legalize-cosh.mir (+227)
  • (added) llvm/test/CodeGen/AArch64/GlobalISel/legalize-sinh.mir (+227)
  • (added) llvm/test/CodeGen/AArch64/GlobalISel/legalize-tanh.mir (+227)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir (+18-12)
  • (modified) llvm/test/CodeGen/AArch64/f16-instructions.ll (+156)
  • (modified) llvm/test/CodeGen/AArch64/fp-intrinsics-fp16.ll (+90)
  • (modified) llvm/test/CodeGen/AArch64/fp-intrinsics.ll (+186)
  • (modified) llvm/test/CodeGen/AArch64/illegal-float-ops.ll (+125)
  • (modified) llvm/test/CodeGen/AArch64/replace-with-veclib-armpl.ll (+272-1)
  • (modified) llvm/test/CodeGen/AArch64/replace-with-veclib-sleef-scalable.ll (+109-1)
  • (modified) llvm/test/CodeGen/AArch64/replace-with-veclib-sleef.ll (+109-1)
  • (modified) llvm/test/CodeGen/AArch64/vec-libcalls.ll (+191)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/veclib-calls-libsystem-darwin.ll (+289)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/veclib-intrinsic-calls.ll (+443-5)
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index ffdf8b8c3bc79..444cef613fb00 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -73,12 +73,18 @@ TLI_DEFINE_VECFUNC("llvm.exp.f32", "_simd_exp_f4", FIXED(4), "_ZGV_LLVM_N4v")
 
 // Trigonometric Functions
 TLI_DEFINE_VECFUNC("acos", "_simd_acos_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_simd_acos_d2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("acosf", "_simd_acos_f4", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_simd_acos_f4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("asin", "_simd_asin_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_simd_asin_d2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("asinf", "_simd_asin_f4", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_simd_asin_f4", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("atan", "_simd_atan_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_simd_atan_d2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("atanf", "_simd_atan_f4", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_simd_atan_f4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan2", "_simd_atan2_d2", FIXED(2), "_ZGV_LLVM_N2vv")
 TLI_DEFINE_VECFUNC("atan2f", "_simd_atan2_f4", FIXED(4), "_ZGV_LLVM_N4vv")
 
@@ -109,11 +115,17 @@ TLI_DEFINE_VECFUNC("llvm.pow.f32", "_simd_pow_f4", FIXED(4), "_ZGV_LLVM_N4vv")
 
 // Hyperbolic Functions
 TLI_DEFINE_VECFUNC("sinh", "_simd_sinh_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_simd_sinh_d2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("sinhf", "_simd_sinh_f4", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_simd_sinh_f4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("cosh", "_simd_cosh_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_simd_cosh_d2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("coshf", "_simd_cosh_f4", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_simd_cosh_f4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("tanh", "_simd_tanh_d2", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_simd_tanh_d2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("tanhf", "_simd_tanh_f4", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_simd_tanh_f4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("asinh", "_simd_asinh_d2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("asinhf", "_simd_asinh_f4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("acosh", "_simd_acosh_d2", FIXED(2), "_ZGV_LLVM_N2v")
@@ -500,14 +512,17 @@ TLI_DEFINE_VECFUNC("__exp2f_finite", "__svml_exp2f16", FIXED(16), "_ZGV_LLVM_N16
 #elif defined(TLI_DEFINE_SLEEFGNUABI_VF2_VECFUNCS)
 
 TLI_DEFINE_VECFUNC("acos", "_ZGVnN2v_acos", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVnN2v_acos", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("acosh", "_ZGVnN2v_acosh", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("asin", "_ZGVnN2v_asin", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVnN2v_asin", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("asinh", "_ZGVnN2v_asinh", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("atan", "_ZGVnN2v_atan", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVnN2v_atan", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("atan2", "_ZGVnN2vv_atan2", FIXED(2), "_ZGV_LLVM_N2vv")
 
@@ -521,6 +536,7 @@ TLI_DEFINE_VECFUNC("cos", "_ZGVnN2v_cos", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("llvm.cos.f64", "_ZGVnN2v_cos", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("cosh", "_ZGVnN2v_cosh", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVnN2v_cosh", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("cospi", "_ZGVnN2v_cospi", FIXED(2), "_ZGV_LLVM_N2v")
 
@@ -583,6 +599,7 @@ TLI_DEFINE_VECFUNC("sincos", "_ZGVnN2vl8l8_sincos", FIXED(2), "_ZGV_LLVM_N2vl8l8
 TLI_DEFINE_VECFUNC("sincospi", "_ZGVnN2vl8l8_sincospi", FIXED(2), "_ZGV_LLVM_N2vl8l8")
 
 TLI_DEFINE_VECFUNC("sinh", "_ZGVnN2v_sinh", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_ZGVnN2v_sinh", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("sinpi", "_ZGVnN2v_sinpi", FIXED(2), "_ZGV_LLVM_N2v")
 
@@ -592,20 +609,24 @@ TLI_DEFINE_VECFUNC("tan", "_ZGVnN2v_tan", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("llvm.tan.f64", "_ZGVnN2v_tan", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("tanh", "_ZGVnN2v_tanh", FIXED(2), "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_ZGVnN2v_tanh", FIXED(2), "_ZGV_LLVM_N2v")
 
 TLI_DEFINE_VECFUNC("tgamma", "_ZGVnN2v_tgamma", FIXED(2), "_ZGV_LLVM_N2v")
 
 #elif defined(TLI_DEFINE_SLEEFGNUABI_VF4_VECFUNCS)
 
 TLI_DEFINE_VECFUNC("acosf", "_ZGVnN4v_acosf", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVnN4v_acosf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("acoshf", "_ZGVnN4v_acoshf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("asinf", "_ZGVnN4v_asinf", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVnN4v_asinf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("asinhf", "_ZGVnN4v_asinhf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("atanf", "_ZGVnN4v_atanf", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVnN4v_atanf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("atan2f", "_ZGVnN4vv_atan2f", FIXED(4), "_ZGV_LLVM_N4vv")
 
@@ -619,6 +640,7 @@ TLI_DEFINE_VECFUNC("cosf", "_ZGVnN4v_cosf", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVnN4v_cosf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("coshf", "_ZGVnN4v_coshf", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVnN4v_coshf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("cospif", "_ZGVnN4v_cospif", FIXED(4), "_ZGV_LLVM_N4v")
 
@@ -681,6 +703,7 @@ TLI_DEFINE_VECFUNC("sincosf", "_ZGVnN4vl4l4_sincosf", FIXED(4), "_ZGV_LLVM_N4vl4
 TLI_DEFINE_VECFUNC("sincospif", "_ZGVnN4vl4l4_sincospif", FIXED(4), "_ZGV_LLVM_N4vl4l4")
 
 TLI_DEFINE_VECFUNC("sinhf", "_ZGVnN4v_sinhf", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVnN4v_sinhf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("sinpif", "_ZGVnN4v_sinpif", FIXED(4), "_ZGV_LLVM_N4v")
 
@@ -690,6 +713,7 @@ TLI_DEFINE_VECFUNC("tanf", "_ZGVnN4v_tanf", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVnN4v_tanf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("tanhf", "_ZGVnN4v_tanhf", FIXED(4), "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVnN4v_tanhf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("tgammaf", "_ZGVnN4v_tgammaf", FIXED(4), "_ZGV_LLVM_N4v")
 
@@ -697,18 +721,24 @@ TLI_DEFINE_VECFUNC("tgammaf", "_ZGVnN4v_tgammaf", FIXED(4), "_ZGV_LLVM_N4v")
 
 TLI_DEFINE_VECFUNC("acos", "_ZGVsMxv_acos",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("acosf", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "_ZGVsMxv_acos", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "_ZGVsMxv_acosf", SCALABLE(4), MASKED, "_ZGVsMxv")
 
 TLI_DEFINE_VECFUNC("acosh", "_ZGVsMxv_acosh",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("acoshf", "_ZGVsMxv_acoshf", SCALABLE(4), MASKED, "_ZGVsMxv")
 
 TLI_DEFINE_VECFUNC("asin", "_ZGVsMxv_asin",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("asinf", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "_ZGVsMxv_asin", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "_ZGVsMxv_asinf", SCALABLE(4), MASKED, "_ZGVsMxv")
 
 TLI_DEFINE_VECFUNC("asinh", "_ZGVsMxv_asinh",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("asinhf", "_ZGVsMxv_asinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
 
 TLI_DEFINE_VECFUNC("atan", "_ZGVsMxv_atan",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("atanf", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "_ZGVsMxv_atan", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "_ZGVsMxv_atanf", SCALABLE(4), MASKED, "_ZGVsMxv")
 
 TLI_DEFINE_VECFUNC("atan2", "_ZGVsMxvv_atan2",  SCALABLE(2), MASKED, "_ZGVsMxvv")
 TLI_DEFINE_VECFUNC("atan2f", "_ZGVsMxvv_atan2f", SCALABLE(4), MASKED, "_ZGVsMxvv")
@@ -729,6 +759,8 @@ TLI_DEFINE_VECFUNC("llvm.cos.f32", "_ZGVsMxv_cosf", SCALABLE(4), MASKED, "_ZGVsM
 
 TLI_DEFINE_VECFUNC("cosh", "_ZGVsMxv_cosh",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("coshf", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "_ZGVsMxv_cosh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "_ZGVsMxv_coshf", SCALABLE(4), MASKED, "_ZGVsMxv")
 
 TLI_DEFINE_VECFUNC("cospi", "_ZGVsMxv_cospi",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("cospif", "_ZGVsMxv_cospif", SCALABLE(4), MASKED, "_ZGVsMxv")
@@ -826,6 +858,8 @@ TLI_DEFINE_VECFUNC("sincospif", "_ZGVsNxvl4l4_sincospif", SCALABLE(4), NOMASK, "
 
 TLI_DEFINE_VECFUNC("sinh", "_ZGVsMxv_sinh",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("sinhf", "_ZGVsMxv_sinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "_ZGVsMxv_sinh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "_ZGVsMxv_sinhf", SCALABLE(4), MASKED, "_ZGVsMxv")
 
 TLI_DEFINE_VECFUNC("sinpi", "_ZGVsMxv_sinpi",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("sinpif", "_ZGVsMxv_sinpif", SCALABLE(4), MASKED, "_ZGVsMxv")
@@ -840,6 +874,8 @@ TLI_DEFINE_VECFUNC("llvm.tan.f32", "_ZGVsMxv_tanf", SCALABLE(4), MASKED, "_ZGVsM
 
 TLI_DEFINE_VECFUNC("tanh", "_ZGVsMxv_tanh",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("tanhf", "_ZGVsMxv_tanhf", SCALABLE(4), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "_ZGVsMxv_tanh", SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "_ZGVsMxv_tanhf", SCALABLE(4), MASKED, "_ZGVsMxv")
 
 TLI_DEFINE_VECFUNC("tgamma", "_ZGVsMxv_tgamma",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("tgammaf", "_ZGVsMxv_tgammaf", SCALABLE(4), MASKED, "_ZGVsMxv")
@@ -851,6 +887,11 @@ TLI_DEFINE_VECFUNC("acosf", "armpl_vacosq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v
 TLI_DEFINE_VECFUNC("acos", "armpl_svacos_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("acosf", "armpl_svacos_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
 
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "armpl_vacosq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "armpl_vacosq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.acos.f64", "armpl_svacos_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.acos.f32", "armpl_svacos_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
+
 TLI_DEFINE_VECFUNC("acosh", "armpl_vacoshq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("acoshf", "armpl_vacoshq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("acosh", "armpl_svacosh_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
@@ -861,6 +902,11 @@ TLI_DEFINE_VECFUNC("asinf", "armpl_vasinq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v
 TLI_DEFINE_VECFUNC("asin", "armpl_svasin_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("asinf", "armpl_svasin_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
 
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "armpl_vasinq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "armpl_vasinq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.asin.f64", "armpl_svasin_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.asin.f32", "armpl_svasin_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
+
 TLI_DEFINE_VECFUNC("asinh", "armpl_vasinhq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("asinhf", "armpl_vasinhq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("asinh", "armpl_svasinh_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
@@ -871,6 +917,11 @@ TLI_DEFINE_VECFUNC("atanf", "armpl_vatanq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v
 TLI_DEFINE_VECFUNC("atan", "armpl_svatan_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("atanf", "armpl_svatan_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
 
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "armpl_vatanq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "armpl_vatanq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.atan.f64", "armpl_svatan_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.atan.f32", "armpl_svatan_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
+
 TLI_DEFINE_VECFUNC("atan2", "armpl_vatan2q_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2vv")
 TLI_DEFINE_VECFUNC("atan2f", "armpl_vatan2q_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4vv")
 TLI_DEFINE_VECFUNC("atan2", "armpl_svatan2_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxvv")
@@ -906,6 +957,11 @@ TLI_DEFINE_VECFUNC("coshf", "armpl_vcoshq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v
 TLI_DEFINE_VECFUNC("cosh", "armpl_svcosh_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("coshf", "armpl_svcosh_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
 
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "armpl_vcoshq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "armpl_vcoshq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.cosh.f64", "armpl_svcosh_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.cosh.f32", "armpl_svcosh_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
+
 TLI_DEFINE_VECFUNC("cospi", "armpl_vcospiq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("cospif", "armpl_vcospiq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("cospi", "armpl_svcospi_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
@@ -1081,6 +1137,11 @@ TLI_DEFINE_VECFUNC("sinhf", "armpl_vsinhq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v
 TLI_DEFINE_VECFUNC("sinh", "armpl_svsinh_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("sinhf", "armpl_svsinh_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
 
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "armpl_vsinhq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "armpl_vsinhq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.sinh.f64", "armpl_svsinh_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.sinh.f32", "armpl_svsinh_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
+
 TLI_DEFINE_VECFUNC("sinpi", "armpl_vsinpiq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("sinpif", "armpl_vsinpiq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("sinpi", "armpl_svsinpi_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
@@ -1106,6 +1167,11 @@ TLI_DEFINE_VECFUNC("tanhf", "armpl_vtanhq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v
 TLI_DEFINE_VECFUNC("tanh", "armpl_svtanh_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
 TLI_DEFINE_VECFUNC("tanhf", "armpl_svtanh_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
 
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "armpl_vtanhq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "armpl_vtanhq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
+TLI_DEFINE_VECFUNC("llvm.tanh.f64", "armpl_svtanh_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
+TLI_DEFINE_VECFUNC("llvm.tanh.f32", "armpl_svtanh_f32_x", SCALABLE(4), MASKED, "_ZGVsMxv")
+
 TLI_DEFINE_VECFUNC("tgamma", "armpl_vtgammaq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("tgammaf", "armpl_vtgammaq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("tgamma", "armpl_svtgamma_f64_x",  SCALABLE(2), MASKED, "_ZGVsMxv")
diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index f717849317ba7..4e3502dd2ca82 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -4698,6 +4698,12 @@ LegalizerHelper::fewerElementsVector(MachineInstr &MI, unsigned TypeIdx,
   case G_FCOS:
   case G_FSIN:
   case G_FTAN:
+  case G_FACOS:
+  case G_FASIN:
+  case G_FATAN:
+  case G_FCOSH:
+  case G_FSINH:
+  case G_FTANH:
   case G_FSQRT:
   case G_BSWAP:
   case G_BITREVERSE:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 7294da474c4bc..327b9b50eb35e 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -732,10 +732,14 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
 
   for (auto Op : {ISD::FREM,         ISD::FPOW,          ISD::FPOWI,
                   ISD::FCOS,         ISD::FSIN,          ISD::FSINCOS,
+                  ISD::FACOS,        ISD::FASIN,         ISD::FATAN,
+                  ISD::FCOSH,        ISD::FSINH,         ISD::FTANH,
                   ISD::FTAN,         ISD::FEXP,          ISD::FEXP2,
                   ISD::FEXP10,       ISD::FLOG,          ISD::FLOG2,
                   ISD::FLOG10,       ISD::STRICT_FREM,   ISD::STRICT_FPOW,
                   ISD::STRICT_FPOWI, ISD::STRICT_FCOS,   ISD::STRICT_FSIN,
+                  ISD::STRICT_FACOS, ISD::STRICT_FASIN,  ISD::STRICT_FATAN,
+                  ISD::STRICT_FCOSH, ISD::STRICT_FSINH,  ISD::STRICT_FTANH,
                   ISD::STRICT_FEXP,  ISD::STRICT_FEXP2,  ISD::STRICT_FLOG,
                   ISD::STRICT_FLOG2, ISD::STRICT_FLOG10, ISD::STRICT_FTAN}) {
     setOperationAction(Op, MVT::f16, Promote);
@@ -1176,6 +1180,8 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
           ISD::FNEG,              ISD::FABS,           ISD::FCEIL,
           ISD::FSQRT,             ISD::FFLOOR,         ISD::FNEARBYINT,
           ISD::FSIN,              ISD::FCOS,           ISD::FTAN,
+          ISD::FASIN,             ISD::FACOS,          ISD::FATAN,
+          ISD::FSINH,             ISD::FCOSH,          ISD::FTANH,
           ISD::FPOW,              ISD::FLOG,           ISD::FLOG2,          
           ISD::FLOG10,            ISD::FEXP,           ISD::FEXP2,
           ISD::FEXP10,            ISD::FRINT,          ISD::FROUND,
@@ -1615,6 +1621,12 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
       setOperationAction(ISD::FSIN, VT, Expand);
       setOperationAction(ISD::FSINCOS, VT, Expand);
       setOperationAction(ISD::FTAN, VT, Expand);
+      setOperationAction(ISD::FACOS, VT, Expand);
+      setOperationAction(ISD::FASIN, VT, Expand);
+      setOperationAction(ISD::FATAN, VT, Expand);
+      setOperationAction(ISD::FCOSH, VT, Expand);
+      setOperationAction(ISD::FSINH, VT, Expand);
+      setOperationAction(ISD::FTANH, VT, Expand);
       setOperationAction(ISD::FEXP, VT, Expand);
       setOperationAction(ISD::FEXP2, VT, Expand);
       setOperationAction(ISD::FEXP10, VT, Expand);
@@ -1822,6 +1834,12 @@ void AArch64TargetLowering::addTypeForNEON(MVT VT) {
     setOperationAction(ISD::FSIN, VT, Expand);
     setOperationAction(ISD::FCOS, VT, Expand);
     setOperationAction(ISD::FTAN, VT, Expand);
+    setOperationAction(ISD::FASIN, VT, Expand);
+    setOperationAction(ISD::FACOS, VT, Expand);
+    setOperationAction(ISD::FATAN, VT, Expand);
+    setOperationAction(ISD::FSINH, VT, Expand);
+    setOperationAction(ISD::FCOSH, VT, Expand);
+    setOperationAction(ISD::FTANH, VT, Expand);
     setOperationAction(ISD::FPOW, VT, Expand);
     setOperationAction(ISD::FLOG, VT, Expand);
     setOperationAction(ISD::FLOG2, VT, Expand);
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 7a29457f5442f..def1f4f44f335 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -267,8 +267,9 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
       .libcallFor({{s64, s128}})
       .minScalarOrElt(1, MinFPScalar);
 
-  getActionDefinitionsBuilder({G_FCOS, G_FSIN, G_FPOW, G_FLOG, G_FLOG2,
-                               G_FLOG10, G_FTAN, G_FEXP, G_FEXP2, G_FEXP10})
+  getActionDefinitionsBuilder(
+      {G_FCOS, G_FSIN, G_FPOW, G_FLOG, G_FLOG2, G_FLOG10, G_FTAN, G_FEXP,
+       G_FEXP2, G_FEXP10, G_FACOS, G_FASIN, G_FATAN, G_FCOSH, G_FSINH, G_FTANH})
       // We need a call for these, so we always need to scalarize.
       .scalarize(0)
       // Regardless of FP16 support, widen 16-bit elements to 32-bits.
diff --git a/llvm/test/CodeGen/A...
[truncated]

farzonl added a commit to farzonl/llvm-project that referenced this pull request Jul 15, 2024
…builtins

- `Builtins.td` - Add f16 support for libm arc and hyperbolic trig functions
- `CGBuiltin.cpp` - Emit constraint intrinsics for trig clang builtins

This change is part of an implementation of llvm#87367 investigation on supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`.

llvm#70079
llvm#70080
llvm#70081
llvm#70083
llvm#70084
llvm#95966

Note this PR needs Merge after:
- llvm#98937
- llvm#98755
@farzonl
Copy link
Member Author

farzonl commented Jul 17, 2024

@davemgreen @paschalis-mpeis If you have some time to review, I would appreciate it. Thanks in advance!

Copy link
Member

@paschalis-mpeis paschalis-mpeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @farzonl ,

Unfortunately I cannot have a strong opinion on Global ISel contributions.

Regarding sources I've modified in the past

The below look OK to me:

  • additions to VecFuncs.def
  • all replace-with-veclib-*
  • LoopVectorize veclib-intrinsic-calls.ll

I verified the existence of the symbols used in sources and test files on ArmPL and SLEEF libraries (see list at the bottom).


Regarding the rest of the sources and tests:

They also look reasonable and similar to existing sources, but I cannot vouch for them.

BTW in GlobalISel/arm64-irtranslator.ll other functions have variants for double as well for scalable vectors (float and double). Why this isn't the case for the newly added functions as well?


Symbos verified in libraries:

# ARMPL
armpl_vacosq_f64
armpl_vacosq_f32
armpl_svacos_f64_x
armpl_svacos_f32_x

armpl_vasinq_f64
armpl_vasinq_f32
armpl_svasin_f64_x
armpl_svasin_f32_x

armpl_vatanq_f64
armpl_vatanq_f32
armpl_svatan_f64_x
armpl_svatan_f32_x

armpl_vcoshq_f64
armpl_vcoshq_f32
armpl_svcosh_f64_x
armpl_svcosh_f32_x

armpl_vsinhq_f64
armpl_vsinhq_f32
armpl_svsinh_f64_x
armpl_svsinh_f32_x

armpl_vtanhq_f64
armpl_vtanhq_f32
armpl_svtanh_f64_x
armpl_svtanh_f32_x

# SLEEF
## Sleef scalable:
_ZGVsMxv_acos
_ZGVsMxv_acosf

_ZGVsMxv_asin
_ZGVsMxv_asinf

_ZGVsMxv_atan
_ZGVsMxv_atanf

_ZGVsMxv_cosh
_ZGVsMxv_coshf

_ZGVsMxv_sinh
_ZGVsMxv_sinhf

_ZGVsMxv_tanh
_ZGVsMxv_tanhf

## Sleef vector:
_ZGVnN2v_acos
_ZGVnN4v_acosf

_ZGVnN2v_asin
_ZGVnN4v_asinf

_ZGVnN2v_atan
_ZGVnN4v_atanf

_ZGVnN2v_cosh
_ZGVnN4v_coshf

_ZGVnN2v_sinh
_ZGVnN4v_sinhf

_ZGVnN2v_tanh
_ZGVnN4v_tanhf

@farzonl
Copy link
Member Author

farzonl commented Jul 18, 2024

Regarding the rest of the sources and tests:

They also look reasonable and similar to existing sources, but I cannot vouch for them.

BTW in GlobalISel/arm64-irtranslator.ll other functions have variants for double as well for scalable vectors (float and double). Why this isn't the case for the newly added functions as well?

Symbos verified in libraries:

In VecFuncs.def I only added ARMPL and SLEEF intrinsic apis if there was an equivalent libcall one that already existed. I'm going for parrity with what existed.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GISel change LGTM. Thanks

@farzonl farzonl merged commit e2f463b into llvm:main Jul 19, 2024
12 checks passed
@farzonl
Copy link
Member Author

farzonl commented Jul 19, 2024

Thank you!

farzonl added a commit that referenced this pull request Jul 19, 2024
…builtins (#98949)

## Change(s)
- `Builtins.td` - Add f16 support for libm arc and hyperbolic trig
functions
- `CGBuiltin.cpp` - Emit constraint intrinsics for trig clang builtins

## History
This change is part of an implementation of
#87367 investigation on
supporting IEEE math operations as intrinsics. Which was discussed in
this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`,
`sinh`, and `tanh`.

#70079
#70080
#70081
#70083
#70084
#95966

## Precursor PR(s)

Note this PR needs Merge after:
- #98937
- #98755
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this pull request Jul 23, 2024
## The change(s)
- `VecFuncs.def`: define intrinsic to  sleef/armpl mapping
- `LegalizerHelper.cpp`: add missing `fewerElementsVector` handling for
the new trig intrinsics
- `AArch64ISelLowering.cpp`: Add arch64 specializations for lowering
like neon instructions
- `AArch64LegalizerInfo.cpp`: Legalize the new trig intrinsics. aarch64
has specail legalization requirments in `AArch64LegalizerInfo.cpp`. If
we redirect the clang builtin without handling this we will break the
aarch64 compiler

## History
This change is part of an implementation of
llvm#87367 investigation on
supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`,
`sinh`, and `tanh`.

llvm#70079
llvm#70080
llvm#70081
llvm#70083
llvm#70084
llvm#95966

## Why is aarch64 needed
The last step is to redirect the `acos`, `asin`, `atan`, `cosh`, `sinh`,
and `tanh` to emit the intrinsic. We can't emit the intrinsic without
the intrinsics becoming legal for aarch64 in `AArch64LegalizerInfo.cpp`
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this pull request Jul 23, 2024
…builtins (llvm#98949)

## Change(s)
- `Builtins.td` - Add f16 support for libm arc and hyperbolic trig
functions
- `CGBuiltin.cpp` - Emit constraint intrinsics for trig clang builtins

## History
This change is part of an implementation of
llvm#87367 investigation on
supporting IEEE math operations as intrinsics. Which was discussed in
this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`,
`sinh`, and `tanh`.

llvm#70079
llvm#70080
llvm#70081
llvm#70083
llvm#70084
llvm#95966

## Precursor PR(s)

Note this PR needs Merge after:
- llvm#98937
- llvm#98755
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
## The change(s)
- `VecFuncs.def`: define intrinsic to  sleef/armpl mapping
- `LegalizerHelper.cpp`: add missing `fewerElementsVector` handling for
the new trig intrinsics
- `AArch64ISelLowering.cpp`: Add arch64 specializations for lowering
like neon instructions
- `AArch64LegalizerInfo.cpp`: Legalize the new trig intrinsics. aarch64
has specail legalization requirments in `AArch64LegalizerInfo.cpp`. If
we redirect the clang builtin without handling this we will break the
aarch64 compiler

## History
This change is part of an implementation of
#87367 investigation on
supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`,
`sinh`, and `tanh`.

#70079
#70080
#70081
#70083
#70084
#95966

## Why is aarch64 needed
The last step is to redirect the `acos`, `asin`, `atan`, `cosh`, `sinh`,
and `tanh` to emit the intrinsic. We can't emit the intrinsic without
the intrinsics becoming legal for aarch64 in `AArch64LegalizerInfo.cpp`

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251359
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…builtins (#98949)

Summary:
## Change(s)
- `Builtins.td` - Add f16 support for libm arc and hyperbolic trig
functions
- `CGBuiltin.cpp` - Emit constraint intrinsics for trig clang builtins

## History
This change is part of an implementation of
#87367 investigation on
supporting IEEE math operations as intrinsics. Which was discussed in
this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`,
`sinh`, and `tanh`.

#70079
#70080
#70081
#70083
#70084
#95966

## Precursor PR(s)

Note this PR needs Merge after:
- #98937
- #98755

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251574
@farzonl farzonl deleted the add-arc-hyp-aarch64-lowering branch August 19, 2024 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants