rs6000: Generate _Decimal128 to _Decimal32 hardware conversion instructions

Message ID a1956d89-4226-ca0c-c24a-1f6ae75b1e29@linux.ibm.com
State New
Headers show
Series
  • rs6000: Generate _Decimal128 to _Decimal32 hardware conversion instructions
Related show

Commit Message

Jonathan Wakely via Gcc-patches July 17, 2020, 7:47 p.m.
PR92488 shows we do not generate hardware conversion instructions when
converting from _Decimal128 to _Decimal32.  There is no one instruction
that does the conversion, so we currently call the __dpd_trunctdsd2
function to do the conversion for us.  This is slow.  Paul Murphy
described a short sequence of dfp hardware instructions that would do
the conversion correctly.  The patch below implements that idea.

The convert-fp-128.c test case uses dg-require-effective-target dfp,
so its !dfp usages are basically disabling those tests completely.
What we really want is to know whether the compiler is generating
hardware instructions or calling the libcalls.  For that, we need
to test hard_dfp.

This patch bootstrapped and regtested with no regressions on powerpc64le-linux.
Segher, you pre-approved the pattern, but I thought I'd have you double
check the test case changes and new test case.

Still ok for trunk?

Peter


gcc/
	PR target/92488
	* config/rs6000/dfp.md (trunctdsd2): New define_insn.

gcc/testsuite/
	PR target/92488
	* gcc.target/powerpc/convert-fp-128.c (bl, drsp, drdpq): Update counts.
	(__dpd_trunctdsd2): Make conditional on !hard_dfp.
	(__dpd_extendsddd2, __dpd_extendsdtd2, __dpd_truncddsd2,
	__dpd_extendddtd2, __dpd_trunctddd2): Use !hard_dfp.
	* gcc.target/powerpc/pr92488.c: New test.

Comments

Segher Boessenkool July 17, 2020, 8:23 p.m. | #1
On Fri, Jul 17, 2020 at 02:47:50PM -0500, Peter Bergner wrote:
> PR92488 shows we do not generate hardware conversion instructions when

> converting from _Decimal128 to _Decimal32.  There is no one instruction

> that does the conversion, so we currently call the __dpd_trunctdsd2

> function to do the conversion for us.  This is slow.  Paul Murphy

> described a short sequence of dfp hardware instructions that would do

> the conversion correctly.  The patch below implements that idea.

> 

> The convert-fp-128.c test case uses dg-require-effective-target dfp,

> so its !dfp usages are basically disabling those tests completely.

> What we really want is to know whether the compiler is generating

> hardware instructions or calling the libcalls.  For that, we need

> to test hard_dfp.

> 

> This patch bootstrapped and regtested with no regressions on powerpc64le-linux.

> Segher, you pre-approved the pattern, but I thought I'd have you double

> check the test case changes and new test case.

> 

> Still ok for trunk?


Yes.  Thanks!

> gcc/

> 	PR target/92488

> 	* config/rs6000/dfp.md (trunctdsd2): New define_insn.

> 

> gcc/testsuite/

> 	PR target/92488

> 	* gcc.target/powerpc/convert-fp-128.c (bl, drsp, drdpq): Update counts.

> 	(__dpd_trunctdsd2): Make conditional on !hard_dfp.

> 	(__dpd_extendsddd2, __dpd_extendsdtd2, __dpd_truncddsd2,

> 	__dpd_extendddtd2, __dpd_trunctddd2): Use !hard_dfp.

> 	* gcc.target/powerpc/pr92488.c: New test.

> 

> diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md

> index e91d6f581ed..50bfad6beb7 100644

> --- a/gcc/config/rs6000/dfp.md

> +++ b/gcc/config/rs6000/dfp.md

> @@ -155,6 +155,16 @@

>    [(set_attr "type" "dfp")

>     (set_attr "length" "8")])

>  

> +(define_insn "trunctdsd2"

> +  [(set (match_operand:SD 0 "gpc_reg_operand" "=d")

> +	(float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d")))

> +   (clobber (match_scratch:TD 2 "=&d"))

> +   (clobber (match_scratch:DF 3 "=&d"))]

> +  "TARGET_DFP"

> +  "mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2"

> +  [(set_attr "type" "dfp")

> +   (set_attr "length" "20")])


Side note:

On ISA 3.0B and later you can do

	mffscdrni %3,7
	drdpq %2,%1
	mffscdrn %3,%3
	drsp %0,%2

(saving one insn, and using somewhat cheaper insns).  But that is only
on newer machines, so is this worth it at all?  :-)


Segher
Jonathan Wakely via Gcc-patches July 17, 2020, 9:18 p.m. | #2
On 7/17/20 3:23 PM, Segher Boessenkool wrote:
> On ISA 3.0B and later you can do

> 

> 	mffscdrni %3,7

> 	drdpq %2,%1

> 	mffscdrn %3,%3

> 	drsp %0,%2

> 

> (saving one insn, and using somewhat cheaper insns).  But that is only

> on newer machines, so is this worth it at all?  :-)


So something like the following?

Peter


(define_insn "trunctdsd2"
  [(set (match_operand:SD 0 "gpc_reg_operand" "=d")
        (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d")))
   (clobber (match_scratch:TD 2 "=&d"))
   (clobber (match_scratch:DF 3 "=&d"))]
  "TARGET_DFP"
{
  if (TARGET_MODULO)
    return "mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2";
  else
    return "mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2";
}
  [(set_attr "type" "dfp")
   (set (attr "length")
        (if_then_else
          (match_test "TARGET_MODULO")
          (const_string "16")
          (const_string "20")))])
Segher Boessenkool July 17, 2020, 9:32 p.m. | #3
On Fri, Jul 17, 2020 at 04:18:55PM -0500, Peter Bergner wrote:
> On 7/17/20 3:23 PM, Segher Boessenkool wrote:

> > On ISA 3.0B and later you can do

> > 

> > 	mffscdrni %3,7

> > 	drdpq %2,%1

> > 	mffscdrn %3,%3

> > 	drsp %0,%2

> > 

> > (saving one insn, and using somewhat cheaper insns).  But that is only

> > on newer machines, so is this worth it at all?  :-)

> 

> So something like the following?

> 

> Peter

> 

> 

> (define_insn "trunctdsd2"

>   [(set (match_operand:SD 0 "gpc_reg_operand" "=d")

>         (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d")))

>    (clobber (match_scratch:TD 2 "=&d"))

>    (clobber (match_scratch:DF 3 "=&d"))]

>   "TARGET_DFP"

> {

>   if (TARGET_MODULO)

>     return "mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2";

>   else

>     return "mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2";

> }

>   [(set_attr "type" "dfp")

>    (set (attr "length")

>         (if_then_else

>           (match_test "TARGET_MODULO")

>           (const_string "16")

>           (const_string "20")))])


Well, just make an "isa" value of "p9"?  Then you can just do

(define_insn "trunctdsd2"
  [(set (match_operand:SD 0 "gpc_reg_operand" "=d")
        (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d")))
   (clobber (match_scratch:TD 2 "=&d"))
   (clobber (match_scratch:DF 3 "=&d"))]
  "TARGET_DFP"
  "@
   mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2
   mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2"
  [(set_attr "type" "dfp")
   (set_attr "isa" "p9,*")
   (set_attr "length" "16,20")])


Segher
Jonathan Wakely via Gcc-patches July 17, 2020, 10:07 p.m. | #4
On 7/17/20 4:32 PM, Segher Boessenkool wrote:
> Well, just make an "isa" value of "p9"?  Then you can just do

> 

> (define_insn "trunctdsd2"

>   [(set (match_operand:SD 0 "gpc_reg_operand" "=d")

>         (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d")))

>    (clobber (match_scratch:TD 2 "=&d"))

>    (clobber (match_scratch:DF 3 "=&d"))]

>   "TARGET_DFP"

>   "@

>    mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2

>    mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2"

>   [(set_attr "type" "dfp")

>    (set_attr "isa" "p9,*")

>    (set_attr "length" "16,20")])


As we discussed offline, we need the duplicated constraint alternatives
like below.  I'll test this and commit this if clean with the following
additional ChangeLog entry for the p9 addition.  Thanks.

        * config/rs6000/rs6000.md (define_attr "isa"): Add p9.
        (define_attr "enabled"): Handle p9.


Peter



diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
index e91d6f581ed..8f822732bac 100644
--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
@@ -155,6 +155,19 @@
   [(set_attr "type" "dfp")
    (set_attr "length" "8")])
 
+(define_insn "trunctdsd2"
+  [(set (match_operand:SD 0 "gpc_reg_operand" "=d,d")
+       (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d,d")))
+   (clobber (match_scratch:TD 2 "=&d,&d"))
+   (clobber (match_scratch:DF 3 "=&d,&d"))]
+  "TARGET_DFP"
+  "@
+   mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2
+   mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2"
+  [(set_attr "type" "dfp")
+   (set_attr "isa" "p9,*")
+   (set_attr "length" "16,20")])
+
 (define_insn "add<mode>3"
   [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
        (plus:DDTD (match_operand:DDTD 1 "gpc_reg_operand" "%d")
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c0d9877c715..b3fcb845a38 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -322,7 +322,7 @@
   (const (symbol_ref "(enum attr_cpu) rs6000_tune")))
 
 ;; The ISA we implement.
-(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9v,p9kf,p9tf,p10"
+(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
   (const_string "any"))
 
 ;; Is this alternative enabled for the current CPU/ISA/etc.?
@@ -351,6 +351,10 @@
          (match_test "TARGET_P8_VECTOR"))
      (const_int 1)
 
+     (and (eq_attr "isa" "p9")
+         (match_test "TARGET_MODULO"))
+     (const_int 1)
+
      (and (eq_attr "isa" "p9v")
          (match_test "TARGET_P9_VECTOR"))
      (const_int 1)
Jonathan Wakely via Gcc-patches July 18, 2020, 2:13 a.m. | #5
On 7/17/20 5:07 PM, Peter Bergner wrote:
> As we discussed offline, we need the duplicated constraint alternatives

> like below.  I'll test this and commit this if clean with the following

> additional ChangeLog entry for the p9 addition.  Thanks.


Testing was clean, so I pushed the commit.  Thanks!

Peter

Patch

diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
index e91d6f581ed..50bfad6beb7 100644
--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
@@ -155,6 +155,16 @@ 
   [(set_attr "type" "dfp")
    (set_attr "length" "8")])
 
+(define_insn "trunctdsd2"
+  [(set (match_operand:SD 0 "gpc_reg_operand" "=d")
+	(float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d")))
+   (clobber (match_scratch:TD 2 "=&d"))
+   (clobber (match_scratch:DF 3 "=&d"))]
+  "TARGET_DFP"
+  "mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2"
+  [(set_attr "type" "dfp")
+   (set_attr "length" "20")])
+
 (define_insn "add<mode>3"
   [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
 	(plus:DDTD (match_operand:DDTD 1 "gpc_reg_operand" "%d")
diff --git a/gcc/testsuite/gcc.target/powerpc/convert-fp-128.c b/gcc/testsuite/gcc.target/powerpc/convert-fp-128.c
index 67896d92c86..5b0ef3b0d49 100644
--- a/gcc/testsuite/gcc.target/powerpc/convert-fp-128.c
+++ b/gcc/testsuite/gcc.target/powerpc/convert-fp-128.c
@@ -34,7 +34,7 @@  conv2
 
 /* { dg-final { scan-assembler-times {\mbl\M} 24 { target { ! hard_dfp } } } } */
 /* { dg-final { scan-assembler-times {\mbl\M} 19 { target { hard_dfp && { ! ppc_float128 } } } } } */
-/* { dg-final { scan-assembler-times {\mbl\M} 31 { target { hard_dfp && { ppc_float128 && { ! ppc_float128_insns } } } } } } */
+/* { dg-final { scan-assembler-times {\mbl\M} 30 { target { hard_dfp && { ppc_float128 && { ! ppc_float128_insns } } } } } } */
 /* { dg-final { scan-assembler-times {\mbl\M} 27 { target { hard_dfp && { ppc_float128 && { ppc_float128_insns } } } } } } */
 
 
@@ -60,20 +60,20 @@  conv2
 /* { dg-final { scan-assembler-times {\mbl __dpd_extendsddf\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_extendsdtf\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_extendsdkf\M} 1 { target { ppc_float128 } } } } */
-/* { dg-final { scan-assembler-times {\mbl __dpd_extendsddd2\M} 1 { target { ! dfp } } } } */
-/* { dg-final { scan-assembler-times {\mbl __dpd_extendsdtd2\M} 1 { target { ! dfp } } } } */
+/* { dg-final { scan-assembler-times {\mbl __dpd_extendsddd2\M} 1 { target { ! hard_dfp } } } } */
+/* { dg-final { scan-assembler-times {\mbl __dpd_extendsdtd2\M} 1 { target { ! hard_dfp } } } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_truncddsf\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_truncdddf\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_extendddtf\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_extendddkf\M} 1 { target { ppc_float128 } } } } */
-/* { dg-final { scan-assembler-times {\mbl __dpd_truncddsd2\M} 1 { target { ! dfp } } } } */
-/* { dg-final { scan-assembler-times {\mbl __dpd_extendddtd2\M} 1 { target { ! dfp } } } } */
+/* { dg-final { scan-assembler-times {\mbl __dpd_truncddsd2\M} 1 { target { ! hard_dfp } } } } */
+/* { dg-final { scan-assembler-times {\mbl __dpd_extendddtd2\M} 1 { target { ! hard_dfp } } } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_trunctdsf\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_trunctddf\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_trunctdtf\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mbl __dpd_trunctdkf\M} 1 { target { ppc_float128 } } } } */
-/* { dg-final { scan-assembler-times {\mbl __dpd_trunctdsd2\M} 1 } } */
-/* { dg-final { scan-assembler-times {\mbl __dpd_trunctddd2\M} 1 { target { ! dfp } } } } */
+/* { dg-final { scan-assembler-times {\mbl __dpd_trunctdsd2\M} 1 { target { ! hard_dfp } } } } */
+/* { dg-final { scan-assembler-times {\mbl __dpd_trunctddd2\M} 1 { target { ! hard_dfp } } } } */
 
 
 /* { dg-final { scan-assembler-times {\mfrsp|xsrsp\M} 2 { target { ! ppc_float128_insns } } } } */
@@ -88,8 +88,8 @@  conv2
 /* { dg-final { scan-assembler-times {\mxxlor|xscpsgndp\M} 3 { target { ppc_float128_insns } } } } */
 
 
-/* { dg-final { scan-assembler-times {\mdrsp\M} 1 { target { hard_dfp } } } } */
-/* { dg-final { scan-assembler-times {\mdrdpq\M} 1 { target { hard_dfp } } } } */
+/* { dg-final { scan-assembler-times {\mdrsp\M} 2 { target { hard_dfp } } } } */
+/* { dg-final { scan-assembler-times {\mdrdpq\M} 2 { target { hard_dfp } } } } */
 /* { dg-final { scan-assembler-times {\mdctdp\M} 2 { target { hard_dfp } } } } */
 /* { dg-final { scan-assembler-times {\mdctqpq\M} 2 { target { hard_dfp } } } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92488.c b/gcc/testsuite/gcc.target/powerpc/pr92488.c
new file mode 100644
index 00000000000..3ca575531ca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr92488.c
@@ -0,0 +1,43 @@ 
+/* { dg-do run } */
+/* { dg-require-effective-target dfprt } */
+/* { dg-options "-O2" } */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+/* Runnable test case for testing _Decimal128 to _Decimal32 rounding.
+   The value below when rounded to _Decimal64 would result in the value
+   1.2345675e+00, which if it were rounded to _Decimal32 would result in
+   the value 1.234568e+00.  However, the correct value when rounding from
+   _Decimal128 directly to _Decimal32 is 1.234567e+00.  */
+
+_Decimal128 td = 1.23456749999999999999e+00dl;
+_Decimal32 sd_expected = 1.234567e+00df;
+
+_Decimal32 __attribute__((noinline))
+td2sd (_Decimal128 td)
+{
+  return td;
+}
+
+int
+main (void)
+{
+  _Decimal32 sd = td2sd (td);
+  if (sd != sd_expected)
+    {
+      union {
+	_Decimal32 sd;
+	unsigned int i;
+      } u;
+
+      printf ("cast to _Decimal32 failed:\n");
+      u.sd = sd;
+      printf ("  actual   = 0x%x\n", u.i);
+      u.sd = sd_expected;
+      printf ("  expected = 0x%x\n", u.i);
+      abort ();
+    }
+
+  return 0;
+}