, PR target/84914, Fix complex long double multiply/divide on PowerPC -mabi=ieeelongdouble

Message ID 20180321154201.GA13408@ibm-tiger.the-meissners.org
State New
Headers show
Series
  • , PR target/84914, Fix complex long double multiply/divide on PowerPC -mabi=ieeelongdouble
Related show

Commit Message

Michael Meissner March 21, 2018, 3:42 p.m.
Joseph Myers pointed out that we call the wrong function for complex long
double multiply and divide.  This patch changes the functions called from
__multc3/__divtc3 to __mulkc3/__divkc3.

I have built this on power8 systems, both a little endian system, and a big
endian system using --with-cpu=power8.  Both compilers built without error, and
had no regressions in the test suite.  Can I check this into the trunk?

[gcc]
2018-03-21  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/84914
	* config/rs6000/rs6000.c (create_complex_muldiv): New helper
	function to create the function decl for complex long double
	multiply and divide for -mabi=ieeelongdouble.
	(init_float128_ieee): Call it.

[gcc/testsuite]
2018-03-21  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/84914
	* gcc.target/powerpc/mulkc-2.c: New tests to make sure complex
	long double multiply/divide uses the correct function.
	* gcc.target/powerpc/mulkc-3.c: Likewise.
	* gcc.target/powerpc/divkc-2.c: Likewise.
	* gcc.target/powerpc/divkc-3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Comments

Segher Boessenkool March 22, 2018, 10:52 p.m. | #1
On Wed, Mar 21, 2018 at 11:42:01AM -0400, Michael Meissner wrote:
> +/* Create a decl for either complex long double multiply or complex long double

> +   divide when long double is IEEE 128-bit floating point.  We can't use

> +   __multc3 and __divtc3 because the original long double using IBM extended

> +   double used those names.  The complex multiply/divide functions are encoded

> +   as builtin functions with a complex result and 4 scalar inputs.  */


The function is much more generic than that (other than the name and the
debug printf).

> +static void

> +create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)

> +{

> +  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,

> +				      name, NULL_TREE);

> +

> +  set_builtin_decl (fncode, fndecl, true);

> +

> +  if (TARGET_DEBUG_BUILTIN)

> +    fprintf (stderr, "create complex %s, fncode: %d, fndecl: 0x%lx\n", name,

> +	     (int) fncode, (unsigned long)fndecl);


Space after cast.  Don't cast pointers to integer, just use %p.  Is this
address useful to print at all?

> @@ -18689,6 +18710,24 @@ init_float128_ieee (machine_mode mode)

>  {

>    if (FLOAT128_VECTOR_P (mode))

>      {

> +      /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  */

> +     if (mode == TFmode && TARGET_IEEEQUAD)

> +       {

> +	 built_in_function fncode_mul =

> +	   (built_in_function)(BUILT_IN_COMPLEX_MUL_MIN + TCmode - MIN_MODE_COMPLEX_FLOAT);

> +	 built_in_function fncode_div =

> +	   (built_in_function)(BUILT_IN_COMPLEX_DIV_MIN + TCmode - MIN_MODE_COMPLEX_FLOAT);


Space after cast, lines too long.  Put TCmode - MIN_MODE_COMPLEX_FLOAT in
a temp, maybe?

> --- gcc/testsuite/gcc.target/powerpc/mulkc3-3.c	(revision 0)

> +++ gcc/testsuite/gcc.target/powerpc/mulkc3-3.c	(revision 0)

> @@ -0,0 +1,16 @@

> +/* { dg-do compile { target { powerpc64*-*-* } } } */


Does powerpc*-*-* not work?

> +/* { dg-require-effective-target powerpc_p9vector_ok } */

> +/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */


powerpc_p9vector_ok but -mpower8-vector, that is strange.


Segher
Michael Meissner March 23, 2018, 7:19 p.m. | #2
On Thu, Mar 22, 2018 at 05:52:56PM -0500, Segher Boessenkool wrote:
> On Wed, Mar 21, 2018 at 11:42:01AM -0400, Michael Meissner wrote:

> > +/* Create a decl for either complex long double multiply or complex long double

> > +   divide when long double is IEEE 128-bit floating point.  We can't use

> > +   __multc3 and __divtc3 because the original long double using IBM extended

> > +   double used those names.  The complex multiply/divide functions are encoded

> > +   as builtin functions with a complex result and 4 scalar inputs.  */

> 

> The function is much more generic than that (other than the name and the

> debug printf).


Not really.  All of the other built-ins go through more standard paths of
specifying insns with pre-defined names.  There is no way to create the complex
multiply and divide builtins without doing the low level build of the function
type and assigning the function into the builtin table.

You can use the function to create other builtins, but it would be better to do
those in the typical fashion.  From the comments, the problem is the library
support does not support returning complex types, so they hacked in the complex
multiply/divide as a builtin.  Even so, the function should have two complex
arguments, instead of 4 scalars.

> > +static void

> > +create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)

> > +{

> > +  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,

> > +				      name, NULL_TREE);

> > +

> > +  set_builtin_decl (fncode, fndecl, true);

> > +

> > +  if (TARGET_DEBUG_BUILTIN)

> > +    fprintf (stderr, "create complex %s, fncode: %d, fndecl: 0x%lx\n", name,

> > +	     (int) fncode, (unsigned long)fndecl);

> 

> Space after cast.  Don't cast pointers to integer, just use %p.  Is this

> address useful to print at all?


When I was doing the debugging, it was useful (making sure the right function
got called).  I eliminated the debug_tree I was also doing, I'll eliminate the

> > @@ -18689,6 +18710,24 @@ init_float128_ieee (machine_mode mode)

> >  {

> >    if (FLOAT128_VECTOR_P (mode))

> >      {

> > +      /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  */

> > +     if (mode == TFmode && TARGET_IEEEQUAD)

> > +       {

> > +	 built_in_function fncode_mul =

> > +	   (built_in_function)(BUILT_IN_COMPLEX_MUL_MIN + TCmode - MIN_MODE_COMPLEX_FLOAT);

> > +	 built_in_function fncode_div =

> > +	   (built_in_function)(BUILT_IN_COMPLEX_DIV_MIN + TCmode - MIN_MODE_COMPLEX_FLOAT);

> 

> Space after cast, lines too long.  Put TCmode - MIN_MODE_COMPLEX_FLOAT in

> a temp, maybe?


Yeah, I reorganized things before the submission, and I didn't notice the line
being too long.

> > --- gcc/testsuite/gcc.target/powerpc/mulkc3-3.c	(revision 0)

> > +++ gcc/testsuite/gcc.target/powerpc/mulkc3-3.c	(revision 0)

> > @@ -0,0 +1,16 @@

> > +/* { dg-do compile { target { powerpc64*-*-* } } } */

> 

> Does powerpc*-*-* not work?

> 

> > +/* { dg-require-effective-target powerpc_p9vector_ok } */

> > +/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */

> 

> powerpc_p9vector_ok but -mpower8-vector, that is strange.


Both of these are cut+paste errors, where I took another test as the basis of
this test (that other test was testing the float128 h/w), and wasn't careful
about changing all of the conditions.

Assuming this patch passes the bootstrap and regression testing, is it ok to
check into trunk?

[gcc]
2018-03-23  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/84914
	* config/rs6000/rs6000.c (create_complex_muldiv): New helper
	function to create the function decl for complex long double
	multiply and divide for -mabi=ieeelongdouble.
	(init_float128_ieee): Call it.

[gcc/testsuite]
2018-03-23  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/84914
	* gcc.target/powerpc/mulkc-2.c: New tests to make sure complex
	long double multiply/divide uses the correct function.
	* gcc.target/powerpc/mulkc-3.c: Likewise.
	* gcc.target/powerpc/divkc-2.c: Likewise.
	* gcc.target/powerpc/divkc-3.c: Likewise.


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 258601)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -18678,6 +18678,26 @@ init_float128_ibm (machine_mode mode)
     }
 }
 
+/* Create a decl for either complex long double multiply or complex long double
+   divide when long double is IEEE 128-bit floating point.  We can't use
+   __multc3 and __divtc3 because the original long double using IBM extended
+   double used those names.  The complex multiply/divide functions are encoded
+   as builtin functions with a complex result and 4 scalar inputs.  */
+
+static void
+create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
+{
+  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
+				      name, NULL_TREE);
+
+  set_builtin_decl (fncode, fndecl, true);
+
+  if (TARGET_DEBUG_BUILTIN)
+    fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
+
+  return;
+}
+
 /* Set up IEEE 128-bit floating point routines.  Use different names if the
    arguments can be passed in a vector register.  The historical PowerPC
    implementation of IEEE 128-bit floating point used _q_<op> for the names, so
@@ -18689,6 +18709,27 @@ init_float128_ieee (machine_mode mode)
 {
   if (FLOAT128_VECTOR_P (mode))
     {
+      /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  */
+     if (mode == TFmode && TARGET_IEEEQUAD)
+       {
+	 built_in_function fncode_mul =
+	   (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
+				- MIN_MODE_COMPLEX_FLOAT);
+	 built_in_function fncode_div =
+	   (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + TCmode
+				- MIN_MODE_COMPLEX_FLOAT);
+
+	 tree fntype = build_function_type_list (complex_long_double_type_node,
+						 long_double_type_node,
+						 long_double_type_node,
+						 long_double_type_node,
+						 long_double_type_node,
+						 NULL_TREE);
+
+	 create_complex_muldiv ("__mulkc3", fncode_mul, fntype);
+	 create_complex_muldiv ("__divkc3", fncode_div, fntype);
+       }
+
       set_optab_libfunc (add_optab, mode, "__addkf3");
       set_optab_libfunc (sub_optab, mode, "__subkf3");
       set_optab_libfunc (neg_optab, mode, "__negkf2");
Index: gcc/testsuite/gcc.target/powerpc/mulkc3-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/mulkc3-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/mulkc3-3.c	(revision 0)
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call when long double is
+   IBM extended double floating point.  */
+
+typedef _Complex long double cld_t;
+
+void
+multiply (cld_t *p, cld_t *q, cld_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __multc3" } } */
Index: gcc/testsuite/gcc.target/powerpc/divkc3-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/divkc3-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/divkc3-3.c	(revision 0)
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call when long double is
+   IBM extended double floating point.  */
+
+typedef _Complex long double cld_t;
+
+void
+divide (cld_t *p, cld_t *q, cld_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divtc3" } } */
Index: gcc/testsuite/gcc.target/powerpc/mulkc3-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/mulkc3-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/mulkc3-2.c	(revision 0)
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call when long double is
+   IEEE 128-bit floating point.  */
+
+typedef _Complex long double cld_t;
+
+void
+multiply (cld_t *p, cld_t *q, cld_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __mulkc3" } } */
Index: gcc/testsuite/gcc.target/powerpc/divkc3-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/divkc3-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/divkc3-2.c	(revision 0)
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call when long double is
+   IEEE 128-bit floating point.  */
+
+typedef _Complex long double cld_t;
+
+void
+divide (cld_t *p, cld_t *q, cld_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divkc3" } } */
Segher Boessenkool March 27, 2018, 10:09 p.m. | #3
On Fri, Mar 23, 2018 at 03:19:03PM -0400, Michael Meissner wrote:
> 2018-03-23  Michael Meissner  <meissner@linux.vnet.ibm.com>

> 

> 	PR target/84914

> 	* config/rs6000/rs6000.c (create_complex_muldiv): New helper

> 	function to create the function decl for complex long double

> 	multiply and divide for -mabi=ieeelongdouble.

> 	(init_float128_ieee): Call it.

> 

> [gcc/testsuite]

> 2018-03-23  Michael Meissner  <meissner@linux.vnet.ibm.com>

> 

> 	PR target/84914

> 	* gcc.target/powerpc/mulkc-2.c: New tests to make sure complex

> 	long double multiply/divide uses the correct function.

> 	* gcc.target/powerpc/mulkc-3.c: Likewise.

> 	* gcc.target/powerpc/divkc-2.c: Likewise.

> 	* gcc.target/powerpc/divkc-3.c: Likewise.


Okay for trunk.  Thanks!


Segher

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 258601)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -18678,6 +18678,27 @@  init_float128_ibm (machine_mode mode)
     }
 }
 
+/* Create a decl for either complex long double multiply or complex long double
+   divide when long double is IEEE 128-bit floating point.  We can't use
+   __multc3 and __divtc3 because the original long double using IBM extended
+   double used those names.  The complex multiply/divide functions are encoded
+   as builtin functions with a complex result and 4 scalar inputs.  */
+
+static void
+create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
+{
+  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
+				      name, NULL_TREE);
+
+  set_builtin_decl (fncode, fndecl, true);
+
+  if (TARGET_DEBUG_BUILTIN)
+    fprintf (stderr, "create complex %s, fncode: %d, fndecl: 0x%lx\n", name,
+	     (int) fncode, (unsigned long)fndecl);
+
+  return;
+}
+
 /* Set up IEEE 128-bit floating point routines.  Use different names if the
    arguments can be passed in a vector register.  The historical PowerPC
    implementation of IEEE 128-bit floating point used _q_<op> for the names, so
@@ -18689,6 +18710,24 @@  init_float128_ieee (machine_mode mode)
 {
   if (FLOAT128_VECTOR_P (mode))
     {
+      /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  */
+     if (mode == TFmode && TARGET_IEEEQUAD)
+       {
+	 built_in_function fncode_mul =
+	   (built_in_function)(BUILT_IN_COMPLEX_MUL_MIN + TCmode - MIN_MODE_COMPLEX_FLOAT);
+	 built_in_function fncode_div =
+	   (built_in_function)(BUILT_IN_COMPLEX_DIV_MIN + TCmode - MIN_MODE_COMPLEX_FLOAT);
+	 tree fntype = build_function_type_list (complex_long_double_type_node,
+						 long_double_type_node,
+						 long_double_type_node,
+						 long_double_type_node,
+						 long_double_type_node,
+						 NULL_TREE);
+
+	 create_complex_muldiv ("__mulkc3", fncode_mul, fntype);
+	 create_complex_muldiv ("__divkc3", fncode_div, fntype);
+       }
+
       set_optab_libfunc (add_optab, mode, "__addkf3");
       set_optab_libfunc (sub_optab, mode, "__subkf3");
       set_optab_libfunc (neg_optab, mode, "__negkf2");
Index: gcc/testsuite/gcc.target/powerpc/mulkc3-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/mulkc3-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/mulkc3-3.c	(revision 0)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target { powerpc64*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call when long double is
+   IBM extended double floating point.  */
+
+typedef _Complex long double cld_t;
+
+void
+multiply (cld_t *p, cld_t *q, cld_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __multc3" } } */
Index: gcc/testsuite/gcc.target/powerpc/divkc3-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/divkc3-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/divkc3-3.c	(revision 0)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target { powerpc64*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ibmlongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call when long double is
+   IBM extended double floating point.  */
+
+typedef _Complex long double cld_t;
+
+void
+divide (cld_t *p, cld_t *q, cld_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divtc3" } } */
Index: gcc/testsuite/gcc.target/powerpc/mulkc3-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/mulkc3-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/mulkc3-2.c	(revision 0)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target { powerpc64*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call when long double is
+   IEEE 128-bit floating point.  */
+
+typedef _Complex long double cld_t;
+
+void
+multiply (cld_t *p, cld_t *q, cld_t *r)
+{
+  *p = *q * *r;
+}
+
+/* { dg-final { scan-assembler "bl __mulkc3" } } */
Index: gcc/testsuite/gcc.target/powerpc/divkc3-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/divkc3-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/divkc3-2.c	(revision 0)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target { powerpc64*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Check that complex multiply generates the right call when long double is
+   IEEE 128-bit floating point.  */
+
+typedef _Complex long double cld_t;
+
+void
+divide (cld_t *p, cld_t *q, cld_t *r)
+{
+  *p = *q / *r;
+}
+
+/* { dg-final { scan-assembler "bl __divkc3" } } */