[rs6000] Add BE support to builtin vec_extract_fp32_from_shorth, vec_extract_fp32_from_shortl

Message ID 1528384196.4981.30.camel@us.ibm.com
State New
Headers show
Series
  • [rs6000] Add BE support to builtin vec_extract_fp32_from_shorth, vec_extract_fp32_from_shortl
Related show

Commit Message

Carl Love June 7, 2018, 3:09 p.m.
GCC Maintainers:

The Power 9 builtins vec_extract_fp32_from_shorth, and
vec_extract_fp32_from_shortl only work on Power 9 LE.  The following
patch adds BE support.  Since no Power 9 BE systems are currently
available, the testing consisted of visually inspecting the code to
make sure the desired code sequence was generated.

The patch was tested on:

    powerpc64le-unknown-linux-gnu (Power 8 LE)   
    powerpc64le-unknown-linux-gnu (Power 9 LE)
    powerpc64-unknown-linux-gnu (Power 8 BE)

With no regressions.

Please let me know if the patch looks OK for GCC mainline.  The patch
needs back porting to GCC 8 as well.

                         Carl Love

---------------------------------------------------------------------

gcc/ChangeLog:

2018-05-31  Carl Love  <cel@us.ibm.com>

	* gcc/config/rs6000/vsx.md (vextract_fp_from_shorth
	vextract_fp_from_shortl): Add BE support.

gcc/testsuite/ChangeLog:

2018-05-31  Carl Love  <cel@us.ibm.com>

	* gcc.target/powerpc/builtins-3-p9-runnable.c: Add debug print statements
---
 gcc/config/rs6000/vsx.md                           | 23 +++++++++++-----
 .../gcc.target/powerpc/builtins-3-p9-runnable.c    | 31 ++++++++++++++++++++--
 2 files changed, 46 insertions(+), 8 deletions(-)

-- 
2.7.4

Comments

Segher Boessenkool June 7, 2018, 9:25 p.m. | #1
On Thu, Jun 07, 2018 at 08:09:56AM -0700, Carl Love wrote:
> 2018-05-31  Carl Love  <cel@us.ibm.com>

> 

> 	* gcc/config/rs6000/vsx.md (vextract_fp_from_shorth


,

> 	vextract_fp_from_shortl): Add BE support.

> 

> gcc/testsuite/ChangeLog:

> 

> 2018-05-31  Carl Love  <cel@us.ibm.com>

> 

> 	* gcc.target/powerpc/builtins-3-p9-runnable.c: Add debug print statements


Okay for trunk and backports.  Thanks!


Segher
Peter Bergner June 7, 2018, 10:04 p.m. | #2
On 6/7/18 10:09 AM, Carl Love wrote:
> 	* gcc/config/rs6000/vsx.md (vextract_fp_from_shorth

> 	vextract_fp_from_shortl): Add BE support.


Missing comma at the end of the first line I think.

Peter

Patch

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 0e016fe..a528ef2 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5214,8 +5214,9 @@ 
    UNSPEC_VSX_VEXTRACT_FP_FROM_SHORTH))]
   "TARGET_P9_VECTOR"
 {
-  int vals[16] = {15, 14, 0, 0, 13, 12, 0, 0, 11, 10, 0, 0, 9, 8, 0, 0};
   int i;
+  int vals_le[16] = {15, 14, 0, 0, 13, 12, 0, 0, 11, 10, 0, 0, 9, 8, 0, 0};
+  int vals_be[16] = {7, 6, 0, 0, 5, 4, 0, 0, 3, 2, 0, 0, 1, 0, 0, 0};
 
   rtx rvals[16];
   rtx mask = gen_reg_rtx (V16QImode);
@@ -5223,11 +5224,15 @@ 
   rtvec v;
 
   for (i = 0; i < 16; i++)
-    rvals[i] = GEN_INT (vals[i]);
+    if (!BYTES_BIG_ENDIAN)
+      rvals[i] = GEN_INT (vals_le[i]);
+    else
+      rvals[i] = GEN_INT (vals_be[i]);
 
   /* xvcvhpsp - vector convert F16 to vector F32 requires the four F16
      inputs in half words 1,3,5,7 (IBM numbering).  Use xxperm to move
-     src half words 0,1,2,3 for the conversion instruction.  */
+     src half words 0,1,2,3 (LE), src half words 4,5,6,7 (BE) for the
+     conversion instruction.  */
   v = gen_rtvec_v (16, rvals);
   emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
   emit_insn (gen_altivec_vperm_v8hiv16qi (tmp, operands[1],
@@ -5244,7 +5249,9 @@ 
 	UNSPEC_VSX_VEXTRACT_FP_FROM_SHORTL))]
   "TARGET_P9_VECTOR"
 {
-  int vals[16] = {7, 6, 0, 0, 5, 4, 0, 0, 3, 2, 0, 0, 1, 0, 0, 0};
+  int vals_le[16] = {7, 6, 0, 0, 5, 4, 0, 0, 3, 2, 0, 0, 1, 0, 0, 0};
+  int vals_be[16] = {15, 14, 0, 0, 13, 12, 0, 0, 11, 10, 0, 0, 9, 8, 0, 0};
+
   int i;
   rtx rvals[16];
   rtx mask = gen_reg_rtx (V16QImode);
@@ -5252,11 +5259,15 @@ 
   rtvec v;
 
   for (i = 0; i < 16; i++)
-    rvals[i] = GEN_INT (vals[i]);
+    if (!BYTES_BIG_ENDIAN)
+      rvals[i] = GEN_INT (vals_le[i]);
+    else
+      rvals[i] = GEN_INT (vals_be[i]);
 
   /* xvcvhpsp - vector convert F16 to vector F32 requires the four F16
      inputs in half words 1,3,5,7 (IBM numbering).  Use xxperm to move
-     src half words 4,5,6,7 for the conversion instruction.  */
+     src half words 4,5,6,7 (LE), src half words 0,1,2,3 (BE) for the
+     conversion instruction.  */
   v = gen_rtvec_v (16, rvals);
   emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
   emit_insn (gen_altivec_vperm_v8hiv16qi (tmp, operands[1],
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c
index 3b67e53..3197a50 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c
@@ -2,6 +2,10 @@ 
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
 /* { dg-options "-mcpu=power9 -O2" } */
 
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
 #include <altivec.h> // vector
 
 void abort (void);
@@ -16,10 +20,26 @@  int main() {
                                    0B000000000000000, 0B0100100001000000,
                                    0B011111000000000, 0B0011100000000000,
                                    0B011110100000000, 0B1011010000000000};
-   
+
+#ifdef DEBUG
+   printf ("Claim, source data is 8 16-bit floats:\n");
+   printf ("   {1.0, -2.0, 0.0, 8.5, 1.5, 0.5, 1.25, -0.25}\n");
+   printf ("vusha = (vector unsigned short){0B011110000000000, 0B1100000000000000,\n");
+   printf ("                                0B000000000000000, 0B0100100001000000,\n");
+   printf ("                                0B011111000000000, 0B0011100000000000,\n");
+   printf ("                                0B011110100000000, 0B1011010000000000};\n\n");
+#endif
+
    vfexpt = (vector float){1.0, -2.0, 0.0, 8.5};
    vfr = vec_extract_fp_from_shorth(vusha);
 
+#ifdef DEBUG
+   printf ("vec_extract_fp_from_shorth\n");
+   for (i=0; i<4; i++)
+     printf("result[%d] = %f; expected[%d] = %f\n",
+	    i, vfr[i], i, vfexpt[i]);
+#endif
+
    for (i=0; i<4; i++) {
       if (vfr[i] != vfexpt[i])
          abort();
@@ -28,7 +48,14 @@  int main() {
    vfexpt = (vector float){1.5, 0.5, 1.25, -0.25};
    vfr = vec_extract_fp_from_shortl(vusha);
 
-   for (i=0; i<4; i++) {
+#ifdef DEBUG
+   printf ("\nvec_extract_fp_from_shortl\n");
+   for (i=0; i<4; i++)
+     printf("result[%d] = %f; expected[%d] = %f\n",
+	    i, vfr[i], i, vfexpt[i]);
+#endif
+
+    for (i=0; i<4; i++) {
       if (vfr[i] != vfexpt[i])
          abort();
    }