[Arm] : MVE: Fix constant load pattern

Message ID 99b0cde6-10bb-f644-7726-170b69c75c31@arm.com
State New
Headers show
Series
  • [Arm] : MVE: Fix constant load pattern
Related show

Commit Message

Andre Vieira (lists) April 7, 2020, 10:34 a.m.
Hi,

This patch fixes the constant load pattern for MVE, this was not 
accounting correctly for label + offset cases.

Added test that ICE'd before and removed the scan assemblers for the 
mve_vector* tests as they were too fragile.

Bootstrapped on arm-linux-gnueabihf and regression tested on arm-none-eabi.

Is this OK for trunk?

gcc/ChangeLog:
2020-04-07  Andre Vieira  <andre.simoesdiasvieira@arm.com>

         * config/arm/arm.c (output_move_neon): Deal with label + offset 
cases.
         * config/arm/mve.md (*mve_mov<mode>): Handle const vectors.

gcc/testsuite/ChangeLog:
2020-04-07  Andre Vieira  <andre.simoesdiasvieira@arm.com>

         * gcc.target/arm/mve/intrinsics/mve_load_from_array.c: New test.
         * gcc.target/arm/mve/intrinsics/mve_vector_float.c: Remove 
scan-assembler.
         * gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.
         * gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.
         * gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.

Comments

Kyrylo Tkachov April 7, 2020, 10:52 a.m. | #1
> -----Original Message-----

> From: Andre Vieira (lists) <andre.simoesdiasvieira@arm.com>

> Sent: 07 April 2020 11:35

> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>

> Subject: [PATCH][GCC][Arm]: MVE: Fix constant load pattern

> 

> Hi,

> 

> This patch fixes the constant load pattern for MVE, this was not

> accounting correctly for label + offset cases.

> 

> Added test that ICE'd before and removed the scan assemblers for the

> mve_vector* tests as they were too fragile.

> 

> Bootstrapped on arm-linux-gnueabihf and regression tested on arm-none-

> eabi.

> 

> Is this OK for trunk?


This makes me a bit nervous as it touches the common output_move_neon code ☹ but it looks like it mostly shuffles things around.
Ok for trunk but please watch out for fallout.
Thanks,
Kyrill

> 

> gcc/ChangeLog:

> 2020-04-07  Andre Vieira  <andre.simoesdiasvieira@arm.com>

> 

>          * config/arm/arm.c (output_move_neon): Deal with label + offset

> cases.

>          * config/arm/mve.md (*mve_mov<mode>): Handle const vectors.

> 

> gcc/testsuite/ChangeLog:

> 2020-04-07  Andre Vieira  <andre.simoesdiasvieira@arm.com>

> 

>          * gcc.target/arm/mve/intrinsics/mve_load_from_array.c: New test.

>          * gcc.target/arm/mve/intrinsics/mve_vector_float.c: Remove

> scan-assembler.

>          * gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.

>          * gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.

>          * gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.
Andre Vieira (lists) April 7, 2020, 12:21 p.m. | #2
The diff looks weird, but this only removes the first if 
(TARGET_HAVE_MVE... ) block and updates the variable 'addr' which is 
only used in the consecutive(TARGET_HAVE_MVE ..) blocks. So it doesn't 
change NEON codegen.

It's unfortunate the diff looks so complicated :(

Cheers,
Andre

On 07/04/2020 11:52, Kyrylo Tkachov wrote:
>

>> -----Original Message-----

>> From: Andre Vieira (lists) <andre.simoesdiasvieira@arm.com>

>> Sent: 07 April 2020 11:35

>> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>

>> Subject: [PATCH][GCC][Arm]: MVE: Fix constant load pattern

>>

>> Hi,

>>

>> This patch fixes the constant load pattern for MVE, this was not

>> accounting correctly for label + offset cases.

>>

>> Added test that ICE'd before and removed the scan assemblers for the

>> mve_vector* tests as they were too fragile.

>>

>> Bootstrapped on arm-linux-gnueabihf and regression tested on arm-none-

>> eabi.

>>

>> Is this OK for trunk?

> This makes me a bit nervous as it touches the common output_move_neon code ☹ but it looks like it mostly shuffles things around.

> Ok for trunk but please watch out for fallout.

> Thanks,

> Kyrill

>

>> gcc/ChangeLog:

>> 2020-04-07  Andre Vieira  <andre.simoesdiasvieira@arm.com>

>>

>>           * config/arm/arm.c (output_move_neon): Deal with label + offset

>> cases.

>>           * config/arm/mve.md (*mve_mov<mode>): Handle const vectors.

>>

>> gcc/testsuite/ChangeLog:

>> 2020-04-07  Andre Vieira  <andre.simoesdiasvieira@arm.com>

>>

>>           * gcc.target/arm/mve/intrinsics/mve_load_from_array.c: New test.

>>           * gcc.target/arm/mve/intrinsics/mve_vector_float.c: Remove

>> scan-assembler.

>>           * gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.

>>           * gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.

>>           * gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d5207e0d8f07f9be5265fc6d175c148c6cdd53cb..1af9d5cf145f6d01e364a1afd7ceb3df5da86c9a 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -20122,52 +20122,43 @@  output_move_neon (rtx *operands)
 	  break;
 	}
       /* Fall through.  */
-    case LABEL_REF:
     case PLUS:
+      addr = XEXP (addr, 0);
+      /* Fall through.  */
+    case LABEL_REF:
       {
 	int i;
 	int overlap = -1;
-	if (TARGET_HAVE_MVE && !BYTES_BIG_ENDIAN
-	    && GET_CODE (addr) != LABEL_REF)
+	for (i = 0; i < nregs; i++)
 	  {
-	    sprintf (buff, "v%srw.32\t%%q0, %%1", load ? "ld" : "st");
-	    ops[0] = reg;
-	    ops[1] = mem;
-	    output_asm_insn (buff, ops);
-	  }
-	else
-	  {
-	    for (i = 0; i < nregs; i++)
+	    /* We're only using DImode here because it's a convenient
+	       size.  */
+	    ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i);
+	    ops[1] = adjust_address (mem, DImode, 8 * i);
+	    if (reg_overlap_mentioned_p (ops[0], mem))
 	      {
-		/* We're only using DImode here because it's a convenient
-		   size.  */
-		ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i);
-		ops[1] = adjust_address (mem, DImode, 8 * i);
-		if (reg_overlap_mentioned_p (ops[0], mem))
-		  {
-		    gcc_assert (overlap == -1);
-		    overlap = i;
-		  }
-		else
-		  {
-		    if (TARGET_HAVE_MVE && GET_CODE (addr) == LABEL_REF)
-		      sprintf (buff, "v%sr.64\t%%P0, %%1", load ? "ld" : "st");
-		    else
-		      sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
-		    output_asm_insn (buff, ops);
-		  }
+		gcc_assert (overlap == -1);
+		overlap = i;
 	      }
-	    if (overlap != -1)
+	    else
 	      {
-		ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * overlap);
-		ops[1] = adjust_address (mem, SImode, 8 * overlap);
 		if (TARGET_HAVE_MVE && GET_CODE (addr) == LABEL_REF)
-		  sprintf (buff, "v%sr.32\t%%P0, %%1", load ? "ld" : "st");
+		  sprintf (buff, "v%sr.64\t%%P0, %%1", load ? "ld" : "st");
 		else
 		  sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
 		output_asm_insn (buff, ops);
 	      }
 	  }
+	if (overlap != -1)
+	  {
+	    ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * overlap);
+	    ops[1] = adjust_address (mem, SImode, 8 * overlap);
+	    if (TARGET_HAVE_MVE && GET_CODE (addr) == LABEL_REF)
+	      sprintf (buff, "v%sr.32\t%%P0, %%1", load ? "ld" : "st");
+	    else
+	      sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
+	    output_asm_insn (buff, ops);
+	  }
 
         return "";
       }
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index d1028f4542b4972b4080e46544c86d625d77383a..10abc3fae3709891346b63213afb1fe3754af41a 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -695,9 +695,9 @@  (define_insn "*mve_mov<mode>"
     case 2:
       return "vmov\t%Q0, %R0, %e1  @ <mode>\;vmov\t%J0, %K0, %f1";
     case 4:
-      if ((TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))
-	  || (MEM_P (operands[1])
-	      && GET_CODE (XEXP (operands[1], 0)) == LABEL_REF))
+      if (MEM_P (operands[1])
+	  && (GET_CODE (XEXP (operands[1], 0)) == LABEL_REF
+	      || GET_CODE (XEXP (operands[1], 0)) == CONST))
 	return output_move_neon (operands);
       else
 	return "vldrb.8 %q0, %E1";
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_load_from_array.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_load_from_array.c
new file mode 100644
index 0000000000000000000000000000000000000000..dcf6225a98f3aa7cbfe554b8946b067711e78c23
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_load_from_array.c
@@ -0,0 +1,19 @@ 
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint16x8_t
+foo (void)
+{
+  static uint16_t const a[] = {0, 1, 2, 3, 4, 5, 6, 7};
+  return vld1q (a);
+}
+
+uint16_t b[] = {0, 1, 2, 3, 4, 5, 6, 7};
+void
+bar (uint16x8_t value)
+{
+  vst1q (b, value);
+}
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float.c
index 9de47e6a1e0214fef26630b7959f11e58809d2c0..881157fc1be2e3ca33666eaf7544cd0d020d0d4b 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float.c
@@ -11,17 +11,9 @@  foo32 (float32x4_t value)
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldmia.*" }  } */
-
 float16x8_t
 foo16 (float16x8_t value)
 {
   float16x8_t b = value;
   return b;
 }
-
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldmia.*" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float1.c
index ba8fb6dd5da44444464dd8e58e837d84540acd1d..9515ed622ace8e51cf87f05aab85d0e501f8dd44 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float1.c
@@ -13,10 +13,6 @@  foo32 ()
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldmia.*" }  } */
-
 float16x8_t value1;
 
 float16x8_t
@@ -25,7 +21,3 @@  foo16 ()
   float16x8_t b = value1;
   return b;
 }
-
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldmia.*" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_int1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_int1.c
index 2d2fd116dfcfcdb04220e92090706c362350e8d2..e54516b25309a64355d6584d2d9a5aed6ee9611e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_int1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_int1.c
@@ -16,10 +16,6 @@  foo8 (void)
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldrb.8*" }  } */
-
 int16x8_t
 foo16 (void)
 {
@@ -27,10 +23,6 @@  foo16 (void)
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldrb.8*" }  } */
-
 int32x4_t
 foo32 (void)
 {
@@ -38,10 +30,6 @@  foo32 (void)
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldrb.8" }  } */
-
 int64x2_t
 foo64 (void)
 {
@@ -49,6 +37,3 @@  foo64 (void)
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldrb.8" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_int2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_int2.c
index 7ec85866993f9919fc6945d056a9cf8fcf361300..2bd9bdfed92e7342b8f87551ed83b6a45cc577eb 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_int2.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_int2.c
@@ -11,10 +11,6 @@  foo8 ()
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldr.64.*" }  } */
-
 int16x8_t
 foo16 (int16x8_t value)
 {
@@ -22,10 +18,6 @@  foo16 (int16x8_t value)
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldr.64.*" }  } */
-
 int32x4_t
 foo32 (int32x4_t value)
 {
@@ -33,17 +25,9 @@  foo32 (int32x4_t value)
   return b;
 }
 
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldr.64.*" }  } */
-
 int64x2_t
 foo64 (int64x2_t value)
 {
   int64x2_t b = {1};
   return b;
 }
-
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]"  }  } */
-/* { dg-final { scan-assembler "vstrb.*" }  } */
-/* { dg-final { scan-assembler "vldr.64.*" }  } */