RFA: Define vect_perm for variable-length SVE

Message ID 87lg8xtpfi.fsf@arm.com
State New
Headers show
Series
  • RFA: Define vect_perm for variable-length SVE
Related show

Commit Message

Richard Sandiford Aug. 23, 2018, 9:14 a.m.
Variable-length SVE now supports enough permutes to define vect_perm.

The change to vect_perm_supported is currently a no-op because the
function is only called with a count of 3.

Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-linux-gnu.  OK for the vect_perm_supported change?
I think the rest is covered by the SVE maintainership.

Richard


2018-08-23  Richard Sandiford  <richard.sandiford@arm.com>

gcc/testsuite/
	* lib/target-supports.exp (vect_perm_supported): Only return
	false for variable-length vectors if the permute size is not
	a power of 2.
	(check_effective_target_vect_perm)
	(check_effective_target_vect_perm_byte)
	(check_effective_target_vect_perm_short): Remove check for
	variable-length vectors.
	* gcc.dg/vect/slp-23.c: Add an XFAIL for variable-length SVE.
	* gcc.dg/vect/slp-perm-10.c: Likewise.
	* gcc.dg/vect/slp-perm-9.c: Add an XFAIL for variable-length vectors.

Comments

Richard Biener Aug. 23, 2018, 1:27 p.m. | #1
On Thu, Aug 23, 2018 at 11:15 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> Variable-length SVE now supports enough permutes to define vect_perm.

>

> The change to vect_perm_supported is currently a no-op because the

> function is only called with a count of 3.

>

> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf

> and x86_64-linux-gnu.  OK for the vect_perm_supported change?

> I think the rest is covered by the SVE maintainership.

OK

> Richard

>

>

> 2018-08-23  Richard Sandiford  <richard.sandiford@arm.com>

>

> gcc/testsuite/

>         * lib/target-supports.exp (vect_perm_supported): Only return

>         false for variable-length vectors if the permute size is not

>         a power of 2.

>         (check_effective_target_vect_perm)

>         (check_effective_target_vect_perm_byte)

>         (check_effective_target_vect_perm_short): Remove check for

>         variable-length vectors.

>         * gcc.dg/vect/slp-23.c: Add an XFAIL for variable-length SVE.

>         * gcc.dg/vect/slp-perm-10.c: Likewise.

>         * gcc.dg/vect/slp-perm-9.c: Add an XFAIL for variable-length vectors.

>

> Index: gcc/testsuite/lib/target-supports.exp

> ===================================================================

> --- gcc/testsuite/lib/target-supports.exp       2018-08-21 14:47:06.491178839 +0100

> +++ gcc/testsuite/lib/target-supports.exp       2018-08-23 10:09:45.296442485 +0100

> @@ -5758,8 +5758,7 @@ proc check_effective_target_vect_perm {

>      } else {

>         set et_vect_perm_saved($et_index) 0

>          if { [is-effective-target arm_neon]

> -            || ([istarget aarch64*-*-*]

> -                && ![check_effective_target_vect_variable_length])

> +            || [istarget aarch64*-*-*]

>              || [istarget powerpc*-*-*]

>               || [istarget spu-*-*]

>              || [istarget i?86-*-*] || [istarget x86_64-*-*]

> @@ -5824,7 +5823,9 @@ proc check_effective_target_vect_perm {

>

>  proc vect_perm_supported { count element_bits } {

>      set vector_bits [lindex [available_vector_sizes] 0]

> -    if { $vector_bits <= 0 } {

> +    # The number of vectors has to be a power of 2 when permuting

> +    # variable-length vectors.

> +    if { $vector_bits <= 0 && ($count & -$count) != $count } {

>         return 0

>      }

>      set vf [expr { $vector_bits / $element_bits }]

> @@ -5864,8 +5865,7 @@ proc check_effective_target_vect_perm_by

>          if { ([is-effective-target arm_neon]

>               && [is-effective-target arm_little_endian])

>              || ([istarget aarch64*-*-*]

> -                && [is-effective-target aarch64_little_endian]

> -                && ![check_effective_target_vect_variable_length])

> +                && [is-effective-target aarch64_little_endian])

>              || [istarget powerpc*-*-*]

>              || [istarget spu-*-*]

>              || ([istarget mips-*.*]

> @@ -5904,8 +5904,7 @@ proc check_effective_target_vect_perm_sh

>          if { ([is-effective-target arm_neon]

>               && [is-effective-target arm_little_endian])

>              || ([istarget aarch64*-*-*]

> -                && [is-effective-target aarch64_little_endian]

> -                && ![check_effective_target_vect_variable_length])

> +                && [is-effective-target aarch64_little_endian])

>              || [istarget powerpc*-*-*]

>              || [istarget spu-*-*]

>              || (([istarget i?86-*-*] || [istarget x86_64-*-*])

> Index: gcc/testsuite/gcc.dg/vect/slp-23.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/vect/slp-23.c  2018-05-02 08:37:48.985604715 +0100

> +++ gcc/testsuite/gcc.dg/vect/slp-23.c  2018-08-23 10:09:45.296442485 +0100

> @@ -107,8 +107,8 @@ int main (void)

>

>  /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { vect_strided8 && { ! { vect_no_align} } } } } } */

>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! { vect_strided8 || vect_no_align } } } } } */

> -/* We fail to vectorize the second loop with variable-length SVE but

> -   fall back to 128-bit vectors, which does use SLP.  */

>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_perm } } } } */

> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_perm } } } */

> +/* SLP fails for the second loop with variable-length SVE because

> +   the load size is greater than the minimum vector size.  */

> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_perm xfail { aarch64_sve && vect_variable_length } } } } */

>

> Index: gcc/testsuite/gcc.dg/vect/slp-perm-10.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/vect/slp-perm-10.c     2016-11-11 17:07:36.516798781 +0000

> +++ gcc/testsuite/gcc.dg/vect/slp-perm-10.c     2018-08-23 10:09:45.296442485 +0100

> @@ -50,4 +50,6 @@ int main ()

>  }

>

>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_perm } } } */

> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_perm } } } */

> +/* SLP fails for variable-length SVE because the load size is greater

> +   than the minimum vector size.  */

> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_perm xfail { aarch64_sve && vect_variable_length } } } } */

> Index: gcc/testsuite/gcc.dg/vect/slp-perm-9.c

> ===================================================================

> --- gcc/testsuite/gcc.dg/vect/slp-perm-9.c      2018-05-02 08:37:49.013604450 +0100

> +++ gcc/testsuite/gcc.dg/vect/slp-perm-9.c      2018-08-23 10:09:45.296442485 +0100

> @@ -59,7 +59,9 @@ int main (int argc, const char* argv[])

>

>  /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" { target { ! { vect_perm_short || vect_load_lanes } } } } } */

>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_perm_short || vect_load_lanes } } } } */

> -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { ! vect_perm3_short } } } } } */

> +/* We don't try permutes with a group size of 3 for variable-length

> +   vectors.  */

> +/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { ! vect_perm3_short } } xfail vect_variable_length } } } */

>  /* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */

>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! vect_perm3_short } || vect_load_lanes } } } } */

>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_perm3_short && { ! vect_load_lanes } } } } } */

Patch

Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	2018-08-21 14:47:06.491178839 +0100
+++ gcc/testsuite/lib/target-supports.exp	2018-08-23 10:09:45.296442485 +0100
@@ -5758,8 +5758,7 @@  proc check_effective_target_vect_perm {
     } else {
 	set et_vect_perm_saved($et_index) 0
         if { [is-effective-target arm_neon]
-	     || ([istarget aarch64*-*-*]
-		 && ![check_effective_target_vect_variable_length])
+	     || [istarget aarch64*-*-*]
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*]
 	     || [istarget i?86-*-*] || [istarget x86_64-*-*]
@@ -5824,7 +5823,9 @@  proc check_effective_target_vect_perm {
 
 proc vect_perm_supported { count element_bits } {
     set vector_bits [lindex [available_vector_sizes] 0]
-    if { $vector_bits <= 0 } {
+    # The number of vectors has to be a power of 2 when permuting
+    # variable-length vectors.
+    if { $vector_bits <= 0 && ($count & -$count) != $count } {
 	return 0
     }
     set vf [expr { $vector_bits / $element_bits }]
@@ -5864,8 +5865,7 @@  proc check_effective_target_vect_perm_by
         if { ([is-effective-target arm_neon]
 	      && [is-effective-target arm_little_endian])
 	     || ([istarget aarch64*-*-*]
-		 && [is-effective-target aarch64_little_endian]
-		 && ![check_effective_target_vect_variable_length])
+		 && [is-effective-target aarch64_little_endian])
 	     || [istarget powerpc*-*-*]
 	     || [istarget spu-*-*]
 	     || ([istarget mips-*.*]
@@ -5904,8 +5904,7 @@  proc check_effective_target_vect_perm_sh
         if { ([is-effective-target arm_neon]
 	      && [is-effective-target arm_little_endian])
 	     || ([istarget aarch64*-*-*]
-		 && [is-effective-target aarch64_little_endian]
-		 && ![check_effective_target_vect_variable_length])
+		 && [is-effective-target aarch64_little_endian])
 	     || [istarget powerpc*-*-*]
 	     || [istarget spu-*-*]
 	     || (([istarget i?86-*-*] || [istarget x86_64-*-*])
Index: gcc/testsuite/gcc.dg/vect/slp-23.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-23.c	2018-05-02 08:37:48.985604715 +0100
+++ gcc/testsuite/gcc.dg/vect/slp-23.c	2018-08-23 10:09:45.296442485 +0100
@@ -107,8 +107,8 @@  int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { vect_strided8 && { ! { vect_no_align} } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! { vect_strided8 || vect_no_align } } } } } */
-/* We fail to vectorize the second loop with variable-length SVE but
-   fall back to 128-bit vectors, which does use SLP.  */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_perm } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_perm } } } */
+/* SLP fails for the second loop with variable-length SVE because
+   the load size is greater than the minimum vector size.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_perm xfail { aarch64_sve && vect_variable_length } } } } */
   
Index: gcc/testsuite/gcc.dg/vect/slp-perm-10.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-perm-10.c	2016-11-11 17:07:36.516798781 +0000
+++ gcc/testsuite/gcc.dg/vect/slp-perm-10.c	2018-08-23 10:09:45.296442485 +0100
@@ -50,4 +50,6 @@  int main ()
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_perm } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_perm } } } */
+/* SLP fails for variable-length SVE because the load size is greater
+   than the minimum vector size.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_perm xfail { aarch64_sve && vect_variable_length } } } } */
Index: gcc/testsuite/gcc.dg/vect/slp-perm-9.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-perm-9.c	2018-05-02 08:37:49.013604450 +0100
+++ gcc/testsuite/gcc.dg/vect/slp-perm-9.c	2018-08-23 10:09:45.296442485 +0100
@@ -59,7 +59,9 @@  int main (int argc, const char* argv[])
 
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" { target { ! { vect_perm_short || vect_load_lanes } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_perm_short || vect_load_lanes } } } } */
-/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { ! vect_perm3_short } } } } } */
+/* We don't try permutes with a group size of 3 for variable-length
+   vectors.  */
+/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { ! vect_perm3_short } } xfail vect_variable_length } } } */
 /* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! vect_perm3_short } || vect_load_lanes } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_perm3_short && { ! vect_load_lanes } } } } } */