PR91166 - Unfolded ZIPs of constants

Message ID CAAgBjM=yhvx93aAZE8Q9Y2fkY5AQrvujJKDoCBc3eKPYmROUoQ@mail.gmail.com
State New
Headers show
Series
  • PR91166 - Unfolded ZIPs of constants
Related show

Commit Message

Prathamesh Kulkarni July 17, 2019, 11:55 a.m.
Hi,
The attached patch tries to fix PR91166.
Does it look OK ?
Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

Thanks,
Prathamesh
2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	PR middle-end/91166
	* match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
	(define_predicates): Add entry for uniform_vector_p.

testsuite/
	* gcc.target/aarch64/sve/pr91166.c: New test.

Comments

Richard Sandiford July 19, 2019, 12:42 p.m. | #1
Not really my area, but FWIW...

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> Hi,

> The attached patch tries to fix PR91166.

> Does it look OK ?

> Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

>

> Thanks,

> Prathamesh

>

> 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

>

> 	PR middle-end/91166

> 	* match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.

> 	(define_predicates): Add entry for uniform_vector_p.

>

> testsuite/

> 	* gcc.target/aarch64/sve/pr91166.c: New test.

>

> diff --git a/gcc/match.pd b/gcc/match.pd

> index 4a7aa0185d8..2ad98c28fd8 100644

> --- a/gcc/match.pd

> +++ b/gcc/match.pd

> @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see

>     integer_valued_real_p

>     integer_pow2p

>     uniform_integer_cst_p

> -   HONOR_NANS)

> +   HONOR_NANS

> +   uniform_vector_p)

>  

>  /* Operator lists.  */

>  (define_operator_list tcc_comparison

> @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

>           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })

>         (if (changed)

>          (vec_perm { op0; } { op1; } { op2; }))))))))))

> +

> +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */

> +(simplify

> + (vec_perm (vec_duplicate@0 @1) @0 @2)

> + { @0; })

> +

> +(simplify

> + (vec_perm uniform_vector_p@0 @0 @1)

> + { @0; }) 


No need for the curly braces here, can use "@0" as the target of
the simplification.

It'd probably be worth using (match ...) to define a new predicate
that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
calling into uniform_vector_p for the latter two.

Thanks,
Richard

> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> new file mode 100644

> index 00000000000..42654be3b31

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> @@ -0,0 +1,20 @@

> +/* { dg-do compile } */

> +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */

> +

> +void

> +f1 (double x[][4]) 

> +{

> +  for (int i = 0; i < 4; ++i)

> +    for (int j = 0; j < 4; ++j)

> +      x[i][j] = 0;

> +}

> +

> +void

> +f2 (double x[][4], double y)

> +{

> +  for (int i = 0; i < 4; ++i)

> +    for (int j = 0; j < 4; ++j)

> +      x[i][j] = y;

> +}

> +

> +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
Prathamesh Kulkarni July 23, 2019, 10:34 a.m. | #2
On Fri, 19 Jul 2019 at 18:12, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> Not really my area, but FWIW...

>

> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:

> > Hi,

> > The attached patch tries to fix PR91166.

> > Does it look OK ?

> > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

> >

> > Thanks,

> > Prathamesh

> >

> > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

> >

> >       PR middle-end/91166

> >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.

> >       (define_predicates): Add entry for uniform_vector_p.

> >

> > testsuite/

> >       * gcc.target/aarch64/sve/pr91166.c: New test.

> >

> > diff --git a/gcc/match.pd b/gcc/match.pd

> > index 4a7aa0185d8..2ad98c28fd8 100644

> > --- a/gcc/match.pd

> > +++ b/gcc/match.pd

> > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see

> >     integer_valued_real_p

> >     integer_pow2p

> >     uniform_integer_cst_p

> > -   HONOR_NANS)

> > +   HONOR_NANS

> > +   uniform_vector_p)

> >

> >  /* Operator lists.  */

> >  (define_operator_list tcc_comparison

> > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

> >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })

> >         (if (changed)

> >          (vec_perm { op0; } { op1; } { op2; }))))))))))

> > +

> > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */

> > +(simplify

> > + (vec_perm (vec_duplicate@0 @1) @0 @2)

> > + { @0; })

> > +

> > +(simplify

> > + (vec_perm uniform_vector_p@0 @0 @1)

> > + { @0; })

>

> No need for the curly braces here, can use "@0" as the target of

> the simplification.

>

> It'd probably be worth using (match ...) to define a new predicate

> that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,

> calling into uniform_vector_p for the latter two.

Hi,
Thanks for the suggestions.
Does this version look OK ?

Thanks,
Prathamesh

>

> Thanks,

> Richard

>

> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > new file mode 100644

> > index 00000000000..42654be3b31

> > --- /dev/null

> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > @@ -0,0 +1,20 @@

> > +/* { dg-do compile } */

> > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */

> > +

> > +void

> > +f1 (double x[][4])

> > +{

> > +  for (int i = 0; i < 4; ++i)

> > +    for (int j = 0; j < 4; ++j)

> > +      x[i][j] = 0;

> > +}

> > +

> > +void

> > +f2 (double x[][4], double y)

> > +{

> > +  for (int i = 0; i < 4; ++i)

> > +    for (int j = 0; j < 4; ++j)

> > +      x[i][j] = y;

> > +}

> > +

> > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
2019-07-23  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	PR middle-end/91166
	* match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
	(define_predicates): Add entry for uniform_vector_p.
	(vec_same_elem_p): New match pattern.

testsuite/
	* gcc.target/aarch64/sve/pr91166.c: New test.

diff --git a/gcc/match.pd b/gcc/match.pd
index 4a7aa0185d8..f14670a7982 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
    integer_valued_real_p
    integer_pow2p
    uniform_integer_cst_p
-   HONOR_NANS)
+   HONOR_NANS
+   uniform_vector_p)
 
 /* Operator lists.  */
 (define_operator_list tcc_comparison
@@ -5568,3 +5569,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
          { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
        (if (changed)
         (vec_perm { op0; } { op1; } { op2; }))))))))))
+
+/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
+
+(match (vec_same_elem_p @0)
+ uniform_vector_p@0)
+
+(match (vec_same_elem_p @0)
+ (vec_duplicate @0))
+
+(simplify
+ (vec_perm (vec_same_elem_p@0 @1) @0 @2)
+ @0)
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
new file mode 100644
index 00000000000..42654be3b31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
+
+void
+f1 (double x[][4]) 
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = 0;
+}
+
+void
+f2 (double x[][4], double y)
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = y;
+}
+
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
Richard Biener July 23, 2019, 11:06 a.m. | #3
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> On Fri, 19 Jul 2019 at 18:12, Richard Sandiford

> <richard.sandiford@arm.com> wrote:

> >

> > Not really my area, but FWIW...

> >

> > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:

> > > Hi,

> > > The attached patch tries to fix PR91166.

> > > Does it look OK ?

> > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

> > >

> > > Thanks,

> > > Prathamesh

> > >

> > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

> > >

> > >       PR middle-end/91166

> > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.

> > >       (define_predicates): Add entry for uniform_vector_p.

> > >

> > > testsuite/

> > >       * gcc.target/aarch64/sve/pr91166.c: New test.

> > >

> > > diff --git a/gcc/match.pd b/gcc/match.pd

> > > index 4a7aa0185d8..2ad98c28fd8 100644

> > > --- a/gcc/match.pd

> > > +++ b/gcc/match.pd

> > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see

> > >     integer_valued_real_p

> > >     integer_pow2p

> > >     uniform_integer_cst_p

> > > -   HONOR_NANS)

> > > +   HONOR_NANS

> > > +   uniform_vector_p)

> > >

> > >  /* Operator lists.  */

> > >  (define_operator_list tcc_comparison

> > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

> > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })

> > >         (if (changed)

> > >          (vec_perm { op0; } { op1; } { op2; }))))))))))

> > > +

> > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */

> > > +(simplify

> > > + (vec_perm (vec_duplicate@0 @1) @0 @2)

> > > + { @0; })

> > > +

> > > +(simplify

> > > + (vec_perm uniform_vector_p@0 @0 @1)

> > > + { @0; })

> >

> > No need for the curly braces here, can use "@0" as the target of

> > the simplification.

> >

> > It'd probably be worth using (match ...) to define a new predicate

> > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,

> > calling into uniform_vector_p for the latter two.

> Hi,

> Thanks for the suggestions.

> Does this version look OK ?


Can you write

+(simplify
+ (vec_perm (vec_same_elem_p@0 @1) @0 @2)
+ @0)

as

 (vec_perm vec_same_elem_p@0 @0 @1)

?

Otherwise looks OK.

Thanks,
Richard.
 
> Thanks,

> Prathamesh

> 

> >

> > Thanks,

> > Richard

> >

> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > new file mode 100644

> > > index 00000000000..42654be3b31

> > > --- /dev/null

> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > @@ -0,0 +1,20 @@

> > > +/* { dg-do compile } */

> > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */

> > > +

> > > +void

> > > +f1 (double x[][4])

> > > +{

> > > +  for (int i = 0; i < 4; ++i)

> > > +    for (int j = 0; j < 4; ++j)

> > > +      x[i][j] = 0;

> > > +}

> > > +

> > > +void

> > > +f2 (double x[][4], double y)

> > > +{

> > > +  for (int i = 0; i < 4; ++i)

> > > +    for (int j = 0; j < 4; ++j)

> > > +      x[i][j] = y;

> > > +}

> > > +

> > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */

> 


-- 
Richard Biener <rguenther@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
Prathamesh Kulkarni July 23, 2019, 11:55 a.m. | #4
On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote:
>

> On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

>

> > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford

> > <richard.sandiford@arm.com> wrote:

> > >

> > > Not really my area, but FWIW...

> > >

> > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:

> > > > Hi,

> > > > The attached patch tries to fix PR91166.

> > > > Does it look OK ?

> > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

> > > >

> > > > Thanks,

> > > > Prathamesh

> > > >

> > > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

> > > >

> > > >       PR middle-end/91166

> > > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.

> > > >       (define_predicates): Add entry for uniform_vector_p.

> > > >

> > > > testsuite/

> > > >       * gcc.target/aarch64/sve/pr91166.c: New test.

> > > >

> > > > diff --git a/gcc/match.pd b/gcc/match.pd

> > > > index 4a7aa0185d8..2ad98c28fd8 100644

> > > > --- a/gcc/match.pd

> > > > +++ b/gcc/match.pd

> > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see

> > > >     integer_valued_real_p

> > > >     integer_pow2p

> > > >     uniform_integer_cst_p

> > > > -   HONOR_NANS)

> > > > +   HONOR_NANS

> > > > +   uniform_vector_p)

> > > >

> > > >  /* Operator lists.  */

> > > >  (define_operator_list tcc_comparison

> > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

> > > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })

> > > >         (if (changed)

> > > >          (vec_perm { op0; } { op1; } { op2; }))))))))))

> > > > +

> > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */

> > > > +(simplify

> > > > + (vec_perm (vec_duplicate@0 @1) @0 @2)

> > > > + { @0; })

> > > > +

> > > > +(simplify

> > > > + (vec_perm uniform_vector_p@0 @0 @1)

> > > > + { @0; })

> > >

> > > No need for the curly braces here, can use "@0" as the target of

> > > the simplification.

> > >

> > > It'd probably be worth using (match ...) to define a new predicate

> > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,

> > > calling into uniform_vector_p for the latter two.

> > Hi,

> > Thanks for the suggestions.

> > Does this version look OK ?

>

> Can you write

>

> +(simplify

> + (vec_perm (vec_same_elem_p@0 @1) @0 @2)

> + @0)

>

> as

>

>  (vec_perm vec_same_elem_p@0 @0 @1)

>

> ?

(simplify
 (vec_perm vec_same_elem_p@0 @0 @1)
 @0)

results in:
gimple-match.c: In function ‘bool
gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node*
(*)(tree), code_helper, tree, tree, tree, tree)’:
gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’
{aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’
   if (gimple_vec_same_elem_p (op0, valueize))
                                    ^~~~~~~~

because gimple_vec_same_elem_p has tree *res_ops as 2nd param and
we're passing valueize as 2nd arg.

Thanks,
Prathamesh
>

> Otherwise looks OK.

>

> Thanks,

> Richard.

>

> > Thanks,

> > Prathamesh

> >

> > >

> > > Thanks,

> > > Richard

> > >

> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > > new file mode 100644

> > > > index 00000000000..42654be3b31

> > > > --- /dev/null

> > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > > @@ -0,0 +1,20 @@

> > > > +/* { dg-do compile } */

> > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */

> > > > +

> > > > +void

> > > > +f1 (double x[][4])

> > > > +{

> > > > +  for (int i = 0; i < 4; ++i)

> > > > +    for (int j = 0; j < 4; ++j)

> > > > +      x[i][j] = 0;

> > > > +}

> > > > +

> > > > +void

> > > > +f2 (double x[][4], double y)

> > > > +{

> > > > +  for (int i = 0; i < 4; ++i)

> > > > +    for (int j = 0; j < 4; ++j)

> > > > +      x[i][j] = y;

> > > > +}

> > > > +

> > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */

> >

>

> --

> Richard Biener <rguenther@suse.de>

> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;

> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
Richard Biener July 23, 2019, 12:18 p.m. | #5
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote:

> >

> > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> >

> > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford

> > > <richard.sandiford@arm.com> wrote:

> > > >

> > > > Not really my area, but FWIW...

> > > >

> > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:

> > > > > Hi,

> > > > > The attached patch tries to fix PR91166.

> > > > > Does it look OK ?

> > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

> > > > >

> > > > > Thanks,

> > > > > Prathamesh

> > > > >

> > > > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

> > > > >

> > > > >       PR middle-end/91166

> > > > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.

> > > > >       (define_predicates): Add entry for uniform_vector_p.

> > > > >

> > > > > testsuite/

> > > > >       * gcc.target/aarch64/sve/pr91166.c: New test.

> > > > >

> > > > > diff --git a/gcc/match.pd b/gcc/match.pd

> > > > > index 4a7aa0185d8..2ad98c28fd8 100644

> > > > > --- a/gcc/match.pd

> > > > > +++ b/gcc/match.pd

> > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see

> > > > >     integer_valued_real_p

> > > > >     integer_pow2p

> > > > >     uniform_integer_cst_p

> > > > > -   HONOR_NANS)

> > > > > +   HONOR_NANS

> > > > > +   uniform_vector_p)

> > > > >

> > > > >  /* Operator lists.  */

> > > > >  (define_operator_list tcc_comparison

> > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

> > > > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })

> > > > >         (if (changed)

> > > > >          (vec_perm { op0; } { op1; } { op2; }))))))))))

> > > > > +

> > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */

> > > > > +(simplify

> > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2)

> > > > > + { @0; })

> > > > > +

> > > > > +(simplify

> > > > > + (vec_perm uniform_vector_p@0 @0 @1)

> > > > > + { @0; })

> > > >

> > > > No need for the curly braces here, can use "@0" as the target of

> > > > the simplification.

> > > >

> > > > It'd probably be worth using (match ...) to define a new predicate

> > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,

> > > > calling into uniform_vector_p for the latter two.

> > > Hi,

> > > Thanks for the suggestions.

> > > Does this version look OK ?

> >

> > Can you write

> >

> > +(simplify

> > + (vec_perm (vec_same_elem_p@0 @1) @0 @2)

> > + @0)

> >

> > as

> >

> >  (vec_perm vec_same_elem_p@0 @0 @1)

> >

> > ?

> (simplify

>  (vec_perm vec_same_elem_p@0 @0 @1)

>  @0)

> 

> results in:

> gimple-match.c: In function ‘bool

> gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node*

> (*)(tree), code_helper, tree, tree, tree, tree)’:

> gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’

> {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’

>    if (gimple_vec_same_elem_p (op0, valueize))

>                                     ^~~~~~~~

> 

> because gimple_vec_same_elem_p has tree *res_ops as 2nd param and

> we're passing valueize as 2nd arg.


Ah, you need the

(match vec_same_elem_p
 @0
 (if (uniform_vector_p (@0)))

(match vec_same_elem_p
 (vec_duplicate @0))

form then.

> Thanks,

> Prathamesh

> >

> > Otherwise looks OK.

> >

> > Thanks,

> > Richard.

> >

> > > Thanks,

> > > Prathamesh

> > >

> > > >

> > > > Thanks,

> > > > Richard

> > > >

> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > > > new file mode 100644

> > > > > index 00000000000..42654be3b31

> > > > > --- /dev/null

> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > > > @@ -0,0 +1,20 @@

> > > > > +/* { dg-do compile } */

> > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */

> > > > > +

> > > > > +void

> > > > > +f1 (double x[][4])

> > > > > +{

> > > > > +  for (int i = 0; i < 4; ++i)

> > > > > +    for (int j = 0; j < 4; ++j)

> > > > > +      x[i][j] = 0;

> > > > > +}

> > > > > +

> > > > > +void

> > > > > +f2 (double x[][4], double y)

> > > > > +{

> > > > > +  for (int i = 0; i < 4; ++i)

> > > > > +    for (int j = 0; j < 4; ++j)

> > > > > +      x[i][j] = y;

> > > > > +}

> > > > > +

> > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */

> > >

> >

> > --

> > Richard Biener <rguenther@suse.de>

> > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;

> > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

> 


-- 
Richard Biener <rguenther@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
Prathamesh Kulkarni July 23, 2019, 12:38 p.m. | #6
On Tue, 23 Jul 2019 at 17:48, Richard Biener <rguenther@suse.de> wrote:
>

> On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

>

> > On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote:

> > >

> > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> > >

> > > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford

> > > > <richard.sandiford@arm.com> wrote:

> > > > >

> > > > > Not really my area, but FWIW...

> > > > >

> > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:

> > > > > > Hi,

> > > > > > The attached patch tries to fix PR91166.

> > > > > > Does it look OK ?

> > > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

> > > > > >

> > > > > > Thanks,

> > > > > > Prathamesh

> > > > > >

> > > > > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

> > > > > >

> > > > > >       PR middle-end/91166

> > > > > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.

> > > > > >       (define_predicates): Add entry for uniform_vector_p.

> > > > > >

> > > > > > testsuite/

> > > > > >       * gcc.target/aarch64/sve/pr91166.c: New test.

> > > > > >

> > > > > > diff --git a/gcc/match.pd b/gcc/match.pd

> > > > > > index 4a7aa0185d8..2ad98c28fd8 100644

> > > > > > --- a/gcc/match.pd

> > > > > > +++ b/gcc/match.pd

> > > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see

> > > > > >     integer_valued_real_p

> > > > > >     integer_pow2p

> > > > > >     uniform_integer_cst_p

> > > > > > -   HONOR_NANS)

> > > > > > +   HONOR_NANS

> > > > > > +   uniform_vector_p)

> > > > > >

> > > > > >  /* Operator lists.  */

> > > > > >  (define_operator_list tcc_comparison

> > > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

> > > > > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })

> > > > > >         (if (changed)

> > > > > >          (vec_perm { op0; } { op1; } { op2; }))))))))))

> > > > > > +

> > > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */

> > > > > > +(simplify

> > > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2)

> > > > > > + { @0; })

> > > > > > +

> > > > > > +(simplify

> > > > > > + (vec_perm uniform_vector_p@0 @0 @1)

> > > > > > + { @0; })

> > > > >

> > > > > No need for the curly braces here, can use "@0" as the target of

> > > > > the simplification.

> > > > >

> > > > > It'd probably be worth using (match ...) to define a new predicate

> > > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,

> > > > > calling into uniform_vector_p for the latter two.

> > > > Hi,

> > > > Thanks for the suggestions.

> > > > Does this version look OK ?

> > >

> > > Can you write

> > >

> > > +(simplify

> > > + (vec_perm (vec_same_elem_p@0 @1) @0 @2)

> > > + @0)

> > >

> > > as

> > >

> > >  (vec_perm vec_same_elem_p@0 @0 @1)

> > >

> > > ?

> > (simplify

> >  (vec_perm vec_same_elem_p@0 @0 @1)

> >  @0)

> >

> > results in:

> > gimple-match.c: In function ‘bool

> > gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node*

> > (*)(tree), code_helper, tree, tree, tree, tree)’:

> > gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’

> > {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’

> >    if (gimple_vec_same_elem_p (op0, valueize))

> >                                     ^~~~~~~~

> >

> > because gimple_vec_same_elem_p has tree *res_ops as 2nd param and

> > we're passing valueize as 2nd arg.

>

> Ah, you need the

>

> (match vec_same_elem_p

>  @0

>  (if (uniform_vector_p (@0)))

>

> (match vec_same_elem_p

>  (vec_duplicate @0))

>

> form then.

Thanks, that worked.
Is the attached patch OK to commit ?

Thanks,
Prathamesh
>

> > Thanks,

> > Prathamesh

> > >

> > > Otherwise looks OK.

> > >

> > > Thanks,

> > > Richard.

> > >

> > > > Thanks,

> > > > Prathamesh

> > > >

> > > > >

> > > > > Thanks,

> > > > > Richard

> > > > >

> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > > > > new file mode 100644

> > > > > > index 00000000000..42654be3b31

> > > > > > --- /dev/null

> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > > > > @@ -0,0 +1,20 @@

> > > > > > +/* { dg-do compile } */

> > > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */

> > > > > > +

> > > > > > +void

> > > > > > +f1 (double x[][4])

> > > > > > +{

> > > > > > +  for (int i = 0; i < 4; ++i)

> > > > > > +    for (int j = 0; j < 4; ++j)

> > > > > > +      x[i][j] = 0;

> > > > > > +}

> > > > > > +

> > > > > > +void

> > > > > > +f2 (double x[][4], double y)

> > > > > > +{

> > > > > > +  for (int i = 0; i < 4; ++i)

> > > > > > +    for (int j = 0; j < 4; ++j)

> > > > > > +      x[i][j] = y;

> > > > > > +}

> > > > > > +

> > > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */

> > > >

> > >

> > > --

> > > Richard Biener <rguenther@suse.de>

> > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;

> > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

> >

>

> --

> Richard Biener <rguenther@suse.de>

> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;

> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
diff --git a/gcc/match.pd b/gcc/match.pd
index 4a7aa0185d8..c5c6a041cfc 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
    integer_valued_real_p
    integer_pow2p
    uniform_integer_cst_p
-   HONOR_NANS)
+   HONOR_NANS
+   uniform_vector_p)
 
 /* Operator lists.  */
 (define_operator_list tcc_comparison
@@ -5568,3 +5569,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
          { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
        (if (changed)
         (vec_perm { op0; } { op1; } { op2; }))))))))))
+
+/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
+
+(match vec_same_elem_p
+ @0
+ (if (uniform_vector_p (@0))))
+
+(match vec_same_elem_p
+ (vec_duplicate @0))
+
+(simplify
+ (vec_perm vec_same_elem_p@0 @0 @1)
+ @0)
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
new file mode 100644
index 00000000000..42654be3b31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
+
+void
+f1 (double x[][4]) 
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = 0;
+}
+
+void
+f2 (double x[][4], double y)
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = y;
+}
+
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
Richard Biener July 23, 2019, 12:44 p.m. | #7
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> On Tue, 23 Jul 2019 at 17:48, Richard Biener <rguenther@suse.de> wrote:

> >

> > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> >

> > > On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote:

> > > >

> > > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> > > >

> > > > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford

> > > > > <richard.sandiford@arm.com> wrote:

> > > > > >

> > > > > > Not really my area, but FWIW...

> > > > > >

> > > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:

> > > > > > > Hi,

> > > > > > > The attached patch tries to fix PR91166.

> > > > > > > Does it look OK ?

> > > > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

> > > > > > >

> > > > > > > Thanks,

> > > > > > > Prathamesh

> > > > > > >

> > > > > > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

> > > > > > >

> > > > > > >       PR middle-end/91166

> > > > > > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.

> > > > > > >       (define_predicates): Add entry for uniform_vector_p.

> > > > > > >

> > > > > > > testsuite/

> > > > > > >       * gcc.target/aarch64/sve/pr91166.c: New test.

> > > > > > >

> > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd

> > > > > > > index 4a7aa0185d8..2ad98c28fd8 100644

> > > > > > > --- a/gcc/match.pd

> > > > > > > +++ b/gcc/match.pd

> > > > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see

> > > > > > >     integer_valued_real_p

> > > > > > >     integer_pow2p

> > > > > > >     uniform_integer_cst_p

> > > > > > > -   HONOR_NANS)

> > > > > > > +   HONOR_NANS

> > > > > > > +   uniform_vector_p)

> > > > > > >

> > > > > > >  /* Operator lists.  */

> > > > > > >  (define_operator_list tcc_comparison

> > > > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

> > > > > > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })

> > > > > > >         (if (changed)

> > > > > > >          (vec_perm { op0; } { op1; } { op2; }))))))))))

> > > > > > > +

> > > > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */

> > > > > > > +(simplify

> > > > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2)

> > > > > > > + { @0; })

> > > > > > > +

> > > > > > > +(simplify

> > > > > > > + (vec_perm uniform_vector_p@0 @0 @1)

> > > > > > > + { @0; })

> > > > > >

> > > > > > No need for the curly braces here, can use "@0" as the target of

> > > > > > the simplification.

> > > > > >

> > > > > > It'd probably be worth using (match ...) to define a new predicate

> > > > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,

> > > > > > calling into uniform_vector_p for the latter two.

> > > > > Hi,

> > > > > Thanks for the suggestions.

> > > > > Does this version look OK ?

> > > >

> > > > Can you write

> > > >

> > > > +(simplify

> > > > + (vec_perm (vec_same_elem_p@0 @1) @0 @2)

> > > > + @0)

> > > >

> > > > as

> > > >

> > > >  (vec_perm vec_same_elem_p@0 @0 @1)

> > > >

> > > > ?

> > > (simplify

> > >  (vec_perm vec_same_elem_p@0 @0 @1)

> > >  @0)

> > >

> > > results in:

> > > gimple-match.c: In function ‘bool

> > > gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node*

> > > (*)(tree), code_helper, tree, tree, tree, tree)’:

> > > gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’

> > > {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’

> > >    if (gimple_vec_same_elem_p (op0, valueize))

> > >                                     ^~~~~~~~

> > >

> > > because gimple_vec_same_elem_p has tree *res_ops as 2nd param and

> > > we're passing valueize as 2nd arg.

> >

> > Ah, you need the

> >

> > (match vec_same_elem_p

> >  @0

> >  (if (uniform_vector_p (@0)))

> >

> > (match vec_same_elem_p

> >  (vec_duplicate @0))

> >

> > form then.

> Thanks, that worked.

> Is the attached patch OK to commit ?


Yes.

Thanks,
Richard.

> Thanks,

> Prathamesh

> >

> > > Thanks,

> > > Prathamesh

> > > >

> > > > Otherwise looks OK.

> > > >

> > > > Thanks,

> > > > Richard.

> > > >

> > > > > Thanks,

> > > > > Prathamesh

> > > > >

> > > > > >

> > > > > > Thanks,

> > > > > > Richard

> > > > > >

> > > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > > > > > new file mode 100644

> > > > > > > index 00000000000..42654be3b31

> > > > > > > --- /dev/null

> > > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c

> > > > > > > @@ -0,0 +1,20 @@

> > > > > > > +/* { dg-do compile } */

> > > > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */

> > > > > > > +

> > > > > > > +void

> > > > > > > +f1 (double x[][4])

> > > > > > > +{

> > > > > > > +  for (int i = 0; i < 4; ++i)

> > > > > > > +    for (int j = 0; j < 4; ++j)

> > > > > > > +      x[i][j] = 0;

> > > > > > > +}

> > > > > > > +

> > > > > > > +void

> > > > > > > +f2 (double x[][4], double y)

> > > > > > > +{

> > > > > > > +  for (int i = 0; i < 4; ++i)

> > > > > > > +    for (int j = 0; j < 4; ++j)

> > > > > > > +      x[i][j] = y;

> > > > > > > +}

> > > > > > > +

> > > > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */

> > > > >

> > > >

> > > > --

> > > > Richard Biener <rguenther@suse.de>

> > > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;

> > > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

> > >

> >

> > --

> > Richard Biener <rguenther@suse.de>

> > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;

> > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

> 


-- 
Richard Biener <rguenther@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Patch

diff --git a/gcc/match.pd b/gcc/match.pd
index 4a7aa0185d8..2ad98c28fd8 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -36,7 +36,8 @@  along with GCC; see the file COPYING3.  If not see
    integer_valued_real_p
    integer_pow2p
    uniform_integer_cst_p
-   HONOR_NANS)
+   HONOR_NANS
+   uniform_vector_p)
 
 /* Operator lists.  */
 (define_operator_list tcc_comparison
@@ -5568,3 +5569,12 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
          { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
        (if (changed)
         (vec_perm { op0; } { op1; } { op2; }))))))))))
+
+/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
+(simplify
+ (vec_perm (vec_duplicate@0 @1) @0 @2)
+ { @0; })
+
+(simplify
+ (vec_perm uniform_vector_p@0 @0 @1)
+ { @0; }) 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
new file mode 100644
index 00000000000..42654be3b31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
+
+void
+f1 (double x[][4]) 
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = 0;
+}
+
+void
+f2 (double x[][4], double y)
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = y;
+}
+
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */