Fix PR94401 by considering reverse overrun

Message ID f8d6e4fd-b668-b63f-a737-dec7610af14d@linux.ibm.com
State New
Headers show
Series
  • Fix PR94401 by considering reverse overrun
Related show

Commit Message

Marek Polacek via Gcc-patches April 2, 2020, 7:15 a.m.
Hi, 

The commit r10-7415 brings scalar type consideration
to eliminate epilogue peeling for gaps, but it exposed
one problem that the current handling doesn't consider
the memory access type VMAT_CONTIGUOUS_REVERSE, for
which the overrun happens on low address side.  This
patch is to make the code take care of it by updating
the offset and construction element order accordingly.

Bootstrapped/regtested on powerpc64le-linux-gnu P8
and aarch64-linux-gnu.

BR,
Kewen
-----------
gcc/ChangeLog

2020-04-02  Kewen Lin  <linkw@gcc.gnu.org>

	PR tree-optimization/94401
	* tree-vect-loop.c (vectorizable_load): Handle VMAT_CONTIGUOUS_REVERSE
	access type when loading halves of vector to avoid peeling for gaps.

Comments

Marek Polacek via Gcc-patches April 2, 2020, 8:28 a.m. | #1
Hi!

On Thu, Apr 02, 2020 at 03:15:42PM +0800, Kewen.Lin via Gcc-patches wrote:

Just formatting nits, not commenting on what the actual patch does.

> --- a/gcc/tree-vect-stmts.c

> +++ b/gcc/tree-vect-stmts.c

> @@ -9590,11 +9590,20 @@ vectorizable_load (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,

>  			    if (new_vtype != NULL_TREE)

>  			      ltype = half_vtype;

>  			  }

> +			tree offset = dataref_offset

> +					? dataref_offset

> +					: build_int_cst (ref_type, 0);


The above is misformatted.  The ? and : shouldn't be indented further than
the dataref_offset, but usually e.g. for the sake of emacs we add ()s around
the expression in this case.  So:
			tree offset = (dataref_offset
				       ? dataref_offset
				       : build_int_cst (ref_type, 0));
or
			tree offset
			  = (dataref_offset
			     ? dataref_offset : build_int_cst (ref_type, 0));

> +			if (ltype != vectype

> +			    && memory_access_type == VMAT_CONTIGUOUS_REVERSE)

> +			  offset = size_binop (

> +			    PLUS_EXPR,

> +			    build_int_cst (ref_type,

> +					   DR_GROUP_GAP (first_stmt_info)

> +					     * tree_to_uhwi (

> +					       TYPE_SIZE_UNIT (elem_type))),

> +			    offset);


Again, no reason to indent * by 2 columns from DR_GROUP_GAP.  But also all
the (s at the end of line and randomly indented arguments look ugly.
I'd recommend temporaries, e.g. like (perhaps with different names of
temporaries, so that they don't shadow anything):

			  {
			    unsigned HOST_WIDE_INT gap
			      = DR_GROUP_GAP (first_stmt_info);
			    gap *= tree_to_uhwi (TYPE_SIZE_UNIT (elem_type));
			    tree gapcst = build_int_cst (ref_type, gap);
			    offset = size_binop (PLUS_EXPR, offset, gapcst);
			  }

	Jakub
Marek Polacek via Gcc-patches April 2, 2020, 9:21 a.m. | #2
On Thu, Apr 2, 2020 at 9:15 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>

> Hi,

>

> The commit r10-7415 brings scalar type consideration

> to eliminate epilogue peeling for gaps, but it exposed

> one problem that the current handling doesn't consider

> the memory access type VMAT_CONTIGUOUS_REVERSE, for

> which the overrun happens on low address side.  This

> patch is to make the code take care of it by updating

> the offset and construction element order accordingly.

>

> Bootstrapped/regtested on powerpc64le-linux-gnu P8

> and aarch64-linux-gnu.


OK with the formatting changes suggested by Jakub.

Richard.

> BR,

> Kewen

> -----------

> gcc/ChangeLog

>

> 2020-04-02  Kewen Lin  <linkw@gcc.gnu.org>

>

>         PR tree-optimization/94401

>         * tree-vect-loop.c (vectorizable_load): Handle VMAT_CONTIGUOUS_REVERSE

>         access type when loading halves of vector to avoid peeling for gaps.
Marek Polacek via Gcc-patches April 2, 2020, 10:07 a.m. | #3
Hi,

on 2020/4/2 下午4:28, Jakub Jelinek wrote:
> Hi!

> 

> On Thu, Apr 02, 2020 at 03:15:42PM +0800, Kewen.Lin via Gcc-patches wrote:

> 

> Just formatting nits, not commenting on what the actual patch does.

> 

>> --- a/gcc/tree-vect-stmts.c

>> +++ b/gcc/tree-vect-stmts.c

>> @@ -9590,11 +9590,20 @@ vectorizable_load (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,

>>  			    if (new_vtype != NULL_TREE)

>>  			      ltype = half_vtype;

>>  			  }

>> +			tree offset = dataref_offset

>> +					? dataref_offset

>> +					: build_int_cst (ref_type, 0);

> 

> The above is misformatted.  The ? and : shouldn't be indented further than

> the dataref_offset, but usually e.g. for the sake of emacs we add ()s around

> the expression in this case.  So:

> 			tree offset = (dataref_offset

> 				       ? dataref_offset

> 				       : build_int_cst (ref_type, 0));

> or

> 			tree offset

> 			  = (dataref_offset

> 			     ? dataref_offset : build_int_cst (ref_type, 0));

> 


Thanks Jakub!  I'll follow this by add () for ternary expression.
With manual added "()", clang-format can get below:

			tree offset
			  = (dataref_offset ? dataref_offset
					    : build_int_cst (ref_type, 0));

contrib/check_GNU_style.sh didn't complain this, I'm not sure whether
it's possible to add this kind of convention into contrib/clang-format.

>> +			if (ltype != vectype

>> +			    && memory_access_type == VMAT_CONTIGUOUS_REVERSE)

>> +			  offset = size_binop (

>> +			    PLUS_EXPR,

>> +			    build_int_cst (ref_type,

>> +					   DR_GROUP_GAP (first_stmt_info)

>> +					     * tree_to_uhwi (

>> +					       TYPE_SIZE_UNIT (elem_type))),

>> +			    offset);

> 

> Again, no reason to indent * by 2 columns from DR_GROUP_GAP.  But also all

> the (s at the end of line and randomly indented arguments look ugly.

> I'd recommend temporaries, e.g. like (perhaps with different names of

> temporaries, so that they don't shadow anything):

> 

> 			  {

> 			    unsigned HOST_WIDE_INT gap

> 			      = DR_GROUP_GAP (first_stmt_info);

> 			    gap *= tree_to_uhwi (TYPE_SIZE_UNIT (elem_type));

> 			    tree gapcst = build_int_cst (ref_type, gap);

> 			    offset = size_binop (PLUS_EXPR, offset, gapcst);

> 			  }

> 


Good suggestion, will update it.

BR,
Kewen
Marek Polacek via Gcc-patches April 2, 2020, 10:33 a.m. | #4
on 2020/4/2 下午5:21, Richard Biener wrote:
> On Thu, Apr 2, 2020 at 9:15 AM Kewen.Lin <linkw@linux.ibm.com> wrote:

>>

>> Hi,

>>

>> The commit r10-7415 brings scalar type consideration

>> to eliminate epilogue peeling for gaps, but it exposed

>> one problem that the current handling doesn't consider

>> the memory access type VMAT_CONTIGUOUS_REVERSE, for

>> which the overrun happens on low address side.  This

>> patch is to make the code take care of it by updating

>> the offset and construction element order accordingly.

>>

>> Bootstrapped/regtested on powerpc64le-linux-gnu P8

>> and aarch64-linux-gnu.

> 

> OK with the formatting changes suggested by Jakub.

> 


Thanks Richi, I'll push the formatted one as attached.

BR,
Kewen
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 12beef6978c..7730e71b94d 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -9590,11 +9590,20 @@ vectorizable_load (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 			    if (new_vtype != NULL_TREE)
 			      ltype = half_vtype;
 			  }
+			tree offset
+			  = (dataref_offset ? dataref_offset
+					    : build_int_cst (ref_type, 0));
+			if (ltype != vectype
+			    && memory_access_type == VMAT_CONTIGUOUS_REVERSE)
+			  {
+			    unsigned HOST_WIDE_INT gap
+			      = DR_GROUP_GAP (first_stmt_info);
+			    gap *= tree_to_uhwi (TYPE_SIZE_UNIT (elem_type));
+			    tree gapcst = build_int_cst (ref_type, gap);
+			    offset = size_binop (PLUS_EXPR, offset, gapcst);
+			  }
 			data_ref
-			  = fold_build2 (MEM_REF, ltype, dataref_ptr,
-					 dataref_offset
-					 ? dataref_offset
-					 : build_int_cst (ref_type, 0));
+			  = fold_build2 (MEM_REF, ltype, dataref_ptr, offset);
 			if (alignment_support_scheme == dr_aligned)
 			  ;
 			else if (DR_MISALIGNMENT (first_dr_info) == -1)
@@ -9607,16 +9616,27 @@ vectorizable_load (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 						  TYPE_ALIGN (elem_type));
 			if (ltype != vectype)
 			  {
-			    vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr));
+			    vect_copy_ref_info (data_ref,
+						DR_REF (first_dr_info->dr));
 			    tree tem = make_ssa_name (ltype);
 			    new_stmt = gimple_build_assign (tem, data_ref);
-			    vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
+			    vect_finish_stmt_generation (stmt_info, new_stmt,
+							 gsi);
 			    data_ref = NULL;
 			    vec<constructor_elt, va_gc> *v;
 			    vec_alloc (v, 2);
-			    CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, tem);
-			    CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
-						    build_zero_cst (ltype));
+			    if (memory_access_type == VMAT_CONTIGUOUS_REVERSE)
+			      {
+				CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
+							build_zero_cst (ltype));
+				CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, tem);
+			      }
+			    else
+			      {
+				CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, tem);
+				CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
+							build_zero_cst (ltype));
+			      }
 			    gcc_assert (new_vtype != NULL_TREE);
 			    if (new_vtype == vectype)
 			      new_stmt = gimple_build_assign (
Marek Polacek via Gcc-patches April 2, 2020, 10:39 a.m. | #5
On Thu, Apr 02, 2020 at 06:07:55PM +0800, Kewen.Lin wrote:
> > The above is misformatted.  The ? and : shouldn't be indented further than

> > the dataref_offset, but usually e.g. for the sake of emacs we add ()s around

> > the expression in this case.  So:

> > 			tree offset = (dataref_offset

> > 				       ? dataref_offset

> > 				       : build_int_cst (ref_type, 0));

> > or

> > 			tree offset

> > 			  = (dataref_offset

> > 			     ? dataref_offset : build_int_cst (ref_type, 0));

> > 

> 

> Thanks Jakub!  I'll follow this by add () for ternary expression.

> With manual added "()", clang-format can get below:


Note, the () isn't about ternary expressions, if everything fits on one
line, there is no reason to add ()s, so
  tree offset = dataref_offset ? dataref_offset : build_int_cst (ref_type, 0);
is just fine that way, on the other side
					int whatever = HOST_WIDE_INT_1U
						       + foobarbaz (qux);
should have them too, like:
					int whatever = (HOST_WIDE_INT_1U
							+ foobarbaz (qux));
or
					int whatever
					  = HOST_WIDE_INT_1U + foobarbaz (qux);
I don't use emacs, but was told that emacs without the ()s would misindent
it like (I think):
					int whatever = HOST_WIDE_INT_1U
					  + foobarbaz (qux);
which is what we do not want.

> 

> 			tree offset

> 			  = (dataref_offset ? dataref_offset

> 					    : build_int_cst (ref_type, 0));

> 

> contrib/check_GNU_style.sh didn't complain this, I'm not sure whether

> it's possible to add this kind of convention into contrib/clang-format.


clang-format is not our official indentation style; I have no problem with
the above formatting from readability POV, though unsure what emacs will do
with that (but if it moves that : right below the first dataref_offset,
no big deal, that is also fine and probably more appropriate if the
build_int_cst... is long and would need more wrapping).

	Jakub
Segher Boessenkool April 2, 2020, 12:05 p.m. | #6
Hi!

On Thu, Apr 02, 2020 at 10:28:34AM +0200, Jakub Jelinek wrote:
> > +			tree offset = dataref_offset

> > +					? dataref_offset

> > +					: build_int_cst (ref_type, 0);

> 

> The above is misformatted.  The ? and : shouldn't be indented further than

> the dataref_offset, but usually e.g. for the sake of emacs we add ()s around

> the expression in this case.  So:

> 			tree offset = (dataref_offset

> 				       ? dataref_offset

> 				       : build_int_cst (ref_type, 0));

> or

> 			tree offset

> 			  = (dataref_offset

> 			     ? dataref_offset : build_int_cst (ref_type, 0));


Or even just the (less obfuscated imnsho)

			tree offset = dataref_offset;
			if (!offset)
			  offset = build_int_cst (ref_type, 0);

which even is shorter!


Segher

Patch

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 12beef6978c..3d27f59ba22 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -9590,11 +9590,20 @@  vectorizable_load (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 			    if (new_vtype != NULL_TREE)
 			      ltype = half_vtype;
 			  }
+			tree offset = dataref_offset
+					? dataref_offset
+					: build_int_cst (ref_type, 0);
+			if (ltype != vectype
+			    && memory_access_type == VMAT_CONTIGUOUS_REVERSE)
+			  offset = size_binop (
+			    PLUS_EXPR,
+			    build_int_cst (ref_type,
+					   DR_GROUP_GAP (first_stmt_info)
+					     * tree_to_uhwi (
+					       TYPE_SIZE_UNIT (elem_type))),
+			    offset);
 			data_ref
-			  = fold_build2 (MEM_REF, ltype, dataref_ptr,
-					 dataref_offset
-					 ? dataref_offset
-					 : build_int_cst (ref_type, 0));
+			  = fold_build2 (MEM_REF, ltype, dataref_ptr, offset);
 			if (alignment_support_scheme == dr_aligned)
 			  ;
 			else if (DR_MISALIGNMENT (first_dr_info) == -1)
@@ -9607,16 +9616,27 @@  vectorizable_load (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 						  TYPE_ALIGN (elem_type));
 			if (ltype != vectype)
 			  {
-			    vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr));
+			    vect_copy_ref_info (data_ref,
+						DR_REF (first_dr_info->dr));
 			    tree tem = make_ssa_name (ltype);
 			    new_stmt = gimple_build_assign (tem, data_ref);
-			    vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
+			    vect_finish_stmt_generation (stmt_info, new_stmt,
+							 gsi);
 			    data_ref = NULL;
 			    vec<constructor_elt, va_gc> *v;
 			    vec_alloc (v, 2);
-			    CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, tem);
-			    CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
-						    build_zero_cst (ltype));
+			    if (memory_access_type == VMAT_CONTIGUOUS_REVERSE)
+			      {
+				CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
+							build_zero_cst (ltype));
+				CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, tem);
+			      }
+			    else
+			      {
+				CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, tem);
+				CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
+							build_zero_cst (ltype));
+			      }
 			    gcc_assert (new_vtype != NULL_TREE);
 			    if (new_vtype == vectype)
 			      new_stmt = gimple_build_assign (