RFA: Avoid versioning loop with unaligned step

Message ID 5C1B9C8E.70108@amylaar.uk
State New
Headers show
Series
  • RFA: Avoid versioning loop with unaligned step
Related show

Commit Message

Joern Wolfgang Rennecke Dec. 20, 2018, 1:43 p.m.
eSi-RISC has vector permute functionality, but no unaligned loads. We 
see execution failures on gcc.dg/vect/slp-perm-12.c because loop 
versioning is used to make the tptr aligned for the first loop 
iteration, and then with a step of originally 11, 22 after 
vectorization, and a vector alignment of 8 bytes, the second iteration 
causes an AlignmentError exception.
The attached patch to tree-vect-data-refs.c suppresses attempts to align 
data accesses where the
step alignment times the vectorization factor is insufficient to sustain 
the alignment during the loop.
Bootstrapped and regression tested on x86_64-pc-linux-gnu .

I have also attached a matching testsuite patch to not expect SLP 
vectorization for slp-perm-12 when
no unaligned loads are available, although in terms of testing, I can 
only say that it works for us.
2018-12-15  Joern Rennecke  <joern.rennecke@riscy-ip.com>

	* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Don't do
	versioning for data accesses with misaligned step.
2018-12-15  Joern Rennecke  <joern.rennecke@riscy-ip.com>

	* testsuite/gcc.dg/vect/slp-perm-12.c (dg-final): Don't expect SLP
	vectorization for ! vect_no_align.

Index: testsuite/gcc.dg/vect/slp-perm-12.c
===================================================================
--- testsuite/gcc.dg/vect/slp-perm-12.c	(revision 5616)
+++ testsuite/gcc.dg/vect/slp-perm-12.c	(revision 5617)
@@ -49,4 +49,4 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_perm } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_perm  && {! vect_no_align } } } } } */

Comments

Richard Biener Dec. 20, 2018, 4:18 p.m. | #1
On Thu, Dec 20, 2018 at 2:43 PM Joern Wolfgang Rennecke <gnu@amylaar.uk> wrote:
>

> eSi-RISC has vector permute functionality, but no unaligned loads. We

> see execution failures on gcc.dg/vect/slp-perm-12.c because loop

> versioning is used to make the tptr aligned for the first loop

> iteration, and then with a step of originally 11, 22 after

> vectorization, and a vector alignment of 8 bytes, the second iteration

> causes an AlignmentError exception.

> The attached patch to tree-vect-data-refs.c suppresses attempts to align

> data accesses where the

> step alignment times the vectorization factor is insufficient to sustain

> the alignment during the loop.

> Bootstrapped and regression tested on x86_64-pc-linux-gnu .

>

> I have also attached a matching testsuite patch to not expect SLP

> vectorization for slp-perm-12 when

> no unaligned loads are available, although in terms of testing, I can

> only say that it works for us.


vect_compute_data_ref_alignment uses DR_TARGET_ALIGNMENT
and DR_STEP_ALIGNMENT () % dr_target-alignment == 0 as check.

I think it's preferable to use the same or similar values for the desired
alignment.

Otherwise this looks OK to me.

Richard.
Richard Sandiford Dec. 20, 2018, 7:46 p.m. | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Thu, Dec 20, 2018 at 2:43 PM Joern Wolfgang Rennecke <gnu@amylaar.uk> wrote:

>>

>> eSi-RISC has vector permute functionality, but no unaligned loads. We

>> see execution failures on gcc.dg/vect/slp-perm-12.c because loop

>> versioning is used to make the tptr aligned for the first loop

>> iteration, and then with a step of originally 11, 22 after

>> vectorization, and a vector alignment of 8 bytes, the second iteration

>> causes an AlignmentError exception.

>> The attached patch to tree-vect-data-refs.c suppresses attempts to align

>> data accesses where the

>> step alignment times the vectorization factor is insufficient to sustain

>> the alignment during the loop.

>> Bootstrapped and regression tested on x86_64-pc-linux-gnu .

>>

>> I have also attached a matching testsuite patch to not expect SLP

>> vectorization for slp-perm-12 when

>> no unaligned loads are available, although in terms of testing, I can

>> only say that it works for us.

>

> vect_compute_data_ref_alignment uses DR_TARGET_ALIGNMENT

> and DR_STEP_ALIGNMENT () % dr_target-alignment == 0 as check.

>

> I think it's preferable to use the same or similar values for the desired

> alignment.


Yeah, I agree testing for a multiple is better than maybe_lt,
and that we should be using DR_TARGET_ALIGNMENT rather than
TYPE_ALIGN_UNIT.

(TYPE_ALIGN_UNIT is the ABI alignment, which might be higher or lower
than the alignment the vectoriser is aiming for.  And I think the
reasons for bailing out apply whenever the vectoriser can't reach
the alignment it's aiming for, even if the alignment isn't needed
for correctness.)

Thanks,
Richard
Jakub Jelinek Dec. 20, 2018, 11:46 p.m. | #3
On Thu, Dec 20, 2018 at 07:46:31PM +0000, Richard Sandiford wrote:
> > vect_compute_data_ref_alignment uses DR_TARGET_ALIGNMENT

> > and DR_STEP_ALIGNMENT () % dr_target-alignment == 0 as check.

> >

> > I think it's preferable to use the same or similar values for the desired

> > alignment.

> 

> Yeah, I agree testing for a multiple is better than maybe_lt,

> and that we should be using DR_TARGET_ALIGNMENT rather than

> TYPE_ALIGN_UNIT.

> 

> (TYPE_ALIGN_UNIT is the ABI alignment, which might be higher or lower

> than the alignment the vectoriser is aiming for.  And I think the

> reasons for bailing out apply whenever the vectoriser can't reach

> the alignment it's aiming for, even if the alignment isn't needed

> for correctness.)


r267314 seems to have broken build:
../../gcc/tree-vect-data-refs.c: In function ‘opt_result vect_enhance_data_refs_alignment(loop_vec_info)’:
../../gcc/tree-vectorizer.h:1255:40: error: ‘struct data_reference’ has no member named ‘target_alignment’
 #define DR_TARGET_ALIGNMENT(DR) ((DR)->target_alignment)
                                        ^
../../gcc/tree-vect-data-refs.c:2171:11: note: in expansion of macro ‘DR_TARGET_ALIGNMENT’
           DR_TARGET_ALIGNMENT (dr)))
           ^~~~~~~~~~~~~~~~~~~

The following patch makes it build again, will commit as obvious if it
passes bootstrap/regtest:

2018-12-21  Jakub Jelinek  <jakub@redhat.com>

	* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Use
	DR_TARGET_ALIGNMENT on dr_info rather than dr.

--- gcc/tree-vect-data-refs.c.jj	2018-12-21 00:40:50.000000000 +0100
+++ gcc/tree-vect-data-refs.c	2018-12-21 00:43:35.786222062 +0100
@@ -2168,7 +2168,7 @@ vect_enhance_data_refs_alignment (loop_v
 		 done by doing some iterations of the non-vectorized loop.  */
 	      if (!multiple_p (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
 			       * DR_STEP_ALIGNMENT (dr),
-			       DR_TARGET_ALIGNMENT (dr)))
+			       DR_TARGET_ALIGNMENT (dr_info)))
 		{
 		  do_versioning = false;
 		  break;


	Jakub
Jakub Jelinek Dec. 21, 2018, 1:04 a.m. | #4
On Fri, Dec 21, 2018 at 12:46:41AM +0100, Jakub Jelinek wrote:
> The following patch makes it build again, will commit as obvious if it

> passes bootstrap/regtest:

> 

> 2018-12-21  Jakub Jelinek  <jakub@redhat.com>

> 

> 	* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Use

> 	DR_TARGET_ALIGNMENT on dr_info rather than dr.

> 

> --- gcc/tree-vect-data-refs.c.jj	2018-12-21 00:40:50.000000000 +0100

> +++ gcc/tree-vect-data-refs.c	2018-12-21 00:43:35.786222062 +0100

> @@ -2168,7 +2168,7 @@ vect_enhance_data_refs_alignment (loop_v

>  		 done by doing some iterations of the non-vectorized loop.  */

>  	      if (!multiple_p (LOOP_VINFO_VECT_FACTOR (loop_vinfo)

>  			       * DR_STEP_ALIGNMENT (dr),

> -			       DR_TARGET_ALIGNMENT (dr)))

> +			       DR_TARGET_ALIGNMENT (dr_info)))

>  		{

>  		  do_versioning = false;

>  		  break;

> 


Here is what I've actually committed, some spelling errors fixed too:

2018-12-21  Jakub Jelinek  <jakub@redhat.com>

	* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Use
	DR_TARGET_ALIGNMENT on dr_info rather than dr.  Spelling fixes.

--- gcc/tree-vect-data-refs.c.jj	2018-12-21 00:40:50.000000000 +0100
+++ gcc/tree-vect-data-refs.c	2018-12-21 00:43:35.786222062 +0100
@@ -2163,12 +2163,12 @@ vect_enhance_data_refs_alignment (loop_v
 	      /* Forcing alignment in the first iteration is no good if
 		 we don't keep it across iterations.  For now, just disable
 		 versioning in this case.
-		 ?? We could actually unroll the loop to archive the required
-		 overall step alignemnt, and forcing the alignment could be
+		 ?? We could actually unroll the loop to achieve the required
+		 overall step alignment, and forcing the alignment could be
 		 done by doing some iterations of the non-vectorized loop.  */
 	      if (!multiple_p (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
 			       * DR_STEP_ALIGNMENT (dr),
-			       DR_TARGET_ALIGNMENT (dr)))
+			       DR_TARGET_ALIGNMENT (dr_info)))
 		{
 		  do_versioning = false;
 		  break;


	Jakub

Patch

Index: tree-vect-data-refs.c
===================================================================
--- tree-vect-data-refs.c	(revision 267262)
+++ tree-vect-data-refs.c	(working copy)
@@ -2160,6 +2160,20 @@  vect_enhance_data_refs_alignment (loop_v
 		  break;
 		}
 
+	      /* Forcing alignment in the first iteration is no good if
+		 we don't keep it across iterations.  For now, just disable
+		 versioning in this case.
+		 ?? We could actually unroll the loop to archive the required
+		 overall step alignemnt, and forcing the alignment could be
+		 done by doing some iterations of the non-vectorized loop.  */
+	      if (maybe_lt (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
+			    * DR_STEP_ALIGNMENT (dr),
+			    TYPE_ALIGN_UNIT (vectype)))
+		{
+		  do_versioning = false;
+		  break;
+		}
+
               /* The rightmost bits of an aligned address must be zeros.
                  Construct the mask needed for this test.  For example,
                  GET_MODE_SIZE for the vector mode V4SI is 16 bytes so the