Fix vectorization of POINTER_DIFF_EXPR (PR tree-optimization/83338)

Message ID 20171209095012.GD2353@tucnak
State New
Headers show
Series
  • Fix vectorization of POINTER_DIFF_EXPR (PR tree-optimization/83338)
Related show

Commit Message

Jakub Jelinek Dec. 9, 2017, 9:50 a.m.
Hi!

When POINTER_PLUS_EXPR is vectorized, we vectorize it as a vector PLUS_EXPR.
This (usually?  Not sure about targets where sizetype has different
precision from POINTER_SIZE; maybe those just don't have vector types) works
because we vectorize pointer variables as vectors of unsigned pointer sized
integers and if sizetype has the same precision, then it is the same vector
too, so both operands and result are compatible vectors.

POINTER_DIFF_EXPR is different, the arguments are pointers which get
vectype of vectors of unsigned pointer sized integers, but the result
is a corresponding signed type, so vectype_out is vector of signed pointer
sized integers; those aren't compatible.

So, we can't just vectorize POINTER_DIFF_EXPR as vector MINUS_EXPR, we need
to VCE the result from vectype to vectype_out.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2017-12-09  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/83338
	* tree-vect-stmts.c (vectorizable_operation): Handle POINTER_DIFF_EXPR
	vectorization as MINUS_EXPR with a subsequent VIEW_CONVERT_EXPR from
	vector of unsigned integers to vector of signed integers.

	* gcc.dg/vect/pr83338.c: New test.



	Jakub

Comments

Richard Biener Dec. 9, 2017, 11:31 a.m. | #1
On December 9, 2017 10:50:12 AM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote:
>Hi!

>

>When POINTER_PLUS_EXPR is vectorized, we vectorize it as a vector

>PLUS_EXPR.

>This (usually?  Not sure about targets where sizetype has different

>precision from POINTER_SIZE; maybe those just don't have vector types)

>works

>because we vectorize pointer variables as vectors of unsigned pointer

>sized

>integers and if sizetype has the same precision, then it is the same

>vector

>too, so both operands and result are compatible vectors.

>

>POINTER_DIFF_EXPR is different, the arguments are pointers which get

>vectype of vectors of unsigned pointer sized integers, but the result

>is a corresponding signed type, so vectype_out is vector of signed

>pointer

>sized integers; those aren't compatible.

>

>So, we can't just vectorize POINTER_DIFF_EXPR as vector MINUS_EXPR, we

>need

>to VCE the result from vectype to vectype_out.

>

>Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok

>for

>trunk?


OK. 

Richard. 


>2017-12-09  Jakub Jelinek  <jakub@redhat.com>

>

>	PR tree-optimization/83338

>	* tree-vect-stmts.c (vectorizable_operation): Handle POINTER_DIFF_EXPR

>	vectorization as MINUS_EXPR with a subsequent VIEW_CONVERT_EXPR from

>	vector of unsigned integers to vector of signed integers.

>

>	* gcc.dg/vect/pr83338.c: New test.

>

>--- gcc/tree-vect-stmts.c.jj	2017-12-08 12:21:58.000000000 +0100

>+++ gcc/tree-vect-stmts.c	2017-12-09 00:55:17.614147824 +0100

>@@ -5226,7 +5226,7 @@ vectorizable_operation (gimple *stmt, gi

>   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);

>   tree vectype;

>   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);

>-  enum tree_code code;

>+  enum tree_code code, orig_code;

>   machine_mode vec_mode;

>   tree new_temp;

>   int op_type;

>@@ -5264,7 +5264,7 @@ vectorizable_operation (gimple *stmt, gi

>   if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)

>     return false;

> 

>-  code = gimple_assign_rhs_code (stmt);

>+  orig_code = code = gimple_assign_rhs_code (stmt);

> 

>   /* For pointer addition and subtraction, we should use the normal

>      plus and minus for the vector operation.  */

>@@ -5455,6 +5455,14 @@ vectorizable_operation (gimple *stmt, gi

>   /* Handle def.  */

>   vec_dest = vect_create_destination_var (scalar_dest, vectype);

> 

>+  /* POINTER_DIFF_EXPR has pointer arguments which are vectorized as

>+     vectors with unsigned elements, but the result is signed.  So, we

>+     need to compute the MINUS_EXPR into vectype temporary and

>+     VIEW_CONVERT_EXPR it into the final vectype_out result.  */

>+  tree vec_cvt_dest = NULL_TREE;

>+  if (orig_code == POINTER_DIFF_EXPR)

>+    vec_cvt_dest = vect_create_destination_var (scalar_dest,

>vectype_out);

>+

>   /* In case the vectorization factor (VF) is bigger than the number

> of elements that we can fit in a vectype (nunits), we have to generate

>      more than one vector stmt - i.e - we need to "unroll" the

>@@ -5546,6 +5554,15 @@ vectorizable_operation (gimple *stmt, gi

> 	  new_temp = make_ssa_name (vec_dest, new_stmt);

> 	  gimple_assign_set_lhs (new_stmt, new_temp);

> 	  vect_finish_stmt_generation (stmt, new_stmt, gsi);

>+	  if (vec_cvt_dest)

>+	    {

>+	      new_temp = build1 (VIEW_CONVERT_EXPR, vectype_out, new_temp);

>+	      new_stmt = gimple_build_assign (vec_cvt_dest,

>VIEW_CONVERT_EXPR,

>+					      new_temp);

>+	      new_temp = make_ssa_name (vec_cvt_dest, new_stmt);

>+	      gimple_assign_set_lhs (new_stmt, new_temp);

>+	      vect_finish_stmt_generation (stmt, new_stmt, gsi);

>+	    }

>           if (slp_node)

> 	    SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt);

>         }

>--- gcc/testsuite/gcc.dg/vect/pr83338.c.jj	2017-12-09

>01:00:06.602565622 +0100

>+++ gcc/testsuite/gcc.dg/vect/pr83338.c	2017-12-09 01:00:18.297422116

>+0100

>@@ -0,0 +1,10 @@

>+/* PR tree-optimization/83338 */

>+/* { dg-do compile } */

>+

>+void

>+foo (char **p, char **q, __PTRDIFF_TYPE__ *r)

>+{

>+  int i;

>+  for (i = 0; i < 1024; i++)

>+    r[i] = p[i] - q[i];

>+}

>

>

>	Jakub

Patch

--- gcc/tree-vect-stmts.c.jj	2017-12-08 12:21:58.000000000 +0100
+++ gcc/tree-vect-stmts.c	2017-12-09 00:55:17.614147824 +0100
@@ -5226,7 +5226,7 @@  vectorizable_operation (gimple *stmt, gi
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   tree vectype;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
-  enum tree_code code;
+  enum tree_code code, orig_code;
   machine_mode vec_mode;
   tree new_temp;
   int op_type;
@@ -5264,7 +5264,7 @@  vectorizable_operation (gimple *stmt, gi
   if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
     return false;
 
-  code = gimple_assign_rhs_code (stmt);
+  orig_code = code = gimple_assign_rhs_code (stmt);
 
   /* For pointer addition and subtraction, we should use the normal
      plus and minus for the vector operation.  */
@@ -5455,6 +5455,14 @@  vectorizable_operation (gimple *stmt, gi
   /* Handle def.  */
   vec_dest = vect_create_destination_var (scalar_dest, vectype);
 
+  /* POINTER_DIFF_EXPR has pointer arguments which are vectorized as
+     vectors with unsigned elements, but the result is signed.  So, we
+     need to compute the MINUS_EXPR into vectype temporary and
+     VIEW_CONVERT_EXPR it into the final vectype_out result.  */
+  tree vec_cvt_dest = NULL_TREE;
+  if (orig_code == POINTER_DIFF_EXPR)
+    vec_cvt_dest = vect_create_destination_var (scalar_dest, vectype_out);
+
   /* In case the vectorization factor (VF) is bigger than the number
      of elements that we can fit in a vectype (nunits), we have to generate
      more than one vector stmt - i.e - we need to "unroll" the
@@ -5546,6 +5554,15 @@  vectorizable_operation (gimple *stmt, gi
 	  new_temp = make_ssa_name (vec_dest, new_stmt);
 	  gimple_assign_set_lhs (new_stmt, new_temp);
 	  vect_finish_stmt_generation (stmt, new_stmt, gsi);
+	  if (vec_cvt_dest)
+	    {
+	      new_temp = build1 (VIEW_CONVERT_EXPR, vectype_out, new_temp);
+	      new_stmt = gimple_build_assign (vec_cvt_dest, VIEW_CONVERT_EXPR,
+					      new_temp);
+	      new_temp = make_ssa_name (vec_cvt_dest, new_stmt);
+	      gimple_assign_set_lhs (new_stmt, new_temp);
+	      vect_finish_stmt_generation (stmt, new_stmt, gsi);
+	    }
           if (slp_node)
 	    SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt);
         }
--- gcc/testsuite/gcc.dg/vect/pr83338.c.jj	2017-12-09 01:00:06.602565622 +0100
+++ gcc/testsuite/gcc.dg/vect/pr83338.c	2017-12-09 01:00:18.297422116 +0100
@@ -0,0 +1,10 @@ 
+/* PR tree-optimization/83338 */
+/* { dg-do compile } */
+
+void
+foo (char **p, char **q, __PTRDIFF_TYPE__ *r)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    r[i] = p[i] - q[i];
+}