[00/11] Add a vec_basic_block of scalar statements

Message ID 874lghez1a.fsf@arm.com
Headers show
Series
  • Add a vec_basic_block of scalar statements
Related show

Message

Richard Sandiford July 30, 2018, 11:36 a.m.
This series puts the statements that need to be vectorised into a
"vec_basic_block" structure of linked stmt_vec_infos, and then puts
pattern statements into this block rather than hanging them off the
original scalar statement.

Partly this is clean-up, since making pattern statements more like
first-class statements removes a lot of indirection.  The diffstat
for the series is:

 7 files changed, 691 insertions(+), 978 deletions(-)

It also makes it easier to do something approaching proper DCE
on the scalar code (patch 10).  However, the main motivation is
to allow the result of an earlier pattern statement to be reused
as the STMT_VINFO_RELATED_STMT for a later (non-pattern) statement.
I have two current uses for this:

(1) The way overwidening detection works means that we can sometimes
    be left with sequences of the form:

      type1 narrowed = ... + ...;   // originally done in type2
      type2 extended = (type2) narrowed;
      type3 truncated = (type3) extended;

    which cast_forwprop can simplify to:

      type1 narrowed = ... + ...;   // originally done in type2
      type3 truncated = (type3) narrowed;

    But if type3 == type1, we really want to replace truncated
    directly with narrowed.  The current representation doesn't
    allow this.

(2) For SVE extending loads, we want to look for:

      type1 narrow = *ptr;
      type2 extended = (type2) narrow; // only use of narrow

    replace narrow with:

      type2 tmp = .LOAD_EXT (ptr, ...);

    and replace extended directly with tmp.  (Deleting narrow and
    replacing tmp with a .LOAD_EXT would move the location of the
    load and so wouldn't be safe in general.)

The series doesn't do either of these things, it's just laying the
groundwork.  It applies on top of:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01308.html

I tested each individual patch on aarch64-linux-gnu and the series as a
whole on aarch64-linux-gnu with SVE, aarch64_be-elf and x86_64-linux-gnu.
OK to install?

Thanks,
Richard

Comments

Richard Sandiford July 30, 2018, 11:40 a.m. | #1
Invariant loads were handled as a variation on the code for contiguous
loads.  We detected whether they were invariant or not as a byproduct of
creating the vector pointer ivs: vect_create_data_ref_ptr passed back an
inv_p to say whether the pointer was invariant.

But vectorised invariant loads just keep the original scalar load,
so this meant that detecting invariant loads had the side-effect of
creating an unwanted vector pointer iv.  The placement of the code
also meant that we'd create a vector load and then not use the result.
In principle this is wrong code, since there's no guarantee that there's
a vector's worth of accessible data at that address, but we rely on DCE
to get rid of the load before any harm is done.

E.g., for an invariant load in an inner loop (which seems like the more
common use case for this code), we'd create:

   vectp_a.6_52 = &a + 4;

   # vectp_a.5_53 = PHI <vectp_a.5_54(9), vectp_a.6_52(2)>

   # vectp_a.5_55 = PHI <vectp_a.5_53(3), vectp_a.5_56(10)>

   vect_next_a_11.7_57 = MEM[(int *)vectp_a.5_55];
   next_a_11 = a[_1];
   vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};

   vectp_a.5_56 = vectp_a.5_55 + 4;

   vectp_a.5_54 = vectp_a.5_53 + 0;

whereas all we want is:

   next_a_11 = a[_1];
   vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};

This patch moves the handling to its own block and makes
vect_create_data_ref_ptr assert (when creating a full iv) that the
address isn't invariant.

The ncopies handling is unfortunate, but a preexisting issue.
Richi's suggestion of using a vector of vector statements would
let us reuse one statement for all copies.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vect_create_data_ref_ptr): Remove inv_p
	parameter.
	* tree-vect-data-refs.c (vect_create_data_ref_ptr): Likewise.
	When creating an iv, assert that the step is not known to be zero.
	(vect_setup_realignment): Update call accordingly.
	* tree-vect-stmts.c (vectorizable_store): Likewise.
	(vectorizable_load): Likewise.  Handle VMAT_INVARIANT separately.

Index: gcc/tree-vectorizer.h
===================================================================
*** gcc/tree-vectorizer.h	2018-07-30 12:32:29.586506669 +0100
--- gcc/tree-vectorizer.h	2018-07-30 12:40:13.000000000 +0100
*************** extern bool vect_analyze_data_refs (vec_
*** 1527,1533 ****
  extern void vect_record_base_alignments (vec_info *);
  extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, tree,
  				      tree *, gimple_stmt_iterator *,
! 				      gimple **, bool, bool *,
  				      tree = NULL_TREE, tree = NULL_TREE);
  extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,
  			     stmt_vec_info, tree);
--- 1527,1533 ----
  extern void vect_record_base_alignments (vec_info *);
  extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, tree,
  				      tree *, gimple_stmt_iterator *,
! 				      gimple **, bool,
  				      tree = NULL_TREE, tree = NULL_TREE);
  extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,
  			     stmt_vec_info, tree);
Index: gcc/tree-vect-data-refs.c
===================================================================
*** gcc/tree-vect-data-refs.c	2018-07-30 12:32:26.214536374 +0100
--- gcc/tree-vect-data-refs.c	2018-07-30 12:32:32.546480596 +0100
*************** vect_create_addr_base_for_vector_ref (st
*** 4674,4689 ****
  
        Return the increment stmt that updates the pointer in PTR_INCR.
  
!    3. Set INV_P to true if the access pattern of the data reference in the
!       vectorized loop is invariant.  Set it to false otherwise.
! 
!    4. Return the pointer.  */
  
  tree
  vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,
  			  struct loop *at_loop, tree offset,
  			  tree *initial_address, gimple_stmt_iterator *gsi,
! 			  gimple **ptr_incr, bool only_init, bool *inv_p,
  			  tree byte_offset, tree iv_step)
  {
    const char *base_name;
--- 4674,4686 ----
  
        Return the increment stmt that updates the pointer in PTR_INCR.
  
!    3. Return the pointer.  */
  
  tree
  vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,
  			  struct loop *at_loop, tree offset,
  			  tree *initial_address, gimple_stmt_iterator *gsi,
! 			  gimple **ptr_incr, bool only_init,
  			  tree byte_offset, tree iv_step)
  {
    const char *base_name;
*************** vect_create_data_ref_ptr (stmt_vec_info
*** 4705,4711 ****
    bool insert_after;
    tree indx_before_incr, indx_after_incr;
    gimple *incr;
-   tree step;
    bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
  
    gcc_assert (iv_step != NULL_TREE
--- 4702,4707 ----
*************** vect_create_data_ref_ptr (stmt_vec_info
*** 4726,4739 ****
        *ptr_incr = NULL;
      }
  
-   /* Check the step (evolution) of the load in LOOP, and record
-      whether it's invariant.  */
-   step = vect_dr_behavior (dr_info)->step;
-   if (integer_zerop (step))
-     *inv_p = true;
-   else
-     *inv_p = false;
- 
    /* Create an expression for the first address accessed by this load
       in LOOP.  */
    base_name = get_name (DR_BASE_ADDRESS (dr));
--- 4722,4727 ----
*************** vect_create_data_ref_ptr (stmt_vec_info
*** 4849,4863 ****
      aptr = aggr_ptr_init;
    else
      {
        if (iv_step == NULL_TREE)
  	{
! 	  /* The step of the aggregate pointer is the type size.  */
  	  iv_step = TYPE_SIZE_UNIT (aggr_type);
! 	  /* One exception to the above is when the scalar step of the load in
! 	     LOOP is zero. In this case the step here is also zero.  */
! 	  if (*inv_p)
! 	    iv_step = size_zero_node;
! 	  else if (tree_int_cst_sgn (step) == -1)
  	    iv_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (iv_step), iv_step);
  	}
  
--- 4837,4853 ----
      aptr = aggr_ptr_init;
    else
      {
+       /* Accesses to invariant addresses should be handled specially
+ 	 by the caller.  */
+       tree step = vect_dr_behavior (dr_info)->step;
+       gcc_assert (!integer_zerop (step));
+ 
        if (iv_step == NULL_TREE)
  	{
! 	  /* The step of the aggregate pointer is the type size,
! 	     negated for downward accesses.  */
  	  iv_step = TYPE_SIZE_UNIT (aggr_type);
! 	  if (tree_int_cst_sgn (step) == -1)
  	    iv_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (iv_step), iv_step);
  	}
  
*************** vect_setup_realignment (stmt_vec_info st
*** 5462,5468 ****
    gphi *phi_stmt;
    tree msq = NULL_TREE;
    gimple_seq stmts = NULL;
-   bool inv_p;
    bool compute_in_loop = false;
    bool nested_in_vect_loop = false;
    struct loop *containing_loop = (gimple_bb (stmt_info->stmt))->loop_father;
--- 5452,5457 ----
*************** vect_setup_realignment (stmt_vec_info st
*** 5556,5562 ****
        vec_dest = vect_create_destination_var (scalar_dest, vectype);
        ptr = vect_create_data_ref_ptr (stmt_info, vectype,
  				      loop_for_initial_load, NULL_TREE,
! 				      &init_addr, NULL, &inc, true, &inv_p);
        if (TREE_CODE (ptr) == SSA_NAME)
  	new_temp = copy_ssa_name (ptr);
        else
--- 5545,5551 ----
        vec_dest = vect_create_destination_var (scalar_dest, vectype);
        ptr = vect_create_data_ref_ptr (stmt_info, vectype,
  				      loop_for_initial_load, NULL_TREE,
! 				      &init_addr, NULL, &inc, true);
        if (TREE_CODE (ptr) == SSA_NAME)
  	new_temp = copy_ssa_name (ptr);
        else
Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c	2018-07-30 12:32:29.586506669 +0100
--- gcc/tree-vect-stmts.c	2018-07-30 12:40:14.000000000 +0100
*************** vectorizable_store (stmt_vec_info stmt_i
*** 6254,6260 ****
    unsigned int group_size, i;
    vec<tree> oprnds = vNULL;
    vec<tree> result_chain = vNULL;
-   bool inv_p;
    tree offset = NULL_TREE;
    vec<tree> vec_oprnds = vNULL;
    bool slp = (slp_node != NULL);
--- 6254,6259 ----
*************** vectorizable_store (stmt_vec_info stmt_i
*** 7018,7039 ****
  	    {
  	      dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
  	      dataref_offset = build_int_cst (ref_type, 0);
- 	      inv_p = false;
  	    }
  	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
! 	    {
! 	      vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
! 					   &dataref_ptr, &vec_offset);
! 	      inv_p = false;
! 	    }
  	  else
  	    dataref_ptr
  	      = vect_create_data_ref_ptr (first_stmt_info, aggr_type,
  					  simd_lane_access_p ? loop : NULL,
  					  offset, &dummy, gsi, &ptr_incr,
! 					  simd_lane_access_p, &inv_p,
! 					  NULL_TREE, bump);
! 	  gcc_assert (bb_vinfo || !inv_p);
  	}
        else
  	{
--- 7017,7032 ----
  	    {
  	      dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
  	      dataref_offset = build_int_cst (ref_type, 0);
  	    }
  	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
! 	    vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
! 					 &dataref_ptr, &vec_offset);
  	  else
  	    dataref_ptr
  	      = vect_create_data_ref_ptr (first_stmt_info, aggr_type,
  					  simd_lane_access_p ? loop : NULL,
  					  offset, &dummy, gsi, &ptr_incr,
! 					  simd_lane_access_p, NULL_TREE, bump);
  	}
        else
  	{
*************** vectorizable_load (stmt_vec_info stmt_in
*** 7419,7425 ****
    bool grouped_load = false;
    stmt_vec_info first_stmt_info;
    stmt_vec_info first_stmt_info_for_drptr = NULL;
-   bool inv_p;
    bool compute_in_loop = false;
    struct loop *at_loop;
    int vec_num;
--- 7412,7417 ----
*************** vectorizable_load (stmt_vec_info stmt_in
*** 7669,7674 ****
--- 7661,7723 ----
        return true;
      }
  
+   if (memory_access_type == VMAT_INVARIANT)
+     {
+       gcc_assert (!grouped_load && !mask && !bb_vinfo);
+       /* If we have versioned for aliasing or the loop doesn't
+ 	 have any data dependencies that would preclude this,
+ 	 then we are sure this is a loop invariant load and
+ 	 thus we can insert it on the preheader edge.  */
+       bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)
+ 		      && !nested_in_vect_loop
+ 		      && hoist_defs_of_uses (stmt_info, loop));
+       if (hoist_p)
+ 	{
+ 	  gassign *stmt = as_a <gassign *> (stmt_info->stmt);
+ 	  if (dump_enabled_p ())
+ 	    {
+ 	      dump_printf_loc (MSG_NOTE, vect_location,
+ 			       "hoisting out of the vectorized loop: ");
+ 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
+ 	    }
+ 	  scalar_dest = copy_ssa_name (scalar_dest);
+ 	  tree rhs = unshare_expr (gimple_assign_rhs1 (stmt));
+ 	  gsi_insert_on_edge_immediate
+ 	    (loop_preheader_edge (loop),
+ 	     gimple_build_assign (scalar_dest, rhs));
+ 	}
+       /* These copies are all equivalent, but currently the representation
+ 	 requires a separate STMT_VINFO_VEC_STMT for each one.  */
+       prev_stmt_info = NULL;
+       gimple_stmt_iterator gsi2 = *gsi;
+       gsi_next (&gsi2);
+       for (j = 0; j < ncopies; j++)
+ 	{
+ 	  stmt_vec_info new_stmt_info;
+ 	  if (hoist_p)
+ 	    {
+ 	      new_temp = vect_init_vector (stmt_info, scalar_dest,
+ 					   vectype, NULL);
+ 	      gimple *new_stmt = SSA_NAME_DEF_STMT (new_temp);
+ 	      new_stmt_info = vinfo->add_stmt (new_stmt);
+ 	    }
+ 	  else
+ 	    {
+ 	      new_temp = vect_init_vector (stmt_info, scalar_dest,
+ 					   vectype, &gsi2);
+ 	      new_stmt_info = vinfo->lookup_def (new_temp);
+ 	    }
+ 	  if (slp)
+ 	    SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt_info);
+ 	  else if (j == 0)
+ 	    STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt_info;
+ 	  else
+ 	    STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt_info;
+ 	  prev_stmt_info = new_stmt_info;
+ 	}
+       return true;
+     }
+ 
    if (memory_access_type == VMAT_ELEMENTWISE
        || memory_access_type == VMAT_STRIDED_SLP)
      {
*************** vectorizable_load (stmt_vec_info stmt_in
*** 8177,8183 ****
  	    {
  	      dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));
  	      dataref_offset = build_int_cst (ref_type, 0);
- 	      inv_p = false;
  	    }
  	  else if (first_stmt_info_for_drptr
  		   && first_stmt_info != first_stmt_info_for_drptr)
--- 8226,8231 ----
*************** vectorizable_load (stmt_vec_info stmt_in
*** 8186,8192 ****
  		= vect_create_data_ref_ptr (first_stmt_info_for_drptr,
  					    aggr_type, at_loop, offset, &dummy,
  					    gsi, &ptr_incr, simd_lane_access_p,
! 					    &inv_p, byte_offset, bump);
  	      /* Adjust the pointer by the difference to first_stmt.  */
  	      data_reference_p ptrdr
  		= STMT_VINFO_DATA_REF (first_stmt_info_for_drptr);
--- 8234,8240 ----
  		= vect_create_data_ref_ptr (first_stmt_info_for_drptr,
  					    aggr_type, at_loop, offset, &dummy,
  					    gsi, &ptr_incr, simd_lane_access_p,
! 					    byte_offset, bump);
  	      /* Adjust the pointer by the difference to first_stmt.  */
  	      data_reference_p ptrdr
  		= STMT_VINFO_DATA_REF (first_stmt_info_for_drptr);
*************** vectorizable_load (stmt_vec_info stmt_in
*** 8199,8214 ****
  					     stmt_info, diff);
  	    }
  	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
! 	    {
! 	      vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
! 					   &dataref_ptr, &vec_offset);
! 	      inv_p = false;
! 	    }
  	  else
  	    dataref_ptr
  	      = vect_create_data_ref_ptr (first_stmt_info, aggr_type, at_loop,
  					  offset, &dummy, gsi, &ptr_incr,
! 					  simd_lane_access_p, &inv_p,
  					  byte_offset, bump);
  	  if (mask)
  	    vec_mask = vect_get_vec_def_for_operand (mask, stmt_info,
--- 8247,8259 ----
  					     stmt_info, diff);
  	    }
  	  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
! 	    vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,
! 					 &dataref_ptr, &vec_offset);
  	  else
  	    dataref_ptr
  	      = vect_create_data_ref_ptr (first_stmt_info, aggr_type, at_loop,
  					  offset, &dummy, gsi, &ptr_incr,
! 					  simd_lane_access_p,
  					  byte_offset, bump);
  	  if (mask)
  	    vec_mask = vect_get_vec_def_for_operand (mask, stmt_info,
*************** vectorizable_load (stmt_vec_info stmt_in
*** 8492,8538 ****
  		    }
  		}
  
- 	      /* 4. Handle invariant-load.  */
- 	      if (inv_p && !bb_vinfo)
- 		{
- 		  gcc_assert (!grouped_load);
- 		  /* If we have versioned for aliasing or the loop doesn't
- 		     have any data dependencies that would preclude this,
- 		     then we are sure this is a loop invariant load and
- 		     thus we can insert it on the preheader edge.  */
- 		  if (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)
- 		      && !nested_in_vect_loop
- 		      && hoist_defs_of_uses (stmt_info, loop))
- 		    {
- 		      gassign *stmt = as_a <gassign *> (stmt_info->stmt);
- 		      if (dump_enabled_p ())
- 			{
- 			  dump_printf_loc (MSG_NOTE, vect_location,
- 					   "hoisting out of the vectorized "
- 					   "loop: ");
- 			  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
- 			}
- 		      tree tem = copy_ssa_name (scalar_dest);
- 		      gsi_insert_on_edge_immediate
- 			(loop_preheader_edge (loop),
- 			 gimple_build_assign (tem,
- 					      unshare_expr
- 					        (gimple_assign_rhs1 (stmt))));
- 		      new_temp = vect_init_vector (stmt_info, tem,
- 						   vectype, NULL);
- 		      new_stmt = SSA_NAME_DEF_STMT (new_temp);
- 		      new_stmt_info = vinfo->add_stmt (new_stmt);
- 		    }
- 		  else
- 		    {
- 		      gimple_stmt_iterator gsi2 = *gsi;
- 		      gsi_next (&gsi2);
- 		      new_temp = vect_init_vector (stmt_info, scalar_dest,
- 						   vectype, &gsi2);
- 		      new_stmt_info = vinfo->lookup_def (new_temp);
- 		    }
- 		}
- 
  	      if (memory_access_type == VMAT_CONTIGUOUS_REVERSE)
  		{
  		  tree perm_mask = perm_mask_for_reverse (vectype);
--- 8537,8542 ----
Richard Sandiford July 30, 2018, 11:42 a.m. | #2
_loop_vec_info::_loop_vec_info used get_loop_array to get the
order of the blocks when creating stmt_vec_infos, but then used
dfs_enumerate_from to get the order of the blocks that the rest
of the vectoriser uses.  We should be able to use that order
for creating stmt_vec_infos too.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Use the
	result of dfs_enumerate_from when constructing stmt_vec_infos,
	instead of additionally calling get_loop_body.

Index: gcc/tree-vect-loop.c
===================================================================
*** gcc/tree-vect-loop.c	2018-07-30 12:40:59.366015643 +0100
--- gcc/tree-vect-loop.c	2018-07-30 12:40:59.362015678 +0100
*************** _loop_vec_info::_loop_vec_info (struct l
*** 834,844 ****
      scalar_loop (NULL),
      orig_loop_info (NULL)
  {
!   /* Create/Update stmt_info for all stmts in the loop.  */
!   basic_block *body = get_loop_body (loop);
!   for (unsigned int i = 0; i < loop->num_nodes; i++)
      {
!       basic_block bb = body[i];
        gimple_stmt_iterator si;
  
        for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
--- 834,851 ----
      scalar_loop (NULL),
      orig_loop_info (NULL)
  {
!   /* CHECKME: We want to visit all BBs before their successors (except for
!      latch blocks, for which this assertion wouldn't hold).  In the simple
!      case of the loop forms we allow, a dfs order of the BBs would the same
!      as reversed postorder traversal, so we are safe.  */
! 
!   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
! 					  bbs, loop->num_nodes, loop);
!   gcc_assert (nbbs == loop->num_nodes);
! 
!   for (unsigned int i = 0; i < nbbs; i++)
      {
!       basic_block bb = bbs[i];
        gimple_stmt_iterator si;
  
        for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
*************** _loop_vec_info::_loop_vec_info (struct l
*** 855,870 ****
  	  add_stmt (stmt);
  	}
      }
-   free (body);
- 
-   /* CHECKME: We want to visit all BBs before their successors (except for
-      latch blocks, for which this assertion wouldn't hold).  In the simple
-      case of the loop forms we allow, a dfs order of the BBs would the same
-      as reversed postorder traversal, so we are safe.  */
- 
-   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
- 					  bbs, loop->num_nodes, loop);
-   gcc_assert (nbbs == loop->num_nodes);
  }
  
  /* Free all levels of MASKS.  */
--- 862,867 ----
Richard Sandiford July 30, 2018, 11:43 a.m. | #3
This patch makes hoist_defs_of_uses use vec_info::lookup_def instead of:

      if (!gimple_nop_p (def_stmt)
	  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))

to test whether a feeding scalar statement needs to be hoisted out
of the vectorised loop.  It isn't worth doing in its own right,
but it's a prerequisite for the next patch, which needs to update
the stmt_vec_infos of the hoisted statements.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-stmts.c (hoist_defs_of_uses): Use vec_info::lookup_def
	instead of gimple_nop_p and flow_bb_inside_loop_p to decide
	whether a statement needs to be hoisted.

Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c	2018-07-30 12:42:35.633169005 +0100
--- gcc/tree-vect-stmts.c	2018-07-30 12:42:35.629169040 +0100
*************** permute_vec_elements (tree x, tree y, tr
*** 7322,7370 ****
  static bool
  hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
  {
    ssa_op_iter i;
    tree op;
    bool any = false;
  
    FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
!     {
!       gimple *def_stmt = SSA_NAME_DEF_STMT (op);
!       if (!gimple_nop_p (def_stmt)
! 	  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
! 	{
! 	  /* Make sure we don't need to recurse.  While we could do
! 	     so in simple cases when there are more complex use webs
! 	     we don't have an easy way to preserve stmt order to fulfil
! 	     dependencies within them.  */
! 	  tree op2;
! 	  ssa_op_iter i2;
! 	  if (gimple_code (def_stmt) == GIMPLE_PHI)
  	    return false;
! 	  FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt, i2, SSA_OP_USE)
! 	    {
! 	      gimple *def_stmt2 = SSA_NAME_DEF_STMT (op2);
! 	      if (!gimple_nop_p (def_stmt2)
! 		  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt2)))
! 		return false;
! 	    }
! 	  any = true;
! 	}
!     }
  
    if (!any)
      return true;
  
    FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
!     {
!       gimple *def_stmt = SSA_NAME_DEF_STMT (op);
!       if (!gimple_nop_p (def_stmt)
! 	  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
! 	{
! 	  gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt);
! 	  gsi_remove (&gsi, false);
! 	  gsi_insert_on_edge_immediate (loop_preheader_edge (loop), def_stmt);
! 	}
!     }
  
    return true;
  }
--- 7322,7360 ----
  static bool
  hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)
  {
+   vec_info *vinfo = stmt_info->vinfo;
    ssa_op_iter i;
    tree op;
    bool any = false;
  
    FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
!     if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))
!       {
! 	/* Make sure we don't need to recurse.  While we could do
! 	   so in simple cases when there are more complex use webs
! 	   we don't have an easy way to preserve stmt order to fulfil
! 	   dependencies within them.  */
! 	tree op2;
! 	ssa_op_iter i2;
! 	if (gimple_code (def_stmt_info->stmt) == GIMPLE_PHI)
! 	  return false;
! 	FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt_info->stmt, i2, SSA_OP_USE)
! 	  if (vinfo->lookup_def (op2))
  	    return false;
! 	any = true;
!       }
  
    if (!any)
      return true;
  
    FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)
!     if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))
!       {
! 	gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt_info->stmt);
! 	gsi_remove (&gsi, false);
! 	gsi_insert_on_edge_immediate (loop_preheader_edge (loop),
! 				      def_stmt_info->stmt);
!       }
  
    return true;
  }
Richard Sandiford July 30, 2018, 11:45 a.m. | #4
This patch adds a vec_basic_block that records the scalar phis and
scalar statements that we need to vectorise.  This is a slight
simplification in its own right, since it avoids unnecesary statement
lookups and shaves >50 LOC.  But the main reason for doing it is
to allow the final patch in the series to treat pattern statements
less specially.

Putting phis (which are logically parallel) and normal statements
(which are logically serial) into a single list might seem dangerous,
but I think in practice it should be fine.  Very little vectoriser
code needs to handle the parallel nature of phis specially, and code
that does can still do so.  Having a single list simplifies code that
wants to look at every scalar phi or stmt in isolation.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vec_basic_block): New structure.
	(vec_info::blocks, _stmt_vec_info::block, _stmt_vec_info::prev)
	(_stmt_vec_info::next): New member variables.
	(FOR_EACH_VEC_BB_STMT, FOR_EACH_VEC_BB_STMT_REVERSE): New macros.
	(vec_basic_block::vec_basic_block): New function.
	* tree-vectorizer.c (vec_basic_block::add_to_end): Likewise.
	(vec_basic_block::add_before): Likewise.
	(vec_basic_block::remove): Likewise.
	(vec_info::~vec_info): Free the vec_basic_blocks.
	(vec_info::remove_stmt): Remove the statement from the containing
	vec_basic_block.
	* tree-vect-patterns.c (vect_determine_precisions)
	(vect_pattern_recog): Iterate over vec_basic_blocks.
	* tree-vect-loop.c (vect_determine_vectorization_factor)
	(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
	(vect_analyze_loop_operations, vect_transform_loop): Likewise.
	(_loop_vec_info::_loop_vec_info): Construct vec_basic_blocks.
	* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise.
	(vect_detect_hybrid_slp): Iterate over vec_basic_blocks.
	* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
	(vect_finish_replace_stmt, vectorizable_condition): Remove the original
	statement from the containing block.
	(hoist_defs_of_uses): Likewise the statement that we're hoisting.

Index: gcc/tree-vectorizer.h
===================================================================
*** gcc/tree-vectorizer.h	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vectorizer.h	2018-07-30 12:43:34.508651861 +0100
*************** #define SLP_TREE_LOAD_PERMUTATION(S)
*** 171,177 ****
--- 171,200 ----
  #define SLP_TREE_TWO_OPERATORS(S)		 (S)->two_operators
  #define SLP_TREE_DEF_TYPE(S)			 (S)->def_type
  
+ /* Information about the phis and statements in a block that we're trying
+    to vectorize, in their original order.  */
+ class vec_basic_block
+ {
+ public:
+   vec_basic_block (basic_block);
+ 
+   void add_to_end (stmt_vec_info);
+   void add_before (stmt_vec_info, stmt_vec_info);
+   void remove (stmt_vec_info);
+ 
+   basic_block bb () const { return m_bb; }
+   stmt_vec_info first () const { return m_first; }
+   stmt_vec_info last () const { return m_last; }
+ 
+ private:
+   /* The block itself.  */
+   basic_block m_bb;
  
+   /* The first and last statements in the block, forming a double-linked list.
+      The list includes both phis and true statements.  */
+   stmt_vec_info m_first;
+   stmt_vec_info m_last;
+ };
  
  /* Describes two objects whose addresses must be unequal for the vectorized
     loop to be valid.  */
*************** struct vec_info {
*** 249,254 ****
--- 272,280 ----
    /* Cost data used by the target cost model.  */
    void *target_cost_data;
  
+   /* The basic blocks in the vectorization region.  */
+   auto_vec<vec_basic_block *, 5> blocks;
+ 
  private:
    stmt_vec_info new_stmt_vec_info (gimple *stmt);
    void set_vinfo_for_stmt (gimple *, stmt_vec_info);
*************** struct dr_vec_info {
*** 776,781 ****
--- 802,812 ----
  typedef struct data_reference *dr_p;
  
  struct _stmt_vec_info {
+   /* The block to which the statement belongs, or null if none.  */
+   vec_basic_block *block;
+ 
+   /* Link chains for the previous and next statements in BLOCK.  */
+   stmt_vec_info prev, next;
  
    enum stmt_vec_info_type type;
  
*************** #define VECT_SCALAR_BOOLEAN_TYPE_P(TYPE)
*** 1072,1077 ****
--- 1103,1129 ----
         && TYPE_PRECISION (TYPE) == 1		\
         && TYPE_UNSIGNED (TYPE)))
  
+ /* Make STMT_INFO iterate over each statement in vec_basic_block VEC_BB
+    in forward order.  */
+ 
+ #define FOR_EACH_VEC_BB_STMT(VEC_BB, STMT_INFO) \
+   for (stmt_vec_info STMT_INFO = (VEC_BB)->first (); STMT_INFO; \
+        STMT_INFO = STMT_INFO->next)
+ 
+ /* Make STMT_INFO iterate over each statement in vec_basic_block VEC_BB
+    in backward order.  */
+ 
+ #define FOR_EACH_VEC_BB_STMT_REVERSE(VEC_BB, STMT_INFO) \
+   for (stmt_vec_info STMT_INFO = (VEC_BB)->last (); STMT_INFO; \
+        STMT_INFO = STMT_INFO->prev)
+ 
+ /* Construct a vec_basic_block for BB.  */
+ 
+ inline vec_basic_block::vec_basic_block (basic_block bb)
+   : m_bb (bb), m_first (NULL), m_last (NULL)
+ {
+ }
+ 
  static inline bool
  nested_in_vect_loop_p (struct loop *loop, stmt_vec_info stmt_info)
  {
Index: gcc/tree-vectorizer.c
===================================================================
*** gcc/tree-vectorizer.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vectorizer.c	2018-07-30 12:43:34.508651861 +0100
*************** note_simd_array_uses (hash_table<simd_ar
*** 444,449 ****
--- 444,504 ----
    delete simd_array_to_simduid_htab;
  }
  
+ /* Add STMT_INFO to the end of the block.  */
+ 
+ void
+ vec_basic_block::add_to_end (stmt_vec_info stmt_info)
+ {
+   gcc_checking_assert (!stmt_info->block
+ 		       && !stmt_info->prev
+ 		       && !stmt_info->next);
+   if (m_last)
+     m_last->next = stmt_info;
+   else
+     m_first = stmt_info;
+   stmt_info->block = this;
+   stmt_info->prev = m_last;
+   m_last = stmt_info;
+ }
+ 
+ /* Add STMT_INFO to the block, inserting it before NEXT_STMT_INFO.  */
+ 
+ void
+ vec_basic_block::add_before (stmt_vec_info stmt_info,
+ 			     stmt_vec_info next_stmt_info)
+ {
+   gcc_checking_assert (!stmt_info->block
+ 		       && !stmt_info->prev
+ 		       && !stmt_info->next
+ 		       && next_stmt_info->block == this);
+   if (next_stmt_info->prev)
+     next_stmt_info->prev->next = stmt_info;
+   else
+     m_first = stmt_info;
+   stmt_info->block = this;
+   stmt_info->prev = next_stmt_info->prev;
+   stmt_info->next = next_stmt_info;
+   next_stmt_info->prev = stmt_info;
+ }
+ 
+ /* Remove STMT_INFO from the block.  */
+ 
+ void
+ vec_basic_block::remove (stmt_vec_info stmt_info)
+ {
+   gcc_checking_assert (stmt_info->block == this);
+   if (stmt_info->prev)
+     stmt_info->prev->next = stmt_info->next;
+   else
+     m_first = stmt_info->next;
+   if (stmt_info->next)
+     stmt_info->next->prev = stmt_info->prev;
+   else
+     m_last = stmt_info->prev;
+   stmt_info->block = NULL;
+   stmt_info->prev = stmt_info->next = NULL;
+ }
+ 
  /* Initialize the vec_info with kind KIND_IN and target cost data
     TARGET_COST_DATA_IN.  */
  
*************** vec_info::vec_info (vec_info::vec_kind k
*** 459,466 ****
--- 514,525 ----
  vec_info::~vec_info ()
  {
    slp_instance instance;
+   vec_basic_block *vec_bb;
    unsigned int i;
  
+   FOR_EACH_VEC_ELT (blocks, i, vec_bb)
+     delete vec_bb;
+ 
    FOR_EACH_VEC_ELT (slp_instances, i, instance)
      vect_free_slp_instance (instance, true);
  
*************** vec_info::remove_stmt (stmt_vec_info stm
*** 596,601 ****
--- 655,661 ----
    unlink_stmt_vdef (stmt_info->stmt);
    gsi_remove (&si, true);
    release_defs (stmt_info->stmt);
+   stmt_info->block->remove (stmt_info);
    free_stmt_vec_info (stmt_info);
  }
  
Index: gcc/tree-vect-patterns.c
===================================================================
*** gcc/tree-vect-patterns.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vect-patterns.c	2018-07-30 12:43:34.504651897 +0100
*************** vect_determine_precisions (vec_info *vin
*** 4631,4669 ****
  {
    DUMP_VECT_SCOPE ("vect_determine_precisions");
  
!   if (loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
!     {
!       struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!       basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
!       unsigned int nbbs = loop->num_nodes;
! 
!       for (unsigned int i = 0; i < nbbs; i++)
! 	{
! 	  basic_block bb = bbs[nbbs - i - 1];
! 	  for (gimple_stmt_iterator si = gsi_last_bb (bb);
! 	       !gsi_end_p (si); gsi_prev (&si))
! 	    vect_determine_stmt_precisions
! 	      (vinfo->lookup_stmt (gsi_stmt (si)));
! 	}
!     }
!   else
!     {
!       bb_vec_info bb_vinfo = as_a <bb_vec_info> (vinfo);
!       gimple_stmt_iterator si = bb_vinfo->region_end;
!       gimple *stmt;
!       do
! 	{
! 	  if (!gsi_stmt (si))
! 	    si = gsi_last_bb (bb_vinfo->bb);
! 	  else
! 	    gsi_prev (&si);
! 	  stmt = gsi_stmt (si);
! 	  stmt_vec_info stmt_info = vinfo->lookup_stmt (stmt);
! 	  if (stmt_info && STMT_VINFO_VECTORIZABLE (stmt_info))
! 	    vect_determine_stmt_precisions (stmt_info);
! 	}
!       while (stmt != gsi_stmt (bb_vinfo->region_begin));
!     }
  }
  
  typedef gimple *(*vect_recog_func_ptr) (stmt_vec_info, tree *);
--- 4631,4641 ----
  {
    DUMP_VECT_SCOPE ("vect_determine_precisions");
  
!   unsigned int i;
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT_REVERSE (vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT_REVERSE (vec_bb, stmt_info)
!       vect_determine_stmt_precisions (stmt_info);
  }
  
  typedef gimple *(*vect_recog_func_ptr) (stmt_vec_info, tree *);
*************** vect_pattern_recog_1 (vect_recog_func *r
*** 4923,4973 ****
  void
  vect_pattern_recog (vec_info *vinfo)
  {
-   struct loop *loop;
-   basic_block *bbs;
-   unsigned int nbbs;
-   gimple_stmt_iterator si;
-   unsigned int i, j;
- 
    vect_determine_precisions (vinfo);
  
    DUMP_VECT_SCOPE ("vect_pattern_recog");
  
!   if (loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
!     {
!       loop = LOOP_VINFO_LOOP (loop_vinfo);
!       bbs = LOOP_VINFO_BBS (loop_vinfo);
!       nbbs = loop->num_nodes;
! 
!       /* Scan through the loop stmts, applying the pattern recognition
! 	 functions starting at each stmt visited:  */
!       for (i = 0; i < nbbs; i++)
! 	{
! 	  basic_block bb = bbs[i];
! 	  for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
! 	    {
! 	      stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
! 	      /* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	      for (j = 0; j < NUM_PATTERNS; j++)
! 		vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j],
! 				      stmt_info);
! 	    }
! 	}
!     }
!   else
!     {
!       bb_vec_info bb_vinfo = as_a <bb_vec_info> (vinfo);
!       for (si = bb_vinfo->region_begin;
! 	   gsi_stmt (si) != gsi_stmt (bb_vinfo->region_end); gsi_next (&si))
! 	{
! 	  gimple *stmt = gsi_stmt (si);
! 	  stmt_vec_info stmt_info = bb_vinfo->lookup_stmt (stmt);
! 	  if (stmt_info && !STMT_VINFO_VECTORIZABLE (stmt_info))
! 	    continue;
! 
! 	  /* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	  for (j = 0; j < NUM_PATTERNS; j++)
! 	    vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
! 	}
!     }
  }
--- 4895,4910 ----
  void
  vect_pattern_recog (vec_info *vinfo)
  {
    vect_determine_precisions (vinfo);
  
    DUMP_VECT_SCOPE ("vect_pattern_recog");
  
!   unsigned int i;
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (STMT_VINFO_VECTORIZABLE (stmt_info))
! 	/* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	for (unsigned int j = 0; j < NUM_PATTERNS; j++)
! 	  vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
  }
Index: gcc/tree-vect-loop.c
===================================================================
*** gcc/tree-vect-loop.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vect-loop.c	2018-07-30 12:43:34.500651932 +0100
*************** vect_determine_vf_for_stmt (stmt_vec_inf
*** 286,321 ****
  static bool
  vect_determine_vectorization_factor (loop_vec_info loop_vinfo)
  {
-   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-   unsigned nbbs = loop->num_nodes;
    poly_uint64 vectorization_factor = 1;
    tree scalar_type = NULL_TREE;
-   gphi *phi;
    tree vectype;
    stmt_vec_info stmt_info;
    unsigned i;
    auto_vec<stmt_vec_info> mask_producers;
  
    DUMP_VECT_SCOPE ("vect_determine_vectorization_factor");
  
!   for (i = 0; i < nbbs; i++)
!     {
!       basic_block bb = bbs[i];
! 
!       for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
  	{
- 	  phi = si.phi ();
- 	  stmt_info = loop_vinfo->lookup_stmt (phi);
  	  if (dump_enabled_p ())
  	    {
  	      dump_printf_loc (MSG_NOTE, vect_location, "==> examining phi: ");
  	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
  	    }
  
- 	  gcc_assert (stmt_info);
- 
  	  if (STMT_VINFO_RELEVANT_P (stmt_info)
  	      || STMT_VINFO_LIVE_P (stmt_info))
              {
--- 286,311 ----
  static bool
  vect_determine_vectorization_factor (loop_vec_info loop_vinfo)
  {
    poly_uint64 vectorization_factor = 1;
    tree scalar_type = NULL_TREE;
    tree vectype;
    stmt_vec_info stmt_info;
    unsigned i;
    auto_vec<stmt_vec_info> mask_producers;
+   vec_basic_block *vec_bb;
  
    DUMP_VECT_SCOPE ("vect_determine_vectorization_factor");
  
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
  	{
  	  if (dump_enabled_p ())
  	    {
  	      dump_printf_loc (MSG_NOTE, vect_location, "==> examining phi: ");
  	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
  	    }
  
  	  if (STMT_VINFO_RELEVANT_P (stmt_info)
  	      || STMT_VINFO_LIVE_P (stmt_info))
              {
*************** vect_determine_vectorization_factor (loo
*** 363,378 ****
  	      vect_update_max_nunits (&vectorization_factor, vectype);
  	    }
  	}
! 
!       for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
! 	{
! 	  stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  if (!vect_determine_vf_for_stmt (stmt_info, &vectorization_factor,
! 					   &mask_producers))
! 	    return false;
!         }
!     }
  
    /* TODO: Analyze cost. Decide if worth while to vectorize.  */
    if (dump_enabled_p ())
--- 353,361 ----
  	      vect_update_max_nunits (&vectorization_factor, vectype);
  	    }
  	}
!       else if (!vect_determine_vf_for_stmt (stmt_info, &vectorization_factor,
! 					    &mask_producers))
! 	return false;
  
    /* TODO: Analyze cost. Decide if worth while to vectorize.  */
    if (dump_enabled_p ())
*************** _loop_vec_info::_loop_vec_info (struct l
*** 846,866 ****
    for (unsigned int i = 0; i < nbbs; i++)
      {
        basic_block bb = bbs[i];
        gimple_stmt_iterator si;
  
        for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
  	{
  	  gimple *phi = gsi_stmt (si);
  	  gimple_set_uid (phi, 0);
! 	  add_stmt (phi);
  	}
  
        for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
  	{
  	  gimple *stmt = gsi_stmt (si);
  	  gimple_set_uid (stmt, 0);
! 	  add_stmt (stmt);
  	}
      }
  }
  
--- 829,851 ----
    for (unsigned int i = 0; i < nbbs; i++)
      {
        basic_block bb = bbs[i];
+       vec_basic_block *vec_bb = new vec_basic_block (bb);
        gimple_stmt_iterator si;
  
        for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
  	{
  	  gimple *phi = gsi_stmt (si);
  	  gimple_set_uid (phi, 0);
! 	  vec_bb->add_to_end (add_stmt (phi));
  	}
  
        for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
  	{
  	  gimple *stmt = gsi_stmt (si);
  	  gimple_set_uid (stmt, 0);
! 	  vec_bb->add_to_end (add_stmt (stmt));
  	}
+       blocks.safe_push (vec_bb);
      }
  }
  
*************** vect_verify_full_masking (loop_vec_info
*** 1066,1074 ****
  vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
!   int nbbs = loop->num_nodes, factor;
!   int innerloop_iters, i;
  
    /* Gather costs for statements in the scalar loop.  */
  
--- 1051,1058 ----
  vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!   int factor, innerloop_iters;
!   unsigned int i;
  
    /* Gather costs for statements in the scalar loop.  */
  
*************** vect_compute_single_scalar_iteration_cos
*** 1077,1099 ****
    if (loop->inner)
      innerloop_iters = 50; /* FIXME */
  
!   for (i = 0; i < nbbs; i++)
      {
!       gimple_stmt_iterator si;
!       basic_block bb = bbs[i];
! 
!       if (bb->loop_father == loop->inner)
          factor = innerloop_iters;
        else
          factor = 1;
  
!       for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
!         {
! 	  gimple *stmt = gsi_stmt (si);
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (stmt);
! 
!           if (!is_gimple_assign (stmt) && !is_gimple_call (stmt))
!             continue;
  
            /* Skip stmts that are not vectorized inside the loop.  */
            if (stmt_info
--- 1061,1079 ----
    if (loop->inner)
      innerloop_iters = 50; /* FIXME */
  
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      {
!       if (vec_bb->bb ()->loop_father == loop->inner)
          factor = innerloop_iters;
        else
          factor = 1;
  
!       FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
! 	{
! 	  if (!is_gimple_assign (stmt_info->stmt)
! 	      && !is_gimple_call (stmt_info->stmt))
! 	    continue;
  
            /* Skip stmts that are not vectorized inside the loop.  */
            if (stmt_info
*************** vect_analyze_loop_form (struct loop *loo
*** 1397,1407 ****
  static void
  vect_update_vf_for_slp (loop_vec_info loop_vinfo)
  {
-   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-   int nbbs = loop->num_nodes;
    poly_uint64 vectorization_factor;
!   int i;
  
    DUMP_VECT_SCOPE ("vect_update_vf_for_slp");
  
--- 1377,1384 ----
  static void
  vect_update_vf_for_slp (loop_vec_info loop_vinfo)
  {
    poly_uint64 vectorization_factor;
!   unsigned int i;
  
    DUMP_VECT_SCOPE ("vect_update_vf_for_slp");
  
*************** vect_update_vf_for_slp (loop_vec_info lo
*** 1414,1434 ****
       perform pure SLP on loop - cross iteration parallelism is not
       exploited.  */
    bool only_slp_in_loop = true;
!   for (i = 0; i < nbbs; i++)
!     {
!       basic_block bb = bbs[i];
!       for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
! 	{
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  stmt_info = vect_stmt_to_vectorize (stmt_info);
! 	  if ((STMT_VINFO_RELEVANT_P (stmt_info)
! 	       || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
! 	      && !PURE_SLP_STMT (stmt_info))
! 	    /* STMT needs both SLP and loop-based vectorization.  */
! 	    only_slp_in_loop = false;
! 	}
!     }
  
    if (only_slp_in_loop)
      {
--- 1391,1407 ----
       perform pure SLP on loop - cross iteration parallelism is not
       exploited.  */
    bool only_slp_in_loop = true;
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	stmt_vec_info final_info = vect_stmt_to_vectorize (stmt_info);
! 	if ((STMT_VINFO_RELEVANT_P (final_info)
! 	     || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (final_info)))
! 	    && !PURE_SLP_STMT (final_info))
! 	  /* STMT needs both SLP and loop-based vectorization.  */
! 	  only_slp_in_loop = false;
!       }
  
    if (only_slp_in_loop)
      {
*************** vect_active_double_reduction_p (stmt_vec
*** 1491,1501 ****
  static bool
  vect_analyze_loop_operations (loop_vec_info loop_vinfo)
  {
!   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
!   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
!   int nbbs = loop->num_nodes;
!   int i;
!   stmt_vec_info stmt_info;
    bool need_to_vectorize = false;
    bool ok;
  
--- 1464,1470 ----
  static bool
  vect_analyze_loop_operations (loop_vec_info loop_vinfo)
  {
!   unsigned int i;
    bool need_to_vectorize = false;
    bool ok;
  
*************** vect_analyze_loop_operations (loop_vec_i
*** 1504,1520 ****
    stmt_vector_for_cost cost_vec;
    cost_vec.create (2);
  
!   for (i = 0; i < nbbs; i++)
!     {
!       basic_block bb = bbs[i];
! 
!       for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
          {
-           gphi *phi = si.phi ();
            ok = true;
- 
- 	  stmt_info = loop_vinfo->lookup_stmt (phi);
            if (dump_enabled_p ())
              {
                dump_printf_loc (MSG_NOTE, vect_location, "examining phi: ");
--- 1473,1484 ----
    stmt_vector_for_cost cost_vec;
    cost_vec.create (2);
  
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
          {
            ok = true;
            if (dump_enabled_p ())
              {
                dump_printf_loc (MSG_NOTE, vect_location, "examining phi: ");
*************** vect_analyze_loop_operations (loop_vec_i
*** 1525,1531 ****
  
            /* Inner-loop loop-closed exit phi in outer-loop vectorization
               (i.e., a phi in the tail of the outer-loop).  */
!           if (! is_loop_header_bb_p (bb))
              {
                /* FORNOW: we currently don't support the case that these phis
                   are not used in the outerloop (unless it is double reduction,
--- 1489,1495 ----
  
            /* Inner-loop loop-closed exit phi in outer-loop vectorization
               (i.e., a phi in the tail of the outer-loop).  */
!           if (! is_loop_header_bb_p (vec_bb->bb ()))
              {
                /* FORNOW: we currently don't support the case that these phis
                   are not used in the outerloop (unless it is double reduction,
*************** vect_analyze_loop_operations (loop_vec_i
*** 1564,1571 ****
                continue;
              }
  
-           gcc_assert (stmt_info);
- 
            if ((STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_scope
                 || STMT_VINFO_LIVE_P (stmt_info))
                && STMT_VINFO_DEF_TYPE (stmt_info) != vect_induction_def)
--- 1528,1533 ----
*************** vect_analyze_loop_operations (loop_vec_i
*** 1610,1627 ****
  	      return false;
              }
          }
! 
!       for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
!         {
! 	  gimple *stmt = gsi_stmt (si);
! 	  if (!gimple_clobber_p (stmt)
! 	      && !vect_analyze_stmt (loop_vinfo->lookup_stmt (stmt),
! 				     &need_to_vectorize,
! 				     NULL, NULL, &cost_vec))
! 	    return false;
!         }
!     } /* bbs */
  
    add_stmt_costs (loop_vinfo->target_cost_data, &cost_vec);
    cost_vec.release ();
--- 1572,1581 ----
  	      return false;
              }
          }
!       else if (!gimple_clobber_p (stmt_info->stmt)
! 	       && !vect_analyze_stmt (stmt_info, &need_to_vectorize,
! 				      NULL, NULL, &cost_vec))
! 	return false;
  
    add_stmt_costs (loop_vinfo->target_cost_data, &cost_vec);
    cost_vec.release ();
*************** vect_analyze_loop_2 (loop_vec_info loop_
*** 2207,2238 ****
      vect_free_slp_instance (instance, false);
    LOOP_VINFO_SLP_INSTANCES (loop_vinfo).release ();
    /* Reset SLP type to loop_vect on all stmts.  */
!   for (i = 0; i < LOOP_VINFO_LOOP (loop_vinfo)->num_nodes; ++i)
!     {
!       basic_block bb = LOOP_VINFO_BBS (loop_vinfo)[i];
!       for (gimple_stmt_iterator si = gsi_start_phis (bb);
! 	   !gsi_end_p (si); gsi_next (&si))
! 	{
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	}
!       for (gimple_stmt_iterator si = gsi_start_bb (bb);
! 	   !gsi_end_p (si); gsi_next (&si))
! 	{
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 	    {
! 	      gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 	      stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
! 	      STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	      for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
! 		   !gsi_end_p (pi); gsi_next (&pi))
! 		STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
! 		  = loop_vect;
! 	    }
! 	}
!     }
    /* Free optimized alias test DDRS.  */
    LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).release ();
--- 2161,2182 ----
      vect_free_slp_instance (instance, false);
    LOOP_VINFO_SLP_INSTANCES (loop_vinfo).release ();
    /* Reset SLP type to loop_vect on all stmts.  */
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 	  {
! 	    gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 	    STMT_SLP_TYPE (STMT_VINFO_RELATED_STMT (stmt_info)) = loop_vect;
! 	    for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
! 		 !gsi_end_p (pi); gsi_next (&pi))
! 	      STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
! 		= loop_vect;
! 	  }
!       }
! 
    /* Free optimized alias test DDRS.  */
    LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).release ();
*************** vect_transform_loop (loop_vec_info loop_
*** 8237,8251 ****
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    struct loop *epilogue = NULL;
!   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
!   int nbbs = loop->num_nodes;
!   int i;
    tree niters_vector = NULL_TREE;
    tree step_vector = NULL_TREE;
    tree niters_vector_mult_vf = NULL_TREE;
    poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
    unsigned int lowest_vf = constant_lower_bound (vf);
-   gimple *stmt;
    bool check_profitability = false;
    unsigned int th;
  
--- 8181,8192 ----
  {
    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    struct loop *epilogue = NULL;
!   unsigned int i;
    tree niters_vector = NULL_TREE;
    tree step_vector = NULL_TREE;
    tree niters_vector_mult_vf = NULL_TREE;
    poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
    unsigned int lowest_vf = constant_lower_bound (vf);
    bool check_profitability = false;
    unsigned int th;
  
*************** vect_transform_loop (loop_vec_info loop_
*** 8363,8452 ****
       support more involved loop forms, the order by which the BBs are
       traversed need to be reconsidered.  */
  
!   for (i = 0; i < nbbs; i++)
      {
!       basic_block bb = bbs[i];
!       stmt_vec_info stmt_info;
! 
!       for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
! 	   gsi_next (&si))
!         {
! 	  gphi *phi = si.phi ();
! 	  if (dump_enabled_p ())
  	    {
! 	      dump_printf_loc (MSG_NOTE, vect_location,
!                                "------>vectorizing phi: ");
! 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
! 	    }
! 	  stmt_info = loop_vinfo->lookup_stmt (phi);
! 	  if (!stmt_info)
! 	    continue;
  
! 	  if (MAY_HAVE_DEBUG_BIND_STMTS && !STMT_VINFO_LIVE_P (stmt_info))
! 	    vect_loop_kill_debug_uses (loop, stmt_info);
  
! 	  if (!STMT_VINFO_RELEVANT_P (stmt_info)
! 	      && !STMT_VINFO_LIVE_P (stmt_info))
! 	    continue;
! 
! 	  if (STMT_VINFO_VECTYPE (stmt_info)
! 	      && (maybe_ne
! 		  (TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)), vf))
! 	      && dump_enabled_p ())
! 	    dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
! 
! 	  if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def
! 	       || STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
! 	       || STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
! 	      && ! PURE_SLP_STMT (stmt_info))
! 	    {
! 	      if (dump_enabled_p ())
! 		dump_printf_loc (MSG_NOTE, vect_location, "transform phi.\n");
! 	      vect_transform_stmt (stmt_info, NULL, NULL, NULL);
  	    }
- 	}
- 
-       for (gimple_stmt_iterator si = gsi_start_bb (bb);
- 	   !gsi_end_p (si);)
- 	{
- 	  stmt = gsi_stmt (si);
  	  /* During vectorization remove existing clobber stmts.  */
! 	  if (gimple_clobber_p (stmt))
! 	    {
! 	      unlink_stmt_vdef (stmt);
! 	      gsi_remove (&si, true);
! 	      release_defs (stmt);
! 	    }
  	  else
  	    {
- 	      stmt_info = loop_vinfo->lookup_stmt (stmt);
- 
- 	      /* vector stmts created in the outer-loop during vectorization of
- 		 stmts in an inner-loop may not have a stmt_info, and do not
- 		 need to be vectorized.  */
  	      stmt_vec_info seen_store = NULL;
! 	      if (stmt_info)
  		{
! 		  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
  		    {
- 		      gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
- 		      for (gimple_stmt_iterator subsi = gsi_start (def_seq);
- 			   !gsi_end_p (subsi); gsi_next (&subsi))
- 			{
- 			  stmt_vec_info pat_stmt_info
- 			    = loop_vinfo->lookup_stmt (gsi_stmt (subsi));
- 			  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
- 						    &si, &seen_store);
- 			}
  		      stmt_vec_info pat_stmt_info
! 			= STMT_VINFO_RELATED_STMT (stmt_info);
! 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
! 						&seen_store);
  		    }
! 		  vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
  					    &seen_store);
  		}
! 	      gsi_next (&si);
  	      if (seen_store)
  		{
  		  if (STMT_VINFO_GROUPED_ACCESS (seen_store))
--- 8304,8376 ----
       support more involved loop forms, the order by which the BBs are
       traversed need to be reconsidered.  */
  
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      {
!       stmt_vec_info next_stmt_info;
!       for (stmt_vec_info stmt_info = vec_bb->first (); stmt_info;
! 	   stmt_info = next_stmt_info)
! 	{
! 	  next_stmt_info = stmt_info->next;
! 	  if (gphi *phi = dyn_cast <gphi *> (stmt_info->stmt))
  	    {
! 	      if (dump_enabled_p ())
! 		{
! 		  dump_printf_loc (MSG_NOTE, vect_location,
! 				   "------>vectorizing phi: ");
! 		  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);
! 		}
  
! 	      if (MAY_HAVE_DEBUG_BIND_STMTS && !STMT_VINFO_LIVE_P (stmt_info))
! 		vect_loop_kill_debug_uses (loop, stmt_info);
  
! 	      if (!STMT_VINFO_RELEVANT_P (stmt_info)
! 		  && !STMT_VINFO_LIVE_P (stmt_info))
! 		continue;
! 
! 	      if (STMT_VINFO_VECTYPE (stmt_info)
! 		  && (maybe_ne
! 		      (TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)),
! 		       vf))
! 		  && dump_enabled_p ())
! 		dump_printf_loc (MSG_NOTE, vect_location, "multiple-types.\n");
! 
! 	      if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def
! 		   || STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
! 		   || STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
! 		  && ! PURE_SLP_STMT (stmt_info))
! 		{
! 		  if (dump_enabled_p ())
! 		    dump_printf_loc (MSG_NOTE, vect_location,
! 				     "transform phi.\n");
! 		  vect_transform_stmt (stmt_info, NULL, NULL, NULL);
! 		}
  	    }
  	  /* During vectorization remove existing clobber stmts.  */
! 	  else if (gimple_clobber_p (stmt_info->stmt))
! 	    loop_vinfo->remove_stmt (stmt_info);
  	  else
  	    {
  	      stmt_vec_info seen_store = NULL;
! 	      gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
! 	      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
  		{
! 		  gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 		  for (gimple_stmt_iterator subsi = gsi_start (def_seq);
! 		       !gsi_end_p (subsi); gsi_next (&subsi))
  		    {
  		      stmt_vec_info pat_stmt_info
! 			= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
! 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
! 						&si, &seen_store);
  		    }
! 		  stmt_vec_info pat_stmt_info
! 		    = STMT_VINFO_RELATED_STMT (stmt_info);
! 		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si,
  					    &seen_store);
  		}
! 	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si,
! 					&seen_store);
  	      if (seen_store)
  		{
  		  if (STMT_VINFO_GROUPED_ACCESS (seen_store))
*************** vect_transform_loop (loop_vec_info loop_
*** 8464,8470 ****
        /* Stub out scalar statements that must not survive vectorization.
  	 Doing this here helps with grouped statements, or statements that
  	 are involved in patterns.  */
!       for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
  	   !gsi_end_p (gsi); gsi_next (&gsi))
  	{
  	  gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi));
--- 8388,8394 ----
        /* Stub out scalar statements that must not survive vectorization.
  	 Doing this here helps with grouped statements, or statements that
  	 are involved in patterns.  */
!       for (gimple_stmt_iterator gsi = gsi_start_bb (vec_bb->bb ());
  	   !gsi_end_p (gsi); gsi_next (&gsi))
  	{
  	  gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi));
Index: gcc/tree-vect-slp.c
===================================================================
*** gcc/tree-vect-slp.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vect-slp.c	2018-07-30 12:43:34.504651897 +0100
*************** vect_detect_hybrid_slp (loop_vec_info lo
*** 2408,2436 ****
  
    /* First walk all pattern stmt in the loop and mark defs of uses as
       hybrid because immediate uses in them are not recorded.  */
!   for (i = 0; i < LOOP_VINFO_LOOP (loop_vinfo)->num_nodes; ++i)
!     {
!       basic_block bb = LOOP_VINFO_BBS (loop_vinfo)[i];
!       for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
! 	   gsi_next (&gsi))
  	{
! 	  gimple *stmt = gsi_stmt (gsi);
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (stmt);
! 	  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 	    {
! 	      walk_stmt_info wi;
! 	      memset (&wi, 0, sizeof (wi));
! 	      wi.info = loop_vinfo;
! 	      gimple_stmt_iterator gsi2
! 		= gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
! 	      walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
! 				vect_detect_hybrid_slp_1, &wi);
! 	      walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
! 			       vect_detect_hybrid_slp_2,
! 			       vect_detect_hybrid_slp_1, &wi);
! 	    }
  	}
-     }
  
    /* Then walk the SLP instance trees marking stmts with uses in
       non-SLP stmts as hybrid, also propagating hybrid down the
--- 2408,2429 ----
  
    /* First walk all pattern stmt in the loop and mark defs of uses as
       hybrid because immediate uses in them are not recorded.  */
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (STMT_VINFO_IN_PATTERN_P (stmt_info))
  	{
! 	  walk_stmt_info wi;
! 	  memset (&wi, 0, sizeof (wi));
! 	  wi.info = loop_vinfo;
! 	  gimple_stmt_iterator gsi2
! 	    = gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
! 	  walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
! 			    vect_detect_hybrid_slp_1, &wi);
! 	  walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
! 			   vect_detect_hybrid_slp_2,
! 			   vect_detect_hybrid_slp_1, &wi);
  	}
  
    /* Then walk the SLP instance trees marking stmts with uses in
       non-SLP stmts as hybrid, also propagating hybrid down the
*************** _bb_vec_info::_bb_vec_info (gimple_stmt_
*** 2457,2469 ****
  {
    gimple_stmt_iterator gsi;
  
    for (gsi = region_begin; gsi_stmt (gsi) != gsi_stmt (region_end);
         gsi_next (&gsi))
      {
        gimple *stmt = gsi_stmt (gsi);
        gimple_set_uid (stmt, 0);
!       add_stmt (stmt);
      }
  
    bb->aux = this;
  }
--- 2450,2464 ----
  {
    gimple_stmt_iterator gsi;
  
+   vec_basic_block *vec_bb = new vec_basic_block (bb);
    for (gsi = region_begin; gsi_stmt (gsi) != gsi_stmt (region_end);
         gsi_next (&gsi))
      {
        gimple *stmt = gsi_stmt (gsi);
        gimple_set_uid (stmt, 0);
!       vec_bb->add_to_end (add_stmt (stmt));
      }
+   blocks.quick_push (vec_bb);
  
    bb->aux = this;
  }
Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c	2018-07-30 12:43:34.512651826 +0100
--- gcc/tree-vect-stmts.c	2018-07-30 12:43:34.504651897 +0100
*************** process_use (stmt_vec_info stmt_vinfo, t
*** 612,623 ****
  bool
  vect_mark_stmts_to_be_vectorized (loop_vec_info loop_vinfo)
  {
-   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-   unsigned int nbbs = loop->num_nodes;
-   gimple_stmt_iterator si;
    unsigned int i;
-   basic_block bb;
    bool live_p;
    enum vect_relevant relevant;
  
--- 612,618 ----
*************** vect_mark_stmts_to_be_vectorized (loop_v
*** 626,659 ****
    auto_vec<stmt_vec_info, 64> worklist;
  
    /* 1. Init worklist.  */
!   for (i = 0; i < nbbs; i++)
!     {
!       bb = bbs[i];
!       for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))
! 	{
! 	  stmt_vec_info phi_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  if (dump_enabled_p ())
! 	    {
! 	      dump_printf_loc (MSG_NOTE, vect_location, "init: phi relevant? ");
! 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi_info->stmt, 0);
! 	    }
! 
! 	  if (vect_stmt_relevant_p (phi_info, loop_vinfo, &relevant, &live_p))
! 	    vect_mark_relevant (&worklist, phi_info, relevant, live_p);
! 	}
!       for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
! 	{
! 	  stmt_vec_info stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
! 	  if (dump_enabled_p ())
! 	    {
! 	      dump_printf_loc (MSG_NOTE, vect_location, "init: stmt relevant? ");
! 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
! 	    }
! 
! 	  if (vect_stmt_relevant_p (stmt_info, loop_vinfo, &relevant, &live_p))
! 	    vect_mark_relevant (&worklist, stmt_info, relevant, live_p);
! 	}
!     }
  
    /* 2. Process_worklist */
    while (worklist.length () > 0)
--- 621,631 ----
    auto_vec<stmt_vec_info, 64> worklist;
  
    /* 1. Init worklist.  */
!   vec_basic_block *vec_bb;
!   FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
!     FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (vect_stmt_relevant_p (stmt_info, loop_vinfo, &relevant, &live_p))
! 	vect_mark_relevant (&worklist, stmt_info, relevant, live_p);
  
    /* 2. Process_worklist */
    while (worklist.length () > 0)
*************** vect_finish_replace_stmt (stmt_vec_info
*** 1753,1758 ****
--- 1725,1731 ----
  
    gimple_stmt_iterator gsi = gsi_for_stmt (stmt_info->stmt);
    gsi_replace (&gsi, vec_stmt, false);
+   stmt_info->block->remove (stmt_info);
  
    return vect_finish_stmt_generation_1 (stmt_info, vec_stmt);
  }
*************** hoist_defs_of_uses (stmt_vec_info stmt_i
*** 7352,7357 ****
--- 7325,7331 ----
        {
  	gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt_info->stmt);
  	gsi_remove (&gsi, false);
+ 	def_stmt_info->block->remove (def_stmt_info);
  	gsi_insert_on_edge_immediate (loop_preheader_edge (loop),
  				      def_stmt_info->stmt);
        }
*************** vectorizable_condition (stmt_vec_info st
*** 9066,9071 ****
--- 9040,9046 ----
  		  gimple_stmt_iterator old_gsi
  		    = gsi_for_stmt (stmt_info->stmt);
  		  gsi_remove (&old_gsi, true);
+ 		  stmt_info->block->remove (stmt_info);
  		  new_stmt_info
  		    = vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
  		}
Richard Sandiford July 30, 2018, 11:47 a.m. | #5
The point of this patch is to put pattern statements in the same
vec_basic_block as the statements they replace, with the pattern
statements for S coming between S and S's original predecessor.
This removes the need to handle them specially in various places.


2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vec_basic_block): Expand comment.
	(_stmt_vec_info::pattern_def_seq): Delete.
	(STMT_VINFO_PATTERN_DEF_SEQ): Likewise.
	(is_main_pattern_stmt_p): New function.
	* tree-vect-loop.c (vect_determine_vf_for_stmt_1): Rename to...
	(vect_determine_vf_for_stmt): ...this, deleting the original
	function with this name.  Remove vectype_maybe_set_p argument
	and test is_pattern_stmt_p instead.  Retain the "examining..."
	message from the previous vect_determine_vf_for_stmt.
	(vect_compute_single_scalar_iteration_cost, vect_update_vf_for_slp)
	(vect_analyze_loop_2): Don't treat pattern statements specially.
	(vect_transform_loop): Likewise.  Use vect_orig_stmt to find the
	insertion point.
	* tree-vect-slp.c (vect_detect_hybrid_slp): Expect pattern statements
	to be in the statement list, without needing to follow
	STMT_VINFO_RELATED_STMT.  Remove PATTERN_DEF_SEQ handling.
	* tree-vect-stmts.c (vect_analyze_stmt): Don't handle pattern
	statements specially.
	(vect_remove_dead_scalar_stmts): Ignore pattern statements.
	* tree-vect-patterns.c (vect_set_pattern_stmt): Insert the pattern
	statement into the vec_basic_block immediately before the statement
	it replaces.
	(append_pattern_def_seq): Likewise.  If the original statement is
	itself a pattern statement, associate the new one with the original
	statement.
	(vect_split_statement): Use append_pattern_def_seq to insert the
	first pattern statement.
	(vect_recog_vector_vector_shift_pattern): Remove mention of
	STMT_VINFO_PATTERN_DEF_SEQ.
	(adjust_bool_stmts): Get the last pattern statement from the
	stmt_vec_info chain.
	(vect_mark_pattern_stmts): Rename to...
	(vect_replace_stmt_with_pattern): ...this.  Remove the
	PATTERN_DEF_SEQ handling and process only the pattern statement given.
	Use append_pattern_def_seq when replacing a pattern statement with
	another pattern statement, and use vec_basic_block::remove instead
	of gsi_remove to remove the old one.
	(vect_pattern_recog_1): Update accordingly.  Remove PATTERN_DEF_SEQ
	handling.  On failure, remove any half-formed pattern sequence from
	the vec_basic_block.  Install the vector type in pattern statements
	that don't yet have one.
	(vect_pattern_recog): Iterate over statements that are added
	by previous recognizers, but skipping those that have already
	been replaced, or the main pattern statement in such a replacement.

Index: gcc/tree-vectorizer.h
===================================================================
*** gcc/tree-vectorizer.h	2018-07-30 12:32:46.658356275 +0100
--- gcc/tree-vectorizer.h	2018-07-30 12:32:49.898327734 +0100
*************** #define SLP_TREE_TWO_OPERATORS(S)		 (S)-
*** 172,178 ****
  #define SLP_TREE_DEF_TYPE(S)			 (S)->def_type
  
  /* Information about the phis and statements in a block that we're trying
!    to vectorize, in their original order.  */
  class vec_basic_block
  {
  public:
--- 172,184 ----
  #define SLP_TREE_DEF_TYPE(S)			 (S)->def_type
  
  /* Information about the phis and statements in a block that we're trying
!    to vectorize.  This includes the phis and statements that were in the
!    original scalar code, in their original order.  It also includes any
!    pattern statements that the vectorizer has created to replace some
!    of the scalar ones.  Such pattern statements come immediately before
!    the statement that they replace; that is, all pattern statements P for
!    which vect_orig_stmt (P) == S form a sequence that comes immediately
!    before S.  */
  class vec_basic_block
  {
  public:
*************** struct _stmt_vec_info {
*** 870,880 ****
          pattern).  */
    stmt_vec_info related_stmt;
  
-   /* Used to keep a sequence of def stmts of a pattern stmt if such exists.
-      The sequence is attached to the original statement rather than the
-      pattern statement.  */
-   gimple_seq pattern_def_seq;
- 
    /* List of datarefs that are known to have the same alignment as the dataref
       of this stmt.  */
    vec<dr_p> same_align_refs;
--- 876,881 ----
*************** #define STMT_VINFO_DR_INFO(S) \
*** 1048,1054 ****
  
  #define STMT_VINFO_IN_PATTERN_P(S)         (S)->in_pattern_p
  #define STMT_VINFO_RELATED_STMT(S)         (S)->related_stmt
- #define STMT_VINFO_PATTERN_DEF_SEQ(S)      (S)->pattern_def_seq
  #define STMT_VINFO_SAME_ALIGN_REFS(S)      (S)->same_align_refs
  #define STMT_VINFO_SIMD_CLONE_INFO(S)	   (S)->simd_clone_info
  #define STMT_VINFO_DEF_TYPE(S)             (S)->def_type
--- 1049,1054 ----
*************** is_pattern_stmt_p (stmt_vec_info stmt_in
*** 1176,1181 ****
--- 1176,1192 ----
    return stmt_info->pattern_stmt_p;
  }
  
+ /* Return TRUE if a statement represented by STMT_INFO is the final
+    statement in a pattern.  */
+ 
+ static inline bool
+ is_main_pattern_stmt_p (stmt_vec_info stmt_info)
+ {
+   stmt_vec_info orig_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
+   return (is_pattern_stmt_p (stmt_info)
+ 	  && STMT_VINFO_RELATED_STMT (orig_stmt_info) == stmt_info);
+ }
+ 
  /* If STMT_INFO is a pattern statement, return the statement that it
     replaces, otherwise return STMT_INFO itself.  */
  
Index: gcc/tree-vect-loop.c
===================================================================
*** gcc/tree-vect-loop.c	2018-07-30 12:32:46.654356310 +0100
--- gcc/tree-vect-loop.c	2018-07-30 12:32:49.894327770 +0100
*************** Software Foundation; either version 3, o
*** 155,172 ****
  
  static void vect_estimate_min_profitable_iters (loop_vec_info, int *, int *);
  
! /* Subroutine of vect_determine_vf_for_stmt that handles only one
!    statement.  VECTYPE_MAYBE_SET_P is true if STMT_VINFO_VECTYPE
!    may already be set for general statements (not just data refs).  */
  
  static bool
! vect_determine_vf_for_stmt_1 (stmt_vec_info stmt_info,
! 			      bool vectype_maybe_set_p,
! 			      poly_uint64 *vf,
! 			      vec<stmt_vec_info > *mask_producers)
  {
    gimple *stmt = stmt_info->stmt;
  
    if ((!STMT_VINFO_RELEVANT_P (stmt_info)
         && !STMT_VINFO_LIVE_P (stmt_info))
        || gimple_clobber_p (stmt))
--- 155,178 ----
  
  static void vect_estimate_min_profitable_iters (loop_vec_info, int *, int *);
  
! /* Subroutine of vect_determine_vectorization_factor.  Set the vector
!    type of STMT_INFO and update the vectorization factor VF accordingly.
!    If the statement produces a mask result whose vector type can only be
!    calculated later, add it to MASK_PRODUCERS.  Return true on success
!    or false if something prevented vectorization.  */
  
  static bool
! vect_determine_vf_for_stmt (stmt_vec_info stmt_info, poly_uint64 *vf,
! 			    vec<stmt_vec_info > *mask_producers)
  {
    gimple *stmt = stmt_info->stmt;
  
+   if (dump_enabled_p ())
+     {
+       dump_printf_loc (MSG_NOTE, vect_location, "==> examining statement: ");
+       dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
+     }
+ 
    if ((!STMT_VINFO_RELEVANT_P (stmt_info)
         && !STMT_VINFO_LIVE_P (stmt_info))
        || gimple_clobber_p (stmt))
*************** vect_determine_vf_for_stmt_1 (stmt_vec_i
*** 188,194 ****
  	   that contain a data ref, or for "pattern-stmts" (stmts generated
  	   by the vectorizer to represent/replace a certain idiom).  */
  	gcc_assert ((STMT_VINFO_DATA_REF (stmt_info)
! 		     || vectype_maybe_set_p)
  		    && STMT_VINFO_VECTYPE (stmt_info) == stmt_vectype);
        else if (stmt_vectype == boolean_type_node)
  	mask_producers->safe_push (stmt_info);
--- 194,200 ----
  	   that contain a data ref, or for "pattern-stmts" (stmts generated
  	   by the vectorizer to represent/replace a certain idiom).  */
  	gcc_assert ((STMT_VINFO_DATA_REF (stmt_info)
! 		     || is_pattern_stmt_p (stmt_info))
  		    && STMT_VINFO_VECTYPE (stmt_info) == stmt_vectype);
        else if (stmt_vectype == boolean_type_node)
  	mask_producers->safe_push (stmt_info);
*************** vect_determine_vf_for_stmt_1 (stmt_vec_i
*** 202,263 ****
    return true;
  }
  
- /* Subroutine of vect_determine_vectorization_factor.  Set the vector
-    types of STMT_INFO and all attached pattern statements and update
-    the vectorization factor VF accordingly.  If some of the statements
-    produce a mask result whose vector type can only be calculated later,
-    add them to MASK_PRODUCERS.  Return true on success or false if
-    something prevented vectorization.  */
- 
- static bool
- vect_determine_vf_for_stmt (stmt_vec_info stmt_info, poly_uint64 *vf,
- 			    vec<stmt_vec_info > *mask_producers)
- {
-   vec_info *vinfo = stmt_info->vinfo;
-   if (dump_enabled_p ())
-     {
-       dump_printf_loc (MSG_NOTE, vect_location, "==> examining statement: ");
-       dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
-     }
-   if (!vect_determine_vf_for_stmt_1 (stmt_info, false, vf, mask_producers))
-     return false;
- 
-   if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-       && STMT_VINFO_RELATED_STMT (stmt_info))
-     {
-       gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
-       stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
- 
-       /* If a pattern statement has def stmts, analyze them too.  */
-       for (gimple_stmt_iterator si = gsi_start (pattern_def_seq);
- 	   !gsi_end_p (si); gsi_next (&si))
- 	{
- 	  stmt_vec_info def_stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
- 	  if (dump_enabled_p ())
- 	    {
- 	      dump_printf_loc (MSG_NOTE, vect_location,
- 			       "==> examining pattern def stmt: ");
- 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM,
- 				def_stmt_info->stmt, 0);
- 	    }
- 	  if (!vect_determine_vf_for_stmt_1 (def_stmt_info, true,
- 					     vf, mask_producers))
- 	    return false;
- 	}
- 
-       if (dump_enabled_p ())
- 	{
- 	  dump_printf_loc (MSG_NOTE, vect_location,
- 			   "==> examining pattern statement: ");
- 	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
- 	}
-       if (!vect_determine_vf_for_stmt_1 (stmt_info, true, vf, mask_producers))
- 	return false;
-     }
- 
-   return true;
- }
- 
  /* Function vect_determine_vectorization_factor
  
     Determine the vectorization factor (VF).  VF is the number of data elements
--- 208,213 ----
*************** vect_compute_single_scalar_iteration_cos
*** 1078,1086 ****
            /* Skip stmts that are not vectorized inside the loop.  */
            if (stmt_info
                && !STMT_VINFO_RELEVANT_P (stmt_info)
!               && (!STMT_VINFO_LIVE_P (stmt_info)
!                   || !VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
! 	      && !STMT_VINFO_IN_PATTERN_P (stmt_info))
              continue;
  
  	  vect_cost_for_stmt kind;
--- 1028,1035 ----
            /* Skip stmts that are not vectorized inside the loop.  */
            if (stmt_info
                && !STMT_VINFO_RELEVANT_P (stmt_info)
! 	      && (!VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info))
! 		  || !STMT_VINFO_LIVE_P (stmt_info)))
              continue;
  
  	  vect_cost_for_stmt kind;
*************** vect_update_vf_for_slp (loop_vec_info lo
*** 1394,1407 ****
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	stmt_vec_info final_info = vect_stmt_to_vectorize (stmt_info);
! 	if ((STMT_VINFO_RELEVANT_P (final_info)
! 	     || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (final_info)))
! 	    && !PURE_SLP_STMT (final_info))
! 	  /* STMT needs both SLP and loop-based vectorization.  */
! 	  only_slp_in_loop = false;
!       }
  
    if (only_slp_in_loop)
      {
--- 1343,1353 ----
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if ((STMT_VINFO_RELEVANT_P (stmt_info)
! 	   || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
! 	  && !PURE_SLP_STMT (stmt_info))
! 	/* STMT needs both SLP and loop-based vectorization.  */
! 	only_slp_in_loop = false;
  
    if (only_slp_in_loop)
      {
*************** vect_analyze_loop_2 (loop_vec_info loop_
*** 2164,2181 ****
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	STMT_SLP_TYPE (stmt_info) = loop_vect;
! 	if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 	  {
! 	    gimple *pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 	    STMT_SLP_TYPE (STMT_VINFO_RELATED_STMT (stmt_info)) = loop_vect;
! 	    for (gimple_stmt_iterator pi = gsi_start (pattern_def_seq);
! 		 !gsi_end_p (pi); gsi_next (&pi))
! 	      STMT_SLP_TYPE (loop_vinfo->lookup_stmt (gsi_stmt (pi)))
! 		= loop_vect;
! 	  }
!       }
  
    /* Free optimized alias test DDRS.  */
    LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
--- 2110,2116 ----
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       STMT_SLP_TYPE (stmt_info) = loop_vect;
  
    /* Free optimized alias test DDRS.  */
    LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).truncate (0);
*************** vect_transform_loop (loop_vec_info loop_
*** 8371,8392 ****
  	    loop_vinfo->remove_stmt (stmt_info);
  	  else
  	    {
! 	      gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
! 	      if (STMT_VINFO_IN_PATTERN_P (stmt_info))
! 		{
! 		  gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info);
! 		  for (gimple_stmt_iterator subsi = gsi_start (def_seq);
! 		       !gsi_end_p (subsi); gsi_next (&subsi))
! 		    {
! 		      stmt_vec_info pat_stmt_info
! 			= loop_vinfo->lookup_stmt (gsi_stmt (subsi));
! 		      vect_transform_loop_stmt (loop_vinfo, pat_stmt_info,
! 						&si);
! 		    }
! 		  stmt_vec_info pat_stmt_info
! 		    = STMT_VINFO_RELATED_STMT (stmt_info);
! 		  vect_transform_loop_stmt (loop_vinfo, pat_stmt_info, &si);
! 		}
  	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si);
  	    }
  	}
--- 8306,8313 ----
  	    loop_vinfo->remove_stmt (stmt_info);
  	  else
  	    {
! 	      stmt_vec_info place = vect_orig_stmt (stmt_info);
! 	      gimple_stmt_iterator si = gsi_for_stmt (place->stmt);
  	      vect_transform_loop_stmt (loop_vinfo, stmt_info, &si);
  	    }
  	}
Index: gcc/tree-vect-slp.c
===================================================================
*** gcc/tree-vect-slp.c	2018-07-30 12:32:46.654356310 +0100
--- gcc/tree-vect-slp.c	2018-07-30 12:32:49.894327770 +0100
*************** vect_detect_hybrid_slp (loop_vec_info lo
*** 2411,2428 ****
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (STMT_VINFO_IN_PATTERN_P (stmt_info))
  	{
  	  walk_stmt_info wi;
  	  memset (&wi, 0, sizeof (wi));
  	  wi.info = loop_vinfo;
! 	  gimple_stmt_iterator gsi2
! 	    = gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)->stmt);
! 	  walk_gimple_stmt (&gsi2, vect_detect_hybrid_slp_2,
  			    vect_detect_hybrid_slp_1, &wi);
- 	  walk_gimple_seq (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
- 			   vect_detect_hybrid_slp_2,
- 			   vect_detect_hybrid_slp_1, &wi);
  	}
  
    /* Then walk the SLP instance trees marking stmts with uses in
--- 2411,2424 ----
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (loop_vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (is_pattern_stmt_p (stmt_info))
  	{
  	  walk_stmt_info wi;
  	  memset (&wi, 0, sizeof (wi));
  	  wi.info = loop_vinfo;
! 	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt_info->stmt);
! 	  walk_gimple_stmt (&gsi, vect_detect_hybrid_slp_2,
  			    vect_detect_hybrid_slp_1, &wi);
  	}
  
    /* Then walk the SLP instance trees marking stmts with uses in
Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c	2018-07-30 12:32:46.658356275 +0100
--- gcc/tree-vect-stmts.c	2018-07-30 12:32:49.898327734 +0100
*************** vect_analyze_stmt (stmt_vec_info stmt_in
*** 9384,9394 ****
  		   slp_tree node, slp_instance node_instance,
  		   stmt_vector_for_cost *cost_vec)
  {
-   vec_info *vinfo = stmt_info->vinfo;
    bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
    enum vect_relevant relevance = STMT_VINFO_RELEVANT (stmt_info);
    bool ok;
-   gimple_seq pattern_def_seq;
  
    if (dump_enabled_p ())
      {
--- 9384,9392 ----
*************** vect_analyze_stmt (stmt_vec_info stmt_in
*** 9405,9498 ****
        return false;
      }
  
-   if (STMT_VINFO_IN_PATTERN_P (stmt_info)
-       && node == NULL
-       && (pattern_def_seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info)))
-     {
-       gimple_stmt_iterator si;
- 
-       for (si = gsi_start (pattern_def_seq); !gsi_end_p (si); gsi_next (&si))
- 	{
- 	  stmt_vec_info pattern_def_stmt_info
- 	    = vinfo->lookup_stmt (gsi_stmt (si));
- 	  if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info)
- 	      || STMT_VINFO_LIVE_P (pattern_def_stmt_info))
- 	    {
- 	      /* Analyze def stmt of STMT if it's a pattern stmt.  */
- 	      if (dump_enabled_p ())
- 		{
- 		  dump_printf_loc (MSG_NOTE, vect_location,
- 				   "==> examining pattern def statement: ");
- 		  dump_gimple_stmt (MSG_NOTE, TDF_SLIM,
- 				    pattern_def_stmt_info->stmt, 0);
- 		}
- 
- 	      if (!vect_analyze_stmt (pattern_def_stmt_info,
- 				      need_to_vectorize, node, node_instance,
- 				      cost_vec))
- 		return false;
- 	    }
- 	}
-     }
- 
    /* Skip stmts that do not need to be vectorized. In loops this is expected
       to include:
       - the COND_EXPR which is the loop exit condition
       - any LABEL_EXPRs in the loop
       - computations that are used only for array indexing or loop control.
       In basic blocks we only analyze statements that are a part of some SLP
!      instance, therefore, all the statements are relevant.
! 
!      Pattern statement needs to be analyzed instead of the original statement
!      if the original statement is not relevant.  Otherwise, we analyze both
!      statements.  In basic blocks we are called from some SLP instance
!      traversal, don't analyze pattern stmts instead, the pattern stmts
!      already will be part of SLP instance.  */
  
-   stmt_vec_info pattern_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
    if (!STMT_VINFO_RELEVANT_P (stmt_info)
        && !STMT_VINFO_LIVE_P (stmt_info))
      {
-       if (STMT_VINFO_IN_PATTERN_P (stmt_info)
- 	  && pattern_stmt_info
- 	  && (STMT_VINFO_RELEVANT_P (pattern_stmt_info)
- 	      || STMT_VINFO_LIVE_P (pattern_stmt_info)))
-         {
-           /* Analyze PATTERN_STMT instead of the original stmt.  */
- 	  stmt_info = pattern_stmt_info;
-           if (dump_enabled_p ())
-             {
-               dump_printf_loc (MSG_NOTE, vect_location,
-                                "==> examining pattern statement: ");
- 	      dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
-             }
-         }
-       else
-         {
-           if (dump_enabled_p ())
-             dump_printf_loc (MSG_NOTE, vect_location, "irrelevant.\n");
- 
-           return true;
-         }
-     }
-   else if (STMT_VINFO_IN_PATTERN_P (stmt_info)
- 	   && node == NULL
- 	   && pattern_stmt_info
- 	   && (STMT_VINFO_RELEVANT_P (pattern_stmt_info)
- 	       || STMT_VINFO_LIVE_P (pattern_stmt_info)))
-     {
-       /* Analyze PATTERN_STMT too.  */
        if (dump_enabled_p ())
!         {
!           dump_printf_loc (MSG_NOTE, vect_location,
!                            "==> examining pattern statement: ");
! 	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt_info->stmt, 0);
!         }
! 
!       if (!vect_analyze_stmt (pattern_stmt_info, need_to_vectorize, node,
! 			      node_instance, cost_vec))
!         return false;
!    }
  
    switch (STMT_VINFO_DEF_TYPE (stmt_info))
      {
--- 9403,9423 ----
        return false;
      }
  
    /* Skip stmts that do not need to be vectorized. In loops this is expected
       to include:
       - the COND_EXPR which is the loop exit condition
       - any LABEL_EXPRs in the loop
       - computations that are used only for array indexing or loop control.
       In basic blocks we only analyze statements that are a part of some SLP
!      instance, therefore, all the statements are relevant.  */
  
    if (!STMT_VINFO_RELEVANT_P (stmt_info)
        && !STMT_VINFO_LIVE_P (stmt_info))
      {
        if (dump_enabled_p ())
! 	dump_printf_loc (MSG_NOTE, vect_location, "irrelevant.\n");
!       return true;
!     }
  
    switch (STMT_VINFO_DEF_TYPE (stmt_info))
      {
*************** vect_remove_dead_scalar_stmts (vec_info
*** 10915,10921 ****
  	   stmt_info = prev_stmt_info)
  	{
  	  prev_stmt_info = stmt_info->prev;
! 	  vect_maybe_remove_scalar_stmt (stmt_info);
  	}
      }
  }
--- 10840,10847 ----
  	   stmt_info = prev_stmt_info)
  	{
  	  prev_stmt_info = stmt_info->prev;
! 	  if (!is_pattern_stmt_p (stmt_info))
! 	    vect_maybe_remove_scalar_stmt (stmt_info);
  	}
      }
  }
Index: gcc/tree-vect-patterns.c
===================================================================
*** gcc/tree-vect-patterns.c	2018-07-30 12:32:42.786390386 +0100
--- gcc/tree-vect-patterns.c	2018-07-30 12:32:49.894327770 +0100
*************** vect_init_pattern_stmt (gimple *pattern_
*** 125,151 ****
  vect_set_pattern_stmt (gimple *pattern_stmt, stmt_vec_info orig_stmt_info,
  		       tree vectype)
  {
!   STMT_VINFO_IN_PATTERN_P (orig_stmt_info) = true;
!   STMT_VINFO_RELATED_STMT (orig_stmt_info)
      = vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype);
  }
  
! /* Add NEW_STMT to STMT_INFO's pattern definition statements.  If VECTYPE
!    is nonnull, record that NEW_STMT's vector type is VECTYPE, which might
!    be different from the vector type of the final pattern statement.  */
  
  static inline void
  append_pattern_def_seq (stmt_vec_info stmt_info, gimple *new_stmt,
  			tree vectype = NULL_TREE)
  {
!   vec_info *vinfo = stmt_info->vinfo;
!   if (vectype)
!     {
!       stmt_vec_info new_stmt_info = vinfo->add_stmt (new_stmt);
!       STMT_VINFO_VECTYPE (new_stmt_info) = vectype;
!     }
!   gimple_seq_add_stmt_without_update (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
! 				      new_stmt);
  }
  
  /* The caller wants to perform new operations on vect_external variable
--- 125,150 ----
  vect_set_pattern_stmt (gimple *pattern_stmt, stmt_vec_info orig_stmt_info,
  		       tree vectype)
  {
!   stmt_vec_info pattern_stmt_info
      = vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype);
+   orig_stmt_info->block->add_before (pattern_stmt_info, orig_stmt_info);
+   STMT_VINFO_IN_PATTERN_P (orig_stmt_info) = true;
+   STMT_VINFO_RELATED_STMT (orig_stmt_info) = pattern_stmt_info;
  }
  
! /* Add NEW_STMT to the pattern statements that replace STMT_INFO.
!    If VECTYPE is nonnull, record that NEW_STMT's vector type is VECTYPE,
!    which might be different from the vector type of the final pattern
!    statement.  */
  
  static inline void
  append_pattern_def_seq (stmt_vec_info stmt_info, gimple *new_stmt,
  			tree vectype = NULL_TREE)
  {
!   stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
!   stmt_vec_info new_stmt_info
!     = vect_init_pattern_stmt (new_stmt, orig_stmt_info, vectype);
!   stmt_info->block->add_before (new_stmt_info, stmt_info);
  }
  
  /* The caller wants to perform new operations on vect_external variable
*************** vect_split_statement (stmt_vec_info stmt
*** 633,643 ****
  {
    if (is_pattern_stmt_p (stmt2_info))
      {
-       /* STMT2_INFO is part of a pattern.  Get the statement to which
- 	 the pattern is attached.  */
-       stmt_vec_info orig_stmt2_info = STMT_VINFO_RELATED_STMT (stmt2_info);
-       vect_init_pattern_stmt (stmt1, orig_stmt2_info, vectype);
- 
        if (dump_enabled_p ())
  	{
  	  dump_printf_loc (MSG_NOTE, vect_location,
--- 632,637 ----
*************** vect_split_statement (stmt_vec_info stmt
*** 645,650 ****
--- 639,647 ----
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
  	}
  
+       /* Insert STMT1_INFO before STMT2_INFO.  */
+       append_pattern_def_seq (stmt2_info, stmt1, vectype);
+ 
        /* Since STMT2_INFO is a pattern statement, we can change it
  	 in-situ without worrying about changing the code for the
  	 containing block.  */
*************** vect_split_statement (stmt_vec_info stmt
*** 658,675 ****
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
  	}
  
-       gimple_seq *def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt2_info);
-       if (STMT_VINFO_RELATED_STMT (orig_stmt2_info) == stmt2_info)
- 	/* STMT2_INFO is the actual pattern statement.  Add STMT1
- 	   to the end of the definition sequence.  */
- 	gimple_seq_add_stmt_without_update (def_seq, stmt1);
-       else
- 	{
- 	  /* STMT2_INFO belongs to the definition sequence.  Insert STMT1
- 	     before it.  */
- 	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt2_info->stmt, def_seq);
- 	  gsi_insert_before_without_update (&gsi, stmt1, GSI_SAME_STMT);
- 	}
        return true;
      }
    else
--- 655,660 ----
*************** vect_split_statement (stmt_vec_info stmt
*** 689,698 ****
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
  	}
  
!       /* Add STMT1 as a singleton pattern definition sequence.  */
!       gimple_seq *def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (stmt2_info);
!       vect_init_pattern_stmt (stmt1, stmt2_info, vectype);
!       gimple_seq_add_stmt_without_update (def_seq, stmt1);
  
        /* Build the second of the two pattern statements.  */
        tree new_lhs = vect_recog_temp_ssa_var (lhs_type, NULL);
--- 674,681 ----
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt2_info->stmt, 0);
  	}
  
!       /* Insert STMT1_INFO before STMT2_INFO.  */
!       append_pattern_def_seq (stmt2_info, stmt1, vectype);
  
        /* Build the second of the two pattern statements.  */
        tree new_lhs = vect_recog_temp_ssa_var (lhs_type, NULL);
*************** vect_recog_rotate_pattern (stmt_vec_info
*** 2164,2170 ****
      i.e. the shift/rotate stmt.  The original stmt (S3) is replaced
      with a shift/rotate which has same type on both operands, in the
      second case just b_T op c_T, in the first case with added cast
!     from a_t to c_T in STMT_VINFO_PATTERN_DEF_SEQ.
  
    Output:
  
--- 2147,2153 ----
      i.e. the shift/rotate stmt.  The original stmt (S3) is replaced
      with a shift/rotate which has same type on both operands, in the
      second case just b_T op c_T, in the first case with added cast
!     from a_t to c_T beforehand.
  
    Output:
  
*************** adjust_bool_stmts (hash_set <gimple *> &
*** 3518,3526 ****
      adjust_bool_pattern (gimple_assign_lhs (bool_stmts[i]),
  			 out_type, stmt_info, defs);
  
!   /* Pop the last pattern seq stmt and install it as pattern root for STMT.  */
!   gimple *pattern_stmt
!     = gimple_seq_last_stmt (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
    return gimple_assign_lhs (pattern_stmt);
  }
  
--- 3501,3508 ----
      adjust_bool_pattern (gimple_assign_lhs (bool_stmts[i]),
  			 out_type, stmt_info, defs);
  
!   /* Return the result of the last statement we emitted.  */
!   gimple *pattern_stmt = stmt_info->prev->stmt;
    return gimple_assign_lhs (pattern_stmt);
  }
  
*************** static vect_recog_func vect_vect_recog_f
*** 4676,4689 ****
  
  const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
  
! /* Mark statements that are involved in a pattern.  */
  
  static inline void
! vect_mark_pattern_stmts (stmt_vec_info orig_stmt_info, gimple *pattern_stmt,
!                          tree pattern_vectype)
  {
-   gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
- 
    gimple *orig_pattern_stmt = NULL;
    if (is_pattern_stmt_p (orig_stmt_info))
      {
--- 4658,4671 ----
  
  const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
  
! /* Replace ORIG_STMT_INFO with PATTERN_STMT, using PATTERN_VECTYPE as
!    the vector type for PATTERN_STMT.  */
  
  static inline void
! vect_replace_stmt_with_pattern (stmt_vec_info orig_stmt_info,
! 				gimple *pattern_stmt,
! 				tree pattern_vectype)
  {
    gimple *orig_pattern_stmt = NULL;
    if (is_pattern_stmt_p (orig_stmt_info))
      {
*************** vect_mark_pattern_stmts (stmt_vec_info o
*** 4710,4741 ****
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
  	}
  
-       /* Switch to the statement that ORIG replaces.  */
-       orig_stmt_info = STMT_VINFO_RELATED_STMT (orig_stmt_info);
- 
        /* We shouldn't be replacing the main pattern statement.  */
!       gcc_assert (STMT_VINFO_RELATED_STMT (orig_stmt_info)->stmt
! 		  != orig_pattern_stmt);
!     }
  
!   if (def_seq)
!     for (gimple_stmt_iterator si = gsi_start (def_seq);
! 	 !gsi_end_p (si); gsi_next (&si))
!       vect_init_pattern_stmt (gsi_stmt (si), orig_stmt_info, pattern_vectype);
! 
!   if (orig_pattern_stmt)
!     {
!       vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
! 
!       /* Insert all the new pattern statements before the original one.  */
!       gimple_seq *orig_def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
!       gimple_stmt_iterator gsi = gsi_for_stmt (orig_pattern_stmt,
! 					       orig_def_seq);
!       gsi_insert_seq_before_without_update (&gsi, def_seq, GSI_SAME_STMT);
!       gsi_insert_before_without_update (&gsi, pattern_stmt, GSI_SAME_STMT);
  
        /* Remove the pattern statement that this new pattern replaces.  */
!       gsi_remove (&gsi, false);
      }
    else
      vect_set_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
--- 4692,4705 ----
  	  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
  	}
  
        /* We shouldn't be replacing the main pattern statement.  */
!       gcc_assert (!is_main_pattern_stmt_p (orig_stmt_info));
  
!       /* Insert the new pattern statement before the original one.  */
!       append_pattern_def_seq (orig_stmt_info, pattern_stmt, pattern_vectype);
  
        /* Remove the pattern statement that this new pattern replaces.  */
!       orig_stmt_info->block->remove (orig_stmt_info);
      }
    else
      vect_set_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
*************** vect_mark_pattern_stmts (stmt_vec_info o
*** 4762,4791 ****
  static void
  vect_pattern_recog_1 (vect_recog_func *recog_func, stmt_vec_info stmt_info)
  {
-   vec_info *vinfo = stmt_info->vinfo;
    gimple *pattern_stmt;
    loop_vec_info loop_vinfo;
    tree pattern_vectype;
  
!   /* If this statement has already been replaced with pattern statements,
!      leave the original statement alone, since the first match wins.
!      Instead try to match against the definition statements that feed
!      the main pattern statement.  */
!   if (STMT_VINFO_IN_PATTERN_P (stmt_info))
!     {
!       gimple_stmt_iterator gsi;
!       for (gsi = gsi_start (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
! 	   !gsi_end_p (gsi); gsi_next (&gsi))
! 	vect_pattern_recog_1 (recog_func, vinfo->lookup_stmt (gsi_stmt (gsi)));
!       return;
!     }
! 
!   gcc_assert (!STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
    pattern_stmt = recog_func->fn (stmt_info, &pattern_vectype);
    if (!pattern_stmt)
      {
!       /* Clear any half-formed pattern definition sequence.  */
!       STMT_VINFO_PATTERN_DEF_SEQ (stmt_info) = NULL;
        return;
      }
  
--- 4726,4742 ----
  static void
  vect_pattern_recog_1 (vect_recog_func *recog_func, stmt_vec_info stmt_info)
  {
    gimple *pattern_stmt;
    loop_vec_info loop_vinfo;
    tree pattern_vectype;
  
!   stmt_vec_info prev_stmt_info = stmt_info->prev;
    pattern_stmt = recog_func->fn (stmt_info, &pattern_vectype);
    if (!pattern_stmt)
      {
!       /* Delete any half-formed pattern sequence.  */
!       while (stmt_info->prev != prev_stmt_info)
! 	stmt_info->block->remove (prev_stmt_info);
        return;
      }
  
*************** vect_pattern_recog_1 (vect_recog_func *r
*** 4800,4807 ****
        dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
      }
  
    /* Mark the stmts that are involved in the pattern. */
!   vect_mark_pattern_stmts (stmt_info, pattern_stmt, pattern_vectype);
  
    /* Patterns cannot be vectorized using SLP, because they change the order of
       computation.  */
--- 4751,4765 ----
        dump_gimple_stmt (MSG_NOTE, TDF_SLIM, pattern_stmt, 0);
      }
  
+   /* Install the vector type in pattern definition statements that
+      don't yet have one.  */
+   for (stmt_vec_info pat_stmt_info = stmt_info->prev;
+        pat_stmt_info != prev_stmt_info; pat_stmt_info = pat_stmt_info->prev)
+     if (!STMT_VINFO_VECTYPE (pat_stmt_info))
+       STMT_VINFO_VECTYPE (pat_stmt_info) = pattern_vectype;
+ 
    /* Mark the stmts that are involved in the pattern. */
!   vect_replace_stmt_with_pattern (stmt_info, pattern_stmt, pattern_vectype);
  
    /* Patterns cannot be vectorized using SLP, because they change the order of
       computation.  */
*************** vect_pattern_recog (vec_info *vinfo)
*** 4903,4910 ****
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       if (STMT_VINFO_VECTORIZABLE (stmt_info))
! 	/* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	for (unsigned int j = 0; j < NUM_PATTERNS; j++)
! 	  vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j], stmt_info);
  }
--- 4861,4887 ----
    vec_basic_block *vec_bb;
    FOR_EACH_VEC_ELT (vinfo->blocks, i, vec_bb)
      FOR_EACH_VEC_BB_STMT (vec_bb, stmt_info)
!       {
! 	stmt_vec_info begin_prev = stmt_info->prev;
! 	if (STMT_VINFO_VECTORIZABLE (stmt_info))
! 	  /* Scan over all generic vect_recog_xxx_pattern functions.  */
! 	  for (unsigned int j = 0; j < NUM_PATTERNS; j++)
! 	    {
! 	      stmt_vec_info curr_prev;
! 	      /* Scan over STMT_INFO and any pattern definition statements
! 		 that were introduced by previous recognizers.  */
! 	      for (stmt_vec_info curr_info = stmt_info;
! 		   curr_info != begin_prev; curr_info = curr_prev)
! 		{
! 		  curr_prev = curr_info->prev;
! 		  /* The first match wins, so skip statements that have
! 		     already been replaced, and the final statement with
! 		     which they were replaced.  */
! 		  if (!STMT_VINFO_IN_PATTERN_P (curr_info)
! 		      && !is_main_pattern_stmt_p (curr_info))
! 		    vect_pattern_recog_1 (&vect_vect_recog_func_ptrs[j],
! 					  curr_info);
! 		}
! 	    }
!       }
  }
Richard Biener Aug. 1, 2018, 12:52 p.m. | #6
On Mon, Jul 30, 2018 at 1:41 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> Invariant loads were handled as a variation on the code for contiguous

> loads.  We detected whether they were invariant or not as a byproduct of

> creating the vector pointer ivs: vect_create_data_ref_ptr passed back an

> inv_p to say whether the pointer was invariant.

>

> But vectorised invariant loads just keep the original scalar load,

> so this meant that detecting invariant loads had the side-effect of

> creating an unwanted vector pointer iv.  The placement of the code

> also meant that we'd create a vector load and then not use the result.

> In principle this is wrong code, since there's no guarantee that there's

> a vector's worth of accessible data at that address, but we rely on DCE

> to get rid of the load before any harm is done.

>

> E.g., for an invariant load in an inner loop (which seems like the more

> common use case for this code), we'd create:

>

>    vectp_a.6_52 = &a + 4;

>

>    # vectp_a.5_53 = PHI <vectp_a.5_54(9), vectp_a.6_52(2)>

>

>    # vectp_a.5_55 = PHI <vectp_a.5_53(3), vectp_a.5_56(10)>

>

>    vect_next_a_11.7_57 = MEM[(int *)vectp_a.5_55];

>    next_a_11 = a[_1];

>    vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};

>

>    vectp_a.5_56 = vectp_a.5_55 + 4;

>

>    vectp_a.5_54 = vectp_a.5_53 + 0;

>

> whereas all we want is:

>

>    next_a_11 = a[_1];

>    vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};

>

> This patch moves the handling to its own block and makes

> vect_create_data_ref_ptr assert (when creating a full iv) that the

> address isn't invariant.

>

> The ncopies handling is unfortunate, but a preexisting issue.

> Richi's suggestion of using a vector of vector statements would

> let us reuse one statement for all copies.


OK.

Richard.

>

> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

>

> gcc/

>         * tree-vectorizer.h (vect_create_data_ref_ptr): Remove inv_p

>         parameter.

>         * tree-vect-data-refs.c (vect_create_data_ref_ptr): Likewise.

>         When creating an iv, assert that the step is not known to be zero.

>         (vect_setup_realignment): Update call accordingly.

>         * tree-vect-stmts.c (vectorizable_store): Likewise.

>         (vectorizable_load): Likewise.  Handle VMAT_INVARIANT separately.

>

> Index: gcc/tree-vectorizer.h

> ===================================================================

> *** gcc/tree-vectorizer.h       2018-07-30 12:32:29.586506669 +0100

> --- gcc/tree-vectorizer.h       2018-07-30 12:40:13.000000000 +0100

> *************** extern bool vect_analyze_data_refs (vec_

> *** 1527,1533 ****

>   extern void vect_record_base_alignments (vec_info *);

>   extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, tree,

>                                       tree *, gimple_stmt_iterator *,

> !                                     gimple **, bool, bool *,

>                                       tree = NULL_TREE, tree = NULL_TREE);

>   extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,

>                              stmt_vec_info, tree);

> --- 1527,1533 ----

>   extern void vect_record_base_alignments (vec_info *);

>   extern tree vect_create_data_ref_ptr (stmt_vec_info, tree, struct loop *, tree,

>                                       tree *, gimple_stmt_iterator *,

> !                                     gimple **, bool,

>                                       tree = NULL_TREE, tree = NULL_TREE);

>   extern tree bump_vector_ptr (tree, gimple *, gimple_stmt_iterator *,

>                              stmt_vec_info, tree);

> Index: gcc/tree-vect-data-refs.c

> ===================================================================

> *** gcc/tree-vect-data-refs.c   2018-07-30 12:32:26.214536374 +0100

> --- gcc/tree-vect-data-refs.c   2018-07-30 12:32:32.546480596 +0100

> *************** vect_create_addr_base_for_vector_ref (st

> *** 4674,4689 ****

>

>         Return the increment stmt that updates the pointer in PTR_INCR.

>

> !    3. Set INV_P to true if the access pattern of the data reference in the

> !       vectorized loop is invariant.  Set it to false otherwise.

> !

> !    4. Return the pointer.  */

>

>   tree

>   vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,

>                           struct loop *at_loop, tree offset,

>                           tree *initial_address, gimple_stmt_iterator *gsi,

> !                         gimple **ptr_incr, bool only_init, bool *inv_p,

>                           tree byte_offset, tree iv_step)

>   {

>     const char *base_name;

> --- 4674,4686 ----

>

>         Return the increment stmt that updates the pointer in PTR_INCR.

>

> !    3. Return the pointer.  */

>

>   tree

>   vect_create_data_ref_ptr (stmt_vec_info stmt_info, tree aggr_type,

>                           struct loop *at_loop, tree offset,

>                           tree *initial_address, gimple_stmt_iterator *gsi,

> !                         gimple **ptr_incr, bool only_init,

>                           tree byte_offset, tree iv_step)

>   {

>     const char *base_name;

> *************** vect_create_data_ref_ptr (stmt_vec_info

> *** 4705,4711 ****

>     bool insert_after;

>     tree indx_before_incr, indx_after_incr;

>     gimple *incr;

> -   tree step;

>     bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);

>

>     gcc_assert (iv_step != NULL_TREE

> --- 4702,4707 ----

> *************** vect_create_data_ref_ptr (stmt_vec_info

> *** 4726,4739 ****

>         *ptr_incr = NULL;

>       }

>

> -   /* Check the step (evolution) of the load in LOOP, and record

> -      whether it's invariant.  */

> -   step = vect_dr_behavior (dr_info)->step;

> -   if (integer_zerop (step))

> -     *inv_p = true;

> -   else

> -     *inv_p = false;

> -

>     /* Create an expression for the first address accessed by this load

>        in LOOP.  */

>     base_name = get_name (DR_BASE_ADDRESS (dr));

> --- 4722,4727 ----

> *************** vect_create_data_ref_ptr (stmt_vec_info

> *** 4849,4863 ****

>       aptr = aggr_ptr_init;

>     else

>       {

>         if (iv_step == NULL_TREE)

>         {

> !         /* The step of the aggregate pointer is the type size.  */

>           iv_step = TYPE_SIZE_UNIT (aggr_type);

> !         /* One exception to the above is when the scalar step of the load in

> !            LOOP is zero. In this case the step here is also zero.  */

> !         if (*inv_p)

> !           iv_step = size_zero_node;

> !         else if (tree_int_cst_sgn (step) == -1)

>             iv_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (iv_step), iv_step);

>         }

>

> --- 4837,4853 ----

>       aptr = aggr_ptr_init;

>     else

>       {

> +       /* Accesses to invariant addresses should be handled specially

> +        by the caller.  */

> +       tree step = vect_dr_behavior (dr_info)->step;

> +       gcc_assert (!integer_zerop (step));

> +

>         if (iv_step == NULL_TREE)

>         {

> !         /* The step of the aggregate pointer is the type size,

> !            negated for downward accesses.  */

>           iv_step = TYPE_SIZE_UNIT (aggr_type);

> !         if (tree_int_cst_sgn (step) == -1)

>             iv_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (iv_step), iv_step);

>         }

>

> *************** vect_setup_realignment (stmt_vec_info st

> *** 5462,5468 ****

>     gphi *phi_stmt;

>     tree msq = NULL_TREE;

>     gimple_seq stmts = NULL;

> -   bool inv_p;

>     bool compute_in_loop = false;

>     bool nested_in_vect_loop = false;

>     struct loop *containing_loop = (gimple_bb (stmt_info->stmt))->loop_father;

> --- 5452,5457 ----

> *************** vect_setup_realignment (stmt_vec_info st

> *** 5556,5562 ****

>         vec_dest = vect_create_destination_var (scalar_dest, vectype);

>         ptr = vect_create_data_ref_ptr (stmt_info, vectype,

>                                       loop_for_initial_load, NULL_TREE,

> !                                     &init_addr, NULL, &inc, true, &inv_p);

>         if (TREE_CODE (ptr) == SSA_NAME)

>         new_temp = copy_ssa_name (ptr);

>         else

> --- 5545,5551 ----

>         vec_dest = vect_create_destination_var (scalar_dest, vectype);

>         ptr = vect_create_data_ref_ptr (stmt_info, vectype,

>                                       loop_for_initial_load, NULL_TREE,

> !                                     &init_addr, NULL, &inc, true);

>         if (TREE_CODE (ptr) == SSA_NAME)

>         new_temp = copy_ssa_name (ptr);

>         else

> Index: gcc/tree-vect-stmts.c

> ===================================================================

> *** gcc/tree-vect-stmts.c       2018-07-30 12:32:29.586506669 +0100

> --- gcc/tree-vect-stmts.c       2018-07-30 12:40:14.000000000 +0100

> *************** vectorizable_store (stmt_vec_info stmt_i

> *** 6254,6260 ****

>     unsigned int group_size, i;

>     vec<tree> oprnds = vNULL;

>     vec<tree> result_chain = vNULL;

> -   bool inv_p;

>     tree offset = NULL_TREE;

>     vec<tree> vec_oprnds = vNULL;

>     bool slp = (slp_node != NULL);

> --- 6254,6259 ----

> *************** vectorizable_store (stmt_vec_info stmt_i

> *** 7018,7039 ****

>             {

>               dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));

>               dataref_offset = build_int_cst (ref_type, 0);

> -             inv_p = false;

>             }

>           else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))

> !           {

> !             vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,

> !                                          &dataref_ptr, &vec_offset);

> !             inv_p = false;

> !           }

>           else

>             dataref_ptr

>               = vect_create_data_ref_ptr (first_stmt_info, aggr_type,

>                                           simd_lane_access_p ? loop : NULL,

>                                           offset, &dummy, gsi, &ptr_incr,

> !                                         simd_lane_access_p, &inv_p,

> !                                         NULL_TREE, bump);

> !         gcc_assert (bb_vinfo || !inv_p);

>         }

>         else

>         {

> --- 7017,7032 ----

>             {

>               dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));

>               dataref_offset = build_int_cst (ref_type, 0);

>             }

>           else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))

> !           vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,

> !                                        &dataref_ptr, &vec_offset);

>           else

>             dataref_ptr

>               = vect_create_data_ref_ptr (first_stmt_info, aggr_type,

>                                           simd_lane_access_p ? loop : NULL,

>                                           offset, &dummy, gsi, &ptr_incr,

> !                                         simd_lane_access_p, NULL_TREE, bump);

>         }

>         else

>         {

> *************** vectorizable_load (stmt_vec_info stmt_in

> *** 7419,7425 ****

>     bool grouped_load = false;

>     stmt_vec_info first_stmt_info;

>     stmt_vec_info first_stmt_info_for_drptr = NULL;

> -   bool inv_p;

>     bool compute_in_loop = false;

>     struct loop *at_loop;

>     int vec_num;

> --- 7412,7417 ----

> *************** vectorizable_load (stmt_vec_info stmt_in

> *** 7669,7674 ****

> --- 7661,7723 ----

>         return true;

>       }

>

> +   if (memory_access_type == VMAT_INVARIANT)

> +     {

> +       gcc_assert (!grouped_load && !mask && !bb_vinfo);

> +       /* If we have versioned for aliasing or the loop doesn't

> +        have any data dependencies that would preclude this,

> +        then we are sure this is a loop invariant load and

> +        thus we can insert it on the preheader edge.  */

> +       bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)

> +                     && !nested_in_vect_loop

> +                     && hoist_defs_of_uses (stmt_info, loop));

> +       if (hoist_p)

> +       {

> +         gassign *stmt = as_a <gassign *> (stmt_info->stmt);

> +         if (dump_enabled_p ())

> +           {

> +             dump_printf_loc (MSG_NOTE, vect_location,

> +                              "hoisting out of the vectorized loop: ");

> +             dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);

> +           }

> +         scalar_dest = copy_ssa_name (scalar_dest);

> +         tree rhs = unshare_expr (gimple_assign_rhs1 (stmt));

> +         gsi_insert_on_edge_immediate

> +           (loop_preheader_edge (loop),

> +            gimple_build_assign (scalar_dest, rhs));

> +       }

> +       /* These copies are all equivalent, but currently the representation

> +        requires a separate STMT_VINFO_VEC_STMT for each one.  */

> +       prev_stmt_info = NULL;

> +       gimple_stmt_iterator gsi2 = *gsi;

> +       gsi_next (&gsi2);

> +       for (j = 0; j < ncopies; j++)

> +       {

> +         stmt_vec_info new_stmt_info;

> +         if (hoist_p)

> +           {

> +             new_temp = vect_init_vector (stmt_info, scalar_dest,

> +                                          vectype, NULL);

> +             gimple *new_stmt = SSA_NAME_DEF_STMT (new_temp);

> +             new_stmt_info = vinfo->add_stmt (new_stmt);

> +           }

> +         else

> +           {

> +             new_temp = vect_init_vector (stmt_info, scalar_dest,

> +                                          vectype, &gsi2);

> +             new_stmt_info = vinfo->lookup_def (new_temp);

> +           }

> +         if (slp)

> +           SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt_info);

> +         else if (j == 0)

> +           STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt_info;

> +         else

> +           STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt_info;

> +         prev_stmt_info = new_stmt_info;

> +       }

> +       return true;

> +     }

> +

>     if (memory_access_type == VMAT_ELEMENTWISE

>         || memory_access_type == VMAT_STRIDED_SLP)

>       {

> *************** vectorizable_load (stmt_vec_info stmt_in

> *** 8177,8183 ****

>             {

>               dataref_ptr = unshare_expr (DR_BASE_ADDRESS (first_dr_info->dr));

>               dataref_offset = build_int_cst (ref_type, 0);

> -             inv_p = false;

>             }

>           else if (first_stmt_info_for_drptr

>                    && first_stmt_info != first_stmt_info_for_drptr)

> --- 8226,8231 ----

> *************** vectorizable_load (stmt_vec_info stmt_in

> *** 8186,8192 ****

>                 = vect_create_data_ref_ptr (first_stmt_info_for_drptr,

>                                             aggr_type, at_loop, offset, &dummy,

>                                             gsi, &ptr_incr, simd_lane_access_p,

> !                                           &inv_p, byte_offset, bump);

>               /* Adjust the pointer by the difference to first_stmt.  */

>               data_reference_p ptrdr

>                 = STMT_VINFO_DATA_REF (first_stmt_info_for_drptr);

> --- 8234,8240 ----

>                 = vect_create_data_ref_ptr (first_stmt_info_for_drptr,

>                                             aggr_type, at_loop, offset, &dummy,

>                                             gsi, &ptr_incr, simd_lane_access_p,

> !                                           byte_offset, bump);

>               /* Adjust the pointer by the difference to first_stmt.  */

>               data_reference_p ptrdr

>                 = STMT_VINFO_DATA_REF (first_stmt_info_for_drptr);

> *************** vectorizable_load (stmt_vec_info stmt_in

> *** 8199,8214 ****

>                                              stmt_info, diff);

>             }

>           else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))

> !           {

> !             vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,

> !                                          &dataref_ptr, &vec_offset);

> !             inv_p = false;

> !           }

>           else

>             dataref_ptr

>               = vect_create_data_ref_ptr (first_stmt_info, aggr_type, at_loop,

>                                           offset, &dummy, gsi, &ptr_incr,

> !                                         simd_lane_access_p, &inv_p,

>                                           byte_offset, bump);

>           if (mask)

>             vec_mask = vect_get_vec_def_for_operand (mask, stmt_info,

> --- 8247,8259 ----

>                                              stmt_info, diff);

>             }

>           else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))

> !           vect_get_gather_scatter_ops (loop, stmt_info, &gs_info,

> !                                        &dataref_ptr, &vec_offset);

>           else

>             dataref_ptr

>               = vect_create_data_ref_ptr (first_stmt_info, aggr_type, at_loop,

>                                           offset, &dummy, gsi, &ptr_incr,

> !                                         simd_lane_access_p,

>                                           byte_offset, bump);

>           if (mask)

>             vec_mask = vect_get_vec_def_for_operand (mask, stmt_info,

> *************** vectorizable_load (stmt_vec_info stmt_in

> *** 8492,8538 ****

>                     }

>                 }

>

> -             /* 4. Handle invariant-load.  */

> -             if (inv_p && !bb_vinfo)

> -               {

> -                 gcc_assert (!grouped_load);

> -                 /* If we have versioned for aliasing or the loop doesn't

> -                    have any data dependencies that would preclude this,

> -                    then we are sure this is a loop invariant load and

> -                    thus we can insert it on the preheader edge.  */

> -                 if (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo)

> -                     && !nested_in_vect_loop

> -                     && hoist_defs_of_uses (stmt_info, loop))

> -                   {

> -                     gassign *stmt = as_a <gassign *> (stmt_info->stmt);

> -                     if (dump_enabled_p ())

> -                       {

> -                         dump_printf_loc (MSG_NOTE, vect_location,

> -                                          "hoisting out of the vectorized "

> -                                          "loop: ");

> -                         dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);

> -                       }

> -                     tree tem = copy_ssa_name (scalar_dest);

> -                     gsi_insert_on_edge_immediate

> -                       (loop_preheader_edge (loop),

> -                        gimple_build_assign (tem,

> -                                             unshare_expr

> -                                               (gimple_assign_rhs1 (stmt))));

> -                     new_temp = vect_init_vector (stmt_info, tem,

> -                                                  vectype, NULL);

> -                     new_stmt = SSA_NAME_DEF_STMT (new_temp);

> -                     new_stmt_info = vinfo->add_stmt (new_stmt);

> -                   }

> -                 else

> -                   {

> -                     gimple_stmt_iterator gsi2 = *gsi;

> -                     gsi_next (&gsi2);

> -                     new_temp = vect_init_vector (stmt_info, scalar_dest,

> -                                                  vectype, &gsi2);

> -                     new_stmt_info = vinfo->lookup_def (new_temp);

> -                   }

> -               }

> -

>               if (memory_access_type == VMAT_CONTIGUOUS_REVERSE)

>                 {

>                   tree perm_mask = perm_mask_for_reverse (vectype);

> --- 8537,8542 ----
Richard Biener Aug. 1, 2018, 12:57 p.m. | #7
On Mon, Jul 30, 2018 at 1:42 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> _loop_vec_info::_loop_vec_info used get_loop_array to get the

> order of the blocks when creating stmt_vec_infos, but then used

> dfs_enumerate_from to get the order of the blocks that the rest

> of the vectoriser uses.  We should be able to use that order

> for creating stmt_vec_infos too.


OK.  Note I have rev_post_order_and_mark_dfs_back_seme for a patch I'm working
on (RPO order on a single-entry multiple-exit region).  I'll try to
remember that "fixme".

Richard.

>

> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

>

> gcc/

>         * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Use the

>         result of dfs_enumerate_from when constructing stmt_vec_infos,

>         instead of additionally calling get_loop_body.

>

> Index: gcc/tree-vect-loop.c

> ===================================================================

> *** gcc/tree-vect-loop.c        2018-07-30 12:40:59.366015643 +0100

> --- gcc/tree-vect-loop.c        2018-07-30 12:40:59.362015678 +0100

> *************** _loop_vec_info::_loop_vec_info (struct l

> *** 834,844 ****

>       scalar_loop (NULL),

>       orig_loop_info (NULL)

>   {

> !   /* Create/Update stmt_info for all stmts in the loop.  */

> !   basic_block *body = get_loop_body (loop);

> !   for (unsigned int i = 0; i < loop->num_nodes; i++)

>       {

> !       basic_block bb = body[i];

>         gimple_stmt_iterator si;

>

>         for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))

> --- 834,851 ----

>       scalar_loop (NULL),

>       orig_loop_info (NULL)

>   {

> !   /* CHECKME: We want to visit all BBs before their successors (except for

> !      latch blocks, for which this assertion wouldn't hold).  In the simple

> !      case of the loop forms we allow, a dfs order of the BBs would the same

> !      as reversed postorder traversal, so we are safe.  */

> !

> !   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,

> !                                         bbs, loop->num_nodes, loop);

> !   gcc_assert (nbbs == loop->num_nodes);

> !

> !   for (unsigned int i = 0; i < nbbs; i++)

>       {

> !       basic_block bb = bbs[i];

>         gimple_stmt_iterator si;

>

>         for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next (&si))

> *************** _loop_vec_info::_loop_vec_info (struct l

> *** 855,870 ****

>           add_stmt (stmt);

>         }

>       }

> -   free (body);

> -

> -   /* CHECKME: We want to visit all BBs before their successors (except for

> -      latch blocks, for which this assertion wouldn't hold).  In the simple

> -      case of the loop forms we allow, a dfs order of the BBs would the same

> -      as reversed postorder traversal, so we are safe.  */

> -

> -   unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,

> -                                         bbs, loop->num_nodes, loop);

> -   gcc_assert (nbbs == loop->num_nodes);

>   }

>

>   /* Free all levels of MASKS.  */

> --- 862,867 ----
Richard Biener Aug. 1, 2018, 1 p.m. | #8
On Mon, Jul 30, 2018 at 1:43 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> This patch makes hoist_defs_of_uses use vec_info::lookup_def instead of:

>

>       if (!gimple_nop_p (def_stmt)

>           && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))

>

> to test whether a feeding scalar statement needs to be hoisted out

> of the vectorised loop.  It isn't worth doing in its own right,

> but it's a prerequisite for the next patch, which needs to update

> the stmt_vec_infos of the hoisted statements.


OK.

>

> 2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>

>

> gcc/

>         * tree-vect-stmts.c (hoist_defs_of_uses): Use vec_info::lookup_def

>         instead of gimple_nop_p and flow_bb_inside_loop_p to decide

>         whether a statement needs to be hoisted.

>

> Index: gcc/tree-vect-stmts.c

> ===================================================================

> *** gcc/tree-vect-stmts.c       2018-07-30 12:42:35.633169005 +0100

> --- gcc/tree-vect-stmts.c       2018-07-30 12:42:35.629169040 +0100

> *************** permute_vec_elements (tree x, tree y, tr

> *** 7322,7370 ****

>   static bool

>   hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)

>   {

>     ssa_op_iter i;

>     tree op;

>     bool any = false;

>

>     FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)

> !     {

> !       gimple *def_stmt = SSA_NAME_DEF_STMT (op);

> !       if (!gimple_nop_p (def_stmt)

> !         && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))

> !       {

> !         /* Make sure we don't need to recurse.  While we could do

> !            so in simple cases when there are more complex use webs

> !            we don't have an easy way to preserve stmt order to fulfil

> !            dependencies within them.  */

> !         tree op2;

> !         ssa_op_iter i2;

> !         if (gimple_code (def_stmt) == GIMPLE_PHI)

>             return false;

> !         FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt, i2, SSA_OP_USE)

> !           {

> !             gimple *def_stmt2 = SSA_NAME_DEF_STMT (op2);

> !             if (!gimple_nop_p (def_stmt2)

> !                 && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt2)))

> !               return false;

> !           }

> !         any = true;

> !       }

> !     }

>

>     if (!any)

>       return true;

>

>     FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)

> !     {

> !       gimple *def_stmt = SSA_NAME_DEF_STMT (op);

> !       if (!gimple_nop_p (def_stmt)

> !         && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))

> !       {

> !         gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt);

> !         gsi_remove (&gsi, false);

> !         gsi_insert_on_edge_immediate (loop_preheader_edge (loop), def_stmt);

> !       }

> !     }

>

>     return true;

>   }

> --- 7322,7360 ----

>   static bool

>   hoist_defs_of_uses (stmt_vec_info stmt_info, struct loop *loop)

>   {

> +   vec_info *vinfo = stmt_info->vinfo;

>     ssa_op_iter i;

>     tree op;

>     bool any = false;

>

>     FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)

> !     if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))

> !       {

> !       /* Make sure we don't need to recurse.  While we could do

> !          so in simple cases when there are more complex use webs

> !          we don't have an easy way to preserve stmt order to fulfil

> !          dependencies within them.  */

> !       tree op2;

> !       ssa_op_iter i2;

> !       if (gimple_code (def_stmt_info->stmt) == GIMPLE_PHI)

> !         return false;

> !       FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt_info->stmt, i2, SSA_OP_USE)

> !         if (vinfo->lookup_def (op2))

>             return false;

> !       any = true;

> !       }

>

>     if (!any)

>       return true;

>

>     FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE)

> !     if (stmt_vec_info def_stmt_info = vinfo->lookup_def (op))

> !       {

> !       gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt_info->stmt);

> !       gsi_remove (&gsi, false);

> !       gsi_insert_on_edge_immediate (loop_preheader_edge (loop),

> !                                     def_stmt_info->stmt);

> !       }

>

>     return true;

>   }