tree-optimization/95761 - fix vector insertion place compute

Message ID nycvar.YFH.7.76.2006191336140.4397@zhemvz.fhfr.qr
State New
Headers show
Series
  • tree-optimization/95761 - fix vector insertion place compute
Related show

Commit Message

Richard Biener June 19, 2020, 11:36 a.m.
I missed that indeed SLP permutation code generation can end up
refering to a non-last vectorized stmt in the last SLP_TREE_VEC_STMTS
element as optimization.  So walk them all.

Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.

2020-06-19  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/95761
	* tree-vect-slp.c (vect_schedule_slp_instance): Walk all
	vectorized stmts for finding the last one.

	* gcc.dg/torture/pr95761.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr95761.c | 25 +++++++++++++++++++++++++
 gcc/tree-vect-slp.c                    | 15 +++++++++------
 2 files changed, 34 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr95761.c

-- 
2.26.2

Patch

diff --git a/gcc/testsuite/gcc.dg/torture/pr95761.c b/gcc/testsuite/gcc.dg/torture/pr95761.c
new file mode 100644
index 00000000000..65ee0fc1c11
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr95761.c
@@ -0,0 +1,25 @@ 
+/* { dg-do compile } */
+
+typedef int a[10];
+typedef struct {
+  a b;
+  a c;
+  a d;
+  a e;
+} f;
+f g;
+int *j;
+void k() {
+  for (;;) {
+    a l;
+    j[0] = g.b[0];
+    int *h = g.d;
+    int i = 0;
+    for (; i < 10; i++)
+      h[i] = l[0] - g.e[0];
+    h = g.e;
+    i = 0;
+    for (; i < 10; i++)
+      h[i] = l[1] + g.e[i];
+  }
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index e33b42fbc68..c9ec77b4a97 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -4207,7 +4207,6 @@  vect_schedule_slp_instance (vec_info *vinfo,
     {
       /* Or if we do not have 1:1 matching scalar stmts emit after the
 	 children vectorized defs.  */
-      gimple *last_in_child;
       gimple *last_stmt = NULL;
       FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
 	/* ???  With only external defs the following breaks.  Note
@@ -4216,11 +4215,15 @@  vect_schedule_slp_instance (vec_info *vinfo,
 	if (SLP_TREE_DEF_TYPE (child) == vect_internal_def)
 	  {
 	    /* We are emitting all vectorized stmts in the same place and
-	       the last one is the last.  */
-	    last_in_child = SLP_TREE_VEC_STMTS (child).last ();
-	    if (!last_stmt
-		|| vect_stmt_dominates_stmt_p (last_stmt, last_in_child))
-	      last_stmt = last_in_child;
+	       the last one is the last.
+	       ???  Unless we have a load permutation applied and that
+	       figures to re-use an earlier generated load.  */
+	    unsigned j;
+	    gimple *vstmt;
+	    FOR_EACH_VEC_ELT (SLP_TREE_VEC_STMTS (child), j, vstmt)
+	      if (!last_stmt
+		  || vect_stmt_dominates_stmt_p (last_stmt, vstmt))
+		last_stmt = vstmt;
 	  }
       if (is_a <gphi *> (last_stmt))
 	si = gsi_after_labels (gimple_bb (last_stmt));