[v2,target] : Fix PR 88070, ICE in create_pre_exit, at mode-switching.c:438

Message ID CAFULd4Z6isa86vtLy30or-EF7KhJ7m6jO5ihhj3ABqCHYFSZ7A@mail.gmail.com
State New
Headers show
Series
  • [v2,target] : Fix PR 88070, ICE in create_pre_exit, at mode-switching.c:438
Related show

Commit Message

Uros Bizjak Nov. 20, 2018, 7:40 p.m.
Hello!

Attached patch is a different approach to the problem of split return
copies in create_pre_exit. It turns out that for vzeroupper insertion
pass, we actually don't need to insert a mode switch before the return
copy, it is enough to split edge to exit block - so we can emit
vzeroupper at the function exit edge.

Since x86 is the only target that uses optimize mode switching after
reload, I took the liberty and used !reload_completed for the
condition when we don't need to search for return copy. Sure, with the
big comment as evident from the patch.

2018-11-20  Uros Bizjak  <ubizjak@gmail.com>

    PR target/88070
    * mode-switching.c (create_pre_exit): After reload, always split the
    fallthrough edge to the exit block.

testsuite/ChangeLog:

2018-11-20  Uros Bizjak  <ubizjak@gmail.com>

    PR target/88070
    * gcc.target/i386/pr88070.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.

Comments

Jeff Law Nov. 21, 2018, 12:14 a.m. | #1
On 11/20/18 12:40 PM, Uros Bizjak wrote:
> Hello!

> 

> Attached patch is a different approach to the problem of split return

> copies in create_pre_exit. It turns out that for vzeroupper insertion

> pass, we actually don't need to insert a mode switch before the return

> copy, it is enough to split edge to exit block - so we can emit

> vzeroupper at the function exit edge.

> 

> Since x86 is the only target that uses optimize mode switching after

> reload, I took the liberty and used !reload_completed for the

> condition when we don't need to search for return copy. Sure, with the

> big comment as evident from the patch.

> 

> 2018-11-20  Uros Bizjak  <ubizjak@gmail.com>

> 

>     PR target/88070

>     * mode-switching.c (create_pre_exit): After reload, always split the

>     fallthrough edge to the exit block.

> 

> testsuite/ChangeLog:

> 

> 2018-11-20  Uros Bizjak  <ubizjak@gmail.com>

> 

>     PR target/88070

>     * gcc.target/i386/pr88070.c: New test.

> 

> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

> 

> Committed to mainline SVN.

> 

> Uros.

> 

OK.  But note this may have to be revisited for the GCN port.

jeff

Patch

Index: mode-switching.c
===================================================================
--- mode-switching.c	(revision 266278)
+++ mode-switching.c	(working copy)
@@ -248,8 +248,22 @@  create_pre_exit (int n_entities, int *entity_map,
 	gcc_assert (!pre_exit);
 	/* If this function returns a value at the end, we have to
 	   insert the final mode switch before the return value copy
-	   to its hard register.  */
-	if (EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) == 1
+	   to its hard register.
+
+	   x86 targets use mode-switching infrastructure to
+	   conditionally insert vzeroupper instruction at the exit
+	   from the function where there is no need to switch the
+	   mode before the return value copy.  The vzeroupper insertion
+	   pass runs after reload, so use !reload_completed as a stand-in
+	   for x86 to skip the search for the return value copy insn.
+
+	   N.b.: the code below assumes that the return copy insn
+	   immediately precedes its corresponding use insn.  This
+	   assumption does not hold after reload, since sched1 pass
+	   can schedule the return copy insn away from its
+	   corresponding use insn.  */
+	if (!reload_completed
+	    && EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) == 1
 	    && NONJUMP_INSN_P ((last_insn = BB_END (src_bb)))
 	    && GET_CODE (PATTERN (last_insn)) == USE
 	    && GET_CODE ((ret_reg = XEXP (PATTERN (last_insn), 0))) == REG)
Index: testsuite/gcc.target/i386/pr88070.c
===================================================================
--- testsuite/gcc.target/i386/pr88070.c	(nonexistent)
+++ testsuite/gcc.target/i386/pr88070.c	(working copy)
@@ -0,0 +1,12 @@ 
+/* PR target/88070 */
+/* { dg-do compile } */
+/* { dg-options "-O -fexpensive-optimizations -fnon-call-exceptions -fschedule-insns -fno-dce -fno-dse -mavx" } */
+
+typedef float vfloat2 __attribute__ ((__vector_size__ (2 * sizeof (float))));
+
+vfloat2
+test1float2 (float c)
+{
+  vfloat2 v = { c, c };
+  return v;
+}