Make ifcvt clean up dead comparisons

Message ID mpty313hedd.fsf@arm.com
State New
Headers show
Series
  • Make ifcvt clean up dead comparisons
Related show

Commit Message

Richard Sandiford July 12, 2019, 7:59 a.m.
This change is needed to avoid a regression in gcc.dg/ifcvt-3.c
for a later patch.  Without it, we enter CSE with a dead comparison left
by if-conversion and then eliminate the second (live) comparison in
favour of the dead one.  That's functionally correct in itself, but it
meant that we'd combine the subtraction and comparison into a SUBS
before we have a chance to fold away the subtraction.

Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.
OK to install?

Richard


2019-07-11  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* ifcvt.c: Include dce.h.
	(if_convert): Call run_fast_dce if the pass changed something.

Comments

Richard Biener July 12, 2019, 9:43 a.m. | #1
On Fri, Jul 12, 2019 at 10:00 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> This change is needed to avoid a regression in gcc.dg/ifcvt-3.c

> for a later patch.  Without it, we enter CSE with a dead comparison left

> by if-conversion and then eliminate the second (live) comparison in

> favour of the dead one.  That's functionally correct in itself, but it

> meant that we'd combine the subtraction and comparison into a SUBS

> before we have a chance to fold away the subtraction.

>

> Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.

> OK to install?


cfg_cleanup if it does, runs fast-dce after cleaning up the CFG so does
it make sense to do this in the caller instead?  (and after removing the
live problem just in case dce tries to keep that updated?)

Otherwise this is of course ok.

Richard.

> Richard

>

>

> 2019-07-11  Richard Sandiford  <richard.sandiford@arm.com>

>

> gcc/

>         * ifcvt.c: Include dce.h.

>         (if_convert): Call run_fast_dce if the pass changed something.

>

> Index: gcc/ifcvt.c

> ===================================================================

> --- gcc/ifcvt.c 2019-03-08 18:15:26.828777872 +0000

> +++ gcc/ifcvt.c 2019-07-12 08:58:21.787403345 +0100

> @@ -46,6 +46,7 @@

>  #include "rtl-iter.h"

>  #include "ifcvt.h"

>  #include "params.h"

> +#include "dce.h"

>

>  #ifndef MAX_CONDITIONAL_EXECUTE

>  #define MAX_CONDITIONAL_EXECUTE \

> @@ -5443,6 +5444,10 @@ if_convert (bool after_combine)

>                num_true_changes);

>      }

>

> +  if (num_updated_if_blocks)

> +    /* Get rid of any dead CC-related instructions.  */

> +    run_fast_dce ();

> +

>    if (optimize == 1)

>      df_remove_problem (df_live);

>
Richard Sandiford July 12, 2019, 10:28 a.m. | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Jul 12, 2019 at 10:00 AM Richard Sandiford

> <richard.sandiford@arm.com> wrote:

>>

>> This change is needed to avoid a regression in gcc.dg/ifcvt-3.c

>> for a later patch.  Without it, we enter CSE with a dead comparison left

>> by if-conversion and then eliminate the second (live) comparison in

>> favour of the dead one.  That's functionally correct in itself, but it

>> meant that we'd combine the subtraction and comparison into a SUBS

>> before we have a chance to fold away the subtraction.

>>

>> Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.

>> OK to install?

>

> cfg_cleanup if it does, runs fast-dce after cleaning up the CFG so does

> it make sense to do this in the caller instead?  (and after removing the

> live problem just in case dce tries to keep that updated?)


I think that only happens with CLEANUP_CROSSJUMP, which we don't use
here as things stand.  But that could easily change in future, and yeah,
doing it after removing the live problem would be better.

So something like the attached?  (Only lightly tested, but it does
still fix the original problem.)

I don't have any evidence that the post-combine pass needs this,
so I've kept it specific to the main pass this time.

Thanks,
Richard


2019-07-12  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* basic-block.h (CLEANUP_FORCE_FAST_DCE): New macro.
	* cfgcleanup.c (cleanup_cfg): Call run_fast_dce if
	CLEANUP_FORCE_FAST_DCE is set.
	* ifcvt.c (rest_of_handle_if_conversion): Pass
	CLEANUP_FORCE_FAST_DCE to the final cleanup_cfg call if
	if-conversion succeeded.

Index: gcc/basic-block.h
===================================================================
--- gcc/basic-block.h	2019-07-10 19:41:27.171891815 +0100
+++ gcc/basic-block.h	2019-07-12 11:20:59.326008338 +0100
@@ -508,6 +508,8 @@ #define CLEANUP_NO_INSN_DEL	16	/* Do not
 #define CLEANUP_CFGLAYOUT	32	/* Do cleanup in cfglayout mode.  */
 #define CLEANUP_CFG_CHANGED	64      /* The caller changed the CFG.  */
 #define CLEANUP_NO_PARTITIONING	128     /* Do not try to fix partitions.  */
+#define CLEANUP_FORCE_FAST_DCE	0x100	/* Force run_fast_dce to be called
+					   at least once.  */
 
 /* Return true if BB is in a transaction.  */
 
Index: gcc/cfgcleanup.c
===================================================================
--- gcc/cfgcleanup.c	2019-07-10 19:41:26.395898027 +0100
+++ gcc/cfgcleanup.c	2019-07-12 11:20:59.326008338 +0100
@@ -3193,7 +3193,10 @@ cleanup_cfg (int mode)
 	      && !delete_trivially_dead_insns (get_insns (), max_reg_num ()))
 	    break;
 	  if ((mode & CLEANUP_CROSSJUMP) && crossjumps_occurred)
-	    run_fast_dce ();
+	    {
+	      run_fast_dce ();
+	      mode &= ~CLEANUP_FORCE_FAST_DCE;
+	    }
 	}
       else
 	break;
@@ -3202,6 +3205,9 @@ cleanup_cfg (int mode)
   if (mode & CLEANUP_CROSSJUMP)
     remove_fake_exit_edges ();
 
+  if (mode & CLEANUP_FORCE_FAST_DCE)
+    run_fast_dce ();
+
   /* Don't call delete_dead_jumptables in cfglayout mode, because
      that function assumes that jump tables are in the insns stream.
      But we also don't _have_ to delete dead jumptables in cfglayout
Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	2019-07-12 11:20:55.000000000 +0100
+++ gcc/ifcvt.c	2019-07-12 11:20:59.330008307 +0100
@@ -5457,6 +5457,8 @@ if_convert (bool after_combine)
 static unsigned int
 rest_of_handle_if_conversion (void)
 {
+  int flags = 0;
+
   if (flag_if_conversion)
     {
       if (dump_file)
@@ -5466,9 +5468,12 @@ rest_of_handle_if_conversion (void)
 	}
       cleanup_cfg (CLEANUP_EXPENSIVE);
       if_convert (false);
+      if (num_updated_if_blocks)
+	/* Get rid of any dead CC-related instructions.  */
+	flags |= CLEANUP_FORCE_FAST_DCE;
     }
 
-  cleanup_cfg (0);
+  cleanup_cfg (flags);
   return 0;
 }
Richard Biener July 12, 2019, 10:31 a.m. | #3
On Fri, Jul 12, 2019 at 12:28 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> Richard Biener <richard.guenther@gmail.com> writes:

> > On Fri, Jul 12, 2019 at 10:00 AM Richard Sandiford

> > <richard.sandiford@arm.com> wrote:

> >>

> >> This change is needed to avoid a regression in gcc.dg/ifcvt-3.c

> >> for a later patch.  Without it, we enter CSE with a dead comparison left

> >> by if-conversion and then eliminate the second (live) comparison in

> >> favour of the dead one.  That's functionally correct in itself, but it

> >> meant that we'd combine the subtraction and comparison into a SUBS

> >> before we have a chance to fold away the subtraction.

> >>

> >> Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.

> >> OK to install?

> >

> > cfg_cleanup if it does, runs fast-dce after cleaning up the CFG so does

> > it make sense to do this in the caller instead?  (and after removing the

> > live problem just in case dce tries to keep that updated?)

>

> I think that only happens with CLEANUP_CROSSJUMP, which we don't use

> here as things stand.  But that could easily change in future, and yeah,

> doing it after removing the live problem would be better.

>

> So something like the attached?  (Only lightly tested, but it does

> still fix the original problem.)


Yeah, that works, too (I was originally thinking to simply put it after
the cleanup_cfg (0) call ...).

Any reason you use 0x100 instead of 256?

If nobody objects this is OK.

Thanks,
Richard.

> I don't have any evidence that the post-combine pass needs this,

> so I've kept it specific to the main pass this time.

>

> Thanks,

> Richard

>

>

> 2019-07-12  Richard Sandiford  <richard.sandiford@arm.com>

>

> gcc/

>         * basic-block.h (CLEANUP_FORCE_FAST_DCE): New macro.

>         * cfgcleanup.c (cleanup_cfg): Call run_fast_dce if

>         CLEANUP_FORCE_FAST_DCE is set.

>         * ifcvt.c (rest_of_handle_if_conversion): Pass

>         CLEANUP_FORCE_FAST_DCE to the final cleanup_cfg call if

>         if-conversion succeeded.

>

> Index: gcc/basic-block.h

> ===================================================================

> --- gcc/basic-block.h   2019-07-10 19:41:27.171891815 +0100

> +++ gcc/basic-block.h   2019-07-12 11:20:59.326008338 +0100

> @@ -508,6 +508,8 @@ #define CLEANUP_NO_INSN_DEL 16      /* Do not

>  #define CLEANUP_CFGLAYOUT      32      /* Do cleanup in cfglayout mode.  */

>  #define CLEANUP_CFG_CHANGED    64      /* The caller changed the CFG.  */

>  #define CLEANUP_NO_PARTITIONING        128     /* Do not try to fix partitions.  */

> +#define CLEANUP_FORCE_FAST_DCE 0x100   /* Force run_fast_dce to be called

> +                                          at least once.  */

>

>  /* Return true if BB is in a transaction.  */

>

> Index: gcc/cfgcleanup.c

> ===================================================================

> --- gcc/cfgcleanup.c    2019-07-10 19:41:26.395898027 +0100

> +++ gcc/cfgcleanup.c    2019-07-12 11:20:59.326008338 +0100

> @@ -3193,7 +3193,10 @@ cleanup_cfg (int mode)

>               && !delete_trivially_dead_insns (get_insns (), max_reg_num ()))

>             break;

>           if ((mode & CLEANUP_CROSSJUMP) && crossjumps_occurred)

> -           run_fast_dce ();

> +           {

> +             run_fast_dce ();

> +             mode &= ~CLEANUP_FORCE_FAST_DCE;

> +           }

>         }

>        else

>         break;

> @@ -3202,6 +3205,9 @@ cleanup_cfg (int mode)

>    if (mode & CLEANUP_CROSSJUMP)

>      remove_fake_exit_edges ();

>

> +  if (mode & CLEANUP_FORCE_FAST_DCE)

> +    run_fast_dce ();

> +

>    /* Don't call delete_dead_jumptables in cfglayout mode, because

>       that function assumes that jump tables are in the insns stream.

>       But we also don't _have_ to delete dead jumptables in cfglayout

> Index: gcc/ifcvt.c

> ===================================================================

> --- gcc/ifcvt.c 2019-07-12 11:20:55.000000000 +0100

> +++ gcc/ifcvt.c 2019-07-12 11:20:59.330008307 +0100

> @@ -5457,6 +5457,8 @@ if_convert (bool after_combine)

>  static unsigned int

>  rest_of_handle_if_conversion (void)

>  {

> +  int flags = 0;

> +

>    if (flag_if_conversion)

>      {

>        if (dump_file)

> @@ -5466,9 +5468,12 @@ rest_of_handle_if_conversion (void)

>         }

>        cleanup_cfg (CLEANUP_EXPENSIVE);

>        if_convert (false);

> +      if (num_updated_if_blocks)

> +       /* Get rid of any dead CC-related instructions.  */

> +       flags |= CLEANUP_FORCE_FAST_DCE;

>      }

>

> -  cleanup_cfg (0);

> +  cleanup_cfg (flags);

>    return 0;

>  }

>
Richard Sandiford July 12, 2019, 10:36 a.m. | #4
Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Jul 12, 2019 at 12:28 PM Richard Sandiford

> <richard.sandiford@arm.com> wrote:

>>

>> Richard Biener <richard.guenther@gmail.com> writes:

>> > On Fri, Jul 12, 2019 at 10:00 AM Richard Sandiford

>> > <richard.sandiford@arm.com> wrote:

>> >>

>> >> This change is needed to avoid a regression in gcc.dg/ifcvt-3.c

>> >> for a later patch.  Without it, we enter CSE with a dead comparison left

>> >> by if-conversion and then eliminate the second (live) comparison in

>> >> favour of the dead one.  That's functionally correct in itself, but it

>> >> meant that we'd combine the subtraction and comparison into a SUBS

>> >> before we have a chance to fold away the subtraction.

>> >>

>> >> Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.

>> >> OK to install?

>> >

>> > cfg_cleanup if it does, runs fast-dce after cleaning up the CFG so does

>> > it make sense to do this in the caller instead?  (and after removing the

>> > live problem just in case dce tries to keep that updated?)

>>

>> I think that only happens with CLEANUP_CROSSJUMP, which we don't use

>> here as things stand.  But that could easily change in future, and yeah,

>> doing it after removing the live problem would be better.

>>

>> So something like the attached?  (Only lightly tested, but it does

>> still fix the original problem.)

>

> Yeah, that works, too (I was originally thinking to simply put it after

> the cleanup_cfg (0) call ...).

>

> Any reason you use 0x100 instead of 256?


Just trying to set a precedent for future flags. :-)  Beyond 65536 I have
to look it up.

> If nobody objects this is OK.


Thanks.

Richard

Patch

Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	2019-03-08 18:15:26.828777872 +0000
+++ gcc/ifcvt.c	2019-07-12 08:58:21.787403345 +0100
@@ -46,6 +46,7 @@ 
 #include "rtl-iter.h"
 #include "ifcvt.h"
 #include "params.h"
+#include "dce.h"
 
 #ifndef MAX_CONDITIONAL_EXECUTE
 #define MAX_CONDITIONAL_EXECUTE \
@@ -5443,6 +5444,10 @@  if_convert (bool after_combine)
 	       num_true_changes);
     }
 
+  if (num_updated_if_blocks)
+    /* Get rid of any dead CC-related instructions.  */
+    run_fast_dce ();
+
   if (optimize == 1)
     df_remove_problem (df_live);