Fix PR84003

Message ID alpine.LSU.2.20.1801251212170.18265@zhemvz.fhfr.qr
State New
Headers show
Series
  • Fix PR84003
Related show

Commit Message

Richard Biener Jan. 25, 2018, 11:18 a.m.
The following patch fixes PR84003 where RTL DSE removes a redundant
store (a store storing the same value as an earlier store) but in
doing this changing program semantics because the later store changes
the effective type of the memory location.  This in turn allows
sched2 to do an originally invalid transform.

The fix is to harden DSE so that while removing the later store
the earlier stores alias-sets are changed to reflect both dynamic
types - which means using alias-set zero.

On GIMPLE we simply avoid removing such type-changing stores because
we cannot easily get hold on the earlier store.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Disclaimer: this is a "local" fix, I didn't try to understand much
of the DSE data structures but only learned from +-10 lines around
the affected transform which I found by searching for the dumped
messages ...  Richard, you contributed the pass so please review.

Ok for trunk and branches?

Thanks,
Richard.

2018-01-25  Richard Biener  <rguenther@suse.de>

	PR rtl-optimization/84003
	* dse.c (dse_step1): When removing redundant stores make sure
	to adjust the earlier stores alias-set to match semantics of
	the removed one and its own.
	(dse_step6): Likewise.

	* g++.dg/torture/pr77745.C: Mark foo noinline to trigger
	latent bug in DSE.

Comments

Richard Sandiford Jan. 25, 2018, 7:50 p.m. | #1
Richard Biener <rguenther@suse.de> writes:
> The following patch fixes PR84003 where RTL DSE removes a redundant

> store (a store storing the same value as an earlier store) but in

> doing this changing program semantics because the later store changes

> the effective type of the memory location.  This in turn allows

> sched2 to do an originally invalid transform.

>

> The fix is to harden DSE so that while removing the later store

> the earlier stores alias-sets are changed to reflect both dynamic

> types - which means using alias-set zero.

>

> On GIMPLE we simply avoid removing such type-changing stores because

> we cannot easily get hold on the earlier store.

>

> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

>

> Disclaimer: this is a "local" fix, I didn't try to understand much

> of the DSE data structures but only learned from +-10 lines around

> the affected transform which I found by searching for the dumped

> messages ...  Richard, you contributed the pass so please review.


So the file says.  In reality I only wrote an early version, and what
was committed contains very little of that.  So TBH this pass is almost
a complete unknown to me.  That said...

Might it be worth keeping the store instead?  Giving things alias set
zero seems like a pretty big hammer and would turn the remaining store
into something close to a scheduling barrier.

IOW, how about checking alias_set_subset_of in:

	  /* Even if PTR won't be eliminated as unneeded, if both
	     PTR and this insn store the same constant value, we might
	     eliminate this insn instead.  */
	  if (s_info->const_rhs
	      && const_rhs
	      && known_subrange_p (offset, width,
				   s_info->offset, s_info->width)
	      && all_positions_needed_p (s_info, offset - s_info->offset,
					 width))

instead?

Failing that:

> 2018-01-25  Richard Biener  <rguenther@suse.de>

>

> 	PR rtl-optimization/84003

> 	* dse.c (dse_step1): When removing redundant stores make sure

> 	to adjust the earlier stores alias-set to match semantics of

> 	the removed one and its own.

> 	(dse_step6): Likewise.

>

> 	* g++.dg/torture/pr77745.C: Mark foo noinline to trigger

> 	latent bug in DSE.

>

> Index: gcc/dse.c

> ===================================================================

> --- gcc/dse.c	(revision 257043)

> +++ gcc/dse.c	(working copy)

> @@ -2725,6 +2725,19 @@ dse_step1 (void)

>  					    "eliminated\n",

>  				 INSN_UID (ptr->insn),

>  				 INSN_UID (s_info->redundant_reason->insn));

> +		      /* If the later store we delete could have changed the

> +		         dynamic type of the memory make sure the one we

> +			 preserve serves as a store for all loads that could

> +			 validly have accessed the one we delete.  */

> +		      store_info *r_info = s_info->redundant_reason->store_rec;

> +		      while (r_info)

> +			{

> +			  if (r_info->is_set

> +			      && (MEM_ALIAS_SET (s_info->mem)

> +				  != MEM_ALIAS_SET (r_info->mem)))

> +			    set_mem_alias_set (r_info->mem, 0);

> +			  r_info = r_info->next;

> +			}

>  		      delete_dead_store_insn (ptr);

>  		    }

>  		  free_store_info (ptr);

> @@ -3512,6 +3525,19 @@ dse_step6 (void)

>  					"eliminated\n",

>  					INSN_UID (insn_info->insn),

>  					INSN_UID (rinsn));

> +		  /* If the later store we delete could have changed the

> +		     dynamic type of the memory make sure the one we

> +		     preserve serves as a store for all loads that could

> +		     validly have accessed the one we delete.  */

> +		  store_info *r_info = s_info->redundant_reason->store_rec;

> +		  while (r_info)

> +		    {

> +		      if (r_info->is_set

> +			  && (MEM_ALIAS_SET (s_info->mem)

> +			      != MEM_ALIAS_SET (r_info->mem)))

> +			set_mem_alias_set (r_info->mem, 0);

> +		      r_info = r_info->next;

> +		    }


I think this is subtle enough to be worth factoring out.  How about
checking alias_set_subset_of, rather than checking for equality?

Thanks,
Richard

>  		  delete_dead_store_insn (insn_info);

>  		}

>  	    }

> Index: gcc/testsuite/g++.dg/torture/pr77745.C

> ===================================================================

> --- gcc/testsuite/g++.dg/torture/pr77745.C	(revision 257043)

> +++ gcc/testsuite/g++.dg/torture/pr77745.C	(working copy)

> @@ -2,7 +2,7 @@

>  

>  inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }

>  

> -long foo(char *c1, char *c2)

> +long __attribute__((noinline)) foo(char *c1, char *c2)

>  {

>    long *p1 = new (c1) long;

>    *p1 = 100;
Richard Sandiford Jan. 25, 2018, 7:55 p.m. | #2
Richard Sandiford <richard.sandiford@linaro.org> writes:
> Richard Biener <rguenther@suse.de> writes:

>> The following patch fixes PR84003 where RTL DSE removes a redundant

>> store (a store storing the same value as an earlier store) but in

>> doing this changing program semantics because the later store changes

>> the effective type of the memory location.  This in turn allows

>> sched2 to do an originally invalid transform.

>>

>> The fix is to harden DSE so that while removing the later store

>> the earlier stores alias-sets are changed to reflect both dynamic

>> types - which means using alias-set zero.

>>

>> On GIMPLE we simply avoid removing such type-changing stores because

>> we cannot easily get hold on the earlier store.

>>

>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

>>

>> Disclaimer: this is a "local" fix, I didn't try to understand much

>> of the DSE data structures but only learned from +-10 lines around

>> the affected transform which I found by searching for the dumped

>> messages ...  Richard, you contributed the pass so please review.

>

> So the file says.  In reality I only wrote an early version, and what

> was committed contains very little of that.  So TBH this pass is almost

> a complete unknown to me.  That said...

>

> Might it be worth keeping the store instead?


...that is, like gimple. :-)  Sorry, I spent so long thinking about this
that I forgot you'd said that gimple already does the same thing.

Richard

> Giving things alias set

> zero seems like a pretty big hammer and would turn the remaining store

> into something close to a scheduling barrier.

>

> IOW, how about checking alias_set_subset_of in:

>

> 	  /* Even if PTR won't be eliminated as unneeded, if both

> 	     PTR and this insn store the same constant value, we might

> 	     eliminate this insn instead.  */

> 	  if (s_info->const_rhs

> 	      && const_rhs

> 	      && known_subrange_p (offset, width,

> 				   s_info->offset, s_info->width)

> 	      && all_positions_needed_p (s_info, offset - s_info->offset,

> 					 width))

>

> instead?

>

> Failing that:

>

>> 2018-01-25  Richard Biener  <rguenther@suse.de>

>>

>> 	PR rtl-optimization/84003

>> 	* dse.c (dse_step1): When removing redundant stores make sure

>> 	to adjust the earlier stores alias-set to match semantics of

>> 	the removed one and its own.

>> 	(dse_step6): Likewise.

>>

>> 	* g++.dg/torture/pr77745.C: Mark foo noinline to trigger

>> 	latent bug in DSE.

>>

>> Index: gcc/dse.c

>> ===================================================================

>> --- gcc/dse.c	(revision 257043)

>> +++ gcc/dse.c	(working copy)

>> @@ -2725,6 +2725,19 @@ dse_step1 (void)

>>  					    "eliminated\n",

>>  				 INSN_UID (ptr->insn),

>>  				 INSN_UID (s_info->redundant_reason->insn));

>> +		      /* If the later store we delete could have changed the

>> +		         dynamic type of the memory make sure the one we

>> +			 preserve serves as a store for all loads that could

>> +			 validly have accessed the one we delete.  */

>> +		      store_info *r_info = s_info->redundant_reason->store_rec;

>> +		      while (r_info)

>> +			{

>> +			  if (r_info->is_set

>> +			      && (MEM_ALIAS_SET (s_info->mem)

>> +				  != MEM_ALIAS_SET (r_info->mem)))

>> +			    set_mem_alias_set (r_info->mem, 0);

>> +			  r_info = r_info->next;

>> +			}

>>  		      delete_dead_store_insn (ptr);

>>  		    }

>>  		  free_store_info (ptr);

>> @@ -3512,6 +3525,19 @@ dse_step6 (void)

>>  					"eliminated\n",

>>  					INSN_UID (insn_info->insn),

>>  					INSN_UID (rinsn));

>> +		  /* If the later store we delete could have changed the

>> +		     dynamic type of the memory make sure the one we

>> +		     preserve serves as a store for all loads that could

>> +		     validly have accessed the one we delete.  */

>> +		  store_info *r_info = s_info->redundant_reason->store_rec;

>> +		  while (r_info)

>> +		    {

>> +		      if (r_info->is_set

>> +			  && (MEM_ALIAS_SET (s_info->mem)

>> +			      != MEM_ALIAS_SET (r_info->mem)))

>> +			set_mem_alias_set (r_info->mem, 0);

>> +		      r_info = r_info->next;

>> +		    }

>

> I think this is subtle enough to be worth factoring out.  How about

> checking alias_set_subset_of, rather than checking for equality?

>

> Thanks,

> Richard

>

>>  		  delete_dead_store_insn (insn_info);

>>  		}

>>  	    }

>> Index: gcc/testsuite/g++.dg/torture/pr77745.C

>> ===================================================================

>> --- gcc/testsuite/g++.dg/torture/pr77745.C	(revision 257043)

>> +++ gcc/testsuite/g++.dg/torture/pr77745.C	(working copy)

>> @@ -2,7 +2,7 @@

>>  

>>  inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }

>>  

>> -long foo(char *c1, char *c2)

>> +long __attribute__((noinline)) foo(char *c1, char *c2)

>>  {

>>    long *p1 = new (c1) long;

>>    *p1 = 100;
Jakub Jelinek Jan. 26, 2018, 10:46 a.m. | #3
On Thu, Jan 25, 2018 at 12:18:21PM +0100, Richard Biener wrote:
> 2018-01-25  Richard Biener  <rguenther@suse.de>

> 

> 	PR rtl-optimization/84003

> 	* dse.c (dse_step1): When removing redundant stores make sure

> 	to adjust the earlier stores alias-set to match semantics of

> 	the removed one and its own.

> 	(dse_step6): Likewise.

> 

> 	* g++.dg/torture/pr77745.C: Mark foo noinline to trigger

> 	latent bug in DSE.

> 

> Index: gcc/dse.c

> ===================================================================

> --- gcc/dse.c	(revision 257043)

> +++ gcc/dse.c	(working copy)

> @@ -2725,6 +2725,19 @@ dse_step1 (void)

>  					    "eliminated\n",

>  				 INSN_UID (ptr->insn),

>  				 INSN_UID (s_info->redundant_reason->insn));

> +		      /* If the later store we delete could have changed the

> +		         dynamic type of the memory make sure the one we

> +			 preserve serves as a store for all loads that could

> +			 validly have accessed the one we delete.  */

> +		      store_info *r_info = s_info->redundant_reason->store_rec;

> +		      while (r_info)

> +			{

> +			  if (r_info->is_set

> +			      && (MEM_ALIAS_SET (s_info->mem)

> +				  != MEM_ALIAS_SET (r_info->mem)))

> +			    set_mem_alias_set (r_info->mem, 0);


Is alias set 0 the only easily computable choice if there is a mismatch?
Also, isn't it fine if one of the alias set is a subset of the other one?

More importantly, I think set_mem_alias_set (r_info->mem, 0) can't work,
r_info->mem is a result of canon_rtx (SET_DEST (body)), so sometimes could
be the MEM that appears in the instruction, but at other times could be a
different pointer and changing that wouldn't change anything in the actual
instruction.  So, in order to do this you'd need to add probably another
field and record the original SET_DEST (body) record_store is called with.
Even that doesn't need to be something that really appears in the
instruction, but the exception in that case is the memset call and that
semantically has alias set of 0.

> --- gcc/testsuite/g++.dg/torture/pr77745.C	(revision 257043)

> +++ gcc/testsuite/g++.dg/torture/pr77745.C	(working copy)

> @@ -2,7 +2,7 @@

>  

>  inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }

>  

> -long foo(char *c1, char *c2)

> +long __attribute__((noinline)) foo(char *c1, char *c2)

>  {

>    long *p1 = new (c1) long;

>    *p1 = 100;


Is it desirable to modify an existing test, rather than say macroize this
and add pr77745-2.C that will define some macro and include this pr77745.C,
such that we cover both noinline and without noinline?

	Jakub
Richard Biener Jan. 26, 2018, 11:14 a.m. | #4
On Thu, 25 Jan 2018, Richard Sandiford wrote:

> Richard Sandiford <richard.sandiford@linaro.org> writes:

> > Richard Biener <rguenther@suse.de> writes:

> >> The following patch fixes PR84003 where RTL DSE removes a redundant

> >> store (a store storing the same value as an earlier store) but in

> >> doing this changing program semantics because the later store changes

> >> the effective type of the memory location.  This in turn allows

> >> sched2 to do an originally invalid transform.

> >>

> >> The fix is to harden DSE so that while removing the later store

> >> the earlier stores alias-sets are changed to reflect both dynamic

> >> types - which means using alias-set zero.

> >>

> >> On GIMPLE we simply avoid removing such type-changing stores because

> >> we cannot easily get hold on the earlier store.

> >>

> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

> >>

> >> Disclaimer: this is a "local" fix, I didn't try to understand much

> >> of the DSE data structures but only learned from +-10 lines around

> >> the affected transform which I found by searching for the dumped

> >> messages ...  Richard, you contributed the pass so please review.

> >

> > So the file says.  In reality I only wrote an early version, and what

> > was committed contains very little of that.  So TBH this pass is almost

> > a complete unknown to me.  That said...

> >

> > Might it be worth keeping the store instead?

> 

> ...that is, like gimple. :-)  Sorry, I spent so long thinking about this

> that I forgot you'd said that gimple already does the same thing.


Yeah, and it still runs into PR82224 ... which means it is not 
conservative enough.

> Richard

> 

> > Giving things alias set

> > zero seems like a pretty big hammer and would turn the remaining store

> > into something close to a scheduling barrier.

> >

> > IOW, how about checking alias_set_subset_of in:

> >

> > 	  /* Even if PTR won't be eliminated as unneeded, if both

> > 	     PTR and this insn store the same constant value, we might

> > 	     eliminate this insn instead.  */

> > 	  if (s_info->const_rhs

> > 	      && const_rhs

> > 	      && known_subrange_p (offset, width,

> > 				   s_info->offset, s_info->width)

> > 	      && all_positions_needed_p (s_info, offset - s_info->offset,

> > 					 width))

> >

> > instead?


That's what GIMPLE does (use alias_set_subset_of), but it arguably
doesn't work for unions (ok, there even the equality case doesn't 
work...).

I guess I thought it's worth trying sth new and adjust the earlier
store ;)  Sth that I can't easily do on GIMPLE but everything seemed
to be in place in RTL DSE.

So, when looking at the above code it seems like there are exactly
two insns we look at?  s_info and 'mem' I guess.  And s_info is
the earlier one.

So sth like

Index: gcc/dse.c
===================================================================
--- gcc/dse.c   (revision 257077)
+++ gcc/dse.c   (working copy)
@@ -1532,7 +1532,12 @@ record_store (rtx body, bb_info_t bb_inf
              && known_subrange_p (offset, width,
                                   s_info->offset, s_info->width)
              && all_positions_needed_p (s_info, offset - s_info->offset,
-                                        width))
+                                        width)
+             /* We can only remove the later store if the earlier aliases
+                at least all accesses the later one.  */
+             && (MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)
+                 || alias_set_subset_of (MEM_ALIAS_SET (mem),
+                                         MEM_ALIAS_SET (s_info->mem))))
            {
              if (GET_MODE (mem) == BLKmode)
                {

?  I can confirm that this patch works as well.  Is that what we prefer?
It would be consistent with what we do on GIMPLE currently (irrespective
of still existing bugs in this area...).

Note that with the above instead of dse1 removing the later store
dse2 now decides to remove the earlier one...  (which is fine!).

So I'm testing the above now, ok if it succeeds?

Thanks,
Richard.

> > Failing that:

> >

> >> 2018-01-25  Richard Biener  <rguenther@suse.de>

> >>

> >> 	PR rtl-optimization/84003

> >> 	* dse.c (dse_step1): When removing redundant stores make sure

> >> 	to adjust the earlier stores alias-set to match semantics of

> >> 	the removed one and its own.

> >> 	(dse_step6): Likewise.

> >>

> >> 	* g++.dg/torture/pr77745.C: Mark foo noinline to trigger

> >> 	latent bug in DSE.

> >>

> >> Index: gcc/dse.c

> >> ===================================================================

> >> --- gcc/dse.c	(revision 257043)

> >> +++ gcc/dse.c	(working copy)

> >> @@ -2725,6 +2725,19 @@ dse_step1 (void)

> >>  					    "eliminated\n",

> >>  				 INSN_UID (ptr->insn),

> >>  				 INSN_UID (s_info->redundant_reason->insn));

> >> +		      /* If the later store we delete could have changed the

> >> +		         dynamic type of the memory make sure the one we

> >> +			 preserve serves as a store for all loads that could

> >> +			 validly have accessed the one we delete.  */

> >> +		      store_info *r_info = s_info->redundant_reason->store_rec;

> >> +		      while (r_info)

> >> +			{

> >> +			  if (r_info->is_set

> >> +			      && (MEM_ALIAS_SET (s_info->mem)

> >> +				  != MEM_ALIAS_SET (r_info->mem)))

> >> +			    set_mem_alias_set (r_info->mem, 0);

> >> +			  r_info = r_info->next;

> >> +			}

> >>  		      delete_dead_store_insn (ptr);

> >>  		    }

> >>  		  free_store_info (ptr);

> >> @@ -3512,6 +3525,19 @@ dse_step6 (void)

> >>  					"eliminated\n",

> >>  					INSN_UID (insn_info->insn),

> >>  					INSN_UID (rinsn));

> >> +		  /* If the later store we delete could have changed the

> >> +		     dynamic type of the memory make sure the one we

> >> +		     preserve serves as a store for all loads that could

> >> +		     validly have accessed the one we delete.  */

> >> +		  store_info *r_info = s_info->redundant_reason->store_rec;

> >> +		  while (r_info)

> >> +		    {

> >> +		      if (r_info->is_set

> >> +			  && (MEM_ALIAS_SET (s_info->mem)

> >> +			      != MEM_ALIAS_SET (r_info->mem)))

> >> +			set_mem_alias_set (r_info->mem, 0);

> >> +		      r_info = r_info->next;

> >> +		    }

> >

> > I think this is subtle enough to be worth factoring out.  How about

> > checking alias_set_subset_of, rather than checking for equality?


The whole block is probably worth factoring out (I just duplicated into
some already duplicated code parts...).  Will do if we decide to go
with this variant.

Richard.

> > Thanks,

> > Richard

> >

> >>  		  delete_dead_store_insn (insn_info);

> >>  		}

> >>  	    }

> >> Index: gcc/testsuite/g++.dg/torture/pr77745.C

> >> ===================================================================

> >> --- gcc/testsuite/g++.dg/torture/pr77745.C	(revision 257043)

> >> +++ gcc/testsuite/g++.dg/torture/pr77745.C	(working copy)

> >> @@ -2,7 +2,7 @@

> >>  

> >>  inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }

> >>  

> >> -long foo(char *c1, char *c2)

> >> +long __attribute__((noinline)) foo(char *c1, char *c2)

> >>  {

> >>    long *p1 = new (c1) long;

> >>    *p1 = 100;

> 

> 


-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Richard Biener Jan. 26, 2018, 11:18 a.m. | #5
On Fri, 26 Jan 2018, Jakub Jelinek wrote:

> On Thu, Jan 25, 2018 at 12:18:21PM +0100, Richard Biener wrote:

> > 2018-01-25  Richard Biener  <rguenther@suse.de>

> > 

> > 	PR rtl-optimization/84003

> > 	* dse.c (dse_step1): When removing redundant stores make sure

> > 	to adjust the earlier stores alias-set to match semantics of

> > 	the removed one and its own.

> > 	(dse_step6): Likewise.

> > 

> > 	* g++.dg/torture/pr77745.C: Mark foo noinline to trigger

> > 	latent bug in DSE.

> > 

> > Index: gcc/dse.c

> > ===================================================================

> > --- gcc/dse.c	(revision 257043)

> > +++ gcc/dse.c	(working copy)

> > @@ -2725,6 +2725,19 @@ dse_step1 (void)

> >  					    "eliminated\n",

> >  				 INSN_UID (ptr->insn),

> >  				 INSN_UID (s_info->redundant_reason->insn));

> > +		      /* If the later store we delete could have changed the

> > +		         dynamic type of the memory make sure the one we

> > +			 preserve serves as a store for all loads that could

> > +			 validly have accessed the one we delete.  */

> > +		      store_info *r_info = s_info->redundant_reason->store_rec;

> > +		      while (r_info)

> > +			{

> > +			  if (r_info->is_set

> > +			      && (MEM_ALIAS_SET (s_info->mem)

> > +				  != MEM_ALIAS_SET (r_info->mem)))

> > +			    set_mem_alias_set (r_info->mem, 0);

> 

> Is alias set 0 the only easily computable choice if there is a mismatch?

> Also, isn't it fine if one of the alias set is a subset of the other one?

> 

> More importantly, I think set_mem_alias_set (r_info->mem, 0) can't work,

> r_info->mem is a result of canon_rtx (SET_DEST (body)), so sometimes could

> be the MEM that appears in the instruction, but at other times could be a

> different pointer and changing that wouldn't change anything in the actual

> instruction.  So, in order to do this you'd need to add probably another

> field and record the original SET_DEST (body) record_store is called with.


Uh, indeed.  See my mail in response to Richard which comes up with
an alternate patch avoiding this issue.

> Even that doesn't need to be something that really appears in the

> instruction, but the exception in that case is the memset call and that

> semantically has alias set of 0.

>

> > --- gcc/testsuite/g++.dg/torture/pr77745.C	(revision 257043)

> > +++ gcc/testsuite/g++.dg/torture/pr77745.C	(working copy)

> > @@ -2,7 +2,7 @@

> >  

> >  inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }

> >  

> > -long foo(char *c1, char *c2)

> > +long __attribute__((noinline)) foo(char *c1, char *c2)

> >  {

> >    long *p1 = new (c1) long;

> >    *p1 = 100;

> 

> Is it desirable to modify an existing test, rather than say macroize this

> and add pr77745-2.C that will define some macro and include this pr77745.C,

> such that we cover both noinline and without noinline?


Yeah, I'll do that.

Richard.
Richard Sandiford Jan. 26, 2018, 11:35 a.m. | #6
Richard Biener <rguenther@suse.de> writes:
> On Thu, 25 Jan 2018, Richard Sandiford wrote:

>

>> Richard Sandiford <richard.sandiford@linaro.org> writes:

>> > Richard Biener <rguenther@suse.de> writes:

>> >> The following patch fixes PR84003 where RTL DSE removes a redundant

>> >> store (a store storing the same value as an earlier store) but in

>> >> doing this changing program semantics because the later store changes

>> >> the effective type of the memory location.  This in turn allows

>> >> sched2 to do an originally invalid transform.

>> >>

>> >> The fix is to harden DSE so that while removing the later store

>> >> the earlier stores alias-sets are changed to reflect both dynamic

>> >> types - which means using alias-set zero.

>> >>

>> >> On GIMPLE we simply avoid removing such type-changing stores because

>> >> we cannot easily get hold on the earlier store.

>> >>

>> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

>> >>

>> >> Disclaimer: this is a "local" fix, I didn't try to understand much

>> >> of the DSE data structures but only learned from +-10 lines around

>> >> the affected transform which I found by searching for the dumped

>> >> messages ...  Richard, you contributed the pass so please review.

>> >

>> > So the file says.  In reality I only wrote an early version, and what

>> > was committed contains very little of that.  So TBH this pass is almost

>> > a complete unknown to me.  That said...

>> >

>> > Might it be worth keeping the store instead?

>> 

>> ...that is, like gimple. :-)  Sorry, I spent so long thinking about this

>> that I forgot you'd said that gimple already does the same thing.

>

> Yeah, and it still runs into PR82224 ... which means it is not 

> conservative enough.

>

>> Richard

>> 

>> > Giving things alias set

>> > zero seems like a pretty big hammer and would turn the remaining store

>> > into something close to a scheduling barrier.

>> >

>> > IOW, how about checking alias_set_subset_of in:

>> >

>> > 	  /* Even if PTR won't be eliminated as unneeded, if both

>> > 	     PTR and this insn store the same constant value, we might

>> > 	     eliminate this insn instead.  */

>> > 	  if (s_info->const_rhs

>> > 	      && const_rhs

>> > 	      && known_subrange_p (offset, width,

>> > 				   s_info->offset, s_info->width)

>> > 	      && all_positions_needed_p (s_info, offset - s_info->offset,

>> > 					 width))

>> >

>> > instead?

>

> That's what GIMPLE does (use alias_set_subset_of), but it arguably

> doesn't work for unions (ok, there even the equality case doesn't 

> work...).

>

> I guess I thought it's worth trying sth new and adjust the earlier

> store ;)  Sth that I can't easily do on GIMPLE but everything seemed

> to be in place in RTL DSE.

>

> So, when looking at the above code it seems like there are exactly

> two insns we look at?  s_info and 'mem' I guess.  And s_info is

> the earlier one.


Yeah.

> So sth like

>

> Index: gcc/dse.c

> ===================================================================

> --- gcc/dse.c   (revision 257077)

> +++ gcc/dse.c   (working copy)

> @@ -1532,7 +1532,12 @@ record_store (rtx body, bb_info_t bb_inf

>               && known_subrange_p (offset, width,

>                                    s_info->offset, s_info->width)

>               && all_positions_needed_p (s_info, offset - s_info->offset,

> -                                        width))

> +                                        width)

> +             /* We can only remove the later store if the earlier aliases

> +                at least all accesses the later one.  */

> +             && (MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)

> +                 || alias_set_subset_of (MEM_ALIAS_SET (mem),

> +                                         MEM_ALIAS_SET (s_info->mem))))

>             {

>               if (GET_MODE (mem) == BLKmode)

>                 {

>

> ?  I can confirm that this patch works as well.  Is that what we prefer?


Not sure I can call that one, but it seems safer (especially for
backports).

> It would be consistent with what we do on GIMPLE currently (irrespective

> of still existing bugs in this area...).

>

> Note that with the above instead of dse1 removing the later store

> dse2 now decides to remove the earlier one...  (which is fine!).

>

> So I'm testing the above now, ok if it succeeds?


LGTM (but I can't approve it).

I was wondering later... if the problem is that we have:

   I1: store X to Y with alias set S1
   I2: load from Y with alias set S1
   I3: store X to Y with alias set S2
   I4: load from Y with alias set S2

and removing I3 allows I4 to be scheduled before I1, what would happen
if I1 and I3 store different values (with no dse)?  Would something stop
the scheduler from ordering it as:

   I1 I3 I2 I4

where I2 would load the wrong value?  If so, why didn't that same thing
work here?

Thanks,
Richard
Richard Biener Jan. 26, 2018, 12:21 p.m. | #7
On Fri, 26 Jan 2018, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:

> > On Thu, 25 Jan 2018, Richard Sandiford wrote:

> >

> >> Richard Sandiford <richard.sandiford@linaro.org> writes:

> >> > Richard Biener <rguenther@suse.de> writes:

> >> >> The following patch fixes PR84003 where RTL DSE removes a redundant

> >> >> store (a store storing the same value as an earlier store) but in

> >> >> doing this changing program semantics because the later store changes

> >> >> the effective type of the memory location.  This in turn allows

> >> >> sched2 to do an originally invalid transform.

> >> >>

> >> >> The fix is to harden DSE so that while removing the later store

> >> >> the earlier stores alias-sets are changed to reflect both dynamic

> >> >> types - which means using alias-set zero.

> >> >>

> >> >> On GIMPLE we simply avoid removing such type-changing stores because

> >> >> we cannot easily get hold on the earlier store.

> >> >>

> >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

> >> >>

> >> >> Disclaimer: this is a "local" fix, I didn't try to understand much

> >> >> of the DSE data structures but only learned from +-10 lines around

> >> >> the affected transform which I found by searching for the dumped

> >> >> messages ...  Richard, you contributed the pass so please review.

> >> >

> >> > So the file says.  In reality I only wrote an early version, and what

> >> > was committed contains very little of that.  So TBH this pass is almost

> >> > a complete unknown to me.  That said...

> >> >

> >> > Might it be worth keeping the store instead?

> >> 

> >> ...that is, like gimple. :-)  Sorry, I spent so long thinking about this

> >> that I forgot you'd said that gimple already does the same thing.

> >

> > Yeah, and it still runs into PR82224 ... which means it is not 

> > conservative enough.

> >

> >> Richard

> >> 

> >> > Giving things alias set

> >> > zero seems like a pretty big hammer and would turn the remaining store

> >> > into something close to a scheduling barrier.

> >> >

> >> > IOW, how about checking alias_set_subset_of in:

> >> >

> >> > 	  /* Even if PTR won't be eliminated as unneeded, if both

> >> > 	     PTR and this insn store the same constant value, we might

> >> > 	     eliminate this insn instead.  */

> >> > 	  if (s_info->const_rhs

> >> > 	      && const_rhs

> >> > 	      && known_subrange_p (offset, width,

> >> > 				   s_info->offset, s_info->width)

> >> > 	      && all_positions_needed_p (s_info, offset - s_info->offset,

> >> > 					 width))

> >> >

> >> > instead?

> >

> > That's what GIMPLE does (use alias_set_subset_of), but it arguably

> > doesn't work for unions (ok, there even the equality case doesn't 

> > work...).

> >

> > I guess I thought it's worth trying sth new and adjust the earlier

> > store ;)  Sth that I can't easily do on GIMPLE but everything seemed

> > to be in place in RTL DSE.

> >

> > So, when looking at the above code it seems like there are exactly

> > two insns we look at?  s_info and 'mem' I guess.  And s_info is

> > the earlier one.

> 

> Yeah.

> 

> > So sth like

> >

> > Index: gcc/dse.c

> > ===================================================================

> > --- gcc/dse.c   (revision 257077)

> > +++ gcc/dse.c   (working copy)

> > @@ -1532,7 +1532,12 @@ record_store (rtx body, bb_info_t bb_inf

> >               && known_subrange_p (offset, width,

> >                                    s_info->offset, s_info->width)

> >               && all_positions_needed_p (s_info, offset - s_info->offset,

> > -                                        width))

> > +                                        width)

> > +             /* We can only remove the later store if the earlier aliases

> > +                at least all accesses the later one.  */

> > +             && (MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)

> > +                 || alias_set_subset_of (MEM_ALIAS_SET (mem),

> > +                                         MEM_ALIAS_SET (s_info->mem))))

> >             {

> >               if (GET_MODE (mem) == BLKmode)

> >                 {

> >

> > ?  I can confirm that this patch works as well.  Is that what we prefer?

> 

> Not sure I can call that one, but it seems safer (especially for

> backports).

> 

> > It would be consistent with what we do on GIMPLE currently (irrespective

> > of still existing bugs in this area...).

> >

> > Note that with the above instead of dse1 removing the later store

> > dse2 now decides to remove the earlier one...  (which is fine!).

> >

> > So I'm testing the above now, ok if it succeeds?

> 

> LGTM (but I can't approve it).


I can self-approve ;)

> I was wondering later... if the problem is that we have:

> 

>    I1: store X to Y with alias set S1

>    I2: load from Y with alias set S1

>    I3: store X to Y with alias set S2

>    I4: load from Y with alias set S2

> 

> and removing I3 allows I4 to be scheduled before I1, what would happen

> if I1 and I3 store different values (with no dse)?  Would something stop

> the scheduler from ordering it as:

> 

>    I1 I3 I2 I4

> 

> where I2 would load the wrong value?  If so, why didn't that same thing

> work here?


The scheduler cannot use TBAA when moving a load down across a store so
it has to prove the addresses are different.

Richard.
Richard Biener Jan. 26, 2018, 12:23 p.m. | #8
On Fri, 26 Jan 2018, Richard Biener wrote:

> On Fri, 26 Jan 2018, Richard Sandiford wrote:

> 

> > Richard Biener <rguenther@suse.de> writes:

> > > On Thu, 25 Jan 2018, Richard Sandiford wrote:

> > >

> > >> Richard Sandiford <richard.sandiford@linaro.org> writes:

> > >> > Richard Biener <rguenther@suse.de> writes:

> > >> >> The following patch fixes PR84003 where RTL DSE removes a redundant

> > >> >> store (a store storing the same value as an earlier store) but in

> > >> >> doing this changing program semantics because the later store changes

> > >> >> the effective type of the memory location.  This in turn allows

> > >> >> sched2 to do an originally invalid transform.

> > >> >>

> > >> >> The fix is to harden DSE so that while removing the later store

> > >> >> the earlier stores alias-sets are changed to reflect both dynamic

> > >> >> types - which means using alias-set zero.

> > >> >>

> > >> >> On GIMPLE we simply avoid removing such type-changing stores because

> > >> >> we cannot easily get hold on the earlier store.

> > >> >>

> > >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

> > >> >>

> > >> >> Disclaimer: this is a "local" fix, I didn't try to understand much

> > >> >> of the DSE data structures but only learned from +-10 lines around

> > >> >> the affected transform which I found by searching for the dumped

> > >> >> messages ...  Richard, you contributed the pass so please review.

> > >> >

> > >> > So the file says.  In reality I only wrote an early version, and what

> > >> > was committed contains very little of that.  So TBH this pass is almost

> > >> > a complete unknown to me.  That said...

> > >> >

> > >> > Might it be worth keeping the store instead?

> > >> 

> > >> ...that is, like gimple. :-)  Sorry, I spent so long thinking about this

> > >> that I forgot you'd said that gimple already does the same thing.

> > >

> > > Yeah, and it still runs into PR82224 ... which means it is not 

> > > conservative enough.

> > >

> > >> Richard

> > >> 

> > >> > Giving things alias set

> > >> > zero seems like a pretty big hammer and would turn the remaining store

> > >> > into something close to a scheduling barrier.

> > >> >

> > >> > IOW, how about checking alias_set_subset_of in:

> > >> >

> > >> > 	  /* Even if PTR won't be eliminated as unneeded, if both

> > >> > 	     PTR and this insn store the same constant value, we might

> > >> > 	     eliminate this insn instead.  */

> > >> > 	  if (s_info->const_rhs

> > >> > 	      && const_rhs

> > >> > 	      && known_subrange_p (offset, width,

> > >> > 				   s_info->offset, s_info->width)

> > >> > 	      && all_positions_needed_p (s_info, offset - s_info->offset,

> > >> > 					 width))

> > >> >

> > >> > instead?

> > >

> > > That's what GIMPLE does (use alias_set_subset_of), but it arguably

> > > doesn't work for unions (ok, there even the equality case doesn't 

> > > work...).

> > >

> > > I guess I thought it's worth trying sth new and adjust the earlier

> > > store ;)  Sth that I can't easily do on GIMPLE but everything seemed

> > > to be in place in RTL DSE.

> > >

> > > So, when looking at the above code it seems like there are exactly

> > > two insns we look at?  s_info and 'mem' I guess.  And s_info is

> > > the earlier one.

> > 

> > Yeah.

> > 

> > > So sth like

> > >

> > > Index: gcc/dse.c

> > > ===================================================================

> > > --- gcc/dse.c   (revision 257077)

> > > +++ gcc/dse.c   (working copy)

> > > @@ -1532,7 +1532,12 @@ record_store (rtx body, bb_info_t bb_inf

> > >               && known_subrange_p (offset, width,

> > >                                    s_info->offset, s_info->width)

> > >               && all_positions_needed_p (s_info, offset - s_info->offset,

> > > -                                        width))

> > > +                                        width)

> > > +             /* We can only remove the later store if the earlier aliases

> > > +                at least all accesses the later one.  */

> > > +             && (MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)

> > > +                 || alias_set_subset_of (MEM_ALIAS_SET (mem),

> > > +                                         MEM_ALIAS_SET (s_info->mem))))

> > >             {

> > >               if (GET_MODE (mem) == BLKmode)

> > >                 {

> > >

> > > ?  I can confirm that this patch works as well.  Is that what we prefer?

> > 

> > Not sure I can call that one, but it seems safer (especially for

> > backports).

> > 

> > > It would be consistent with what we do on GIMPLE currently (irrespective

> > > of still existing bugs in this area...).

> > >

> > > Note that with the above instead of dse1 removing the later store

> > > dse2 now decides to remove the earlier one...  (which is fine!).

> > >

> > > So I'm testing the above now, ok if it succeeds?

> > 

> > LGTM (but I can't approve it).

> 

> I can self-approve ;)

> 

> > I was wondering later... if the problem is that we have:

> > 

> >    I1: store X to Y with alias set S1

> >    I2: load from Y with alias set S1

> >    I3: store X to Y with alias set S2

> >    I4: load from Y with alias set S2

> > 

> > and removing I3 allows I4 to be scheduled before I1, what would happen

> > if I1 and I3 store different values (with no dse)?  Would something stop

> > the scheduler from ordering it as:

> > 

> >    I1 I3 I2 I4

> > 

> > where I2 would load the wrong value?  If so, why didn't that same thing

> > work here?

> 

> The scheduler cannot use TBAA when moving a load down across a store so

> it has to prove the addresses are different.


So only RAW dependences can be disambiguated using TBAA, not
WAW or WAR ones.  In GCC terms only true_dependence uses TBAA.

Richard.
Richard Biener Jan. 26, 2018, 2:49 p.m. | #9
On Fri, 26 Jan 2018, Richard Biener wrote:

> On Fri, 26 Jan 2018, Jakub Jelinek wrote:

> 

> > On Thu, Jan 25, 2018 at 12:18:21PM +0100, Richard Biener wrote:

> > > 2018-01-25  Richard Biener  <rguenther@suse.de>

> > > 

> > > 	PR rtl-optimization/84003

> > > 	* dse.c (dse_step1): When removing redundant stores make sure

> > > 	to adjust the earlier stores alias-set to match semantics of

> > > 	the removed one and its own.

> > > 	(dse_step6): Likewise.

> > > 

> > > 	* g++.dg/torture/pr77745.C: Mark foo noinline to trigger

> > > 	latent bug in DSE.

> > > 

> > > Index: gcc/dse.c

> > > ===================================================================

> > > --- gcc/dse.c	(revision 257043)

> > > +++ gcc/dse.c	(working copy)

> > > @@ -2725,6 +2725,19 @@ dse_step1 (void)

> > >  					    "eliminated\n",

> > >  				 INSN_UID (ptr->insn),

> > >  				 INSN_UID (s_info->redundant_reason->insn));

> > > +		      /* If the later store we delete could have changed the

> > > +		         dynamic type of the memory make sure the one we

> > > +			 preserve serves as a store for all loads that could

> > > +			 validly have accessed the one we delete.  */

> > > +		      store_info *r_info = s_info->redundant_reason->store_rec;

> > > +		      while (r_info)

> > > +			{

> > > +			  if (r_info->is_set

> > > +			      && (MEM_ALIAS_SET (s_info->mem)

> > > +				  != MEM_ALIAS_SET (r_info->mem)))

> > > +			    set_mem_alias_set (r_info->mem, 0);

> > 

> > Is alias set 0 the only easily computable choice if there is a mismatch?

> > Also, isn't it fine if one of the alias set is a subset of the other one?

> > 

> > More importantly, I think set_mem_alias_set (r_info->mem, 0) can't work,

> > r_info->mem is a result of canon_rtx (SET_DEST (body)), so sometimes could

> > be the MEM that appears in the instruction, but at other times could be a

> > different pointer and changing that wouldn't change anything in the actual

> > instruction.  So, in order to do this you'd need to add probably another

> > field and record the original SET_DEST (body) record_store is called with.

> 

> Uh, indeed.  See my mail in response to Richard which comes up with

> an alternate patch avoiding this issue.

> 

> > Even that doesn't need to be something that really appears in the

> > instruction, but the exception in that case is the memset call and that

> > semantically has alias set of 0.

> >

> > > --- gcc/testsuite/g++.dg/torture/pr77745.C	(revision 257043)

> > > +++ gcc/testsuite/g++.dg/torture/pr77745.C	(working copy)

> > > @@ -2,7 +2,7 @@

> > >  

> > >  inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }

> > >  

> > > -long foo(char *c1, char *c2)

> > > +long __attribute__((noinline)) foo(char *c1, char *c2)

> > >  {

> > >    long *p1 = new (c1) long;

> > >    *p1 = 100;

> > 

> > Is it desirable to modify an existing test, rather than say macroize this

> > and add pr77745-2.C that will define some macro and include this pr77745.C,

> > such that we cover both noinline and without noinline?

> 

> Yeah, I'll do that.


This is what I have applied, bootstrapped and tested on 
x86_64-unknown-linux-gnu.

Richard.

2018-01-26  Richard Biener  <rguenther@suse.de>

	PR rtl-optimization/84003
	* dse.c (record_store): Only record redundant stores when
	the earlier store aliases at least all accesses the later one does.

	* g++.dg/torture/pr77745.C: Mark foo noinline to trigger
	latent bug in DSE if NOINLINE is appropriately defined.
	* g++.dg/torture/pr77745-2.C: New testcase including pr77745.C
	and defining NOINLINE.

Index: gcc/dse.c
===================================================================
--- gcc/dse.c	(revision 257077)
+++ gcc/dse.c	(working copy)
@@ -1532,7 +1532,12 @@ record_store (rtx body, bb_info_t bb_inf
 	      && known_subrange_p (offset, width,
 				   s_info->offset, s_info->width)
 	      && all_positions_needed_p (s_info, offset - s_info->offset,
-					 width))
+					 width)
+	      /* We can only remove the later store if the earlier aliases
+		 at least all accesses the later one.  */
+	      && (MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)
+		  || alias_set_subset_of (MEM_ALIAS_SET (mem),
+					  MEM_ALIAS_SET (s_info->mem))))
 	    {
 	      if (GET_MODE (mem) == BLKmode)
 		{
Index: gcc/testsuite/g++.dg/torture/pr77745.C
===================================================================
--- gcc/testsuite/g++.dg/torture/pr77745.C	(revision 257080)
+++ gcc/testsuite/g++.dg/torture/pr77745.C	(working copy)
@@ -1,8 +1,12 @@
 // { dg-do run }
 
+#ifndef NOINLINE
+#define NOINLINE /* */
+#endif
+
 inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }
 
-long foo(char *c1, char *c2)
+long NOINLINE foo(char *c1, char *c2)
 {
   long *p1 = new (c1) long;
   *p1 = 100;
Index: gcc/testsuite/g++.dg/torture/pr77745-2.C
===================================================================
--- gcc/testsuite/g++.dg/torture/pr77745-2.C	(nonexistent)
+++ gcc/testsuite/g++.dg/torture/pr77745-2.C	(working copy)
@@ -0,0 +1,4 @@
+// { dg-do run }
+
+#define NOINLINE __attribute__((noinline))
+#include "pr77745.C"

Patch

Index: gcc/dse.c
===================================================================
--- gcc/dse.c	(revision 257043)
+++ gcc/dse.c	(working copy)
@@ -2725,6 +2725,19 @@  dse_step1 (void)
 					    "eliminated\n",
 				 INSN_UID (ptr->insn),
 				 INSN_UID (s_info->redundant_reason->insn));
+		      /* If the later store we delete could have changed the
+		         dynamic type of the memory make sure the one we
+			 preserve serves as a store for all loads that could
+			 validly have accessed the one we delete.  */
+		      store_info *r_info = s_info->redundant_reason->store_rec;
+		      while (r_info)
+			{
+			  if (r_info->is_set
+			      && (MEM_ALIAS_SET (s_info->mem)
+				  != MEM_ALIAS_SET (r_info->mem)))
+			    set_mem_alias_set (r_info->mem, 0);
+			  r_info = r_info->next;
+			}
 		      delete_dead_store_insn (ptr);
 		    }
 		  free_store_info (ptr);
@@ -3512,6 +3525,19 @@  dse_step6 (void)
 					"eliminated\n",
 					INSN_UID (insn_info->insn),
 					INSN_UID (rinsn));
+		  /* If the later store we delete could have changed the
+		     dynamic type of the memory make sure the one we
+		     preserve serves as a store for all loads that could
+		     validly have accessed the one we delete.  */
+		  store_info *r_info = s_info->redundant_reason->store_rec;
+		  while (r_info)
+		    {
+		      if (r_info->is_set
+			  && (MEM_ALIAS_SET (s_info->mem)
+			      != MEM_ALIAS_SET (r_info->mem)))
+			set_mem_alias_set (r_info->mem, 0);
+		      r_info = r_info->next;
+		    }
 		  delete_dead_store_insn (insn_info);
 		}
 	    }
Index: gcc/testsuite/g++.dg/torture/pr77745.C
===================================================================
--- gcc/testsuite/g++.dg/torture/pr77745.C	(revision 257043)
+++ gcc/testsuite/g++.dg/torture/pr77745.C	(working copy)
@@ -2,7 +2,7 @@ 
 
 inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }
 
-long foo(char *c1, char *c2)
+long __attribute__((noinline)) foo(char *c1, char *c2)
 {
   long *p1 = new (c1) long;
   *p1 = 100;