Check DECL_CONTEXT of new/delete operators.

Message ID af29444d-118a-4818-26ca-bbe468a587dd@suse.cz
State New
Headers show
Series
  • Check DECL_CONTEXT of new/delete operators.
Related show

Commit Message

Martin Liška March 30, 2020, 8:40 a.m.
Hi.

The patch ensures that a deleted new/delete pair has a same context.
That will fix the issue presented in the PR.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2020-03-30  Martin Liska  <mliska@suse.cz>

	PR c++/94314
	* tree-ssa-dce.c (propagate_necessity): Verify that
	DECL_CONTEXT of a new/delete operators do match.

gcc/testsuite/ChangeLog:

2020-03-30  Martin Liska  <mliska@suse.cz>

	PR c++/94314
	* g++.dg/pr94314.C: New test.
---
  gcc/testsuite/g++.dg/pr94314.C | 84 ++++++++++++++++++++++++++++++++++
  gcc/tree-ssa-dce.c             | 31 +++++++++----
  2 files changed, 105 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/pr94314.C

Comments

Iain Buclaw via Gcc-patches March 30, 2020, 8:53 a.m. | #1
On Mon, Mar 30, 2020 at 10:41 AM Martin Liška <mliska@suse.cz> wrote:
>

> Hi.

>

> The patch ensures that a deleted new/delete pair has a same context.

> That will fix the issue presented in the PR.

>

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.


I think this will break the DCE with LTO since we strip quite some
DECL_CONTEXT in free-lang-data.

Honza - do you remember why we strip DECL_CONTEXT for methods
which will most likely keep their containing record types live through
their this
arguments?  Quoting:

  /* We need to keep field decls associated with their trees. Otherwise tree
     merging may merge some fileds and keep others disjoint wich in turn will
     not do well with TREE_CHAIN pointers linking them.

     Also do not drop containing types for virtual methods and tables because
     these are needed by devirtualization.
     C++ destructors are special because C++ frontends sometimes produces
     virtual destructor as an alias of non-virtual destructor.  In
     devirutalization code we always walk through aliases and we need
     context to be preserved too.  See PR89335  */
  if (TREE_CODE (decl) != FIELD_DECL
      && ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL)
          || (!DECL_VIRTUAL_P (decl)
              && (TREE_CODE (decl) != FUNCTION_DECL
                  || !DECL_CXX_DESTRUCTOR_P (decl)))))
    DECL_CONTEXT (decl) = fld_decl_context (DECL_CONTEXT (decl));

and fld_decl_context stripping up to the next non-TYPE context (unless VLA).

Richard.

> Ready to be installed?

> Thanks,

> Martin

>

> gcc/ChangeLog:

>

> 2020-03-30  Martin Liska  <mliska@suse.cz>

>

>         PR c++/94314

>         * tree-ssa-dce.c (propagate_necessity): Verify that

>         DECL_CONTEXT of a new/delete operators do match.

>

> gcc/testsuite/ChangeLog:

>

> 2020-03-30  Martin Liska  <mliska@suse.cz>

>

>         PR c++/94314

>         * g++.dg/pr94314.C: New test.

> ---

>   gcc/testsuite/g++.dg/pr94314.C | 84 ++++++++++++++++++++++++++++++++++

>   gcc/tree-ssa-dce.c             | 31 +++++++++----

>   2 files changed, 105 insertions(+), 10 deletions(-)

>   create mode 100644 gcc/testsuite/g++.dg/pr94314.C

>

>
Marc Glisse March 30, 2020, 9:29 a.m. | #2
On Mon, 30 Mar 2020, Martin Liška wrote:

> The patch ensures that a deleted new/delete pair has a same context.

> That will fix the issue presented in the PR.


DECL_CONTEXT seems good for that example (assuming it is still available 
in the middle-end), but shouldn't we also check if both are array 
versions? I can build a similar example by implementing the global 
operator new[] in terms of operator new and operator delete[] in terms of 
operator delete. Checking if both are aligned versions could be nice as 
well. I don't know what others can be relevant, I think we are allowed to 
mix throw and nothrow, or sized and non-sized versions.

For DECL_CONTEXT, if we use new on a derived class and delete on a base 
class, I guess it is sensible not to optimize unless devirt / inlining 
manage to show a matched pair of operators, so your patch looks good here.

I understand this feature is getting rather painful, sorry...

-- 
Marc Glisse
Jan Hubicka March 31, 2020, 12:29 p.m. | #3
> On Mon, Mar 30, 2020 at 10:41 AM Martin Liška <mliska@suse.cz> wrote:

> >

> > Hi.

> >

> > The patch ensures that a deleted new/delete pair has a same context.

> > That will fix the issue presented in the PR.

> >

> > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

> 

> I think this will break the DCE with LTO since we strip quite some

> DECL_CONTEXT in free-lang-data.

> 

> Honza - do you remember why we strip DECL_CONTEXT for methods

> which will most likely keep their containing record types live through

> their this

> arguments?  Quoting:

> 

>   /* We need to keep field decls associated with their trees. Otherwise tree

>      merging may merge some fileds and keep others disjoint wich in turn will

>      not do well with TREE_CHAIN pointers linking them.

> 

>      Also do not drop containing types for virtual methods and tables because

>      these are needed by devirtualization.

>      C++ destructors are special because C++ frontends sometimes produces

>      virtual destructor as an alias of non-virtual destructor.  In

>      devirutalization code we always walk through aliases and we need

>      context to be preserved too.  See PR89335  */

>   if (TREE_CODE (decl) != FIELD_DECL

>       && ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL)

>           || (!DECL_VIRTUAL_P (decl)

>               && (TREE_CODE (decl) != FUNCTION_DECL

>                   || !DECL_CXX_DESTRUCTOR_P (decl)))))

>     DECL_CONTEXT (decl) = fld_decl_context (DECL_CONTEXT (decl));

> 

> and fld_decl_context stripping up to the next non-TYPE context (unless VLA).


Well, I basically went through all pointers and tried to get rid of as
many of them as possible.  CONTEXT pointers do increase size of SCCs
that increases chance they will not get merged and also processing time
of merging algorithm.  I guess if we need to stream more contexts we
could do that (and check the effect on merging and average SCC size)

Honza
> 

> Richard.

> 

> > Ready to be installed?

> > Thanks,

> > Martin

> >

> > gcc/ChangeLog:

> >

> > 2020-03-30  Martin Liska  <mliska@suse.cz>

> >

> >         PR c++/94314

> >         * tree-ssa-dce.c (propagate_necessity): Verify that

> >         DECL_CONTEXT of a new/delete operators do match.

> >

> > gcc/testsuite/ChangeLog:

> >

> > 2020-03-30  Martin Liska  <mliska@suse.cz>

> >

> >         PR c++/94314

> >         * g++.dg/pr94314.C: New test.

> > ---

> >   gcc/testsuite/g++.dg/pr94314.C | 84 ++++++++++++++++++++++++++++++++++

> >   gcc/tree-ssa-dce.c             | 31 +++++++++----

> >   2 files changed, 105 insertions(+), 10 deletions(-)

> >   create mode 100644 gcc/testsuite/g++.dg/pr94314.C

> >

> >
Martin Liška March 31, 2020, 12:38 p.m. | #4
On 3/31/20 2:29 PM, Jan Hubicka wrote:
> Well, I basically went through all pointers and tried to get rid of as

> many of them as possible.  CONTEXT pointers do increase size of SCCs

> that increases chance they will not get merged and also processing time

> of merging algorithm.  I guess if we need to stream more contexts we

> could do that (and check the effect on merging and average SCC size)


Ok, do we want to stream contexts just for the new/delete operators?
Can you please prepare a streaming patch?

Thank you,
Martin
Jan Hubicka April 3, 2020, 3:26 p.m. | #5
> On 3/31/20 2:29 PM, Jan Hubicka wrote:

> > Well, I basically went through all pointers and tried to get rid of as

> > many of them as possible.  CONTEXT pointers do increase size of SCCs

> > that increases chance they will not get merged and also processing time

> > of merging algorithm.  I guess if we need to stream more contexts we

> > could do that (and check the effect on merging and average SCC size)

> 

> Ok, do we want to stream contexts just for the new/delete operators?

> Can you please prepare a streaming patch?

Hi,
I am still not convinced that context is useful here.
It took me a while to understand what the code does and why it fails,
but here is a testcase.
It fails for me with your patch and -O2 --param early-inlining-insns=100

The invalid transform is to remove pair
base:new and B:delete
B:new gets inlined and we get count out of sync.

Honza

#include <stdio.h>
volatile int idx;
struct base
{
  __attribute__((malloc,noinline))
  static void* operator new(unsigned long sz)
  {
    return ::operator new(sz);
  }

  __attribute__((malloc,noinline))
  static void operator delete(void* ptr)
  {
    --count[idx];
    ::operator delete(ptr);
  }
  volatile static int count[2];
};
volatile int base::count[2] = {0,0};
struct B:base
{
  static void* operator new(unsigned long sz)
  {
    ++count[idx];
    return base::operator new(sz);
  }

};


volatile int c=1;

int main(){
  for (int i; i<c;i++)
    {
      idx=0;
      delete new B;
      if (B::count[0] != 0)
	__builtin_abort ();
    }


  return 0;
}
Jan Hubicka April 3, 2020, 3:42 p.m. | #6
Hi,
and this is the streaming fix

diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index e4077b58890..dd9645723c1 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -646,6 +647,19 @@ degenerate_phi_p (gimple *phi)
   return true;
 }
 
+/* Return true if C1 and C2 are matching contexts (both translation unit decls
+   or both types.  */
+
+bool
+matching_contexts_p (tree c1, tree c2)
+{
+  if (TREE_CODE (c1) == TRANSLATION_UNIT_DECL)
+    return TREE_CODE (c2) == TRANSLATION_UNIT_DECL;
+  if (TREE_CODE (c2) == TRANSLATION_UNIT_DECL)
+    return TREE_CODE (c1) == TRANSLATION_UNIT_DECL;
+  return types_same_for_odr (c1, c2);
+}
+
 /* Propagate necessity using the operands of necessary statements.
    Process the uses on each statement in the worklist, and add all
    feeding statements which contribute to the calculation of this
@@ -824,16 +838,28 @@ propagate_necessity (bool aggressive)
 			   || DECL_FUNCTION_CODE (def_callee) == BUILT_IN_CALLOC))
 		      || DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee)))
 		{
-		  /* Delete operators can have alignment and (or) size as next
-		     arguments.  When being a SSA_NAME, they must be marked
-		     as necessary.  */
-		  if (is_delete_operator && gimple_call_num_args (stmt) >= 2)
-		    for (unsigned i = 1; i < gimple_call_num_args (stmt); i++)
-		      {
-			tree arg = gimple_call_arg (stmt, i);
-			if (TREE_CODE (arg) == SSA_NAME)
-			  mark_operand_necessary (arg);
-		      }
+		  if (is_delete_operator)
+		    {
+		      /* Verify that new and delete operators have the same
+			 context.  */
+		      if (DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee)
+			  && !matching_contexts_p
+				 (DECL_CONTEXT (def_callee),
+				  DECL_CONTEXT (gimple_call_fndecl (stmt))))
+			mark_operand_necessary (gimple_call_arg (stmt, 0));
+
+		      /* Delete operators can have alignment and (or) size
+			 as next arguments.  When being a SSA_NAME, they
+			 must be marked as necessary.  */
+		      if (gimple_call_num_args (stmt) >= 2)
+			for (unsigned i = 1; i < gimple_call_num_args (stmt);
+			     i++)
+			  {
+			    tree arg = gimple_call_arg (stmt, i);
+			    if (TREE_CODE (arg) == SSA_NAME)
+			      mark_operand_necessary (arg);
+			  }
+		    }
 
 		  continue;
 		}
diff --git a/gcc/tree.c b/gcc/tree.c
index 63dc6730b2b..cd608a9c878 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -5879,6 +5879,9 @@ free_lang_data_in_decl (tree decl, class free_lang_data_d *fld)
      merging may merge some fileds and keep others disjoint wich in turn will
      not do well with TREE_CHAIN pointers linking them.
 
+     tree-ssa-dce is using context to match new and delete operators (which may
+     be static functions).
+
      Also do not drop containing types for virtual methods and tables because
      these are needed by devirtualization.
      C++ destructors are special because C++ frontends sometimes produces
@@ -5886,6 +5889,9 @@ free_lang_data_in_decl (tree decl, class free_lang_data_d *fld)
      devirutalization code we always walk through aliases and we need
      context to be preserved too.  See PR89335  */
   if (TREE_CODE (decl) != FIELD_DECL
+      && (TREE_CODE (decl) != FUNCTION_DECL
+	  || (!DECL_IS_REPLACEABLE_OPERATOR_NEW_P (decl)
+	      && !DECL_IS_OPERATOR_DELETE_P (decl)))
       && ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL)
           || (!DECL_VIRTUAL_P (decl)
 	      && (TREE_CODE (decl) != FUNCTION_DECL
Jan Hubicka April 4, 2020, 11:53 a.m. | #7
Hi,
thinking a bit of the problem, I guess we could match in addition to
DECL_CONTEXT the whole inline stack of both statements and see if there
are inlined new/delete operators and if so if they are always in
matching pairs.

The inline stack is available as
for (tree block = gimple_block (call); block && TREE_CODE (block) == BLOCK; block = BLOCK_SUPERCONTEXT (block))
  {
    tree fn = block_ultimate_origin (block);
    if (fn != NULL && TREE_CODE (fn) == FUNCTION_DECL)
      do the checking htere.
  }

But I do not understand what C++ promises here and in what conditions
the new/delete pair can be removed.
Honza
Martin Liška April 6, 2020, 8:34 a.m. | #8
On 4/3/20 5:26 PM, Jan Hubicka wrote:
>> On 3/31/20 2:29 PM, Jan Hubicka wrote:

>>> Well, I basically went through all pointers and tried to get rid of as

>>> many of them as possible.  CONTEXT pointers do increase size of SCCs

>>> that increases chance they will not get merged and also processing time

>>> of merging algorithm.  I guess if we need to stream more contexts we

>>> could do that (and check the effect on merging and average SCC size)

>>

>> Ok, do we want to stream contexts just for the new/delete operators?

>> Can you please prepare a streaming patch?

> Hi,

> I am still not convinced that context is useful here.

> It took me a while to understand what the code does and why it fails,

> but here is a testcase.

> It fails for me with your patch and -O2 --param early-inlining-insns=100

> 

> The invalid transform is to remove pair

> base:new and B:delete

> B:new gets inlined and we get count out of sync.

> 

> Honza

> 

> #include <stdio.h>

> volatile int idx;

> struct base

> {

>    __attribute__((malloc,noinline))

>    static void* operator new(unsigned long sz)

>    {

>      return ::operator new(sz);

>    }

> 

>    __attribute__((malloc,noinline))

>    static void operator delete(void* ptr)

>    {

>      --count[idx];

>      ::operator delete(ptr);

>    }

>    volatile static int count[2];

> };

> volatile int base::count[2] = {0,0};

> struct B:base

> {

>    static void* operator new(unsigned long sz)

>    {

>      ++count[idx];

>      return base::operator new(sz);

>    }

> 

> };

> 

> 

> volatile int c=1;

> 

> int main(){

>    for (int i; i<c;i++)

>      {

>        idx=0;

>        delete new B;

>        if (B::count[0] != 0)

> 	__builtin_abort ();

>      }

> 

> 

>    return 0;

> }

> 


May I please ping Jason, Nathan and Jonathan who can help us here?

Thanks,
Martin
Iain Buclaw via Gcc-patches April 6, 2020, 9:27 a.m. | #9
On Sat, Apr 4, 2020 at 1:53 PM Jan Hubicka <hubicka@ucw.cz> wrote:
>

> Hi,

> thinking a bit of the problem, I guess we could match in addition to

> DECL_CONTEXT the whole inline stack of both statements and see if there

> are inlined new/delete operators and if so if they are always in

> matching pairs.

>

> The inline stack is available as

> for (tree block = gimple_block (call); block && TREE_CODE (block) == BLOCK; block = BLOCK_SUPERCONTEXT (block))

>   {

>     tree fn = block_ultimate_origin (block);

>     if (fn != NULL && TREE_CODE (fn) == FUNCTION_DECL)

>       do the checking htere.

>   }

>

> But I do not understand what C++ promises here and in what conditions

> the new/delete pair can be removed.


But if the inline stack matches in pairs then the ultimate new/delete
call should also
match, no?  When there's a mismatch in inlining we can't DCE since we
can't remove
the extra inlined stmts.

Your failing testcase suggests we never can remove new/delete pairs though
unless the DECL_CONTEXT is 'final'.  Also the user could have chosen to
"inline" the side-effect of the new operation manually but not the
delete one, so

operator delete() { count-- }

  ptr = new A;
  count++;
  delete ptr;

is it valid to elide the new/delete pair here?

Richard.

> Honza
Nathan Sidwell April 6, 2020, 12:45 p.m. | #10
On 4/6/20 4:34 AM, Martin Liška wrote:

> 

> May I please ping Jason, Nathan and Jonathan who can help us here?


On IRC Martin clarified the question as essentially 'how do you pair up 
operator new and operator delete calls?' (so you may delete them if the 
object is not used).

I am not sure you're permitted to remove those calls in general.  All I 
can find is [expr.new]/12
'An implementation is allowed to omit a call to a replaceable global 
allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage 
is instead provided by the implementation or provided by extending the 
allocation of another new-expression.'

But, I suspect the optimization is safe, in that either no one counts 
objects by their allocation, or if they do, they don't actually care 
that the std-conforming number of allocations happen.

The both operator new and operator delete are looked up in the same 
manner.  The std does not require a 'matching pair' be found.  but it is 
extremely poor form for a class to declare exactly one of operator 
{new,delete}.

The following is well formed:

struct PoorForm {
   void *operator new (size_t s) {count++; return ::operator new (s)};
   static unsigned count;
};

Have you considered throwing ctors?

struct Obj {
   Obj (); // might throw
};

'obj = new Obj (); ... delete obj;' sequence expand to something like ...

// obj = new Obj ();
void *p = ::operator new (sizeof (Obj));
try {
   Obj::ctor(p);
}
catch (...) // cleanup code
{
   ::operator delete (p); // #1
   throw;
}

obj = (Obj*)p;

.... user code

// delete obj;
Obj::dtor (obj);
::operator delete (obj); // #2

calls 1 & 2 matchup to the operator new call

Notice that para I quoted allows one to coalesce allocations using the 
global operator new/deletes.  The rules are pretty much as you can guess 
-- one lifetime must be entirely within the other.  If inner one's ctor 
throws, the exception path must destroy the outer.

does that help?

nathan

-- 
Nathan Sidwell
Iain Buclaw via Gcc-patches April 6, 2020, 3:10 p.m. | #11
On Mon, Apr 6, 2020 at 5:27 AM Richard Biener via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Sat, Apr 4, 2020 at 1:53 PM Jan Hubicka <hubicka@ucw.cz> wrote:

> >

> > Hi,

> > thinking a bit of the problem, I guess we could match in addition to

> > DECL_CONTEXT the whole inline stack of both statements and see if there

> > are inlined new/delete operators and if so if they are always in

> > matching pairs.

> >

> > The inline stack is available as

> > for (tree block = gimple_block (call); block && TREE_CODE (block) ==

> BLOCK; block = BLOCK_SUPERCONTEXT (block))

> >   {

> >     tree fn = block_ultimate_origin (block);

> >     if (fn != NULL && TREE_CODE (fn) == FUNCTION_DECL)

> >       do the checking htere.

> >   }

> >

> > But I do not understand what C++ promises here and in what conditions

> > the new/delete pair can be removed.

>

> But if the inline stack matches in pairs then the ultimate new/delete

> call should also

> match, no?  When there's a mismatch in inlining we can't DCE since we

> can't remove

> the extra inlined stmts.

>

> Your failing testcase suggests we never can remove new/delete pairs though

> unless the DECL_CONTEXT is 'final'.  Also the user could have chosen to

> "inline" the side-effect of the new operation manually but not the

> delete one, so

>

> operator delete() { count-- }

>

>   ptr = new A;

>   count++;

>   delete ptr;

>

> is it valid to elide the new/delete pair here?

>


The C++ constraints are (deliberately, I think) vague.  As Nathan quotes,
it just says that a call to a replaceable operator new can be omitted, and
that if it is, the matching delete-expression should not call operator
delete.  This example seems to demonstrate that we should also only
consider the replaceable delete operators, not any operator delete.

Jason
Iain Buclaw via Gcc-patches April 7, 2020, 8:26 a.m. | #12
On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell <nathan@acm.org> wrote:
>

> On 4/6/20 4:34 AM, Martin Liška wrote:

>

> >

> > May I please ping Jason, Nathan and Jonathan who can help us here?

>

> On IRC Martin clarified the question as essentially 'how do you pair up

> operator new and operator delete calls?' (so you may delete them if the

> object is not used).

>

> I am not sure you're permitted to remove those calls in general.  All I

> can find is [expr.new]/12

> 'An implementation is allowed to omit a call to a replaceable global

> allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage

> is instead provided by the implementation or provided by extending the

> allocation of another new-expression.'


At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith
summarised the rules as "new-expressions, like std::allocator, may
obtain storage by calling 'operator new', but it's unspecified how
often it's called and with what arguments."

So if our optimisation is removing the calls to base::operator new and
base::operator delete, but not the B::operator new call, then it seems
to be working at the wrong level. It should be eliding any calls to
operator new and operator delete at the point of the new-expression
and delete-expression, not leaving one call to operator new present
and then removing another one (which leaves the call "partially
removed").
Iain Buclaw via Gcc-patches April 7, 2020, 9:29 a.m. | #13
On Tue, Apr 7, 2020 at 10:26 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>

> On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell <nathan@acm.org> wrote:

> >

> > On 4/6/20 4:34 AM, Martin Liška wrote:

> >

> > >

> > > May I please ping Jason, Nathan and Jonathan who can help us here?

> >

> > On IRC Martin clarified the question as essentially 'how do you pair up

> > operator new and operator delete calls?' (so you may delete them if the

> > object is not used).

> >

> > I am not sure you're permitted to remove those calls in general.  All I

> > can find is [expr.new]/12

> > 'An implementation is allowed to omit a call to a replaceable global

> > allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage

> > is instead provided by the implementation or provided by extending the

> > allocation of another new-expression.'

>

> At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith

> summarised the rules as "new-expressions, like std::allocator, may

> obtain storage by calling 'operator new', but it's unspecified how

> often it's called and with what arguments."

>

> So if our optimisation is removing the calls to base::operator new and

> base::operator delete, but not the B::operator new call, then it seems

> to be working at the wrong level. It should be eliding any calls to

> operator new and operator delete at the point of the new-expression

> and delete-expression, not leaving one call to operator new present

> and then removing another one (which leaves the call "partially

> removed").


Well, the question is how to identify "operator new and operator delete at the
point of the new-expression and delete-expression".  Currently we're
just matching up "is this a new/delete operator" and the dataflow of the
pointer.  In the PR it was suggested the actual called methods should have
the same DECL_CONTEXT.  Honza suggested the context should have the
"same" ODR type (or be global).

You make it sound it's much harder and the parser needs to build the
relation?  You also suggest the "validness" is only OK in the context
of std::allocator and based on the unspecifiedness of the number of
allocations from the standard library.  That would further suggest that
we need to mark the allocation/deallocation points somehow and _not_
base the optimization on the called new/delete "function" (maybe with
an exception of the global ::new and ::delete).

Richard.
Jan Hubicka April 7, 2020, 9:49 a.m. | #14
> 

> Well, the question is how to identify "operator new and operator delete at the

> point of the new-expression and delete-expression".  Currently we're

> just matching up "is this a new/delete operator" and the dataflow of the

> pointer.  In the PR it was suggested the actual called methods should have

> the same DECL_CONTEXT.  Honza suggested the context should have the

> "same" ODR type (or be global).

I was just arguing that comparing pointers to types does not make much
sence in LTO where types can get unshared :)
Since the classes have ODR name at least this problem can be solved by
using ODR name compare.

However the testcase I sent abuses the fact that if you inherit the
class you can rewrite only new operator.  In the inerited class
DECL_CONTEXT of delete will still point to the oriignal class.  This
means that you can mix new/delete pair from the original and inherited
class.

Honza
> 

> You make it sound it's much harder and the parser needs to build the

> relation?  You also suggest the "validness" is only OK in the context

> of std::allocator and based on the unspecifiedness of the number of

> allocations from the standard library.  That would further suggest that

> we need to mark the allocation/deallocation points somehow and _not_

> base the optimization on the called new/delete "function" (maybe with

> an exception of the global ::new and ::delete).

> 

> Richard.
Iain Buclaw via Gcc-patches April 7, 2020, 10:22 a.m. | #15
On Tue, Apr 7, 2020 at 11:49 AM Jan Hubicka <hubicka@ucw.cz> wrote:
>

> >

> > Well, the question is how to identify "operator new and operator delete at the

> > point of the new-expression and delete-expression".  Currently we're

> > just matching up "is this a new/delete operator" and the dataflow of the

> > pointer.  In the PR it was suggested the actual called methods should have

> > the same DECL_CONTEXT.  Honza suggested the context should have the

> > "same" ODR type (or be global).

> I was just arguing that comparing pointers to types does not make much

> sence in LTO where types can get unshared :)

> Since the classes have ODR name at least this problem can be solved by

> using ODR name compare.


Sure.

> However the testcase I sent abuses the fact that if you inherit the

> class you can rewrite only new operator.  In the inerited class

> DECL_CONTEXT of delete will still point to the oriignal class.  This

> means that you can mix new/delete pair from the original and inherited

> class.


Yeah, but we're searching for a correctness fix not for an optimality one ;)

It sounds matching up the pairs in the middle-end is impossible and thus
the FE has to do this match-up (or refrain from marking new/delete when
a matchup according to to be defined methods is not going to be enough).
And maybe that tracking has to be done on a per call level rather than
on a called function level.

Richard.

> Honza

> >

> > You make it sound it's much harder and the parser needs to build the

> > relation?  You also suggest the "validness" is only OK in the context

> > of std::allocator and based on the unspecifiedness of the number of

> > allocations from the standard library.  That would further suggest that

> > we need to mark the allocation/deallocation points somehow and _not_

> > base the optimization on the called new/delete "function" (maybe with

> > an exception of the global ::new and ::delete).

> >

> > Richard.
Martin Liška April 7, 2020, 10:42 a.m. | #16
On 4/7/20 12:22 PM, Richard Biener wrote:
> On Tue, Apr 7, 2020 at 11:49 AM Jan Hubicka <hubicka@ucw.cz> wrote:

>>

>>>

>>> Well, the question is how to identify "operator new and operator delete at the

>>> point of the new-expression and delete-expression".  Currently we're

>>> just matching up "is this a new/delete operator" and the dataflow of the

>>> pointer.  In the PR it was suggested the actual called methods should have

>>> the same DECL_CONTEXT.  Honza suggested the context should have the

>>> "same" ODR type (or be global).

>> I was just arguing that comparing pointers to types does not make much

>> sence in LTO where types can get unshared :)

>> Since the classes have ODR name at least this problem can be solved by

>> using ODR name compare.

> 

> Sure.

> 

>> However the testcase I sent abuses the fact that if you inherit the

>> class you can rewrite only new operator.  In the inerited class

>> DECL_CONTEXT of delete will still point to the oriignal class.  This

>> means that you can mix new/delete pair from the original and inherited

>> class.

> 

> Yeah, but we're searching for a correctness fix not for an optimality one ;)

> 

> It sounds matching up the pairs in the middle-end is impossible and thus

> the FE has to do this match-up (or refrain from marking new/delete when

> a matchup according to to be defined methods is not going to be enough).

> And maybe that tracking has to be done on a per call level rather than

> on a called function level.


Based on what was said here, I see a proper matching implementation quite expensive
from implementation point of point. Moreover, the optimization itself has quite
low impact and so I suggest to only remove matching replaceable operators (1-12)
from [1], which are the top-level ones.

I'll come up with DECL_IS_REPLACEABLE_OPERATOR_DELETE_P and fix

#define DECL_IS_REPLACEABLE_OPERATOR_NEW_P(NODE) \
   (DECL_IS_OPERATOR_NEW_P (NODE) && DECL_IS_MALLOC (NODE))

which is not correct. It also matches

struct A
{
   __attribute__((noinline))
   static void* operator new(unsigned long sz)
   {
     ++count;
     return ::operator new(sz);
   }

where DECL_IS_MALLOC is discovered by local IPA passes.

Hope it's viable solution?
Thanks,
Martin

[1] https://en.cppreference.com/w/cpp/memory/new/operator_delete

> 

> Richard.

> 

>> Honza

>>>

>>> You make it sound it's much harder and the parser needs to build the

>>> relation?  You also suggest the "validness" is only OK in the context

>>> of std::allocator and based on the unspecifiedness of the number of

>>> allocations from the standard library.  That would further suggest that

>>> we need to mark the allocation/deallocation points somehow and _not_

>>> base the optimization on the called new/delete "function" (maybe with

>>> an exception of the global ::new and ::delete).

>>>

>>> Richard.
Martin Liška April 7, 2020, 10:46 a.m. | #17
On 4/6/20 2:45 PM, Nathan Sidwell wrote:
> On 4/6/20 4:34 AM, Martin Liška wrote:

> 

>>

>> May I please ping Jason, Nathan and Jonathan who can help us here?

> 

> On IRC Martin clarified the question as essentially 'how do you pair up operator new and operator delete calls?' (so you may delete them if the object is not used).

> 

> I am not sure you're permitted to remove those calls in general.  All I can find is [expr.new]/12

> 'An implementation is allowed to omit a call to a replaceable global allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression.'

> 

> But, I suspect the optimization is safe, in that either no one counts objects by their allocation, or if they do, they don't actually care that the std-conforming number of allocations happen.

> 

> The both operator new and operator delete are looked up in the same manner.  The std does not require a 'matching pair' be found.  but it is extremely poor form for a class to declare exactly one of operator {new,delete}.


Hi.

Thank you for the information.

> 

> The following is well formed:

> 

> struct PoorForm {

>    void *operator new (size_t s) {count++; return ::operator new (s)};

>    static unsigned count;

> };

> 

> Have you considered throwing ctors?

> 

> struct Obj {

>    Obj (); // might throw

> };

> 

> 'obj = new Obj (); ... delete obj;' sequence expand to something like ...

> 

> // obj = new Obj ();

> void *p = ::operator new (sizeof (Obj));

> try {

>    Obj::ctor(p);

> }

> catch (...) // cleanup code

> {

>    ::operator delete (p); // #1

>    throw;

> }

> 

> obj = (Obj*)p;

> 

> .... user code

> 

> // delete obj;

> Obj::dtor (obj);

> ::operator delete (obj); // #2

> 

> calls 1 & 2 matchup to the operator new call


Looking at the throwing ctors:

$ cat new-delete2.C
#include <stdio.h>

struct A
{
   __attribute__((always_inline)) A(int x)
   {
     if (x == 123)
       throw x;
   }
};

int main(int argc, char **argv) {
   A *a = new A (argc);
   delete a;

   return 0;
}

$ g++-9 new-delete2.C -O2 -c -fdump-tree-optimized=/dev/stdout
...
   <bb 2> [local count: 1073741824]:
   _3 = operator new (1);
   if (argc_4(D) == 123)
     goto <bb 3>; [0.00%]
   else
     goto <bb 4>; [100.00%]

   <bb 3> [count: 0]:
   _8 = __cxa_allocate_exception (4);
   MEM[(int *)_8] = 123;
   __cxa_throw (_8, &_ZTIi, 0B);

   <bb 4> [local count: 1073741824]:
   operator delete (_3, 1);
   return 0;
...

As seen cddce can still optimize out
   _3 = operator new (1);
...
   operator delete (_3, 1);

Martin

> 

> Notice that para I quoted allows one to coalesce allocations using the global operator new/deletes.  The rules are pretty much as you can guess -- one lifetime must be entirely within the other.  If inner one's ctor throws, the exception path must destroy the outer.

> 

> does that help?

> 

> nathan

>
Iain Buclaw via Gcc-patches April 7, 2020, 11:29 a.m. | #18
On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:
> The both operator new and operator delete are looked up in the same

> manner.  The std does not require a 'matching pair' be found.  but it is

> extremely poor form for a class to declare exactly one of operator

> {new,delete}.


There are unfortunately several such example in the standard!

I wonder how much benefit we will really get from trying to make this
optimisation too general.

Just eliding (or coalescing) implicit calls to ::operator new(size_t)
and ::operator delete(void*, size_t) (and the [] and align_val_t forms
of those) probably covers 99% of real cases.
Iain Buclaw via Gcc-patches April 7, 2020, 11:40 a.m. | #19
On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>

> On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:

> > The both operator new and operator delete are looked up in the same

> > manner.  The std does not require a 'matching pair' be found.  but it is

> > extremely poor form for a class to declare exactly one of operator

> > {new,delete}.

>

> There are unfortunately several such example in the standard!

>

> I wonder how much benefit we will really get from trying to make this

> optimisation too general.

>

> Just eliding (or coalescing) implicit calls to ::operator new(size_t)

> and ::operator delete(void*, size_t) (and the [] and align_val_t forms

> of those) probably covers 99% of real cases.


IIRC the only reason to have the optimization was to fully elide
STL containers when possible.  That brings in allocators and
thus non ::new/::delete allocations.  Which then suggests that
our standard library implementation could annotate allocation
points somehow.

Richard.
Iain Buclaw via Gcc-patches April 7, 2020, 11:41 a.m. | #20
On Tue, 7 Apr 2020 at 10:29, Richard Biener <richard.guenther@gmail.com> wrote:
>

> On Tue, Apr 7, 2020 at 10:26 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> >

> > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell <nathan@acm.org> wrote:

> > >

> > > On 4/6/20 4:34 AM, Martin Liška wrote:

> > >

> > > >

> > > > May I please ping Jason, Nathan and Jonathan who can help us here?

> > >

> > > On IRC Martin clarified the question as essentially 'how do you pair up

> > > operator new and operator delete calls?' (so you may delete them if the

> > > object is not used).

> > >

> > > I am not sure you're permitted to remove those calls in general.  All I

> > > can find is [expr.new]/12

> > > 'An implementation is allowed to omit a call to a replaceable global

> > > allocation function (17.6.2.1, 17.6.2.2). When it does so, the storage

> > > is instead provided by the implementation or provided by extending the

> > > allocation of another new-expression.'

> >

> > At https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94295#c6 Richard Smith

> > summarised the rules as "new-expressions, like std::allocator, may

> > obtain storage by calling 'operator new', but it's unspecified how

> > often it's called and with what arguments."

> >

> > So if our optimisation is removing the calls to base::operator new and

> > base::operator delete, but not the B::operator new call, then it seems

> > to be working at the wrong level. It should be eliding any calls to

> > operator new and operator delete at the point of the new-expression

> > and delete-expression, not leaving one call to operator new present

> > and then removing another one (which leaves the call "partially

> > removed").

>

> Well, the question is how to identify "operator new and operator delete at the

> point of the new-expression and delete-expression".  Currently we're

> just matching up "is this a new/delete operator" and the dataflow of the

> pointer.  In the PR it was suggested the actual called methods should have

> the same DECL_CONTEXT.  Honza suggested the context should have the

> "same" ODR type (or be global).

>

> You make it sound it's much harder and the parser needs to build the

> relation?  You also suggest the "validness" is only OK in the context

> of std::allocator and based on the unspecifiedness of the number of

> allocations from the standard library.


I don't think Richard's summary or my paraphrasing intends to say it
only applies to std::allocator.
Iain Buclaw via Gcc-patches April 7, 2020, 11:46 a.m. | #21
On Tue, 7 Apr 2020 at 12:40, Richard Biener <richard.guenther@gmail.com> wrote:
>

> On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> >

> > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:

> > > The both operator new and operator delete are looked up in the same

> > > manner.  The std does not require a 'matching pair' be found.  but it is

> > > extremely poor form for a class to declare exactly one of operator

> > > {new,delete}.

> >

> > There are unfortunately several such example in the standard!

> >

> > I wonder how much benefit we will really get from trying to make this

> > optimisation too general.

> >

> > Just eliding (or coalescing) implicit calls to ::operator new(size_t)

> > and ::operator delete(void*, size_t) (and the [] and align_val_t forms

> > of those) probably covers 99% of real cases.

>

> IIRC the only reason to have the optimization was to fully elide

> STL containers when possible.  That brings in allocators and

> thus non ::new/::delete allocations.


But the vast majority of containers are used with std::allocator, and
we control that.

Wouldn't it be simpler to add __builtin_operator_new and
__builtin_operator_delete like clang has, then make std::allocator use
those, and limit the optimizations to uses of those built-ins?
Iain Buclaw via Gcc-patches April 7, 2020, 11:57 a.m. | #22
On Tue, Apr 7, 2020 at 1:46 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>

> On Tue, 7 Apr 2020 at 12:40, Richard Biener <richard.guenther@gmail.com> wrote:

> >

> > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> > >

> > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:

> > > > The both operator new and operator delete are looked up in the same

> > > > manner.  The std does not require a 'matching pair' be found.  but it is

> > > > extremely poor form for a class to declare exactly one of operator

> > > > {new,delete}.

> > >

> > > There are unfortunately several such example in the standard!

> > >

> > > I wonder how much benefit we will really get from trying to make this

> > > optimisation too general.

> > >

> > > Just eliding (or coalescing) implicit calls to ::operator new(size_t)

> > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms

> > > of those) probably covers 99% of real cases.

> >

> > IIRC the only reason to have the optimization was to fully elide

> > STL containers when possible.  That brings in allocators and

> > thus non ::new/::delete allocations.

>

> But the vast majority of containers are used with std::allocator, and

> we control that.

>

> Wouldn't it be simpler to add __builtin_operator_new and

> __builtin_operator_delete like clang has, then make std::allocator use

> those, and limit the optimizations to uses of those built-ins?


Sure, that's a viable implementation strathegy.  Another might be

void delete (void *) __attribute__((matching_new(somewhere::new)));

and thus allow the user to relate a new/delete pair (here the FE would
do lookup for 'new' and record for example the mangled assembler name).

That is, we usually try to design a mechanism that's more broadly usable.
But yes, we match malloc/free so matching ::new/::delete by aliasing them
to __builtin_operator_new and __builtin_operator_delete is fair enough.

Not easily usable by other languages with custom allocation though
(fortran allocate/deallocate anyone? that's currently inlined to expose
underlying malloc/free calls for similar reasons)

Richard.
Nathan Sidwell April 7, 2020, 2:11 p.m. | #23
On 4/7/20 7:29 AM, Jonathan Wakely wrote:
> On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:

>> The both operator new and operator delete are looked up in the same

>> manner.  The std does not require a 'matching pair' be found.  but it is

>> extremely poor form for a class to declare exactly one of operator

>> {new,delete}.

> 

> There are unfortunately several such example in the standard!


Pah!

I also realized one could write:

struct Derived : Base {
   void *operator new (size_t t) { ... }
   using Base::operator delete;
};

which is not such poor code, but the FE will generate calls that 
completely lose the usingness of the operator delete.

As RichardB comments, I think the FE needs to mark sets of new/delete 
calls for the middle end to safely optimize. but ...

> 

> I wonder how much benefit we will really get from trying to make this

> optimisation too general.

> 

> Just eliding (or coalescing) implicit calls to ::operator new(size_t)

> and ::operator delete(void*, size_t) (and the [] and align_val_t forms

> of those) probably covers 99% of real cases.


I agree.  It's certainly a simpler problem and the major component of 
any win we'll get.

nathan

-- 
Nathan Sidwell
Iain Buclaw via Gcc-patches April 7, 2020, 3:16 p.m. | #24
On Tue, 7 Apr 2020 at 12:57, Richard Biener <richard.guenther@gmail.com> wrote:
>

> On Tue, Apr 7, 2020 at 1:46 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> >

> > On Tue, 7 Apr 2020 at 12:40, Richard Biener <richard.guenther@gmail.com> wrote:

> > >

> > > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> > > >

> > > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:

> > > > > The both operator new and operator delete are looked up in the same

> > > > > manner.  The std does not require a 'matching pair' be found.  but it is

> > > > > extremely poor form for a class to declare exactly one of operator

> > > > > {new,delete}.

> > > >

> > > > There are unfortunately several such example in the standard!

> > > >

> > > > I wonder how much benefit we will really get from trying to make this

> > > > optimisation too general.

> > > >

> > > > Just eliding (or coalescing) implicit calls to ::operator new(size_t)

> > > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms

> > > > of those) probably covers 99% of real cases.

> > >

> > > IIRC the only reason to have the optimization was to fully elide

> > > STL containers when possible.  That brings in allocators and

> > > thus non ::new/::delete allocations.

> >

> > But the vast majority of containers are used with std::allocator, and

> > we control that.

> >

> > Wouldn't it be simpler to add __builtin_operator_new and

> > __builtin_operator_delete like clang has, then make std::allocator use

> > those, and limit the optimizations to uses of those built-ins?

>

> Sure, that's a viable implementation strathegy.  Another might be

>

> void delete (void *) __attribute__((matching_new(somewhere::new)));

>

> and thus allow the user to relate a new/delete pair (here the FE would

> do lookup for 'new' and record for example the mangled assembler name).


Does that attribute tell us anything useful?

Given a pointer obtained from new and passed to delete, can't we
assume they are a matching pair? If not, the behaviour would be
undefined anyway.

> That is, we usually try to design a mechanism that's more broadly usable.

> But yes, we match malloc/free so matching ::new/::delete by aliasing them

> to __builtin_operator_new and __builtin_operator_delete is fair enough.


Would it make sense to annotate the actual calls to the
allocation/deallocation functions, instead of the declarations of
those functions?

According to Richard Smith, we don't want to optimise away 'operator
delete(operator new(1), 1)' because that's an explicit call, and the
user might have replaced those functions and be relying on the side
effects. But we can optimise away 'delete new char' and we can
optimise away std::allocator<char>().deallocate(std::allocator<char>().allocate(1),
1). So what matters is not whether the functions being called match up
(they have to anyway) or which functions are being called. What
matters is whether they are called implicitly (by a new-expression or
by std::allocator).

So when the compiler expands 'new T' into a call to operator new
followed by constructing a T (plus exception handling) the call to
operator new would be marked as optimisable by the FE, and likewise
when the compiler expands 'delete p' into a destructor followed by
calling operator delete, the call to operator delete would be marked
as optimisable. If a pointer is allocated by a call marked optimisable
and deallocated by a call marked optimisable, it can be optimised away
(or coalesced with another optimisation).

Then for std::allocator we just want to be able to mark the explicit
calls to operator new and operator delete as "optimisable", as the FE
does for the implicit calls it generates. So if we want a general
purpose utility, it would be an attribute to mark /calls/ as
optimisable, not functions.

Adding __builtin_operator_new would just be a different syntax for
"call operator new but mark it optimisable".

N.B. I am not a compiler dev and might be talking nonsense :-)
Iain Buclaw via Gcc-patches April 8, 2020, 7:34 a.m. | #25
On Tue, Apr 7, 2020 at 5:17 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>

> On Tue, 7 Apr 2020 at 12:57, Richard Biener <richard.guenther@gmail.com> wrote:

> >

> > On Tue, Apr 7, 2020 at 1:46 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> > >

> > > On Tue, 7 Apr 2020 at 12:40, Richard Biener <richard.guenther@gmail.com> wrote:

> > > >

> > > > On Tue, Apr 7, 2020 at 1:30 PM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:

> > > > >

> > > > > On Mon, 6 Apr 2020 at 13:45, Nathan Sidwell wrote:

> > > > > > The both operator new and operator delete are looked up in the same

> > > > > > manner.  The std does not require a 'matching pair' be found.  but it is

> > > > > > extremely poor form for a class to declare exactly one of operator

> > > > > > {new,delete}.

> > > > >

> > > > > There are unfortunately several such example in the standard!

> > > > >

> > > > > I wonder how much benefit we will really get from trying to make this

> > > > > optimisation too general.

> > > > >

> > > > > Just eliding (or coalescing) implicit calls to ::operator new(size_t)

> > > > > and ::operator delete(void*, size_t) (and the [] and align_val_t forms

> > > > > of those) probably covers 99% of real cases.

> > > >

> > > > IIRC the only reason to have the optimization was to fully elide

> > > > STL containers when possible.  That brings in allocators and

> > > > thus non ::new/::delete allocations.

> > >

> > > But the vast majority of containers are used with std::allocator, and

> > > we control that.

> > >

> > > Wouldn't it be simpler to add __builtin_operator_new and

> > > __builtin_operator_delete like clang has, then make std::allocator use

> > > those, and limit the optimizations to uses of those built-ins?

> >

> > Sure, that's a viable implementation strathegy.  Another might be

> >

> > void delete (void *) __attribute__((matching_new(somewhere::new)));

> >

> > and thus allow the user to relate a new/delete pair (here the FE would

> > do lookup for 'new' and record for example the mangled assembler name).

>

> Does that attribute tell us anything useful?

>

> Given a pointer obtained from new and passed to delete, can't we

> assume they are a matching pair? If not, the behaviour would be

> undefined anyway.


Further matching of new/delete came up in the context of inlining where
we might not be able to elide side-effects of new/delete "appropriately".
That's actually the case in the referenced PR.

> > That is, we usually try to design a mechanism that's more broadly usable.

> > But yes, we match malloc/free so matching ::new/::delete by aliasing them

> > to __builtin_operator_new and __builtin_operator_delete is fair enough.

>

> Would it make sense to annotate the actual calls to the

> allocation/deallocation functions, instead of the declarations of

> those functions?


Sure - I think that's ultimatively the way to go.

> According to Richard Smith, we don't want to optimise away 'operator

> delete(operator new(1), 1)' because that's an explicit call, and the

> user might have replaced those functions and be relying on the side

> effects. But we can optimise away 'delete new char' and we can

> optimise away std::allocator<char>().deallocate(std::allocator<char>().allocate(1),

> 1). So what matters is not whether the functions being called match up

> (they have to anyway) or which functions are being called. What

> matters is whether they are called implicitly (by a new-expression or

> by std::allocator).


OK, I see.  So for the inline case we'd for example have

operator new() { return ::new ...; }

and

 delete new T;

where the generated and marked operator new is inlined we'd no longer
elide the pair because the operator new implementation had an explicit
call to ::new and this is what the optimization sees now.  Only when
the operator new implementation uses a new expression again we'd
see it as candidate pair and then run into the inline issue again
(the issue of the new operator implementation incrementing some counter
and the delete one decrementing it, us inlining one of both and then
eliding the pair, only eliding either the increment or the decrement
and thus retaining half of the overall side-effect).

> So when the compiler expands 'new T' into a call to operator new

> followed by constructing a T (plus exception handling) the call to

> operator new would be marked as optimisable by the FE, and likewise

> when the compiler expands 'delete p' into a destructor followed by

> calling operator delete, the call to operator delete would be marked

> as optimisable. If a pointer is allocated by a call marked optimisable

> and deallocated by a call marked optimisable, it can be optimised away

> (or coalesced with another optimisation).

>

> Then for std::allocator we just want to be able to mark the explicit

> calls to operator new and operator delete as "optimisable", as the FE

> does for the implicit calls it generates. So if we want a general

> purpose utility, it would be an attribute to mark /calls/ as

> optimisable, not functions.

>

> Adding __builtin_operator_new would just be a different syntax for

> "call operator new but mark it optimisable".


Ah, so __builtin_operator_new isn't a function but an alternate
new expression syntax?

Richard.

> N.B. I am not a compiler dev and might be talking nonsense :-)
Iain Buclaw via Gcc-patches April 8, 2020, 8:11 a.m. | #26
On Wed, 8 Apr 2020 at 08:35, Richard Biener wrote:
> Ah, so __builtin_operator_new isn't a function but an alternate

> new expression syntax?


No, not a new-expression, because a new-expression calls operator new
to obtain storage *and* begins the lifetime of one or more objects in
that storage. __builtin_operator_new is just the first part, i.e. the
operator new(...) call. But because explicit calls to operator
new(...) are not supposed to be optimized, __builtin_operator_new is a
way of calling operator new(...) that tells the compiler "this isn't
an explicit call, you can optimise it". So a new-expression can use it
(because that needs to call operator new(...), but the call should be
optimisable) and std::allocator<T>::allocate(n) can use it (because
that call is also optimisable).

Patch

diff --git a/gcc/testsuite/g++.dg/pr94314.C b/gcc/testsuite/g++.dg/pr94314.C
new file mode 100644
index 00000000000..76cd9d8d2a4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr94314.C
@@ -0,0 +1,84 @@ 
+/* PR c++/94314.  */
+/* { dg-options "-O2 -fdump-tree-cddce-details" } */
+/* { dg-additional-options "-fdelete-null-pointer-checks" } */
+
+#include <stdio.h>
+
+struct A
+{
+  __attribute__((malloc,noinline))
+  static void* operator new(unsigned long sz)
+  {
+    ++count;
+    return ::operator new(sz);
+  }
+
+  static void operator delete(void* ptr)
+  {
+    --count;
+    ::operator delete(ptr);
+  }
+
+  static int count;
+};
+
+int A::count = 0;
+
+struct B
+{
+  __attribute__((malloc,noinline))
+  static void* operator new(unsigned long sz)
+  {
+    ++count;
+    return ::operator new(sz);
+  }
+
+  __attribute__((noinline))
+  static void operator delete(void* ptr)
+  {
+    --count;
+    ::operator delete(ptr);
+  }
+
+  static int count;
+};
+
+int B::count = 0;
+
+struct C
+{
+  static void* operator new(unsigned long sz)
+  {
+    ++count;
+    return ::operator new(sz);
+  }
+
+  static void operator delete(void* ptr)
+  {
+    --count;
+    ::operator delete(ptr);
+  }
+
+  static int count;
+};
+
+int C::count = 0;
+
+int main(){
+  delete new A;
+  if (A::count != 0)
+    __builtin_abort ();
+
+  delete new B;
+  if (B::count != 0)
+    __builtin_abort ();
+
+  delete new C;
+  if (C::count != 0)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Deleting : operator delete" 1 "cddce1"} } */
+/* { dg-final { scan-tree-dump-times "Deleting : B::operator delete" 1 "cddce1"} } */
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index e4077b58890..d86234ead23 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -824,16 +824,27 @@  propagate_necessity (bool aggressive)
 			   || DECL_FUNCTION_CODE (def_callee) == BUILT_IN_CALLOC))
 		      || DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee)))
 		{
-		  /* Delete operators can have alignment and (or) size as next
-		     arguments.  When being a SSA_NAME, they must be marked
-		     as necessary.  */
-		  if (is_delete_operator && gimple_call_num_args (stmt) >= 2)
-		    for (unsigned i = 1; i < gimple_call_num_args (stmt); i++)
-		      {
-			tree arg = gimple_call_arg (stmt, i);
-			if (TREE_CODE (arg) == SSA_NAME)
-			  mark_operand_necessary (arg);
-		      }
+		  if (is_delete_operator)
+		    {
+		      /* Verify that new and delete operators have the same
+			 context.  */
+		      if (DECL_IS_REPLACEABLE_OPERATOR_NEW_P (def_callee)
+			  && (DECL_CONTEXT (def_callee)
+			      != DECL_CONTEXT (gimple_call_fndecl (stmt))))
+			mark_operand_necessary (gimple_call_arg (stmt, 0));
+
+		      /* Delete operators can have alignment and (or) size
+			 as next arguments.  When being a SSA_NAME, they
+			 must be marked as necessary.  */
+		      if (gimple_call_num_args (stmt) >= 2)
+			for (unsigned i = 1; i < gimple_call_num_args (stmt);
+			     i++)
+			  {
+			    tree arg = gimple_call_arg (stmt, i);
+			    if (TREE_CODE (arg) == SSA_NAME)
+			      mark_operand_necessary (arg);
+			  }
+		    }
 
 		  continue;
 		}