[RFA/RFC] Improve DSE/aliasing around __builtin_object_size (PR89689, P2 regression)

Message ID 340e993bb94e3f8d42b3eaaa4fda63851db73638.camel@redhat.com
State New
Headers show
Series
  • [RFA/RFC] Improve DSE/aliasing around __builtin_object_size (PR89689, P2 regression)
Related show

Commit Message

Jeff Law Jan. 29, 2020, 2:54 a.m.
Here's two approaches to fix the regression in 89689.  Note this only
fixes the regression in the originally reported testcase.  THe deeper
issue Martin raises in C#1 is not addressed.  Thus if we go forward
with either patch ISTM that we would drop the regression markers, but
keep the BZ open.

So the key blocks in question as we enter DSE1:

> 


> ;;   basic block 2, loop depth 0, maybe hot

> ;;    prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)

> ;;    pred:       ENTRY (FALLTHRU,EXECUTABLE)

>   h = l;

>   h$data_33 = l.data;

>   if (h$data_33 != &__sb_slop)

>     goto <bb 3>; [70.00%]

>   else

>     goto <bb 4>; [30.00%]

> ;;    succ:       3 [70.0% (guessed)]  (TRUE_VALUE,EXECUTABLE)

> ;;                4 [30.0% (guessed)]  (FALSE_VALUE,EXECUTABLE)

> 

> ;;   basic block 3, loop depth 0, maybe hot

> ;;    prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)

> ;;    pred:       2 [70.0% (guessed)]  (TRUE_VALUE,EXECUTABLE)

>   *h$data_33 = 0;

> ;;    succ:       4 [always]  (FALLTHRU,EXECUTABLE)

> 

> ;;   basic block 4, loop depth 0, maybe hot

> ;;    prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)

> ;;    pred:       3 [always]  (FALLTHRU,EXECUTABLE)

> ;;                2 [30.0% (guessed)]  (FALSE_VALUE,EXECUTABLE)

>   _23 = __builtin_object_size (h$data_33, 0);

>   _25 = __builtin___memcpy_chk (h$data_33, "abcd", 4, _23);

>   __builtin_puts (h$data_33);

>   _6 = h$data_33 == &__sb_slop;

>   _7 = (int) _6;

>   printf ("%d\n", _7);

>   _9 = h$data_33 == &buf;

>   _10 = (int) _9;

>   printf ("%d\n", _10);

>   if (h$data_33 != &__sb_slop)

>     goto <bb 5>; [70.00%]

>   else

>     goto <bb 6>; [30.00%]

> 


So vrp1 is going to want to thread the jump at the end of bb4 which
requires duplicating bb4.  One version of bb4 will only be reachable
via the false edge from bb2.  That in turn exposes a cprop opportunity 
to replace h$data_33 with &__sbloop and the type of &__sbloop is a
char[1] array thus triggering the false positive warning.

We can avoid all that by realizing the store in bb3 is dead.  That in
turn makes the conditional at the end of bb2 pointless as bb3 is empty
and thus both arms ultimately transfer control to bb4 without any other
side effects meaning we can eliminate that conditional which in turn
eliminates the need for jump threading.

Again, this only fixes the original testcase and if there were other
statements in bb3 it wouldn't work and it won't work for the test in
c#1.  But the proposed changes are a general improvement.

DSE misses this case because ref_maybe_used_by_stmt_p returns true for
the virtual operand of the __builtin_object_size call.  Opps!

There's two easy ways to fix this, I've bootstrapped and regression
tested both.

First, the most conservative fix and perhaps the most appropriate for
gcc-10 -- special casing __builtin_object_size in tree-ssa-dse.c.

The second approach is more general and marks __builtin_object_size as
const rather than just pure.  Jakub and I kicked this around in IRC a
bit.  We've always marked __builtin_object_size as pure.  It was never
const.  THere is some concern that code motion might cause _b_o_s to
get different results, which would be particularly concerning for
_b_o_s types #1 and #3 which do use some contextual information. 
However, ISTM that motion is still going to be limited by the SSA
graph, though somewhat less because we wouldn't have a dependency on
the virtual operands.  So it *should* be safe, but there's some degree
of hesitation to make this change at this point in gcc-10's lifecycle.

Thoughts?  If we go forward obviously I'd add a testcase and ChangeLog
entries.

Jeff
commit f4e5d3f4755b6a5846ac20b53008b90131a8bb7c
Author: Jeff Law <law@torsion.usersys.redhat.com>
Date:   Tue Jan 28 18:16:09 2020 -0500

    Mark _b_o_s as constant

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 5ab842c34c2..fa8b0641ab1 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -972,7 +972,7 @@ DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
 DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq")
 
 /* Object size checking builtins.  */
-DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_MEMCPY_CHK, "__memcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_MEMMOVE_CHK, "__memmove_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_MEMPCPY_CHK, "__mempcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RETNONNULL_NOTHROW_LEAF)

Comments

Richard Biener Jan. 29, 2020, 9 a.m. | #1
On Wed, Jan 29, 2020 at 3:55 AM Jeff Law <law@redhat.com> wrote:
>

>

> Here's two approaches to fix the regression in 89689.  Note this only

> fixes the regression in the originally reported testcase.  THe deeper

> issue Martin raises in C#1 is not addressed.  Thus if we go forward

> with either patch ISTM that we would drop the regression markers, but

> keep the BZ open.

>

> So the key blocks in question as we enter DSE1:

>

> >

>

> > ;;   basic block 2, loop depth 0, maybe hot

> > ;;    prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)

> > ;;    pred:       ENTRY (FALLTHRU,EXECUTABLE)

> >   h = l;

> >   h$data_33 = l.data;

> >   if (h$data_33 != &__sb_slop)

> >     goto <bb 3>; [70.00%]

> >   else

> >     goto <bb 4>; [30.00%]

> > ;;    succ:       3 [70.0% (guessed)]  (TRUE_VALUE,EXECUTABLE)

> > ;;                4 [30.0% (guessed)]  (FALSE_VALUE,EXECUTABLE)

> >

> > ;;   basic block 3, loop depth 0, maybe hot

> > ;;    prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)

> > ;;    pred:       2 [70.0% (guessed)]  (TRUE_VALUE,EXECUTABLE)

> >   *h$data_33 = 0;

> > ;;    succ:       4 [always]  (FALLTHRU,EXECUTABLE)

> >

> > ;;   basic block 4, loop depth 0, maybe hot

> > ;;    prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)

> > ;;    pred:       3 [always]  (FALLTHRU,EXECUTABLE)

> > ;;                2 [30.0% (guessed)]  (FALSE_VALUE,EXECUTABLE)

> >   _23 = __builtin_object_size (h$data_33, 0);

> >   _25 = __builtin___memcpy_chk (h$data_33, "abcd", 4, _23);

> >   __builtin_puts (h$data_33);

> >   _6 = h$data_33 == &__sb_slop;

> >   _7 = (int) _6;

> >   printf ("%d\n", _7);

> >   _9 = h$data_33 == &buf;

> >   _10 = (int) _9;

> >   printf ("%d\n", _10);

> >   if (h$data_33 != &__sb_slop)

> >     goto <bb 5>; [70.00%]

> >   else

> >     goto <bb 6>; [30.00%]

> >

>

> So vrp1 is going to want to thread the jump at the end of bb4 which

> requires duplicating bb4.  One version of bb4 will only be reachable

> via the false edge from bb2.  That in turn exposes a cprop opportunity

> to replace h$data_33 with &__sbloop and the type of &__sbloop is a

> char[1] array thus triggering the false positive warning.

>

> We can avoid all that by realizing the store in bb3 is dead.  That in

> turn makes the conditional at the end of bb2 pointless as bb3 is empty

> and thus both arms ultimately transfer control to bb4 without any other

> side effects meaning we can eliminate that conditional which in turn

> eliminates the need for jump threading.

>

> Again, this only fixes the original testcase and if there were other

> statements in bb3 it wouldn't work and it won't work for the test in

> c#1.  But the proposed changes are a general improvement.

>

> DSE misses this case because ref_maybe_used_by_stmt_p returns true for

> the virtual operand of the __builtin_object_size call.  Opps!

>

> There's two easy ways to fix this, I've bootstrapped and regression

> tested both.

>

> First, the most conservative fix and perhaps the most appropriate for

> gcc-10 -- special casing __builtin_object_size in tree-ssa-dse.c.

>

> The second approach is more general and marks __builtin_object_size as

> const rather than just pure.  Jakub and I kicked this around in IRC a

> bit.  We've always marked __builtin_object_size as pure.  It was never

> const.  THere is some concern that code motion might cause _b_o_s to

> get different results, which would be particularly concerning for

> _b_o_s types #1 and #3 which do use some contextual information.

> However, ISTM that motion is still going to be limited by the SSA

> graph, though somewhat less because we wouldn't have a dependency on

> the virtual operands.  So it *should* be safe, but there's some degree

> of hesitation to make this change at this point in gcc-10's lifecycle.

>

> Thoughts?  If we go forward obviously I'd add a testcase and ChangeLog

> entries.


P2 please (make _b_o_s const).  No idea why it wasn't that way in the first
place.  As you say the value-dependence on the result keeps everything
live and stores or loads are completely irrelevant to the size of objects.

So OK for P2.

Thanks,
Richard.

> Jeff
Jeff Law Jan. 29, 2020, 7:28 p.m. | #2
On Wed, 2020-01-29 at 10:00 +0100, Richard Biener wrote:
> 

> P2 please (make _b_o_s const).  No idea why it wasn't that way in the first

> place.  As you say the value-dependence on the result keeps everything

> live and stores or loads are completely irrelevant to the size of objects.

> 

> So OK for P2.

Here's the patch I committed for the historical record.  It just adds
the test and ChangeLog entries relative to the RFA/RFC originally
posted.

Jeff
commit 786083766459723b790405c9ba22f974f84f637e
Author: Jeff Law <law@redhat.com>
Date:   Wed Jan 29 12:23:53 2020 -0700

    Improve DSE which in turn eliminates the need for jump threading and block
    duplication for the original testcase in pr89689 which in turn eliminates
    the false positive -Warray-bounds warning for the original testcase.
    
            PR tree-optimization/89689
            * builtins.def (BUILT_IN_OBJECT_SIZE): Make it const rather than pure.
    
            PR tree-optimization/89689
            * gcc.dg/pr89689.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e2838c053a3..567dff632f5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2020-01-24  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/89689
+	* builtins.def (BUILT_IN_OBJECT_SIZE): Make it const rather than pure.
+
 2020-01-29  Richard Sandiford  <richard.sandiford@arm.com>
 
 	Revert:
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 5ab842c34c2..fa8b0641ab1 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -972,7 +972,7 @@ DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
 DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq")
 
 /* Object size checking builtins.  */
-DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_MEMCPY_CHK, "__memcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_MEMMOVE_CHK, "__memmove_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_MEMPCPY_CHK, "__mempcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RETNONNULL_NOTHROW_LEAF)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index e38c4da8d72..b62e7effb59 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2020-01-29  Jeff Law  <law@redhat.com
+
+	PR tree-optimization/89689
+	* gcc.dg/pr89689.c: New test.
+
 2020-01-29  Marek Polacek  <polacek@redhat.com>
 
 	PR c++/91754 - Fix template arguments comparison with class NTTP.
diff --git a/gcc/testsuite/gcc.dg/pr89689.c b/gcc/testsuite/gcc.dg/pr89689.c
new file mode 100644
index 00000000000..ee81274d3c4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr89689.c
@@ -0,0 +1,43 @@
+/* { dg-do-compile } */
+/* { dg-options "-O2 -Warray-bounds" } */
+
+#include <string.h>
+#include <assert.h>
+#include <stdio.h>
+
+static inline __attribute__((__artificial__)) void *a(char *c, const char *d, long n)
+{
+    return __builtin___memcpy_chk(c, d, n, __builtin_object_size(c, 0));
+}
+typedef struct {
+    char *data;
+    int len;
+} sb_t;
+const char __sb_slop[1];
+static void inline set0(sb_t *c)
+{
+    if (c->data != __sb_slop)
+        c->data[0] = 0;
+    else
+        assert (c->data[0] == 0);
+}
+char buf[5];
+sb_t l = {
+    .data = buf,
+    .len = 0
+};
+void o()
+{
+    char *data = "abcd";
+    sb_t h = l;
+    set0(&h);
+    a(h.data, data, strlen(data));
+    printf("%s\n", h.data);
+    printf("%d\n", h.data == __sb_slop);
+    printf("%d\n", h.data == buf);
+    set0(&h);
+}
+int main(void) {
+    o();
+    return 0;
+}

Patch

diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 374143e5785..e88342f1b68 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -789,8 +789,13 @@  dse_classify_store (ao_ref *ref, gimple *stmt,
 		  phi_def = use_stmt;
 		}
 	    }
-	  /* If the statement is a use the store is not dead.  */
-	  else if (ref_maybe_used_by_stmt_p (use_stmt, ref))
+	  /* If the statement is a use the store is not dead.   Do not
+	     consider calls to __builtin_object_size as a use.  For gcc-11
+	     we plan mark __builtin_object_size as const.  */
+	  else if ((!is_gimple_call (use_stmt)
+		    || !gimple_call_builtin_p (as_a <gcall *> (use_stmt),
+					       BUILT_IN_OBJECT_SIZE))
+		   && ref_maybe_used_by_stmt_p (use_stmt, ref))
 	    {
 	      /* Handle common cases where we can easily build an ao_ref
 		 structure for USE_STMT and in doing so we find that the