fix ICE in generic_overlap (PR 84526)

Message ID d6101f48-539d-2596-f85f-5281a7c15020@gmail.com
State New
Headers show
Series
  • fix ICE in generic_overlap (PR 84526)
Related show

Commit Message

Martin Sebor Feb. 23, 2018, 7:57 p.m.
r257860 introduced an unsafe assumption under some conditions
that the base object referenced by a memcpy strcpy necessarily
has a pointer or array type.  The attached patch removes this
assumption.

In addition, as discussed in the bug and in IRC, the patch
also removes a test for the result of get_inner_reference()
being non-null (a vestige of handling the result of a call
to get_addr_base_and_unit_offset() made previously here),
and a test for the bit offset computed by get_inner_reference()
not being convertible to HOST_WIDE_INT.  As far as we can tell
poly64_int is always convertible to HOST_WIDE_INT (i.e.,
poly64_int::is_constant() always returns true).

The conversion from poly64_int to HOST_WIDE_INT is necessary
even though the result is stored in an offset_int because there
is apparently no way to convert from the former to the latter
directly, even though it's wider.

Martin

Comments

Jakub Jelinek Feb. 23, 2018, 8:13 p.m. | #1
On Fri, Feb 23, 2018 at 12:57:14PM -0700, Martin Sebor wrote:
> +  /* get_inner_reference is not expected to return null.  */

> +  gcc_assert (base != NULL);

> +

>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>  

> -  HOST_WIDE_INT const_off;

> -  if (!base || !bytepos.is_constant (&const_off))

> -    {

> -      base = get_base_address (TREE_OPERAND (expr, 0));

> -      return;

> -    }

> -

> +  /* There is no conversion from poly_int64 to offset_int even

> +     though the latter is wider, so go through HOST_WIDE_INT.

> +     The offset is expected to always be constant.  */

> +  HOST_WIDE_INT const_off = bytepos.to_constant ();


The assert is ok, but removing the bytepos.is_constant (&const_off)
is wrong, I'm sure Richard S. can come up with some SVE testcase
where it will not be constant.  If it is not constant, you can handle
it like var_off (which as I said on IRC or in the PR also seems to be
incorrect, because if the base is not a decl the variable offset could be
negative).

>    offrange[0] += const_off;

>    offrange[1] += const_off;

>  

> @@ -923,7 +923,11 @@ builtin_access::generic_overlap ()

>        /* There's no way to distinguish an access to the same member

>  	 of a structure from one to two distinct members of the same

>  	 structure.  Give up to avoid excessive false positives.  */

> -      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));

> +      tree basetype = TREE_TYPE (dstref->base);

> +      if (POINTER_TYPE_P (basetype)

> +	  || TREE_CODE (basetype) == ARRAY_TYPE)

> +	basetype = TREE_TYPE (basetype);


This doesn't address any of my concerns that it is completely random
what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer
(e.g. the argument of the function), sometimes the ADDR_EXPR operand,
sometimes the base of the reference, sometimes again address (if the
base of the reference is MEM_REF).  By the lack of consistency in what
it is, just deciding on its type whether you take TREE_TYPE or
TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g
call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has pointer
type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)
would be true.

	Jakub
Martin Sebor Feb. 23, 2018, 9:46 p.m. | #2
On 02/23/2018 01:13 PM, Jakub Jelinek wrote:
> On Fri, Feb 23, 2018 at 12:57:14PM -0700, Martin Sebor wrote:

>> +  /* get_inner_reference is not expected to return null.  */

>> +  gcc_assert (base != NULL);

>> +

>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>

>> -  HOST_WIDE_INT const_off;

>> -  if (!base || !bytepos.is_constant (&const_off))

>> -    {

>> -      base = get_base_address (TREE_OPERAND (expr, 0));

>> -      return;

>> -    }

>> -

>> +  /* There is no conversion from poly_int64 to offset_int even

>> +     though the latter is wider, so go through HOST_WIDE_INT.

>> +     The offset is expected to always be constant.  */

>> +  HOST_WIDE_INT const_off = bytepos.to_constant ();

>

> The assert is ok, but removing the bytepos.is_constant (&const_off)

> is wrong, I'm sure Richard S. can come up with some SVE testcase

> where it will not be constant.  If it is not constant, you can handle

> it like var_off (which as I said on IRC or in the PR also seems to be

> incorrect, because if the base is not a decl the variable offset could be

> negative).


That's what I did at first.  I took it out because it sounded
to me like you were saying it was a hypothetical possibility,
not something that would ever happen in practice, and because
I have no way of testing that code.  I have put it back in
the attached update.

Since you added Richard: can you please explain how to convert
a poly64_int to offset_int without converting it to HOST_WIDE_INT,
or if it can't be done, why not?  It's cumbersome and error-prone
to have to go through HWI, and doing so defeats the main goal of
using offset_int in the gimple-ssa-warn-restrict pass.  Should
the pass (and others like it) be changed to store offsets and
sizes in some flavor of poly_int instead of offset_int?

>>    offrange[0] += const_off;

>>    offrange[1] += const_off;

>>

>> @@ -923,7 +923,11 @@ builtin_access::generic_overlap ()

>>        /* There's no way to distinguish an access to the same member

>>  	 of a structure from one to two distinct members of the same

>>  	 structure.  Give up to avoid excessive false positives.  */

>> -      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));

>> +      tree basetype = TREE_TYPE (dstref->base);

>> +      if (POINTER_TYPE_P (basetype)

>> +	  || TREE_CODE (basetype) == ARRAY_TYPE)

>> +	basetype = TREE_TYPE (basetype);

>

> This doesn't address any of my concerns that it is completely random

> what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer

> (e.g. the argument of the function), sometimes the ADDR_EXPR operand,

> sometimes the base of the reference, sometimes again address (if the

> base of the reference is MEM_REF).  By the lack of consistency in what

> it is, just deciding on its type whether you take TREE_TYPE or

> TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g

> call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has pointer

> type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)

> would be true.


I think I understand what you're saying but this block is only
used for string functions (not for memcpy), and only as a stopgap
to avoid false positives.  Being limited to (a subset of) string
functions the case I think you're concerned about, namely calling
strcpy with a pointer to a pointer, shouldn't come up in valid
code.  It's not bullet-proof but I don't think there is
a fundamental problem, either with this patch or with the one
that introduced it.  The fundamental problem is that MEM_REF
can be ambiguous and that's what this code is trying to combat.
See the example I gave here:
https://gcc.gnu.org/ml/gcc/2018-02/msg00136.html

If you think I'm missing something please put together a small
test case to help me see the problem.  I'm not nearly as
comfortable thinking in terms of the internal representation
to visualize all the possibilities here.

Martin
PR tree-optimization/84526 - ICE in generic_overlap at gcc/gimple-ssa-warn-restrict.c:927 since r257860

gcc/ChangeLog:

	PR tree-optimization/84526
	* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
	Remove dead code.
	(builtin_access::generic_overlap): Be prepared to handle non-array
	base objects.

gcc/testsuite/ChangeLog:

	PR tree-optimization/84526
	* gcc.dg/Wrestrict-10.c: New test.

Index: gcc/gimple-ssa-warn-restrict.c
===================================================================
--- gcc/gimple-ssa-warn-restrict.c	(revision 257933)
+++ gcc/gimple-ssa-warn-restrict.c	(working copy)
@@ -43,6 +43,8 @@
 #include "cfgloop.h"
 #include "intl.h"
 
+location_t gloc;
+
 namespace {
 
 const pass_data pass_data_wrestrict = {
@@ -409,18 +411,24 @@ builtin_memref::set_base_and_offset (tree expr)
   base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,
 			      &mode, &sign, &reverse, &vol);
 
+  /* get_inner_reference is not expected to return null.  */
+  gcc_assert (base != NULL);
+
   poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
 
+  /* There is no conversion from poly_int64 to offset_int even
+     though the latter is wider, so go through HOST_WIDE_INT.
+     The offset should be constant but be prepared for it not
+     to be just in case.  */
   HOST_WIDE_INT const_off;
-  if (!base || !bytepos.is_constant (&const_off))
+  if (bytepos.is_constant (&const_off))
     {
-      base = get_base_address (TREE_OPERAND (expr, 0));
-      return;
+      offrange[0] += const_off;
+      offrange[1] += const_off;
     }
+  else
+    offrange[1] += maxobjsize;
 
-  offrange[0] += const_off;
-  offrange[1] += const_off;
-
   if (var_off)
     {
       if (TREE_CODE (var_off) == INTEGER_CST)
@@ -918,12 +926,18 @@ builtin_access::generic_overlap ()
   if (!overlap_certain)
     {
       if (!dstref->strbounded_p && !depends_p)
+	/* Memcpy only considers certain overlap.  */
 	return false;
 
       /* There's no way to distinguish an access to the same member
 	 of a structure from one to two distinct members of the same
 	 structure.  Give up to avoid excessive false positives.  */
-      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));
+      tree basetype = TREE_TYPE (dstref->base);
+
+      if (POINTER_TYPE_P (basetype)
+	  || TREE_CODE (basetype) == ARRAY_TYPE)
+	basetype = TREE_TYPE (basetype);
+
       if (RECORD_OR_UNION_TYPE_P (basetype))
 	return false;
     }
@@ -1845,6 +1859,7 @@ check_bounds_or_overlap (gcall *call, tree dst, tr
       loc = *pbloc;
 
   loc = expansion_point_location_if_in_system_header (loc);
+  gloc = loc;
 
   tree func = gimple_call_fndecl (call);
 
Index: gcc/testsuite/gcc.dg/Wrestrict-10.c
===================================================================
--- gcc/testsuite/gcc.dg/Wrestrict-10.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wrestrict-10.c	(working copy)
@@ -0,0 +1,121 @@
+/* PR tree-optimization/84526 - ICE in generic_overlap
+   { dg-do compile }
+   { dg-options "-O2 -Wrestrict" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* memcpy (void* restrict, const void* restrict, size_t);
+extern char* strcat (char* restrict, const char* restrict);
+extern char* strcpy (char* restrict, const char* restrict);
+extern char* strncat (char* restrict, const char* restrict, size_t);
+extern char* strncpy (char* restrict, const char* restrict, size_t);
+
+struct
+{
+  char a[1];
+} b;
+
+int i;
+size_t n;
+
+void __attribute__ ((noclone, noinline))
+test_arr_memcpy_1 (void)
+{
+  memcpy (&b.a[i], b.a, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_memcpy_2 (void)
+{
+  memcpy (b.a, &b.a[i], n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcat_1 (void)
+{
+  strcat (&b.a[i], b.a);            /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcat_2 (void)
+{
+  /* This probably deserves a warning.  */
+  strcpy (b.a, &b.a[i]);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strncat_1 (void)
+{
+  strncat (&b.a[i], b.a, n);        /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strncat_2 (void)
+{
+  strncat (b.a, &b.a[i], n);        /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcpy_1 (void)
+{
+  strcpy (&b.a[i], b.a);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcpy_2 (void)
+{
+  strcpy (b.a, &b.a[i]);
+}
+
+
+struct S {
+  int a;
+  char b[10];
+} d;
+
+void __attribute__ ((noclone, noinline))
+test_obj_memcpy_1 (void)
+{
+  memcpy (d.b, (char *) &d, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_memcpy_2 (void)
+{
+  memcpy ((char *) &d, d.b, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strcpy_1 (void)
+{
+  strcpy (d.b, (char *) &d);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strcpy_2 (void)
+{
+  strcpy ((char *) &d, d.b);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncat_1 (void)
+{
+  strncat (d.b, (char *) &d, n);    /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncat_2 (void)
+{
+  strncat ((char *) &d, d.b, n);    /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncpy_1 (void)
+{
+  strncpy (d.b, (char *) &d, n);
+}
+
+void test_obj_strncpy_2 (void)
+{
+  strncpy ((char *) &d, d.b, n);
+}
Jakub Jelinek Feb. 23, 2018, 9:54 p.m. | #3
On Fri, Feb 23, 2018 at 02:46:28PM -0700, Martin Sebor wrote:
> > This doesn't address any of my concerns that it is completely random

> > what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer

> > (e.g. the argument of the function), sometimes the ADDR_EXPR operand,

> > sometimes the base of the reference, sometimes again address (if the

> > base of the reference is MEM_REF).  By the lack of consistency in what

> > it is, just deciding on its type whether you take TREE_TYPE or

> > TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g

> > call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has pointer

> > type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)

> > would be true.

> 

> I think I understand what you're saying but this block is only

> used for string functions (not for memcpy), and only as a stopgap

> to avoid false positives.  Being limited to (a subset of) string

> functions the case I think you're concerned about, namely calling

> strcpy with a pointer to a pointer, shouldn't come up in valid

> code.  It's not bullet-proof but I don't think there is


Can you explain what is invalid on:
char *p;

void
foo (void)
{
  if (sizeof (p) < 8)
    return;
  memcpy (&p, "abcdefg");
  strcpy ((char *) &p, (char *) &p + 5);
}

and similar code?  Both memcpy and strcpy are defined as char accesses
that can alias anything.  If needed tweak it so that you run into this code.

	Jakub
Martin Sebor Feb. 23, 2018, 11:25 p.m. | #4
On 02/23/2018 02:54 PM, Jakub Jelinek wrote:
> On Fri, Feb 23, 2018 at 02:46:28PM -0700, Martin Sebor wrote:

>>> This doesn't address any of my concerns that it is completely random

>>> what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer

>>> (e.g. the argument of the function), sometimes the ADDR_EXPR operand,

>>> sometimes the base of the reference, sometimes again address (if the

>>> base of the reference is MEM_REF).  By the lack of consistency in what

>>> it is, just deciding on its type whether you take TREE_TYPE or

>>> TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g

>>> call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has pointer

>>> type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)

>>> would be true.

>>

>> I think I understand what you're saying but this block is only

>> used for string functions (not for memcpy), and only as a stopgap

>> to avoid false positives.  Being limited to (a subset of) string

>> functions the case I think you're concerned about, namely calling

>> strcpy with a pointer to a pointer, shouldn't come up in valid

>> code.  It's not bullet-proof but I don't think there is

>

> Can you explain what is invalid on:

> char *p;

>

> void

> foo (void)

> {

>   if (sizeof (p) < 8)

>     return;

>   memcpy (&p, "abcdefg");

>   strcpy ((char *) &p, (char *) &p + 5);

> }

>

> and similar code?  Both memcpy and strcpy are defined as char accesses

> that can alias anything.  If needed tweak it so that you run into this code.


This is arguably valid but (I hope you will agree) questionable
at best.  I haven't seen any warnings for code like this (I've
looked at all those in Marek's Fedora rebuild), nor have I seen
such code in practice.  But if I did come across it I wouldn't
be too concerned about warnings (in fact, I'd view warning for
it as helpful since it's most likely a bug).

That said, the test case as is doesn't trigger -Wrestrict because
there is no overlap.  Even with overlap I could find no way to
exercise the code because the strlen pass doesn't handle the
MEM_REF that the call to memcpy is folded into early on and so
-Wrestrict doesn't know how long the string is.

Without a way to test it I don't feel comfortable making changes
here at this stage.

This isn't meant to say there aren't bugs here.  Just that I don't
know about any (none has been reported) and that I have known bugs
I do need to get to.  If/when we find more here as I'm sure we
will, I'll deal with them as I always do.

Martin
Richard Sandiford Feb. 24, 2018, 9:32 a.m. | #5
Martin Sebor <msebor@gmail.com> writes:
> On 02/23/2018 01:13 PM, Jakub Jelinek wrote:

>> On Fri, Feb 23, 2018 at 12:57:14PM -0700, Martin Sebor wrote:

>>> +  /* get_inner_reference is not expected to return null.  */

>>> +  gcc_assert (base != NULL);

>>> +

>>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>>

>>> -  HOST_WIDE_INT const_off;

>>> -  if (!base || !bytepos.is_constant (&const_off))

>>> -    {

>>> -      base = get_base_address (TREE_OPERAND (expr, 0));

>>> -      return;

>>> -    }

>>> -

>>> +  /* There is no conversion from poly_int64 to offset_int even

>>> +     though the latter is wider, so go through HOST_WIDE_INT.

>>> +     The offset is expected to always be constant.  */

>>> +  HOST_WIDE_INT const_off = bytepos.to_constant ();

>>

>> The assert is ok, but removing the bytepos.is_constant (&const_off)

>> is wrong, I'm sure Richard S. can come up with some SVE testcase

>> where it will not be constant.  If it is not constant, you can handle

>> it like var_off (which as I said on IRC or in the PR also seems to be

>> incorrect, because if the base is not a decl the variable offset could be

>> negative).

>

> That's what I did at first.  I took it out because it sounded

> to me like you were saying it was a hypothetical possibility,

> not something that would ever happen in practice, and because

> I have no way of testing that code.  I have put it back in

> the attached update.

>

> Since you added Richard: can you please explain how to convert

> a poly64_int to offset_int without converting it to HOST_WIDE_INT,

> or if it can't be done, why not?  It's cumbersome and error-prone

> to have to go through HWI, and doing so defeats the main goal of

> using offset_int in the gimple-ssa-warn-restrict pass.  Should

> the pass (and others like it) be changed to store offsets and

> sizes in some flavor of poly_int instead of offset_int?


Not sure if this is what you mean, but:

bool f (poly_int64 x, offset_int *y) { return x.is_constant (y); }

does work.  "y" doesn't have to be "HOST_WIDE_INT *".

Richard
Martin Sebor Feb. 24, 2018, 2:45 p.m. | #6
On 02/24/2018 02:32 AM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:

>> On 02/23/2018 01:13 PM, Jakub Jelinek wrote:

>>> On Fri, Feb 23, 2018 at 12:57:14PM -0700, Martin Sebor wrote:

>>>> +  /* get_inner_reference is not expected to return null.  */

>>>> +  gcc_assert (base != NULL);

>>>> +

>>>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>>>

>>>> -  HOST_WIDE_INT const_off;

>>>> -  if (!base || !bytepos.is_constant (&const_off))

>>>> -    {

>>>> -      base = get_base_address (TREE_OPERAND (expr, 0));

>>>> -      return;

>>>> -    }

>>>> -

>>>> +  /* There is no conversion from poly_int64 to offset_int even

>>>> +     though the latter is wider, so go through HOST_WIDE_INT.

>>>> +     The offset is expected to always be constant.  */

>>>> +  HOST_WIDE_INT const_off = bytepos.to_constant ();

>>>

>>> The assert is ok, but removing the bytepos.is_constant (&const_off)

>>> is wrong, I'm sure Richard S. can come up with some SVE testcase

>>> where it will not be constant.  If it is not constant, you can handle

>>> it like var_off (which as I said on IRC or in the PR also seems to be

>>> incorrect, because if the base is not a decl the variable offset could be

>>> negative).

>>

>> That's what I did at first.  I took it out because it sounded

>> to me like you were saying it was a hypothetical possibility,

>> not something that would ever happen in practice, and because

>> I have no way of testing that code.  I have put it back in

>> the attached update.

>>

>> Since you added Richard: can you please explain how to convert

>> a poly64_int to offset_int without converting it to HOST_WIDE_INT,

>> or if it can't be done, why not?  It's cumbersome and error-prone

>> to have to go through HWI, and doing so defeats the main goal of

>> using offset_int in the gimple-ssa-warn-restrict pass.  Should

>> the pass (and others like it) be changed to store offsets and

>> sizes in some flavor of poly_int instead of offset_int?

>

> Not sure if this is what you mean, but:

>

> bool f (poly_int64 x, offset_int *y) { return x.is_constant (y); }

>

> does work.  "y" doesn't have to be "HOST_WIDE_INT *".


Great, that works, thank you!

Can you also comment on whether the poly64_int offset returned
by get_inner_reference() can be non-constant?  O'm trying to
understand why/how that could happen and if there is a way to
test it in the -Wrestrict pass (i.e., come up with a test case
involving memcpy or similar built-in).

Martin
Richard Sandiford Feb. 25, 2018, 10:03 a.m. | #7
Martin Sebor <msebor@gmail.com> writes:
> On 02/24/2018 02:32 AM, Richard Sandiford wrote:

>> Martin Sebor <msebor@gmail.com> writes:

>>> On 02/23/2018 01:13 PM, Jakub Jelinek wrote:

>>>> On Fri, Feb 23, 2018 at 12:57:14PM -0700, Martin Sebor wrote:

>>>>> +  /* get_inner_reference is not expected to return null.  */

>>>>> +  gcc_assert (base != NULL);

>>>>> +

>>>>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>>>>

>>>>> -  HOST_WIDE_INT const_off;

>>>>> -  if (!base || !bytepos.is_constant (&const_off))

>>>>> -    {

>>>>> -      base = get_base_address (TREE_OPERAND (expr, 0));

>>>>> -      return;

>>>>> -    }

>>>>> -

>>>>> +  /* There is no conversion from poly_int64 to offset_int even

>>>>> +     though the latter is wider, so go through HOST_WIDE_INT.

>>>>> +     The offset is expected to always be constant.  */

>>>>> +  HOST_WIDE_INT const_off = bytepos.to_constant ();

>>>>

>>>> The assert is ok, but removing the bytepos.is_constant (&const_off)

>>>> is wrong, I'm sure Richard S. can come up with some SVE testcase

>>>> where it will not be constant.  If it is not constant, you can handle

>>>> it like var_off (which as I said on IRC or in the PR also seems to be

>>>> incorrect, because if the base is not a decl the variable offset could be

>>>> negative).

>>>

>>> That's what I did at first.  I took it out because it sounded

>>> to me like you were saying it was a hypothetical possibility,

>>> not something that would ever happen in practice, and because

>>> I have no way of testing that code.  I have put it back in

>>> the attached update.

>>>

>>> Since you added Richard: can you please explain how to convert

>>> a poly64_int to offset_int without converting it to HOST_WIDE_INT,

>>> or if it can't be done, why not?  It's cumbersome and error-prone

>>> to have to go through HWI, and doing so defeats the main goal of

>>> using offset_int in the gimple-ssa-warn-restrict pass.  Should

>>> the pass (and others like it) be changed to store offsets and

>>> sizes in some flavor of poly_int instead of offset_int?

>>

>> Not sure if this is what you mean, but:

>>

>> bool f (poly_int64 x, offset_int *y) { return x.is_constant (y); }

>>

>> does work.  "y" doesn't have to be "HOST_WIDE_INT *".

>

> Great, that works, thank you!

>

> Can you also comment on whether the poly64_int offset returned

> by get_inner_reference() can be non-constant?  O'm trying to

> understand why/how that could happen and if there is a way to

> test it in the -Wrestrict pass (i.e., come up with a test case

> involving memcpy or similar built-in).


Yeah, get_inner_reference can return non-constant offsets (but so far
only on SVE of course).  E.g. the SVE versions of IFN_LOAD/STORE_LANES
use arrays of variable-length vectors, so vectors 1 and above will be at
non-constant offsets.  Also, the OpenMP support creates vector-sized
variables, and we can have references to the last element of such
a vector.

I don't think it can yet happen for cases that are interesting to the
-Wrestrict warning though, so dropping back to conservatively correct
behaviour (via is_constant) is fine.

And that's the general principle: if one of the inputs has poly type,
try to use poly types for the whole operation if it's easy to do,
otherwise drop to conservatively correct behaviour using is_constant.
Only use to_constant if there is a known reason for it to be correct
(rather than no known reason for it to be incorrect).

Thanks,
Richard
Martin Sebor Feb. 26, 2018, 5:32 p.m. | #8
That attached revision updates the pass to directly convert
the poly64_int into offset_int using the API suggested by
Richard (thanks again).  As discussed below, I've made no
other changes.

Jakub, if you still have concerns that the false positive
suppression logic isn't sufficiently effective please post
a test case (or open a new bug).  I will look into it when
I get a chance.

Until then, I'd like to get the ICE fix committed. Please
confirm that the attached patch is good to go in.

Martin

On 02/23/2018 02:46 PM, Martin Sebor wrote:
> On 02/23/2018 01:13 PM, Jakub Jelinek wrote:

>> On Fri, Feb 23, 2018 at 12:57:14PM -0700, Martin Sebor wrote:

>>> +  /* get_inner_reference is not expected to return null.  */

>>> +  gcc_assert (base != NULL);

>>> +

>>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>>

>>> -  HOST_WIDE_INT const_off;

>>> -  if (!base || !bytepos.is_constant (&const_off))

>>> -    {

>>> -      base = get_base_address (TREE_OPERAND (expr, 0));

>>> -      return;

>>> -    }

>>> -

>>> +  /* There is no conversion from poly_int64 to offset_int even

>>> +     though the latter is wider, so go through HOST_WIDE_INT.

>>> +     The offset is expected to always be constant.  */

>>> +  HOST_WIDE_INT const_off = bytepos.to_constant ();

>>

>> The assert is ok, but removing the bytepos.is_constant (&const_off)

>> is wrong, I'm sure Richard S. can come up with some SVE testcase

>> where it will not be constant.  If it is not constant, you can handle

>> it like var_off (which as I said on IRC or in the PR also seems to be

>> incorrect, because if the base is not a decl the variable offset could be

>> negative).

>

> That's what I did at first.  I took it out because it sounded

> to me like you were saying it was a hypothetical possibility,

> not something that would ever happen in practice, and because

> I have no way of testing that code.  I have put it back in

> the attached update.

>

> Since you added Richard: can you please explain how to convert

> a poly64_int to offset_int without converting it to HOST_WIDE_INT,

> or if it can't be done, why not?  It's cumbersome and error-prone

> to have to go through HWI, and doing so defeats the main goal of

> using offset_int in the gimple-ssa-warn-restrict pass.  Should

> the pass (and others like it) be changed to store offsets and

> sizes in some flavor of poly_int instead of offset_int?

>

>>>    offrange[0] += const_off;

>>>    offrange[1] += const_off;

>>>

>>> @@ -923,7 +923,11 @@ builtin_access::generic_overlap ()

>>>        /* There's no way to distinguish an access to the same member

>>>       of a structure from one to two distinct members of the same

>>>       structure.  Give up to avoid excessive false positives.  */

>>> -      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));

>>> +      tree basetype = TREE_TYPE (dstref->base);

>>> +      if (POINTER_TYPE_P (basetype)

>>> +      || TREE_CODE (basetype) == ARRAY_TYPE)

>>> +    basetype = TREE_TYPE (basetype);

>>

>> This doesn't address any of my concerns that it is completely random

>> what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer

>> (e.g. the argument of the function), sometimes the ADDR_EXPR operand,

>> sometimes the base of the reference, sometimes again address (if the

>> base of the reference is MEM_REF).  By the lack of consistency in what

>> it is, just deciding on its type whether you take TREE_TYPE or

>> TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g

>> call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has

>> pointer

>> type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)

>> would be true.

>

> I think I understand what you're saying but this block is only

> used for string functions (not for memcpy), and only as a stopgap

> to avoid false positives.  Being limited to (a subset of) string

> functions the case I think you're concerned about, namely calling

> strcpy with a pointer to a pointer, shouldn't come up in valid

> code.  It's not bullet-proof but I don't think there is

> a fundamental problem, either with this patch or with the one

> that introduced it.  The fundamental problem is that MEM_REF

> can be ambiguous and that's what this code is trying to combat.

> See the example I gave here:

> https://gcc.gnu.org/ml/gcc/2018-02/msg00136.html

>

> If you think I'm missing something please put together a small

> test case to help me see the problem.  I'm not nearly as

> comfortable thinking in terms of the internal representation

> to visualize all the possibilities here.

>

> Martin
PR tree-optimization/84526 - ICE in generic_overlap at gcc/gimple-ssa-warn-restrict.c:927 since r257860

gcc/ChangeLog:

	PR tree-optimization/84526
	* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
	Remove dead code.
	(builtin_access::generic_overlap): Be prepared to handle non-array
	base objects.

gcc/testsuite/ChangeLog:

	PR tree-optimization/84526
	* gcc.dg/Wrestrict-10.c: New test.

Index: gcc/gimple-ssa-warn-restrict.c
===================================================================
--- gcc/gimple-ssa-warn-restrict.c	(revision 257963)
+++ gcc/gimple-ssa-warn-restrict.c	(working copy)
@@ -396,6 +396,9 @@ builtin_memref::set_base_and_offset (tree expr)
   if (TREE_CODE (expr) == ADDR_EXPR)
     expr = TREE_OPERAND (expr, 0);
 
+  /* Stash the reference for offset validation.  */
+  ref = expr;
+
   poly_int64 bitsize, bitpos;
   tree var_off;
   machine_mode mode;
@@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)
   base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,
 			      &mode, &sign, &reverse, &vol);
 
+  /* get_inner_reference is not expected to return null.  */
+  gcc_assert (base != NULL);
+
   poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
 
-  HOST_WIDE_INT const_off;
-  if (!base || !bytepos.is_constant (&const_off))
+  /* Convert the poly_int64 offset to to offset_int.  The offset
+     should be constant but be prepared for it not to be just in
+     case.  */
+  offset_int cstoff;
+  if (bytepos.is_constant (&cstoff))
     {
-      base = get_base_address (TREE_OPERAND (expr, 0));
-      return;
+      offrange[0] += cstoff;
+      offrange[1] += cstoff;
+
+      /* Besides the reference saved above, also stash the offset
+	 for validation.  */
+      if (TREE_CODE (expr) == COMPONENT_REF)
+	refoff = cstoff;
     }
+  else
+    offrange[1] += maxobjsize;
 
-  offrange[0] += const_off;
-  offrange[1] += const_off;
-
   if (var_off)
     {
       if (TREE_CODE (var_off) == INTEGER_CST)
 	{
-	  offset_int cstoff = wi::to_offset (var_off);
+	  cstoff = wi::to_offset (var_off);
 	  offrange[0] += cstoff;
 	  offrange[1] += cstoff;
 	}
@@ -433,13 +446,6 @@ builtin_memref::set_base_and_offset (tree expr)
 	offrange[1] += maxobjsize;
     }
 
-  /* Stash the reference for offset validation.  */
-  ref = expr;
-
-  /* Also stash the constant offset for offset validation.  */
-  if (TREE_CODE (expr) == COMPONENT_REF)
-    refoff = const_off;
-
   if (TREE_CODE (base) == MEM_REF)
     {
       tree memrefoff = TREE_OPERAND (base, 1);
@@ -918,12 +924,18 @@ builtin_access::generic_overlap ()
   if (!overlap_certain)
     {
       if (!dstref->strbounded_p && !depends_p)
+	/* Memcpy only considers certain overlap.  */
 	return false;
 
       /* There's no way to distinguish an access to the same member
 	 of a structure from one to two distinct members of the same
 	 structure.  Give up to avoid excessive false positives.  */
-      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));
+      tree basetype = TREE_TYPE (dstref->base);
+
+      if (POINTER_TYPE_P (basetype)
+	  || TREE_CODE (basetype) == ARRAY_TYPE)
+	basetype = TREE_TYPE (basetype);
+
       if (RECORD_OR_UNION_TYPE_P (basetype))
 	return false;
     }
Index: gcc/testsuite/gcc.dg/Wrestrict-10.c
===================================================================
--- gcc/testsuite/gcc.dg/Wrestrict-10.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wrestrict-10.c	(working copy)
@@ -0,0 +1,121 @@
+/* PR tree-optimization/84526 - ICE in generic_overlap
+   { dg-do compile }
+   { dg-options "-O2 -Wrestrict" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* memcpy (void* restrict, const void* restrict, size_t);
+extern char* strcat (char* restrict, const char* restrict);
+extern char* strcpy (char* restrict, const char* restrict);
+extern char* strncat (char* restrict, const char* restrict, size_t);
+extern char* strncpy (char* restrict, const char* restrict, size_t);
+
+struct
+{
+  char a[1];
+} b;
+
+int i;
+size_t n;
+
+void __attribute__ ((noclone, noinline))
+test_arr_memcpy_1 (void)
+{
+  memcpy (&b.a[i], b.a, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_memcpy_2 (void)
+{
+  memcpy (b.a, &b.a[i], n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcat_1 (void)
+{
+  strcat (&b.a[i], b.a);            /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcat_2 (void)
+{
+  /* This probably deserves a warning.  */
+  strcpy (b.a, &b.a[i]);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strncat_1 (void)
+{
+  strncat (&b.a[i], b.a, n);        /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strncat_2 (void)
+{
+  strncat (b.a, &b.a[i], n);        /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcpy_1 (void)
+{
+  strcpy (&b.a[i], b.a);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcpy_2 (void)
+{
+  strcpy (b.a, &b.a[i]);
+}
+
+
+struct S {
+  int a;
+  char b[10];
+} d;
+
+void __attribute__ ((noclone, noinline))
+test_obj_memcpy_1 (void)
+{
+  memcpy (d.b, (char *) &d, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_memcpy_2 (void)
+{
+  memcpy ((char *) &d, d.b, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strcpy_1 (void)
+{
+  strcpy (d.b, (char *) &d);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strcpy_2 (void)
+{
+  strcpy ((char *) &d, d.b);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncat_1 (void)
+{
+  strncat (d.b, (char *) &d, n);    /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncat_2 (void)
+{
+  strncat ((char *) &d, d.b, n);    /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncpy_1 (void)
+{
+  strncpy (d.b, (char *) &d, n);
+}
+
+void test_obj_strncpy_2 (void)
+{
+  strncpy ((char *) &d, d.b, n);
+}
Jakub Jelinek Feb. 26, 2018, 5:39 p.m. | #9
On Mon, Feb 26, 2018 at 10:32:27AM -0700, Martin Sebor wrote:
> That attached revision updates the pass to directly convert

> the poly64_int into offset_int using the API suggested by

> Richard (thanks again).  As discussed below, I've made no

> other changes.

> 

> Jakub, if you still have concerns that the false positive

> suppression logic isn't sufficiently effective please post

> a test case (or open a new bug).  I will look into it when

> I get a chance.


Yes, I still have major concerns, I explained what I'd like to see
(differentiate clearly between what base is, either by adding a flag or
having separate base and base_addr, then in the users you can easily find
out what is what and can decide based on that).

> Until then, I'd like to get the ICE fix committed. Please

> confirm that the attached patch is good to go in.


Due to the above concerns, I don't think the patch is good to go in.
If you find somebody else who thinks the patch is ok, I won't fight against
it though.

	Jakub
Martin Sebor Feb. 26, 2018, 6:13 p.m. | #10
On 02/26/2018 10:39 AM, Jakub Jelinek wrote:
> On Mon, Feb 26, 2018 at 10:32:27AM -0700, Martin Sebor wrote:

>> That attached revision updates the pass to directly convert

>> the poly64_int into offset_int using the API suggested by

>> Richard (thanks again).  As discussed below, I've made no

>> other changes.

>>

>> Jakub, if you still have concerns that the false positive

>> suppression logic isn't sufficiently effective please post

>> a test case (or open a new bug).  I will look into it when

>> I get a chance.

>

> Yes, I still have major concerns, I explained what I'd like to see

> (differentiate clearly between what base is, either by adding a flag or

> having separate base and base_addr, then in the users you can easily find

> out what is what and can decide based on that).


I'm open to making these improvements but without a test case
or a way to validate them I'm not comfortable making intrusive
changes in this area at this late stage.  As I explained, the
code cannot be reached under the conditions you described.
In any event, the improvements you suggest are independent of
the fix for the ICE.

>> Until then, I'd like to get the ICE fix committed. Please

>> confirm that the attached patch is good to go in.

>

> Due to the above concerns, I don't think the patch is good to go in.

> If you find somebody else who thinks the patch is ok, I won't fight against

> it though.


Jeff, when you have a chance, can you please review/approve
the patch?

https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01441.html

Martin
Richard Sandiford Feb. 27, 2018, 12:32 p.m. | #11
Martin Sebor <msebor@gmail.com> writes:
> +  /* Convert the poly_int64 offset to to offset_int.  The offset

> +     should be constant but be prepared for it not to be just in

> +     case.  */


This comment seems redundant and could easily get out of date once
ACLE support is added.

> +  offset_int cstoff;

> +  if (bytepos.is_constant (&cstoff))

>      {


> -      base = get_base_address (TREE_OPERAND (expr, 0));

> -      return;

> +      offrange[0] += cstoff;

> +      offrange[1] += cstoff;


Although this looks right, there's no actual need for cstoff to be
offset_int here, and I think the original HOST_WIDE_INT was more
efficient.

I realise you reuse the cstoff variable later with wi::to_offset,
but it doesn't seem necessary to use the same variable, since it's
logically a separate value.

Either way's fine, just thought I'd mention it :-)

Thanks,
Richard
Martin Sebor March 7, 2018, 3:17 p.m. | #12
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01441.html

Jeff, I know this is in your queue.  I ping it because a bug
with the same root cause was reported in RHBZ #1552639.

On 02/26/2018 11:13 AM, Martin Sebor wrote:
> On 02/26/2018 10:39 AM, Jakub Jelinek wrote:

>> On Mon, Feb 26, 2018 at 10:32:27AM -0700, Martin Sebor wrote:

>>> That attached revision updates the pass to directly convert

>>> the poly64_int into offset_int using the API suggested by

>>> Richard (thanks again).  As discussed below, I've made no

>>> other changes.

>>>

>>> Jakub, if you still have concerns that the false positive

>>> suppression logic isn't sufficiently effective please post

>>> a test case (or open a new bug).  I will look into it when

>>> I get a chance.

>>

>> Yes, I still have major concerns, I explained what I'd like to see

>> (differentiate clearly between what base is, either by adding a flag or

>> having separate base and base_addr, then in the users you can easily find

>> out what is what and can decide based on that).

>

> I'm open to making these improvements but without a test case

> or a way to validate them I'm not comfortable making intrusive

> changes in this area at this late stage.  As I explained, the

> code cannot be reached under the conditions you described.

> In any event, the improvements you suggest are independent of

> the fix for the ICE.

>

>>> Until then, I'd like to get the ICE fix committed. Please

>>> confirm that the attached patch is good to go in.

>>

>> Due to the above concerns, I don't think the patch is good to go in.

>> If you find somebody else who thinks the patch is ok, I won't fight

>> against

>> it though.

>

> Jeff, when you have a chance, can you please review/approve

> the patch?

>

> https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01441.html

>

> Martin
Jeff Law March 7, 2018, 7:14 p.m. | #13
On 02/23/2018 01:13 PM, Jakub Jelinek wrote:
> On Fri, Feb 23, 2018 at 12:57:14PM -0700, Martin Sebor wrote:

>> +  /* get_inner_reference is not expected to return null.  */

>> +  gcc_assert (base != NULL);

>> +

>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>  

>> -  HOST_WIDE_INT const_off;

>> -  if (!base || !bytepos.is_constant (&const_off))

>> -    {

>> -      base = get_base_address (TREE_OPERAND (expr, 0));

>> -      return;

>> -    }

>> -

>> +  /* There is no conversion from poly_int64 to offset_int even

>> +     though the latter is wider, so go through HOST_WIDE_INT.

>> +     The offset is expected to always be constant.  */

>> +  HOST_WIDE_INT const_off = bytepos.to_constant ();

> 

> The assert is ok, but removing the bytepos.is_constant (&const_off)

> is wrong, I'm sure Richard S. can come up with some SVE testcase

> where it will not be constant.  If it is not constant, you can handle

> it like var_off (which as I said on IRC or in the PR also seems to be

> incorrect, because if the base is not a decl the variable offset could be

> negative).

> 

>>    offrange[0] += const_off;

>>    offrange[1] += const_off;

>>  

>> @@ -923,7 +923,11 @@ builtin_access::generic_overlap ()

>>        /* There's no way to distinguish an access to the same member

>>  	 of a structure from one to two distinct members of the same

>>  	 structure.  Give up to avoid excessive false positives.  */

>> -      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));

>> +      tree basetype = TREE_TYPE (dstref->base);

>> +      if (POINTER_TYPE_P (basetype)

>> +	  || TREE_CODE (basetype) == ARRAY_TYPE)

>> +	basetype = TREE_TYPE (basetype);

> 

> This doesn't address any of my concerns that it is completely random

> what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer

> (e.g. the argument of the function), sometimes the ADDR_EXPR operand,

> sometimes the base of the reference, sometimes again address (if the

> base of the reference is MEM_REF).  By the lack of consistency in what

> it is, just deciding on its type whether you take TREE_TYPE or

> TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g

> call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has pointer

> type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)

> would be true.

The intent of the code is that BASE is a pointer to a memory location
that can be compared to another BASE for the purposes of trying to
determine if there's an overlap between two arguments.

Note that BASE may have a different type than what actually appeared in
the call.  Martin's code is presented with the actual argument, but then
tries to walk backwards to refine that value.  In particular it might
look through pointer typecasts or pointer arithmetic.  The latter
results in a memory address plus an offset.

It may also look at an ADDR_EXPR and strip away its various _REF nodes.
The result in this case is similar to pointer arithmetic in that we will
have a memory address plus an offset.

Note that the memory address computed in here and stored into BASE may
differ in its type from what is actually passed to the function for
which we are checking for invalid argument overlap.  In fact, I think

The key here is boil arguments down to a memory location plus an
optional offset, which in turn allows us to compare them against each
other for overlap.  In some ways it's like what we do in alias.c for RTL
(which isn't necessary a shining example of good code).

I don't think the lack of consistency is really a major issue inside
builtin_memref::set_base_and_offset -- as long as clients use the
information just for warnings and are aware that the underlying types
may differ from what appeared in the call itself.

Removal of the undesirable base = get_base_address (TREE_OPERAND (expr,
0)) definitely helps clarify IMHO.

I'll have more comments as a follow-up to one of Martin's messages.

Jeff
Jeff Law March 7, 2018, 7:18 p.m. | #14
On 02/23/2018 02:46 PM, Martin Sebor wrote:
>> This doesn't address any of my concerns that it is completely random

>> what {dst,src}ref->base is, apples and oranges; sometimes it is a pointer

>> (e.g. the argument of the function), sometimes the ADDR_EXPR operand,

>> sometimes the base of the reference, sometimes again address (if the

>> base of the reference is MEM_REF).  By the lack of consistency in what

>> it is, just deciding on its type whether you take TREE_TYPE or

>> TREE_TYPE (TREE_TYPE ()) of it also gives useless result.  You could e.g

>> call the memcpy etc. function with ADDR_EXPR of a VAR_DECL that has

>> pointer

>> type, then if dstref->base is that VAR_DECL, POINTER_TYPE_P (basetype)

>> would be true.

> 

> I think I understand what you're saying but this block is only

> used for string functions (not for memcpy), and only as a stopgap

> to avoid false positives.  Being limited to (a subset of) string

> functions the case I think you're concerned about, namely calling

> strcpy with a pointer to a pointer, shouldn't come up in valid

> code.  It's not bullet-proof but I don't think there is

> a fundamental problem, either with this patch or with the one

> that introduced it.  The fundamental problem is that MEM_REF

> can be ambiguous and that's what this code is trying to combat.

> See the example I gave here:

> https://gcc.gnu.org/ml/gcc/2018-02/msg00136.html

> 

> If you think I'm missing something please put together a small

> test case to help me see the problem.  I'm not nearly as

> comfortable thinking in terms of the internal representation

> to visualize all the possibilities here.

I think at least part of the issue here was that previously you had this
code in set_base_and_offset:

 HOST_WIDE_INT const_off;
  if (!base || !bytepos.is_constant (&const_off))
    {
      base = get_base_address (TREE_OPERAND (expr, 0));
      return;
    }

That is stripping away part of EXPR in a way I don't think was safe.
With this gone in the most recent patches I'm a lot less concerned.

Jeff
Jeff Law March 7, 2018, 7:37 p.m. | #15
On 02/26/2018 10:32 AM, Martin Sebor wrote:
> 

> PR tree-optimization/84526 - ICE in generic_overlap at gcc/gimple-ssa-warn-restrict.c:927 since r257860

> 

> gcc/ChangeLog:

> 

> 	PR tree-optimization/84526

> 	* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):

> 	Remove dead code.

> 	(builtin_access::generic_overlap): Be prepared to handle non-array

> 	base objects.

> 

> gcc/testsuite/ChangeLog:

> 

> 	PR tree-optimization/84526

> 	* gcc.dg/Wrestrict-10.c: New test.

> 

> Index: gcc/gimple-ssa-warn-restrict.c

> ===================================================================

> --- gcc/gimple-ssa-warn-restrict.c	(revision 257963)

> +++ gcc/gimple-ssa-warn-restrict.c	(working copy)

> @@ -396,6 +396,9 @@ builtin_memref::set_base_and_offset (tree expr)

>    if (TREE_CODE (expr) == ADDR_EXPR)

>      expr = TREE_OPERAND (expr, 0);

>  

> +  /* Stash the reference for offset validation.  */

> +  ref = expr;

> +

>    poly_int64 bitsize, bitpos;

>    tree var_off;

>    machine_mode mode;

> @@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)

>    base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,

>  			      &mode, &sign, &reverse, &vol);

>  

> +  /* get_inner_reference is not expected to return null.  */

> +  gcc_assert (base != NULL);

> +

>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>  

> -  HOST_WIDE_INT const_off;

> -  if (!base || !bytepos.is_constant (&const_off))

> +  /* Convert the poly_int64 offset to to offset_int.  The offset

> +     should be constant but be prepared for it not to be just in

> +     case.  */

> +  offset_int cstoff;

> +  if (bytepos.is_constant (&cstoff))

>      {

> -      base = get_base_address (TREE_OPERAND (expr, 0));

> -      return;

> +      offrange[0] += cstoff;

> +      offrange[1] += cstoff;

> +

> +      /* Besides the reference saved above, also stash the offset

> +	 for validation.  */

> +      if (TREE_CODE (expr) == COMPONENT_REF)

> +	refoff = cstoff;

>      }

> +  else

> +    offrange[1] += maxobjsize;

So is this assignment to offrange[1] really correct?

ISTM that it potentially overflows (relative to MAXOBJSIZE) and that
you'd likely be better off just assigning offrange[1] to MAXOBJSIZE.

Or alternately signal to the callers that we can't really analyze the
memory address and inhibit further analysis of the potential overlap of
the objects.

> @@ -918,12 +924,18 @@ builtin_access::generic_overlap ()

>    if (!overlap_certain)

>      {

>        if (!dstref->strbounded_p && !depends_p)

> +	/* Memcpy only considers certain overlap.  */

>  	return false;

>  

>        /* There's no way to distinguish an access to the same member

>  	 of a structure from one to two distinct members of the same

>  	 structure.  Give up to avoid excessive false positives.  */

> -      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));

> +      tree basetype = TREE_TYPE (dstref->base);

> +

> +      if (POINTER_TYPE_P (basetype)

> +	  || TREE_CODE (basetype) == ARRAY_TYPE)

> +	basetype = TREE_TYPE (basetype);

> +

>        if (RECORD_OR_UNION_TYPE_P (basetype))

>  	return false;

>      }

This is probably fairly reasonable.  My only concern would be that we
handle multi-dimensional arrays correctly.  Do you need to handle
ARRAY_TYPEs with a loop?

I do have a more general concern.  Given that we walk backwards through
pointer casts and ADDR_EXPRs we potentially lose information.  For
example we might walk backwards through a cast from a small array to a
larger array type.

Thus later we use the smaller array type when computing the bounds and
subsequent offset from BASE.  This could lead to a missed warning as the
computed offset would be too small.

I'm inclined to ack after fixing offrange[1] computation noted above as
I don't think the patch makes things any worse.  As I noted in a prior
message, I don't see concerns with consistency of BASE that Jakub sees here.

jeff
Martin Sebor March 7, 2018, 10:52 p.m. | #16
On 03/07/2018 12:37 PM, Jeff Law wrote:
> On 02/26/2018 10:32 AM, Martin Sebor wrote:

>>

>> PR tree-optimization/84526 - ICE in generic_overlap at gcc/gimple-ssa-warn-restrict.c:927 since r257860

>>

>> gcc/ChangeLog:

>>

>> 	PR tree-optimization/84526

>> 	* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):

>> 	Remove dead code.

>> 	(builtin_access::generic_overlap): Be prepared to handle non-array

>> 	base objects.

>>

>> gcc/testsuite/ChangeLog:

>>

>> 	PR tree-optimization/84526

>> 	* gcc.dg/Wrestrict-10.c: New test.

>>

>> Index: gcc/gimple-ssa-warn-restrict.c

>> ===================================================================

>> --- gcc/gimple-ssa-warn-restrict.c	(revision 257963)

>> +++ gcc/gimple-ssa-warn-restrict.c	(working copy)

>> @@ -396,6 +396,9 @@ builtin_memref::set_base_and_offset (tree expr)

>>    if (TREE_CODE (expr) == ADDR_EXPR)

>>      expr = TREE_OPERAND (expr, 0);

>>

>> +  /* Stash the reference for offset validation.  */

>> +  ref = expr;

>> +

>>    poly_int64 bitsize, bitpos;

>>    tree var_off;

>>    machine_mode mode;

>> @@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)

>>    base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,

>>  			      &mode, &sign, &reverse, &vol);

>>

>> +  /* get_inner_reference is not expected to return null.  */

>> +  gcc_assert (base != NULL);

>> +

>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>

>> -  HOST_WIDE_INT const_off;

>> -  if (!base || !bytepos.is_constant (&const_off))

>> +  /* Convert the poly_int64 offset to to offset_int.  The offset

>> +     should be constant but be prepared for it not to be just in

>> +     case.  */

>> +  offset_int cstoff;

>> +  if (bytepos.is_constant (&cstoff))

>>      {

>> -      base = get_base_address (TREE_OPERAND (expr, 0));

>> -      return;

>> +      offrange[0] += cstoff;

>> +      offrange[1] += cstoff;

>> +

>> +      /* Besides the reference saved above, also stash the offset

>> +	 for validation.  */

>> +      if (TREE_CODE (expr) == COMPONENT_REF)

>> +	refoff = cstoff;

>>      }

>> +  else

>> +    offrange[1] += maxobjsize;

> So is this assignment to offrange[1] really correct?

>

> ISTM that it potentially overflows (relative to MAXOBJSIZE) and that

> you'd likely be better off just assigning offrange[1] to MAXOBJSIZE.

>

> Or alternately signal to the callers that we can't really analyze the

> memory address and inhibit further analysis of the potential overlap of

> the objects.


The offsets are stored in offset_int and there is code to deal
with values outside the [-MAXOBJSIZE, MAXOBJSIZE] range, either
by capping them to the bounds of the array/object being accessed
if it's known, or to the MAXOBJSIZE range.  There are a number of
places where offsets with unknown values are being added together
and where their sum might end up exceeding that range (e.g., when
dealing with multiple offsets, each in some range) but as long
as the offsets are only added and not multiplied they should never
overflow offset_int.  It would be okay to assign MAXOBJSIZE here
instead of adding it, but there are other places with the same
addition (see the bottom of builtin_memref::extend_offset_range)
so doing it here alone would be inconsistent.

>> @@ -918,12 +924,18 @@ builtin_access::generic_overlap ()

>>    if (!overlap_certain)

>>      {

>>        if (!dstref->strbounded_p && !depends_p)

>> +	/* Memcpy only considers certain overlap.  */

>>  	return false;

>>

>>        /* There's no way to distinguish an access to the same member

>>  	 of a structure from one to two distinct members of the same

>>  	 structure.  Give up to avoid excessive false positives.  */

>> -      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));

>> +      tree basetype = TREE_TYPE (dstref->base);

>> +

>> +      if (POINTER_TYPE_P (basetype)

>> +	  || TREE_CODE (basetype) == ARRAY_TYPE)

>> +	basetype = TREE_TYPE (basetype);

>> +

>>        if (RECORD_OR_UNION_TYPE_P (basetype))

>>  	return false;

>>      }

> This is probably fairly reasonable.  My only concern would be that we

> handle multi-dimensional arrays correctly.  Do you need to handle

> ARRAY_TYPEs with a loop?


Yes, good catch, thanks!  It's unrelated to this bug (the ICE) but
it seems simple enough to handle at the same time so I've added
the loop and a test to verify it works as expected.

> I do have a more general concern.  Given that we walk backwards through

> pointer casts and ADDR_EXPRs we potentially lose information.  For

> example we might walk backwards through a cast from a small array to a

> larger array type.

>

> Thus later we use the smaller array type when computing the bounds and

> subsequent offset from BASE.  This could lead to a missed warning as the

> computed offset would be too small.


There are many false negatives here.  A lot of those I noticed
during testing are because of limitations in the strlen pass
(I must have raised a dozen bugs to track and, hopefully, fix
in stage 1).  There are also deficiencies in the restrict pass
itself.  For instance, using builtin_object_size to compute
the size of member arrays leads to many because bos computes
the size of the larger object.  It's on my to do list to open
bugs for all these false negatives irrespective of the strlen
limitations as a reminder to add test cases for the former.

> I'm inclined to ack after fixing offrange[1] computation noted above as

> I don't think the patch makes things any worse.  As I noted in a prior

> message, I don't see concerns with consistency of BASE that Jakub sees here.


Attached is a revision with the loop.  I didn't make any
changes to the offrange[1] computation.  If you want me to
change it wholesale for all the uses let me know, but I would
prefer to do it in a separate change, after fixing the ICE.

Martin
PR tree-optimization/84526 - ICE in generic_overlap at gcc/gimple-ssa-warn-restrict.c:927 since r257860

gcc/ChangeLog:

	PR tree-optimization/84526
	* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
	Remove dead code.
	(builtin_access::generic_overlap): Be prepared to handle non-array
	base objects.

gcc/testsuite/ChangeLog:

	PR tree-optimization/84526
	* gcc.dg/Wrestrict-10.c: New test.
	* gcc.dg/Wrestrict-11.c: New test.

Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog	(revision 258339)
+++ gcc/ChangeLog	(working copy)
@@ -1,5 +1,13 @@
 2018-03-07  Martin Sebor  <msebor@redhat.com>
 
+	PR tree-optimization/84526
+	* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
+	Remove dead code.
+	(builtin_access::generic_overlap): Be prepared to handle non-array
+	base objects.
+
+2018-03-07  Martin Sebor  <msebor@redhat.com>
+
 	PR tree-optimization/84468
 	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Consider successor
 	basic block when looking for nul assignment.
Index: gcc/gimple-ssa-warn-restrict.c
===================================================================
--- gcc/gimple-ssa-warn-restrict.c	(revision 258339)
+++ gcc/gimple-ssa-warn-restrict.c	(working copy)
@@ -396,6 +396,9 @@ builtin_memref::set_base_and_offset (tree expr)
   if (TREE_CODE (expr) == ADDR_EXPR)
     expr = TREE_OPERAND (expr, 0);
 
+  /* Stash the reference for offset validation.  */
+  ref = expr;
+
   poly_int64 bitsize, bitpos;
   tree var_off;
   machine_mode mode;
@@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)
   base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,
 			      &mode, &sign, &reverse, &vol);
 
+  /* get_inner_reference is not expected to return null.  */
+  gcc_assert (base != NULL);
+
   poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
 
-  HOST_WIDE_INT const_off;
-  if (!base || !bytepos.is_constant (&const_off))
+  /* Convert the poly_int64 offset to to offset_int.  The offset
+     should be constant but be prepared for it not to be just in
+     case.  */
+  offset_int cstoff;
+  if (bytepos.is_constant (&cstoff))
     {
-      base = get_base_address (TREE_OPERAND (expr, 0));
-      return;
+      offrange[0] += cstoff;
+      offrange[1] += cstoff;
+
+      /* Besides the reference saved above, also stash the offset
+	 for validation.  */
+      if (TREE_CODE (expr) == COMPONENT_REF)
+	refoff = cstoff;
     }
+  else
+    offrange[1] += maxobjsize;
 
-  offrange[0] += const_off;
-  offrange[1] += const_off;
-
   if (var_off)
     {
       if (TREE_CODE (var_off) == INTEGER_CST)
 	{
-	  offset_int cstoff = wi::to_offset (var_off);
+	  cstoff = wi::to_offset (var_off);
 	  offrange[0] += cstoff;
 	  offrange[1] += cstoff;
 	}
@@ -433,13 +446,6 @@ builtin_memref::set_base_and_offset (tree expr)
 	offrange[1] += maxobjsize;
     }
 
-  /* Stash the reference for offset validation.  */
-  ref = expr;
-
-  /* Also stash the constant offset for offset validation.  */
-  if (TREE_CODE (expr) == COMPONENT_REF)
-    refoff = const_off;
-
   if (TREE_CODE (base) == MEM_REF)
     {
       tree memrefoff = TREE_OPERAND (base, 1);
@@ -918,12 +924,20 @@ builtin_access::generic_overlap ()
   if (!overlap_certain)
     {
       if (!dstref->strbounded_p && !depends_p)
+	/* Memcpy only considers certain overlap.  */
 	return false;
 
       /* There's no way to distinguish an access to the same member
 	 of a structure from one to two distinct members of the same
 	 structure.  Give up to avoid excessive false positives.  */
-      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));
+      tree basetype = TREE_TYPE (dstref->base);
+
+      if (POINTER_TYPE_P (basetype))
+	basetype = TREE_TYPE (basetype);
+      else
+	while (TREE_CODE (basetype) == ARRAY_TYPE)
+	  basetype = TREE_TYPE (basetype);
+
       if (RECORD_OR_UNION_TYPE_P (basetype))
 	return false;
     }
Index: gcc/testsuite/ChangeLog
===================================================================
--- gcc/testsuite/ChangeLog	(revision 258339)
+++ gcc/testsuite/ChangeLog	(working copy)
@@ -1,5 +1,10 @@
 2018-03-07  Martin Sebor  <msebor@redhat.com>
 
+	PR tree-optimization/84526
+	* gcc.dg/Wrestrict-10.c: New test.
+
+2018-03-07  Martin Sebor  <msebor@redhat.com>
+
 	PR tree-optimization/84468
 	* g++.dg/warn/Wstringop-truncation-2.C: New test.
 	* gcc.dg/Wstringop-truncation.c: New test.
Index: gcc/testsuite/gcc.dg/Wrestrict-10.c
===================================================================
--- gcc/testsuite/gcc.dg/Wrestrict-10.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wrestrict-10.c	(working copy)
@@ -0,0 +1,121 @@
+/* PR tree-optimization/84526 - ICE in generic_overlap
+   { dg-do compile }
+   { dg-options "-O2 -Wrestrict" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* memcpy (void* restrict, const void* restrict, size_t);
+extern char* strcat (char* restrict, const char* restrict);
+extern char* strcpy (char* restrict, const char* restrict);
+extern char* strncat (char* restrict, const char* restrict, size_t);
+extern char* strncpy (char* restrict, const char* restrict, size_t);
+
+struct
+{
+  char a[1];
+} b;
+
+int i;
+size_t n;
+
+void __attribute__ ((noclone, noinline))
+test_arr_memcpy_1 (void)
+{
+  memcpy (&b.a[i], b.a, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_memcpy_2 (void)
+{
+  memcpy (b.a, &b.a[i], n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcat_1 (void)
+{
+  strcat (&b.a[i], b.a);            /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcat_2 (void)
+{
+  /* This probably deserves a warning.  */
+  strcpy (b.a, &b.a[i]);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strncat_1 (void)
+{
+  strncat (&b.a[i], b.a, n);        /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strncat_2 (void)
+{
+  strncat (b.a, &b.a[i], n);        /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcpy_1 (void)
+{
+  strcpy (&b.a[i], b.a);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcpy_2 (void)
+{
+  strcpy (b.a, &b.a[i]);
+}
+
+
+struct S {
+  int a;
+  char b[10];
+} d;
+
+void __attribute__ ((noclone, noinline))
+test_obj_memcpy_1 (void)
+{
+  memcpy (d.b, (char *) &d, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_memcpy_2 (void)
+{
+  memcpy ((char *) &d, d.b, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strcpy_1 (void)
+{
+  strcpy (d.b, (char *) &d);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strcpy_2 (void)
+{
+  strcpy ((char *) &d, d.b);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncat_1 (void)
+{
+  strncat (d.b, (char *) &d, n);    /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncat_2 (void)
+{
+  strncat ((char *) &d, d.b, n);    /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncpy_1 (void)
+{
+  strncpy (d.b, (char *) &d, n);
+}
+
+void test_obj_strncpy_2 (void)
+{
+  strncpy ((char *) &d, d.b, n);
+}
Index: gcc/testsuite/gcc.dg/Wrestrict-11.c
===================================================================
--- gcc/testsuite/gcc.dg/Wrestrict-11.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wrestrict-11.c	(working copy)
@@ -0,0 +1,205 @@
+/* PR tree-optimization/84526 - ICE in generic_overlap
+   Unrelated to the ICE but rather to PR 84095 that introduced it, verify
+   that calls to strncpy involving multidimensional arrays of structs don't
+   trigger false positive -Wrestrict warnings.
+   { dg-do compile }
+   { dg-options "-O2 -Wrestrict -ftrack-macro-expansion=0" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern char* strcpy (char*, const char*);
+
+struct MemArrays
+{
+  char a1[4];
+  char a2[4][4];
+  char a3[4][4][4];
+} ma1[4], ma2[4][4], ma3[4][4][4];
+
+#define T(dst, src) do {				\
+    strcpy (src, "123");				\
+    strcpy (dst, src);					\
+  } while (0)
+
+
+void test_ma1_cst (const char *s)
+{
+  T (ma1[0].a1, ma1[0].a1);           /* { dg-warning "\\\[-Wrestrict]" } */
+  T (ma1[0].a1, ma1[1].a1);
+  T (ma1[0].a1, ma1[2].a1);
+  T (ma1[0].a1, ma1[3].a1);
+
+  T (ma1[0].a1, ma1[0].a1);           /* { dg-warning "\\\[-Wrestrict]" } */
+  T (ma1[1].a1, ma1[0].a1);
+  T (ma1[2].a1, ma1[0].a1);
+  T (ma1[3].a1, ma1[0].a1);
+}
+
+
+void test_ma1_var_cst (const char *s, int i)
+{
+  T (ma1[i].a1, ma1[0].a1);
+  T (ma1[i].a1, ma1[1].a1);
+  T (ma1[i].a1, ma1[2].a1);
+  T (ma1[i].a1, ma1[3].a1);
+
+  T (ma1[0].a1, ma1[i].a1);
+  T (ma1[1].a1, ma1[i].a1);
+  T (ma1[2].a1, ma1[i].a1);
+  T (ma1[3].a1, ma1[i].a1);
+}
+
+
+void test_ma1_var_var (const char *s, int i, int j)
+{
+  T (ma1[i].a1, ma1[j].a1);
+  T (ma1[i].a1, ma1[j].a1);
+  T (ma1[i].a1, ma1[j].a1);
+  T (ma1[i].a1, ma1[j].a1);
+
+  T (ma1[i].a1, ma1[j].a1);
+  T (ma1[i].a1, ma1[j].a1);
+  T (ma1[i].a1, ma1[j].a1);
+  T (ma1[i].a1, ma1[j].a1);
+}
+
+
+void test_ma2_cst (const char *s)
+{
+  T (ma2[0][0].a1, ma2[0][0].a1);     /* { dg-warning "\\\[-Wrestrict]" } */
+  T (ma2[0][0].a1, ma2[0][1].a1);
+  T (ma2[0][0].a1, ma2[0][2].a1);
+  T (ma2[0][0].a1, ma2[0][3].a1);
+
+  T (ma2[0][0].a1, ma2[1][0].a1);
+  T (ma2[0][0].a1, ma2[1][1].a1);
+  T (ma2[0][0].a1, ma2[1][2].a1);
+  T (ma2[0][0].a1, ma2[1][3].a1);
+
+  T (ma2[0][0].a1, ma2[2][0].a1);
+  T (ma2[0][0].a1, ma2[2][1].a1);
+  T (ma2[0][0].a1, ma2[2][2].a1);
+  T (ma2[0][0].a1, ma2[2][3].a1);
+
+  T (ma2[0][0].a1, ma2[3][0].a1);
+  T (ma2[0][0].a1, ma2[3][1].a1);
+  T (ma2[0][0].a1, ma2[3][2].a1);
+  T (ma2[0][0].a1, ma2[3][3].a1);
+
+
+  T (ma2[0][1].a1, ma2[0][0].a1);
+  T (ma2[0][1].a1, ma2[0][1].a1);     /* { dg-warning "\\\[-Wrestrict]" } */
+  T (ma2[0][1].a1, ma2[0][2].a1);
+  T (ma2[0][1].a1, ma2[0][3].a1);
+
+  T (ma2[0][1].a1, ma2[1][0].a1);
+  T (ma2[0][1].a1, ma2[1][1].a1);
+  T (ma2[0][1].a1, ma2[1][2].a1);
+  T (ma2[0][1].a1, ma2[1][3].a1);
+
+  T (ma2[0][1].a1, ma2[2][0].a1);
+  T (ma2[0][1].a1, ma2[2][1].a1);
+  T (ma2[0][1].a1, ma2[2][2].a1);
+  T (ma2[0][1].a1, ma2[2][3].a1);
+
+  T (ma2[0][1].a1, ma2[3][0].a1);
+  T (ma2[0][1].a1, ma2[3][1].a1);
+  T (ma2[0][1].a1, ma2[3][2].a1);
+  T (ma2[0][1].a1, ma2[3][3].a1);
+
+
+  T (ma2[0][2].a1, ma2[0][0].a1);
+  T (ma2[0][2].a1, ma2[0][1].a1);
+  T (ma2[0][2].a1, ma2[0][2].a1);     /* { dg-warning "\\\[-Wrestrict]" } */
+  T (ma2[0][2].a1, ma2[0][3].a1);
+
+  T (ma2[0][2].a1, ma2[1][0].a1);
+  T (ma2[0][2].a1, ma2[1][1].a1);
+  T (ma2[0][2].a1, ma2[1][2].a1);
+  T (ma2[0][2].a1, ma2[1][3].a1);
+
+  T (ma2[0][2].a1, ma2[2][0].a1);
+  T (ma2[0][2].a1, ma2[2][1].a1);
+  T (ma2[0][2].a1, ma2[2][2].a1);
+  T (ma2[0][2].a1, ma2[2][3].a1);
+
+  T (ma2[0][2].a1, ma2[3][0].a1);
+  T (ma2[0][2].a1, ma2[3][1].a1);
+  T (ma2[0][2].a1, ma2[3][2].a1);
+  T (ma2[0][2].a1, ma2[3][3].a1);
+
+
+  T (ma2[0][3].a1, ma2[0][0].a1);
+  T (ma2[0][3].a1, ma2[0][1].a1);
+  T (ma2[0][3].a1, ma2[0][2].a1);
+  T (ma2[0][3].a1, ma2[0][3].a1);     /* { dg-warning "\\\[-Wrestrict]" } */
+
+  T (ma2[0][3].a1, ma2[1][0].a1);
+  T (ma2[0][3].a1, ma2[1][1].a1);
+  T (ma2[0][3].a1, ma2[1][2].a1);
+  T (ma2[0][3].a1, ma2[1][3].a1);
+
+  T (ma2[0][3].a1, ma2[2][0].a1);
+  T (ma2[0][3].a1, ma2[2][1].a1);
+  T (ma2[0][3].a1, ma2[2][2].a1);
+  T (ma2[0][3].a1, ma2[2][3].a1);
+
+  T (ma2[0][3].a1, ma2[3][0].a1);
+  T (ma2[0][3].a1, ma2[3][1].a1);
+  T (ma2[0][3].a1, ma2[3][2].a1);
+  T (ma2[0][3].a1, ma2[3][3].a1);
+}
+
+
+void test_ma2_var (int i0, int j0, int i1, int j1)
+{
+  T (ma2[i0][j0].a1, ma2[i0][j0].a1);       /* { dg-warning "\\\[-Wrestrict]" } */
+
+  T (ma2[i0][j0].a1, ma2[i0][j1].a1);       /* { dg-bogus "\\\[-Wrestrict]" } */
+  T (ma2[i0][j0].a1, ma2[i1][j1].a1);       /* { dg-bogus "\\\[-Wrestrict]" } */
+
+  T (ma2[0][0].a2[i0], ma2[0][0].a2[j0]);   /* { dg-bogus "\\\[-Wrestrict]" } */
+  T (ma2[0][i0].a2[0], ma2[0][i1].a2[0]);   /* { dg-bogus "\\\[-Wrestrict]" } */
+  T (ma2[i0][0].a2[0], ma2[i1][0].a2[0]);   /* { dg-bogus "\\\[-Wrestrict]" } */
+  T (ma2[i0][j0].a2[0], ma2[i1][j1].a2[0]); /* { dg-bogus "\\\[-Wrestrict]" } */
+}
+
+
+void test_p2_var (struct MemArrays **p2, int i0, int j0, int i1, int j1)
+{
+  T (p2[i0][j0].a1, p2[i0][j0].a1);         /* { dg-warning "\\\[-Wrestrict]" } */
+
+  T (p2[i0][j0].a1, p2[i0][j1].a1);
+  T (p2[i0][j0].a1, p2[i1][j1].a1);
+
+  T (p2[0][0].a2[i0], p2[0][0].a2[j0]);
+  T (p2[0][i0].a2[0], p2[0][i1].a2[0]);
+  T (p2[i0][0].a2[0], p2[i1][0].a2[0]);
+  T (p2[i0][j0].a2[0], p2[i1][j1].a2[0]);
+}
+
+
+void test_ma3_cst (const char *s)
+{
+  T (ma3[0][0][0].a1, ma3[0][0][0].a1); /* { dg-warning "\\\[-Wrestrict]" } */
+  T (ma3[0][0][0].a1, ma3[0][0][3].a1);
+
+  T (ma3[0][0][0].a1, ma3[0][1][0].a1);
+  T (ma3[0][0][0].a1, ma3[0][1][3].a1);
+  T (ma3[0][0][0].a1, ma3[1][0][0].a1);
+  T (ma3[0][0][0].a1, ma3[1][0][3].a1);
+  T (ma3[0][0][0].a1, ma3[3][0][3].a1);
+  T (ma3[0][0][0].a1, ma3[3][3][3].a1);
+}
+
+
+void test_ma3_var (const char *s,
+		   int i0, int j0, int k0,
+		   int i1, int j1, int k1)
+{
+  T (ma3[i0][j0][k0].a1, ma3[i0][j0][k0].a1);   /* { dg-warning "\\\[-Wrestrict]" } */
+
+  T (ma3[i0][j0][k0].a1, ma3[i0][j0][k1].a1);   /* { dg-bogus "\\\[-Wrestrict]" } */
+  T (ma3[i0][j0][k0].a1, ma3[i0][j1][k1].a1);   /* { dg-bogus "\\\[-Wrestrict]" } */
+  T (ma3[i0][j0][k0].a1, ma3[i1][j1][k1].a1);   /* { dg-bogus "\\\[-Wrestrict]" } */
+}
Richard Sandiford March 7, 2018, 11:04 p.m. | #17
Martin Sebor <msebor@gmail.com> writes:
> @@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)

>    base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,

>  			      &mode, &sign, &reverse, &vol);

>  

> +  /* get_inner_reference is not expected to return null.  */

> +  gcc_assert (base != NULL);

> +

>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>  

> -  HOST_WIDE_INT const_off;

> -  if (!base || !bytepos.is_constant (&const_off))

> +  /* Convert the poly_int64 offset to to offset_int.  The offset

> +     should be constant but be prepared for it not to be just in

> +     case.  */


I know it's just repeating what I said in the other reply, but I think
we should drop this comment.  It doesn't add anything and it will quickly
become out of date once the frontend supports variable-length types.

> +  offset_int cstoff;

> +  if (bytepos.is_constant (&cstoff))


Also, same comment about this being less efficient than using the
natural type of HOST_WIDE_INT.  I think this and...

>      {

> -      base = get_base_address (TREE_OPERAND (expr, 0));

> -      return;

> +      offrange[0] += cstoff;

> +      offrange[1] += cstoff;

> +

> +      /* Besides the reference saved above, also stash the offset

> +	 for validation.  */

> +      if (TREE_CODE (expr) == COMPONENT_REF)

> +	refoff = cstoff;

>      }

> +  else

> +    offrange[1] += maxobjsize;

>  

> -  offrange[0] += const_off;

> -  offrange[1] += const_off;

> -

>    if (var_off)

>      {

>        if (TREE_CODE (var_off) == INTEGER_CST)

>  	{

> -	  offset_int cstoff = wi::to_offset (var_off);

> +	  cstoff = wi::to_offset (var_off);

>  	  offrange[0] += cstoff;

>  	  offrange[1] += cstoff;


...this are logically separate variables, so there's no need for
them to have the same type and name.

Richard
Martin Sebor March 7, 2018, 11:59 p.m. | #18
On 03/07/2018 04:04 PM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:

>> @@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)

>>    base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,

>>  			      &mode, &sign, &reverse, &vol);

>>

>> +  /* get_inner_reference is not expected to return null.  */

>> +  gcc_assert (base != NULL);

>> +

>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>

>> -  HOST_WIDE_INT const_off;

>> -  if (!base || !bytepos.is_constant (&const_off))

>> +  /* Convert the poly_int64 offset to to offset_int.  The offset

>> +     should be constant but be prepared for it not to be just in

>> +     case.  */

>

> I know it's just repeating what I said in the other reply, but I think

> we should drop this comment.  It doesn't add anything and it will quickly

> become out of date once the frontend supports variable-length types.


For some reason I never got the email you're referring to (I just
found it in the archives, after some head scratching as to what
you meant above).

I added the comment based on your feedback that it can't happen
yet (as an explanation for not calling to_constant() instead),
but I'll remove it before I commit the patch.

Otherwise, I only made these changes here because Jakub asked
me to.  They are unrelated to the fix for the ICE and so they
should have been made independently of it.  But since they are
correct and any difference in efficiency between HOST_WIDE_INT
and offset_int here is surely insignificant (and since you
also said you were fine either way) I'll leave the offset_int
in place.

Thanks
Martin
Jeff Law March 9, 2018, 7:07 p.m. | #19
On 03/07/2018 03:52 PM, Martin Sebor wrote:
> On 03/07/2018 12:37 PM, Jeff Law wrote:

>> On 02/26/2018 10:32 AM, Martin Sebor wrote:

>>>

>>> PR tree-optimization/84526 - ICE in generic_overlap at gcc/gimple-ssa-warn-restrict.c:927 since r257860

>>>

>>>

>>> gcc/ChangeLog:

>>>

>>>     PR tree-optimization/84526

>>>     * gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):

>>>     Remove dead code.

>>>     (builtin_access::generic_overlap): Be prepared to handle non-array

>>>     base objects.

>>>

>>> gcc/testsuite/ChangeLog:

>>>

>>>     PR tree-optimization/84526

>>>     * gcc.dg/Wrestrict-10.c: New test.

>>>

>>> Index: gcc/gimple-ssa-warn-restrict.c

>>> ===================================================================

>>> --- gcc/gimple-ssa-warn-restrict.c    (revision 257963)

>>> +++ gcc/gimple-ssa-warn-restrict.c    (working copy)

>>> @@ -396,6 +396,9 @@ builtin_memref::set_base_and_offset (tree expr)

>>>    if (TREE_CODE (expr) == ADDR_EXPR)

>>>      expr = TREE_OPERAND (expr, 0);

>>>

>>> +  /* Stash the reference for offset validation.  */

>>> +  ref = expr;

>>> +

>>>    poly_int64 bitsize, bitpos;

>>>    tree var_off;

>>>    machine_mode mode;

>>> @@ -409,23 +412,33 @@ builtin_memref::set_base_and_offset (tree expr)

>>>    base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,

>>>                    &mode, &sign, &reverse, &vol);

>>>

>>> +  /* get_inner_reference is not expected to return null.  */

>>> +  gcc_assert (base != NULL);

>>> +

>>>    poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);

>>>

>>> -  HOST_WIDE_INT const_off;

>>> -  if (!base || !bytepos.is_constant (&const_off))

>>> +  /* Convert the poly_int64 offset to to offset_int.  The offset

>>> +     should be constant but be prepared for it not to be just in

>>> +     case.  */

>>> +  offset_int cstoff;

>>> +  if (bytepos.is_constant (&cstoff))

>>>      {

>>> -      base = get_base_address (TREE_OPERAND (expr, 0));

>>> -      return;

>>> +      offrange[0] += cstoff;

>>> +      offrange[1] += cstoff;

>>> +

>>> +      /* Besides the reference saved above, also stash the offset

>>> +     for validation.  */

>>> +      if (TREE_CODE (expr) == COMPONENT_REF)

>>> +    refoff = cstoff;

>>>      }

>>> +  else

>>> +    offrange[1] += maxobjsize;

>> So is this assignment to offrange[1] really correct?

>>

>> ISTM that it potentially overflows (relative to MAXOBJSIZE) and that

>> you'd likely be better off just assigning offrange[1] to MAXOBJSIZE.

>>

>> Or alternately signal to the callers that we can't really analyze the

>> memory address and inhibit further analysis of the potential overlap of

>> the objects.

> 

> The offsets are stored in offset_int and there is code to deal

> with values outside the [-MAXOBJSIZE, MAXOBJSIZE] range, either

> by capping them to the bounds of the array/object being accessed

> if it's known, or to the MAXOBJSIZE range.  There are a number of

> places where offsets with unknown values are being added together

> and where their sum might end up exceeding that range (e.g., when

> dealing with multiple offsets, each in some range) but as long

> as the offsets are only added and not multiplied they should never

> overflow offset_int.  It would be okay to assign MAXOBJSIZE here

> instead of adding it, but there are other places with the same

> addition (see the bottom of builtin_memref::extend_offset_range)

> so doing it here alone would be inconsistent.

OK.  Thanks for explaining.

> 

>>> @@ -918,12 +924,18 @@ builtin_access::generic_overlap ()

>>>    if (!overlap_certain)

>>>      {

>>>        if (!dstref->strbounded_p && !depends_p)

>>> +    /* Memcpy only considers certain overlap.  */

>>>      return false;

>>>

>>>        /* There's no way to distinguish an access to the same member

>>>       of a structure from one to two distinct members of the same

>>>       structure.  Give up to avoid excessive false positives.  */

>>> -      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));

>>> +      tree basetype = TREE_TYPE (dstref->base);

>>> +

>>> +      if (POINTER_TYPE_P (basetype)

>>> +      || TREE_CODE (basetype) == ARRAY_TYPE)

>>> +    basetype = TREE_TYPE (basetype);

>>> +

>>>        if (RECORD_OR_UNION_TYPE_P (basetype))

>>>      return false;

>>>      }

>> This is probably fairly reasonable.  My only concern would be that we

>> handle multi-dimensional arrays correctly.  Do you need to handle

>> ARRAY_TYPEs with a loop?

> 

> Yes, good catch, thanks!  It's unrelated to this bug (the ICE) but

> it seems simple enough to handle at the same time so I've added

> the loop and a test to verify it works as expected.

> 

>> I do have a more general concern.  Given that we walk backwards through

>> pointer casts and ADDR_EXPRs we potentially lose information.  For

>> example we might walk backwards through a cast from a small array to a

>> larger array type.

Agreed.  Thanks for fixing this as well.  Agreed it seems simple enough
to go ahead and handle now.


>>

>> Thus later we use the smaller array type when computing the bounds and

>> subsequent offset from BASE.  This could lead to a missed warning as the

>> computed offset would be too small.

> 

> There are many false negatives here.  A lot of those I noticed

> during testing are because of limitations in the strlen pass

> (I must have raised a dozen bugs to track and, hopefully, fix

> in stage 1).  There are also deficiencies in the restrict pass

> itself.  For instance, using builtin_object_size to compute

> the size of member arrays leads to many because bos computes

> the size of the larger object.  It's on my to do list to open

> bugs for all these false negatives irrespective of the strlen

> limitations as a reminder to add test cases for the former.

Understood.  I should have been clearer that I don't think we
necessarily have to address the false negatives right now.  Just that
this is a potential source of them.


> 

>> I'm inclined to ack after fixing offrange[1] computation noted above as

>> I don't think the patch makes things any worse.  As I noted in a prior

>> message, I don't see concerns with consistency of BASE that Jakub sees here.

>>

> 

> Attached is a revision with the loop.  I didn't make any

> changes to the offrange[1] computation.  If you want me to

> change it wholesale for all the uses let me know, but I would

> prefer to do it in a separate change, after fixing the ICE.

> 

> Martin

> 

> 

> gcc-84526.diff

> 

> 

> PR tree-optimization/84526 - ICE in generic_overlap at gcc/gimple-ssa-warn-restrict.c:927 since r257860

> 

> gcc/ChangeLog:

> 

> 	PR tree-optimization/84526

> 	* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):

> 	Remove dead code.

> 	(builtin_access::generic_overlap): Be prepared to handle non-array

> 	base objects.

> 

> gcc/testsuite/ChangeLog:

> 

> 	PR tree-optimization/84526

> 	* gcc.dg/Wrestrict-10.c: New test.

> 	* gcc.dg/Wrestrict-11.c: New test.

OK.  I think Richard S. had some follow-up issues that didn't affect
correctness/behavior.  Those clean ups are pre-approved.


Given we've seen a few instances pop up in Fedora, I'm going to go ahead
and commit now.  At the least it'll stop duplicates in our BZ and if
Jakub is spinning a new Fedora release over the weekend, it'd get picked
up for Fedora as well.

Jeff

Patch

PR tree-optimization/84526 - ICE in generic_overlap at gcc/gimple-ssa-warn-restrict.c:927 since r257860

gcc/ChangeLog:

	PR tree-optimization/84526
	* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
	Remove dead code.
	(builtin_access::generic_overlap): Be prepared to handle non-array
	base objects.

gcc/testsuite/ChangeLog:

	PR tree-optimization/84526
	* gcc.dg/Wrestrict-10.c: New test.

Index: gcc/gimple-ssa-warn-restrict.c
===================================================================
--- gcc/gimple-ssa-warn-restrict.c	(revision 257933)
+++ gcc/gimple-ssa-warn-restrict.c	(working copy)
@@ -409,15 +409,15 @@  builtin_memref::set_base_and_offset (tree expr)
   base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,
 			      &mode, &sign, &reverse, &vol);
 
+  /* get_inner_reference is not expected to return null.  */
+  gcc_assert (base != NULL);
+
   poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
 
-  HOST_WIDE_INT const_off;
-  if (!base || !bytepos.is_constant (&const_off))
-    {
-      base = get_base_address (TREE_OPERAND (expr, 0));
-      return;
-    }
-
+  /* There is no conversion from poly_int64 to offset_int even
+     though the latter is wider, so go through HOST_WIDE_INT.
+     The offset is expected to always be constant.  */
+  HOST_WIDE_INT const_off = bytepos.to_constant ();
   offrange[0] += const_off;
   offrange[1] += const_off;
 
@@ -923,7 +923,11 @@  builtin_access::generic_overlap ()
       /* There's no way to distinguish an access to the same member
 	 of a structure from one to two distinct members of the same
 	 structure.  Give up to avoid excessive false positives.  */
-      tree basetype = TREE_TYPE (TREE_TYPE (dstref->base));
+      tree basetype = TREE_TYPE (dstref->base);
+      if (POINTER_TYPE_P (basetype)
+	  || TREE_CODE (basetype) == ARRAY_TYPE)
+	basetype = TREE_TYPE (basetype);
+
       if (RECORD_OR_UNION_TYPE_P (basetype))
 	return false;
     }
Index: gcc/testsuite/gcc.dg/Wrestrict-10.c
===================================================================
--- gcc/testsuite/gcc.dg/Wrestrict-10.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wrestrict-10.c	(working copy)
@@ -0,0 +1,121 @@ 
+/* PR tree-optimization/84526 - ICE in generic_overlap
+   { dg-do compile }
+   { dg-options "-O2 -Wrestrict" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* memcpy (void* restrict, const void* restrict, size_t);
+extern char* strcat (char* restrict, const char* restrict);
+extern char* strcpy (char* restrict, const char* restrict);
+extern char* strncat (char* restrict, const char* restrict, size_t);
+extern char* strncpy (char* restrict, const char* restrict, size_t);
+
+struct
+{
+  char a[1];
+} b;
+
+int i;
+size_t n;
+
+void __attribute__ ((noclone, noinline))
+test_arr_memcpy_1 (void)
+{
+  memcpy (&b.a[i], b.a, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_memcpy_2 (void)
+{
+  memcpy (b.a, &b.a[i], n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcat_1 (void)
+{
+  strcat (&b.a[i], b.a);            /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcat_2 (void)
+{
+  /* This probably deserves a warning.  */
+  strcpy (b.a, &b.a[i]);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strncat_1 (void)
+{
+  strncat (&b.a[i], b.a, n);        /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strncat_2 (void)
+{
+  strncat (b.a, &b.a[i], n);        /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcpy_1 (void)
+{
+  strcpy (&b.a[i], b.a);
+}
+
+void __attribute__ ((noclone, noinline))
+test_arr_strcpy_2 (void)
+{
+  strcpy (b.a, &b.a[i]);
+}
+
+
+struct S {
+  int a;
+  char b[10];
+} d;
+
+void __attribute__ ((noclone, noinline))
+test_obj_memcpy_1 (void)
+{
+  memcpy (d.b, (char *) &d, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_memcpy_2 (void)
+{
+  memcpy ((char *) &d, d.b, n);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strcpy_1 (void)
+{
+  strcpy (d.b, (char *) &d);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strcpy_2 (void)
+{
+  strcpy ((char *) &d, d.b);
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncat_1 (void)
+{
+  strncat (d.b, (char *) &d, n);    /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncat_2 (void)
+{
+  strncat ((char *) &d, d.b, n);    /* { dg-warning "\\\[-Wrestrict" } */
+}
+
+void __attribute__ ((noclone, noinline))
+test_obj_strncpy_1 (void)
+{
+  strncpy (d.b, (char *) &d, n);
+}
+
+void test_obj_strncpy_2 (void)
+{
+  strncpy ((char *) &d, d.b, n);
+}