adjust object size computation for union accesses and PHIs (PR 92765)

Message ID 64a504e5-8e1f-682a-0443-e74e62d3aac1@gmail.com
State New
Headers show
Series
  • adjust object size computation for union accesses and PHIs (PR 92765)
Related show

Commit Message

Martin Sebor Jan. 15, 2020, 1:18 p.m.
The strcmp optimization newly introduced in GCC 10 relies on
the size of the smallest referenced array object to determine
whether the function can return zero.  When the size of
the object is smaller than the length of the other string
argument the optimization folds the equality to false.

The bug report has identified a couple of problems here:
1) when the access to the array object is via a pointer to
a (possibly indirect) member of a union, in GIMPLE the pointer
may actually point to a different member than the one in
the original source code.  Thus the size of the array may
appear to be smaller than in the source code which can then
result in the optimization being invalid.
2) when the pointer in the access may point to two or more
arrays of different size (i.e., it's the result of a PHI),
assuming it points to the smallest of them can also lead
to an incorrect result when the optimization is applied.

The attached patch adjusts the optimization to 1) avoid making
any assumptions about the sizes of objects accessed via union
types, and b) use the size of the largest object in PHI nodes.

Tested on x86_64-linux.

Martin

Comments

Jeff Law Jan. 15, 2020, 8:29 p.m. | #1
On Wed, 2020-01-15 at 13:18 +0000, Martin Sebor wrote:
> The strcmp optimization newly introduced in GCC 10 relies on

> the size of the smallest referenced array object to determine

> whether the function can return zero.  When the size of

> the object is smaller than the length of the other string

> argument the optimization folds the equality to false.

> 

> The bug report has identified a couple of problems here:

> 1) when the access to the array object is via a pointer to

> a (possibly indirect) member of a union, in GIMPLE the pointer

> may actually point to a different member than the one in

> the original source code.  Thus the size of the array may

> appear to be smaller than in the source code which can then

> result in the optimization being invalid.

> 2) when the pointer in the access may point to two or more

> arrays of different size (i.e., it's the result of a PHI),

> assuming it points to the smallest of them can also lead

> to an incorrect result when the optimization is applied.

> 

> The attached patch adjusts the optimization to 1) avoid making

> any assumptions about the sizes of objects accessed via union

> types, and b) use the size of the largest object in PHI nodes.

> 

> Tested on x86_64-linux.

> 

> Martin

> PR tree-optimization/92765 - wrong code for strcmp of a union member

> 

> gcc/testsuite/ChangeLog:

> 

> 	PR tree-optimization/92765

> 	* gcc.dg/strlenopt-92.c: New test.

> 

> gcc/ChangeLog:

> 

> 	PR tree-optimization/92765

> 	* tree-ssa-strlen.c (component_ref_via_union_p): New function.

> 	(determine_min_objsize): Call it.  Use the maximum object size

> 	for PHI arguments.

OK
jeff
Jakub Jelinek Jan. 15, 2020, 8:52 p.m. | #2
On Wed, Jan 15, 2020 at 01:18:54PM +0000, Martin Sebor wrote:
> @@ -4099,14 +4122,18 @@ determine_min_objsize (tree dest)

>  

>    init_object_sizes ();

>  

> -  if (compute_builtin_object_size (dest, 2, &size))

> -    return size;

> -

>    /* Try to determine the size of the object through the RHS

>       of the assign statement.  */

>    if (TREE_CODE (dest) == SSA_NAME)

>      {

>        gimple *stmt = SSA_NAME_DEF_STMT (dest);

> +

> +      /* Determine the size of the largest object when DEST refers

> +	 to two or more via a PHI, otherwise the smallest.  */

> +      int ostype = gimple_code (stmt) == GIMPLE_PHI ? 0 : 2;

> +      if (compute_builtin_object_size (dest, ostype, &size))

> +	return size;

> +

>        if (!is_gimple_assign (stmt))

>  	return HOST_WIDE_INT_M1U;

>  

> @@ -4118,6 +4145,10 @@ determine_min_objsize (tree dest)

>        return determine_min_objsize (dest);

>      }

>  

> +  /* Try to determine the size of the referenced object itself.  */

> +  if (compute_builtin_object_size (dest, 2, &size))

> +    return size;

> +


This looks wrong.  For one, this function is used for two purposes now and
you tweak it for one, but more importantly, whether he initial stmt
you see is a PHI or not can't make a difference, how is that case e.g.
different from _1 = PHI <_3, _4>; _2 = _1 + 1; and asking about _2?
For _1, you'd use (correctly) the maximum, but if called on _2, you'd ask
(wrongly) for minimum instead of maximum.

>    /* The size of a flexible array cannot be determined.  Otherwise,

> -     for arrays with more than one element, return the size of its

> -     type.  GCC itself misuses arrays of both zero and one elements

> -     as flexible array members so they are excluded as well.  */

> +     unless the reference involves a union, for arrays with more than

> +     one element, return the size of its type.  GCC itself misuses

> +     arrays of both zero and one elements as flexible array members

> +     so they are excluded as well.  */

>    if (TREE_CODE (type) != ARRAY_TYPE

> -      || !array_at_struct_end_p (dest))

> +      || (!component_ref_via_union_p (dest)

> +	  && !array_at_struct_end_p (dest)))

>      {

>        tree type_size = TYPE_SIZE_UNIT (type);

>        if (type_size && TREE_CODE (type_size) == INTEGER_CST


This also looks like a hack to shut up the particular testcases instead of
really playing with what the IL provides.  Instead of the unions, consider
e.g. C++ placement new, have a pointer to a buffer into which you placement
new one structure, take address of some member in it, pass it to something,
if it doesn't have a destructor do a C++ placement new into the same buffer
but with different structure, take address of a different member with the
same address as the first member, do the str*cmp on it that invokes this
stuff.  SCCVN will (likely) find out that the values of those two pointers
are the same and just use the former pointer in the latter case.

	Jakub
Jakub Jelinek Jan. 16, 2020, 12:23 p.m. | #3
On Wed, Jan 15, 2020 at 09:52:57PM +0100, Jakub Jelinek wrote:
> This looks wrong.  For one, this function is used for two purposes now and

> you tweak it for one, but more importantly, whether he initial stmt

> you see is a PHI or not can't make a difference, how is that case e.g.

> different from _1 = PHI <_3, _4>; _2 = _1 + 1; and asking about _2?

> For _1, you'd use (correctly) the maximum, but if called on _2, you'd ask

> (wrongly) for minimum instead of maximum.


And now with testcases.  strlenopt-95.c shows the above.

> This also looks like a hack to shut up the particular testcases instead of

> really playing with what the IL provides.  Instead of the unions, consider

> e.g. C++ placement new, have a pointer to a buffer into which you placement

> new one structure, take address of some member in it, pass it to something,

> if it doesn't have a destructor do a C++ placement new into the same buffer

> but with different structure, take address of a different member with the

> same address as the first member, do the str*cmp on it that invokes this

> stuff.  SCCVN will (likely) find out that the values of those two pointers

> are the same and just use the former pointer in the latter case.


And strlenopt-93.C shows the latter.  strlenopt-94.C is similar, just to
show that it breaks equally badly with non-PODs that will be constructed by
placement new and destructed later.

	Jakub
__attribute__((noipa)) int
barrier_copy (char *x, int y)
{
  asm volatile ("" : : "g" (x), "g" (y) : "memory");
  if (y == 0)
    __builtin_strcpy (x, "abcd");
  return y;
}

__attribute__((noipa)) char *
test_2 (int x)
{
  char *p;
  if (x)
    p = __builtin_malloc (4);
  else
    p = __builtin_calloc (16, 1);
  char *q = p + 2;
  if (barrier_copy (q, x))
    return p;
  if (__builtin_strcmp (q, "abcd") != 0)
    __builtin_abort ();
  return p;
}

int
main ()
{
  __builtin_free (test_2 (0));
  __builtin_free (test_2 (1));
  return 0;
}
#include <new>

struct S1 { char a[2]; char b[2]; char c[2]; };
struct S2 { char d[6]; };

__attribute__((noipa)) void
foo (char *b)
{
  b[0] = 1;
  b[1] = 2;
  asm volatile ("" : : "g" (b) : "memory");
}

__attribute__((noipa)) void
bar (char *d)
{
  __builtin_memcpy (d, "cde", 4);
  asm volatile ("" : : "g" (d) : "memory");
}

__attribute__((noipa)) void
baz (char *buf)
{
  S1 *s1 = new (buf) S1;
  char *p = (char *) &s1->b;
  foo (p);
  S2 *s2 = new (buf) S2;
  char *q = (char *) &s2->d[2];
  bar (q);
  if (__builtin_strcmp (q, "cde"))
    __builtin_abort ();
}

int
main ()
{
  union U { S1 s1; S2 s2; char buf[sizeof (S1) > sizeof (S2) ? sizeof (S1) : sizeof (S2)]; } u;
  baz (u.buf);
  return 0;
}
#include <new>

struct S1 { char a[2]; char b[2]; char c[2]; S1 () { a[0] = 0; b[0] = 0; c[0] = 0; }; ~S1 () {} };
struct S2 { char d[6]; S2 () { d[0] = 0; d[2] = 0; } ~S2 () {} };

__attribute__((noipa)) void
foo (char *b)
{
  b[0] = 1;
  b[1] = 2;
  asm volatile ("" : : "g" (b) : "memory");
}

__attribute__((noipa)) void
bar (char *d)
{
  __builtin_memcpy (d, "cde", 4);
  asm volatile ("" : : "g" (d) : "memory");
}

__attribute__((noipa)) void
baz (char *buf)
{
  S1 *s1 = new (buf) S1 ();
  char *p = (char *) &s1->b;
  foo (p);
  s1->~S1 ();
  S2 *s2 = new (buf) S2 ();
  char *q = (char *) &s2->d[2];
  bar (q);
  if (__builtin_strcmp (q, "cde"))
    __builtin_abort ();
  s2->~S2 ();
}

int
main ()
{
  char buf[sizeof (S1) > sizeof (S2) ? sizeof (S1) : sizeof (S2)];
  baz (buf);
  return 0;
}
Martin Sebor Jan. 31, 2020, 7:04 p.m. | #4
Attached is a reworked patch since the first one didn't go far
enough to solve the major problems.  The new solution relies on
get_range_strlen_dynamic the same way as the sprintf optimization,
and does away with the determine_min_objsize function and calling
compute_builtin_object_size.

To minimize testsuite fallout I extended get_range_strlen to handle
a couple more kinds expressions(*), but I still had to xfail and
disable a few tests that were relying on being able to use the type
of the destination object as the upper bound on the string length.

Tested on x86_64-linux.

Martin

[*] With all the issues around MEM_REFs and types this change needs
extra scrutiny.  I'm still not sure I fully understand what can and
what cannot be safely relied on at this level.

On 1/15/20 6:18 AM, Martin Sebor wrote:
> The strcmp optimization newly introduced in GCC 10 relies on

> the size of the smallest referenced array object to determine

> whether the function can return zero.  When the size of

> the object is smaller than the length of the other string

> argument the optimization folds the equality to false.

> 

> The bug report has identified a couple of problems here:

> 1) when the access to the array object is via a pointer to

> a (possibly indirect) member of a union, in GIMPLE the pointer

> may actually point to a different member than the one in

> the original source code.  Thus the size of the array may

> appear to be smaller than in the source code which can then

> result in the optimization being invalid.

> 2) when the pointer in the access may point to two or more

> arrays of different size (i.e., it's the result of a PHI),

> assuming it points to the smallest of them can also lead

> to an incorrect result when the optimization is applied.

> 

> The attached patch adjusts the optimization to 1) avoid making

> any assumptions about the sizes of objects accessed via union

> types, and b) use the size of the largest object in PHI nodes.

> 

> Tested on x86_64-linux.

> 

> Martin
PR tree-optimization/92765 - wrong code for strcmp of a union member

gcc/ChangeLog:

	PR tree-optimization/92765
	* gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
	* tree-ssa-strlen.c (compute_string_length): Remove.
	(determine_min_objsize): Remove.
	(get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
	Avoid using type size as the upper bound on string length.
	(handle_builtin_string_cmp): Add an argument.  Adjust.
	(strlen_check_and_optimize_call): Pass additional argument to
	handle_builtin_string_cmp.

gcc/testsuite/ChangeLog:

	PR tree-optimization/92765
	* g++.dg/tree-ssa/strlenopt-1.C: New test.
	* g++.dg/tree-ssa/strlenopt-2.C: New test.
	* gcc.dg/Warray-bounds-58.c: New test.
	* gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
	* gcc.dg/Wstring-compare.c: Xfail a test.
	* gcc.dg/strcmpopt_2.c: Disable tests.
	* gcc.dg/strcmpopt_4.c: Adjust tests.
	* gcc.dg/strcmpopt_10.c: New test.
	* gcc.dg/strlenopt-69.c: Disable tests.
	* gcc.dg/strlenopt-92.c: New test.
	* gcc.dg/strlenopt-93.c: New test.
	* gcc.dg/strlenopt.h: Declare calloc.
	* gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index ed225922269..d70ac67e1ca 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 		       c_strlen_data *pdata, unsigned eltsize)
 {
   gcc_assert (TREE_CODE (arg) != SSA_NAME);
- 
+
   /* The length computed by this invocation of the function.  */
   tree val = NULL_TREE;
 
@@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	     type about the length here.  */
 	  tight_bound = true;
 	}
-      else if (VAR_P (arg))
+      else if (TREE_CODE (arg) == MEM_REF
+	       && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
+	       && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
+	       && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
+	{
+	  /* Handle a MEM_REF into a DECL accessing an array of integers,
+	     being conservative about references to extern structures with
+	     flexible array members that can be initialized to arbitrary
+	     numbers of elements as an extension (static structs are okay).
+	     FIXME: Make this less conservative -- see
+	     component_ref_size in tree.c.  */
+	  tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
+	  tree off = TREE_OPERAND (arg, 1);
+	  if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))
+	      && (!DECL_EXTERNAL (ref)
+		  || !array_at_struct_end_p (arg)))
+	    {
+	      /* Fail if the offset is out of bounds.  Such accesses
+		 should be diagnosed at some point.  */
+	      val = DECL_SIZE_UNIT (ref);
+	      if (!val
+		  || integer_zerop (val)
+		  || tree_int_cst_le (val, off))
+		return false;
+
+	      pdata->minlen = ssize_int (0);
+
+	      /* Subtract the offset and one for the terminating nul.  */
+	      val = fold_build2 (MINUS_EXPR, TREE_TYPE (val), val, off);
+	      val = fold_build2 (MINUS_EXPR, TREE_TYPE (val), val,
+				 integer_one_node);
+	      /* Since VAL reflects the size of a declared object
+		 rather the type of the access it is not a tight bound.  */
+	    }
+	}
+      else if (TREE_CODE (arg) == PARM_DECL || VAR_P (arg))
 	{
 	  /* Avoid handling pointers to arrays.  GCC might misuse
 	     a pointer to an array of one bound to point to an array
@@ -1500,7 +1535,8 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	     the referenced subobject minus 1 (for the terminating nul).  */
 	  tree type = TREE_TYPE (base);
 	  if (TREE_CODE (type) == POINTER_TYPE
-	      || !VAR_P (base) || !(val = DECL_SIZE_UNIT (base)))
+	      || (TREE_CODE (base) != PARM_DECL && !VAR_P (base))
+	      || !(val = DECL_SIZE_UNIT (base)))
 	    val = build_all_ones_cst (size_type_node);
 	  else
 	    {
diff --git a/gcc/testsuite/g++.dg/tree-ssa/strlenopt-1.C b/gcc/testsuite/g++.dg/tree-ssa/strlenopt-1.C
new file mode 100644
index 00000000000..b6adfac7136
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/strlenopt-1.C
@@ -0,0 +1,42 @@
+/* PR tree-optimization/92765 - wrong code for strcmp of a union member
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+inline void* operator new (size_t, void *p)
+{
+  return p;
+}
+
+struct A { char a2[2]; };
+struct B { char a4[4]; };
+
+__attribute__((noipa)) void
+sink (void*) { }
+
+__attribute__((noipa)) void
+copy (char *d, const char *s)
+{
+  while ((*d++ = *s++));
+}
+
+__attribute__((noipa)) void
+store_and_compare (void *p)
+{
+  A *a = new (p) A;
+  sink (a->a2);
+
+  B *b = new (p) B;
+  char *q = (char *) b->a4;
+  copy (q, "abc");
+
+  if (__builtin_strcmp (q, "abc"))
+    __builtin_abort ();
+}
+
+int main ()
+{
+  char a [sizeof (A) > sizeof (B) ? sizeof (A) : sizeof (B)];
+  store_and_compare (a);
+}
diff --git a/gcc/testsuite/g++.dg/tree-ssa/strlenopt-2.C b/gcc/testsuite/g++.dg/tree-ssa/strlenopt-2.C
new file mode 100644
index 00000000000..60f8205c2ab
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/strlenopt-2.C
@@ -0,0 +1,56 @@
+/* PR tree-optimization/92765 - wrong code for strcmp of a union member
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+inline void* operator new (size_t, void *p)
+{
+  return p;
+}
+
+struct A
+{
+  char a[2]; char b[2]; char c[2];
+  A () { a[0] = 0; b[0] = 0; c[0] = 0; };
+  ~A () { }
+};
+
+struct B
+{
+  char d[6];
+  B () { d[0] = 0; d[2] = 0; }
+  ~B () { }
+};
+
+__attribute__((noipa)) void
+sink (void *) { }
+
+__attribute__((noipa)) void
+copy (char *d, const char *s)
+{
+  while ((*d++ = *s++));
+}
+
+__attribute__((noipa)) void
+store_and_compare (void *p)
+{
+  A *a = new (p) A ();
+  sink (&a->b);
+  a->~A ();
+
+  B *b = new (p) B ();
+  char *q = &b->d[2];
+  copy (q, "abc");
+
+  if (__builtin_strcmp (q, "abc"))
+    __builtin_abort ();
+  b->~B ();
+}
+
+int main ()
+{
+  char a [sizeof (A) > sizeof (B) ? sizeof (A) : sizeof (B)];
+  store_and_compare (a);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-58.c b/gcc/testsuite/gcc.dg/Warray-bounds-58.c
new file mode 100644
index 00000000000..7bd6df2bf2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Warray-bounds-58.c
@@ -0,0 +1,81 @@
+/* { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern size_t strlen (const char*);
+
+void sink (size_t);
+
+struct A0 { char i, a[0]; };
+
+extern struct A0 ea0;
+
+void fa0_extern (void)
+{
+  sink (strlen (ea0.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ea0.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ea0.a));        // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ea0.a + 1));    // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static struct A0 sa0 = { 0 };
+
+void fa0_static (void)
+{
+  sink (strlen (sa0.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (sa0.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (sa0.a));        // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (sa0.a + 1));    // { dg-warning "\\\[-Warray-bounds" }
+}
+
+
+struct Ax { char i, a[]; };
+
+extern struct Ax ax;
+
+void fax_extern (void)
+{
+  sink (strlen (ax.a - 2));     // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax.a - 1));     // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax.a));
+  sink (strlen (ax.a + 123));
+}
+
+static struct Ax ax0 = { 0, { 0 } };
+static struct Ax ax1 = { 1, { 1, 0 } };
+static struct Ax ax2 = { 2, { 2, 1, 0 } };
+static struct Ax ax3 = { 3, { 3, 2, 1, 0 } };
+
+void fax_static (void)
+{
+  sink (strlen (ax0.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ax0.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax0.a));
+  sink (strlen (ax0.a + 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax0.a + 2));    // { dg-warning "\\\[-Warray-bounds" }
+
+  sink (strlen (ax1.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ax1.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax1.a));
+  sink (strlen (ax1.a + 1));
+  sink (strlen (ax1.a + 2));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax1.a + 3));    // { dg-warning "\\\[-Warray-bounds" }
+
+  sink (strlen (ax2.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ax2.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax2.a));
+  sink (strlen (ax2.a + 1));
+  sink (strlen (ax2.a + 2));
+  sink (strlen (ax2.a + 3));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax2.a + 4));    // { dg-warning "\\\[-Warray-bounds" }
+
+  sink (strlen (ax3.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ax3.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax3.a));
+  sink (strlen (ax3.a + 1));
+  sink (strlen (ax3.a + 2));
+  sink (strlen (ax3.a + 3));
+  sink (strlen (ax3.a + 4));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax3.a + 5));    // { dg-warning "\\\[-Warray-bounds" }
+}
diff --git a/gcc/testsuite/gcc.dg/Wrestrict-20.c b/gcc/testsuite/gcc.dg/Wrestrict-20.c
index 9826e7f4503..a2d29887c37 100644
--- a/gcc/testsuite/gcc.dg/Wrestrict-20.c
+++ b/gcc/testsuite/gcc.dg/Wrestrict-20.c
@@ -15,7 +15,7 @@ void test_warn (char *p)
   sprintf (a, "a=%s", a);     /* { dg-warning "-Wrestrict" } */
 
   p = a;
-  char *q = p + 1;
+  char *q = p + 3;
   sprintf (p, "a=%s", q);     /* { dg-warning "-Wrestrict" } */
 }
 
@@ -31,7 +31,7 @@ void test_nowarn_front_end (char *d)
 void test_nowarn_sprintf_pass (char *d)
 {
   char *q = d;
-  
+
   sprintf (d, "p=%p", q);
   snprintf (d, 32, "p=%p", q);
 
diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c
index 0ca492db0ab..d1534bf7555 100644
--- a/gcc/testsuite/gcc.dg/Wstring-compare.c
+++ b/gcc/testsuite/gcc.dg/Wstring-compare.c
@@ -120,7 +120,8 @@ void strcmp_array_copy (void)
 
 void strcmp_member_array_lit (const struct S *p)
 {
-  T (p->a4, "1234");        // { dg-warning "length 4 and an array of size 4 " }
+  // Not handled due to the fix for PR 92756.
+  T (p->a4, "1234");        // { dg-warning "length 4 and an array of size 4 " "pr92765" { xfail *-*-* } }
 }
 
 
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_10.c b/gcc/testsuite/gcc.dg/strcmpopt_10.c
new file mode 100644
index 00000000000..94fda596901
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_10.c
@@ -0,0 +1,130 @@
+/* Verify that strncmp equalities aren't eliminated when the trailing array
+   type referenced by a member pointer is smaller than the string in cases
+   when the pointer pointed to by the enclosing object references an object
+   sufficiently large to store a string of equal length.
+  { dg-do compile }
+  { dg-options "-O2 -Wall -Wextra -fdump-tree-optimized" } */
+
+void init (void*);
+
+struct A1 { char i, a[1]; };
+
+void f1_arr (void)
+{
+  char a[9];
+  init (a);
+
+  struct A1 *p = (struct A1*)a;
+
+  if (__builtin_strncmp (p->a, "01234567", 8) == 0)
+    {
+      extern void array_test (void);
+      array_test ();
+    }
+}
+
+void f1_ptr (void)
+{
+  void *p;
+  init (&p);
+
+  struct A1 *q = (struct A1*)p;
+
+  if (__builtin_strncmp (q->a, "0123456789", 10) == 0)
+    {
+      extern void pointer_test (void);
+      pointer_test ();
+    }
+}
+
+void f1_struct (void)
+{
+  struct { char a[9]; } b;
+  init (&b);
+
+  struct A1 *p = (struct A1*)&b;
+
+  if (__builtin_strncmp (p->a, "01234567", 8) == 0)
+    {
+      extern void struct_test (void);
+      struct_test ();
+    }
+}
+
+void f1_memptr (void)
+{
+  struct { void *p; } b;
+  init (&b);
+
+  struct A1 *p = (struct A1*)b.p;
+
+  if (__builtin_strncmp (p->a, "0123456789", 10) == 0)
+    {
+      extern void memptr_test (void);
+      memptr_test ();
+    }
+}
+
+
+struct A2 { char i, a[2]; };
+
+void f2_arr (void)
+{
+  char a[8];
+  init (a);
+
+  struct A2 *p = (struct A2*)a;
+
+  if (__builtin_strncmp (p->a, "0123456", 7) == 0)
+    {
+      extern void array_test (void);
+      array_test ();
+    }
+}
+
+void f2_ptr (void)
+{
+  void *p;
+  init (&p);
+
+  struct A2 *q = (struct A2*)p;
+
+  if (__builtin_strncmp (q->a, "0123456789", 10) == 0)
+    {
+      extern void pointer_test (void);
+      pointer_test ();
+    }
+}
+
+void f2_struct (void)
+{
+  struct { char a[8]; } b;
+  init (&b);
+
+  struct A2 *p = (struct A2*)&b;
+
+  if (__builtin_strncmp (p->a, "0123456", 7) == 0)
+    {
+      extern void struct_test (void);
+      struct_test ();
+    }
+}
+
+void f2_memptr (void)
+{
+  struct { void *p; } b;
+  init (&b);
+
+  struct A2 *p = (struct A2*)b.p;
+
+  if (__builtin_strncmp (p->a, "0123456789", 10) == 0)
+    {
+      extern void memptr_test (void);
+      memptr_test ();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "array_test" 2 "optimized" } }
+   { dg-final { scan-tree-dump-times "pointer_test" 2 "optimized" } }
+   { dg-final { scan-tree-dump-times "struct_test" 2 "optimized" } }
+   { dg-final { scan-tree-dump-times "memptr_test" 2 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
index 57d8f651c28..f31761be173 100644
--- a/gcc/testsuite/gcc.dg/strcmpopt_2.c
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -4,52 +4,53 @@
 char s[100] = {'a','b','c','d'};
 typedef struct { char s[8]; int x; } S;
 
-__attribute__ ((noinline)) int 
-f1 (S *s) 
-{ 
-  return __builtin_strcmp (s->s, "abc") != 0; 
+__attribute__ ((noinline)) int
+f1 (S *s)
+{
+  /* Member arrays not handled due to the fix for PR 92765.  */
+  return 0; // __builtin_strcmp (s->s, "abc") != 0;
 }
 
-__attribute__ ((noinline)) int 
-f2 (void) 
-{ 
-  return __builtin_strcmp (s, "abc") != 0; 
+__attribute__ ((noinline)) int
+f2 (void)
+{
+  return __builtin_strcmp (s, "abc") != 0;
 }
 
-__attribute__ ((noinline)) int 
-f3 (S *s) 
-{ 
-  return __builtin_strcmp ("abc", s->s) != 0; 
+__attribute__ ((noinline)) int
+f3 (S *s)
+{
+  return 0; // __builtin_strcmp ("abc", s->s) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f4 (void) 
-{ 
-  return __builtin_strcmp ("abc", s) != 0; 
+__attribute__ ((noinline)) int
+f4 (void)
+{
+  return __builtin_strcmp ("abc", s) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f5 (S *s) 
-{ 
-  return __builtin_strncmp (s->s, "abc", 3) != 0; 
+__attribute__ ((noinline)) int
+f5 (S *s)
+{
+  return 0; // __builtin_strncmp (s->s, "abc", 3) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f6 (void) 
-{ 
-  return __builtin_strncmp (s, "abc", 2) != 0; 
+__attribute__ ((noinline)) int
+f6 (void)
+{
+  return __builtin_strncmp (s, "abc", 2) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f7 (S *s) 
-{ 
-  return __builtin_strncmp ("abc", s->s, 3) != 0; 
+__attribute__ ((noinline)) int
+f7 (S *s)
+{
+  return 0; // __builtin_strncmp ("abc", s->s, 3) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f8 (void) 
-{ 
-  return __builtin_strncmp ("abc", s, 2) != 0; 
+__attribute__ ((noinline)) int
+f8 (void)
+{
+  return __builtin_strncmp ("abc", s, 2) != 0;
 }
 
 int main (void)
@@ -64,4 +65,4 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 8 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 4 "strlen1" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c
index 4e26522eed1..b07fbb6b7b0 100644
--- a/gcc/testsuite/gcc.dg/strcmpopt_4.c
+++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
@@ -2,15 +2,26 @@
 /* { dg-options "-O2 -fdump-tree-strlen" } */
 
 typedef struct { char s[8]; int x; } S;
+
 extern int max_i;
 
-int
-f1 (S * s)
-{ 
-  int result, i;
-  for (i = 0; i < max_i; i++)
-    result += __builtin_strcmp (s->s, "abc") != 0 ? 2 : 1;
+int f_param (S s)
+{
+  int result = 0;
+  for (int i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s.s, "abc") != 0 ? 2 : 1;
+  return result;
+}
+
+
+S s;
+
+int f_object (void)
+{
+  int result = 0;
+  for (int i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s.s, "abc") != 0 ? 2 : 1;
   return result;
 }
 
-/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 1 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 2 "strlen1" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c b/gcc/testsuite/gcc.dg/strlenopt-69.c
index 9ad8e2e8aac..9df6eeccb97 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-69.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-69.c
@@ -35,6 +35,8 @@ void test_array_lit (void)
 
 void test_memarray_lit (struct S *p)
 {
+#if 0
+  /* Member arrays not handled due to the fix for PR 92765.  */
   A (strcmp (p->a4, "1234"));
   A (strcmp (p->a4, "12345"));
   A (strcmp (p->a4, "123456"));
@@ -42,6 +44,7 @@ void test_memarray_lit (struct S *p)
   A (strcmp ("1234", p->a4));
   A (strcmp ("12345", p->a4));
   A (strcmp ("123456", p->a4));
+#endif
 }
 
 /* Verify that the equality of empty strings is folded.  */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-92.c b/gcc/testsuite/gcc.dg/strlenopt-92.c
new file mode 100644
index 00000000000..a9383e65588
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-92.c
@@ -0,0 +1,58 @@
+/* PR tree-optimization/92765 - wrong code for strcmp of a union member
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+#include "strlenopt.h"
+
+__attribute__((noipa)) int
+copy (char *x, int y)
+{
+  if (y == 0)
+    strcpy (x, "abcd");
+  return y;
+}
+
+__attribute__((noipa)) char *
+alloc_2_copy_compare (int x)
+{
+  char *p;
+  if (x)
+    p = malloc (4);
+  else
+    p = calloc (16, 1);
+
+  char *q = p + 2;
+  if (copy (q, x))
+    return p;
+
+  if (strcmp (q, "abcd") != 0)
+    abort ();
+
+  return p;
+}
+
+char a5[5], a6[6], a7[7];
+
+__attribute__((noipa)) char *
+decl_3_copy_compare (int x)
+{
+  char *p = x < 0 ? a5 : 0 < x ? a6 : a7;
+  char *q = p + 1;
+  if (copy (q, x))
+    return p;
+
+  if (strcmp (q, "abcd") != 0)
+    abort ();
+
+  return p;
+}
+
+int main ()
+{
+  free (alloc_2_copy_compare (0));
+  free (alloc_2_copy_compare (1));
+
+  decl_3_copy_compare (-1);
+  decl_3_copy_compare (0);
+  decl_3_copy_compare (1);
+}
diff --git a/gcc/testsuite/gcc.dg/strlenopt-93.c b/gcc/testsuite/gcc.dg/strlenopt-93.c
new file mode 100644
index 00000000000..b72b2ed7939
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-93.c
@@ -0,0 +1,71 @@
+/* Verify that strlen doesn't (inadvertently) use the size of an array
+   of char pointers to put an upper bound on the length of the strings
+   they point to.
+   { dg-do compile }
+   { dg-options "-O2 -Wall -fdump-tree-optimized" } */
+
+void eaa_test (void)
+{
+  extern char eaa[4][4];
+
+  char (*p)[4] = eaa;
+
+  if (!*p)
+    return;
+
+  /* The longest string stored in EAA is 15 characters.  */
+  if (__builtin_strlen (*p) > 14)
+    {
+      extern void eaa_ok (void);
+      eaa_ok ();
+    }
+
+  if (__builtin_strlen (*p) > 15)
+    {
+      extern void eaa_fail (void);
+      eaa_fail ();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "eaa_ok" "optimized" } }
+   { dg-final { scan-tree-dump-not "eaa_fail" "optimized" } } */
+
+
+void epa_test (void)
+{
+  extern char* epa[4];
+  char **p = epa;
+
+  if (*p && __builtin_strlen (*p) > 123)
+    {
+      extern void epa_ok (void);
+      epa_ok ();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "epa_ok" 1 "optimized" } } */
+
+
+static char* spa[4];
+
+void spa_test (void)
+{
+  char **p = spa;
+
+  if (*p && __builtin_strlen (*p) > 123)
+    {
+      extern void spa_ok ();
+      spa_ok ();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "spa_ok" 1 "optimized" } } */
+
+
+void sink (void*, ...);
+
+void init (void)
+{
+  /* Make believe even the static array SA may be non-zero.  */
+  sink (spa);
+}
diff --git a/gcc/testsuite/gcc.dg/strlenopt.h b/gcc/testsuite/gcc.dg/strlenopt.h
index 518d0cf08b2..a3ca951ddc5 100644
--- a/gcc/testsuite/gcc.dg/strlenopt.h
+++ b/gcc/testsuite/gcc.dg/strlenopt.h
@@ -5,6 +5,7 @@
 #define NULL ((void *) 0)
 typedef __SIZE_TYPE__ size_t;
 extern void abort (void);
+void *calloc (size_t, size_t);
 void *malloc (size_t);
 void free (void *);
 char *strdup (const char *);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
index 98dfa1d3966..7fb96514f1d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
@@ -642,10 +642,22 @@ void test_multiple_overlap (int i)
   }
 
   {
-    char a[4];                /* { dg-message "declared here" } */
+    char a[4];
+
+    /* There is no overlap here because the length of a3 is at most 1
+       and a4 is necessarily the empty string.  */
+    char *d = a;
+    char *a3 = a + 2;
+    char *a4 = a + 3;
+
+    T (d, "%s%s", a3, a4);
+  }
+
+  {
+    char a[5];                /* { dg-message "declared here" } */
 
     /* a3 and a4 may overlap the output.  They will only not overlap
-       it when a3 is empty, and a4 is at most chaeracter byte long.  */
+       it when a3 is empty, and a4 is at most 1 character long.  */
     char *d = a;
     char *a3 = a + 2;
     char *a4 = a + 3;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr92056.c b/gcc/testsuite/gcc.dg/tree-ssa/pr92056.c
index e43d0c06e2f..25a2c11b23a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr92056.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr92056.c
@@ -1,8 +1,8 @@
 /* PR tree-optimization/92056 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-optimized" } */
-/* { dg-final { scan-tree-dump-times "return 1;" 2 "optimized" } } */
-/* { dg-final { scan-tree-dump-not "strcmp \\(" "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 1;" 2 "optimized" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-not "strcmp \\(" "optimized" { xfail *-*-* } } } */
 
 void bar (int, char *);
 
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index ad9e98973b1..b9972c16e18 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -4061,105 +4061,20 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   return true;
 }
 
-/* Given an index to the strinfo vector, compute the string length
-   for the corresponding string. Return -1 when unknown.  */
-
-static HOST_WIDE_INT
-compute_string_length (int idx)
-{
-  HOST_WIDE_INT string_leni = -1;
-  gcc_assert (idx != 0);
-
-  if (idx < 0)
-   return ~idx;
-
-  strinfo *si = get_strinfo (idx);
-  if (si)
-    {
-      tree const_string_len = get_string_length (si);
-      if (const_string_len && tree_fits_shwi_p (const_string_len))
-	string_leni = tree_to_shwi (const_string_len);
-    }
-
-  if (string_leni < 0)
-    return -1;
-
-  return string_leni;
-}
-
-/* Determine the minimum size of the object referenced by DEST expression
-   which must have a pointer type.
-   Return the minimum size of the object if successful or HWI_M1U when
-   the size cannot be determined.  */
-
-static unsigned HOST_WIDE_INT
-determine_min_objsize (tree dest)
-{
-  unsigned HOST_WIDE_INT size = 0;
-
-  init_object_sizes ();
-
-  if (compute_builtin_object_size (dest, 2, &size))
-    return size;
-
-  /* Try to determine the size of the object through the RHS
-     of the assign statement.  */
-  if (TREE_CODE (dest) == SSA_NAME)
-    {
-      gimple *stmt = SSA_NAME_DEF_STMT (dest);
-      if (!is_gimple_assign (stmt))
-	return HOST_WIDE_INT_M1U;
-
-      if (!gimple_assign_single_p (stmt)
-	  && !gimple_assign_unary_nop_p (stmt))
-	return HOST_WIDE_INT_M1U;
-
-      dest = gimple_assign_rhs1 (stmt);
-      return determine_min_objsize (dest);
-    }
-
-  /* Try to determine the size of the object from its type.  */
-  if (TREE_CODE (dest) != ADDR_EXPR)
-    return HOST_WIDE_INT_M1U;
-
-  tree type = TREE_TYPE (dest);
-  if (TREE_CODE (type) == POINTER_TYPE)
-    type = TREE_TYPE (type);
-
-  type = TYPE_MAIN_VARIANT (type);
-
-  /* The size of a flexible array cannot be determined.  Otherwise,
-     for arrays with more than one element, return the size of its
-     type.  GCC itself misuses arrays of both zero and one elements
-     as flexible array members so they are excluded as well.  */
-  if (TREE_CODE (type) != ARRAY_TYPE
-      || !array_at_struct_end_p (dest))
-    {
-      tree type_size = TYPE_SIZE_UNIT (type);
-      if (type_size && TREE_CODE (type_size) == INTEGER_CST
-	  && !integer_onep (type_size)
-	  && !integer_zerop (type_size))
-        return tree_to_uhwi (type_size);
-    }
-
-  return HOST_WIDE_INT_M1U;
-}
-
-/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths
-   of  the string(s) referenced by ARG if it can be determined.
-   If the length cannot be determined, set *SIZE to the size of
+/* Given strinfo IDX for ARG, sets LENRNG[] to the range of lengths
+   of the string(s) referenced by ARG if it can be determined.
+   If the length cannot be determined, sets *SIZE to the size of
    the array the string is stored in, if any.  If no such array is
-   known, set *SIZE to -1.  When the strings are nul-terminated set
-   *NULTERM to true, otherwise to false.  Return true on success.  */
+   known, sets *SIZE to -1.  When the strings are nul-terminated sets
+   *NULTERM to true, otherwise to false.  When nonnull uses RVALS to
+   determine range information. Returns true on success.  */
 
 static bool
 get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
-		 unsigned HOST_WIDE_INT *size, bool *nulterm)
+		 unsigned HOST_WIDE_INT *size, bool *nulterm,
+		 const vr_values *rvals)
 {
-  /* Set so that both LEN and ~LEN are invalid lengths, i.e.,
-     maximum possible length + 1.  */
-  lenrng[0] = lenrng[1] = HOST_WIDE_INT_MAX;
-
+  /* Invalidate.  */
   *size = HOST_WIDE_INT_M1U;
 
   if (idx < 0)
@@ -4168,13 +4083,18 @@ get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
       lenrng[0] = ~idx;
       lenrng[1] = lenrng[0];
       *nulterm = true;
+      return true;
     }
-  else if (idx == 0)
-    ; /* Handled below.  */
-  else if (strinfo *si = get_strinfo (idx))
+
+  /* Set so that both LEN and ~LEN are invalid lengths, i.e., maximum
+     possible length + 1.  */
+  lenrng[0] = lenrng[1] = HOST_WIDE_INT_MAX;
+
+  if (strinfo *si = idx ? get_strinfo (idx) : NULL)
     {
+      /* FIXME: Handle all this in_range_strlen_dynamic.  */
       if (!si->nonzero_chars)
-	arg = si->ptr;
+	;
       else if (tree_fits_uhwi_p (si->nonzero_chars))
 	{
 	  lenrng[0] = tree_to_uhwi (si->nonzero_chars);
@@ -4195,42 +4115,62 @@ get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
 	      *nulterm = si->full_string_p;
 	    }
 	}
-      else if (si->ptr)
-	arg = si->ptr;
     }
 
-  if (lenrng[0] == HOST_WIDE_INT_MAX)
+  if (lenrng[0] != HOST_WIDE_INT_MAX)
+    return true;
+
+  /* Compute the minimum and maximum real or possible lengths.  */
+  c_strlen_data lendata = { };
+  /* Set MAXBOUND to an arbitrary non-null non-integer node as a request
+     to have it set to the length of the longest string in a PHI.  */
+  lendata.maxbound = arg;
+  get_range_strlen_dynamic (arg, &lendata, rvals);
+
+  unsigned HOST_WIDE_INT maxbound = HOST_WIDE_INT_M1U;
+  if (tree_fits_uhwi_p (lendata.maxbound)
+      && !integer_all_onesp (lendata.maxbound))
+    maxbound = tree_to_uhwi (lendata.maxbound);
+
+  if (tree_fits_uhwi_p (lendata.minlen) && tree_fits_uhwi_p (lendata.maxlen))
     {
-      /* Compute the minimum and maximum real or possible lengths.  */
-      c_strlen_data lendata = { };
-      if (get_range_strlen (arg, &lendata, /* eltsize = */1))
+      unsigned HOST_WIDE_INT minlen = tree_to_uhwi (lendata.minlen);
+      unsigned HOST_WIDE_INT maxlen = tree_to_uhwi (lendata.maxlen);
+
+      /* The longest string in this data model.  */
+      const unsigned HOST_WIDE_INT lenmax
+	= tree_to_uhwi (max_object_size ()) - 2;
+
+      if (maxbound == HOST_WIDE_INT_M1U)
 	{
-	  if (tree_fits_shwi_p (lendata.maxlen) && !lendata.maxbound)
-	    {
-	      lenrng[0] = tree_to_shwi (lendata.minlen);
-	      lenrng[1] = tree_to_shwi (lendata.maxlen);
-	      *nulterm = true;
-	    }
-	  else if (lendata.maxbound && tree_fits_shwi_p (lendata.maxbound))
-	    {
-	      /* Set *SIZE to the conservative LENDATA.MAXBOUND which
-		 is a conservative estimate of the longest string based
-		 on the sizes of the arrays referenced by ARG.  */
-	      *size = tree_to_uhwi (lendata.maxbound) + 1;
-	      *nulterm = false;
-	    }
+	  lenrng[0] = minlen;
+	  lenrng[1] = maxlen;
+	  *nulterm = minlen == maxlen;
 	}
-      else
+      else if (maxlen < lenmax)
 	{
-	  /* Set *SIZE to the size of the smallest object referenced
-	     by ARG if ARG denotes a single object, or to HWI_M1U
-	     otherwise.  */
-	  *size = determine_min_objsize (arg);
+	  *size = maxbound + 1;
 	  *nulterm = false;
 	}
+      else
+	return false;
+
+      return true;
     }
 
-  return lenrng[0] != HOST_WIDE_INT_MAX || *size != HOST_WIDE_INT_M1U;
+  if (maxbound != HOST_WIDE_INT_M1U
+      && lendata.maxlen
+      && !integer_all_onesp (lendata.maxlen))
+    {
+      /* Set *SIZE to LENDATA.MAXBOUND which is a conservative estimate
+	 of the longest string based on the sizes of the arrays referenced
+	 by ARG.  */
+      *size = maxbound + 1;
+      *nulterm = false;
+      return true;
+    }
+
+  return false;
 }
 
 /* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
@@ -4245,15 +4185,15 @@ get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
 static tree
 strxcmp_eqz_result (tree arg1, int idx1, tree arg2, int idx2,
 		    unsigned HOST_WIDE_INT bound, unsigned HOST_WIDE_INT len[2],
-		    unsigned HOST_WIDE_INT *psize)
+		    unsigned HOST_WIDE_INT *psize, const vr_values *rvals)
 {
   /* Determine the range the length of each string is in and whether it's
      known to be nul-terminated, or the size of the array it's stored in.  */
   bool nul1, nul2;
   unsigned HOST_WIDE_INT siz1, siz2;
   unsigned HOST_WIDE_INT len1rng[2], len2rng[2];
-  if (!get_len_or_size (arg1, idx1, len1rng, &siz1, &nul1)
-      || !get_len_or_size (arg2, idx2, len2rng, &siz2, &nul2))
+  if (!get_len_or_size (arg1, idx1, len1rng, &siz1, &nul1, rvals)
+      || !get_len_or_size (arg2, idx2, len2rng, &siz2, &nul2, rvals))
     return NULL_TREE;
 
   /* BOUND is set to HWI_M1U for strcmp and less to strncmp, and LENiRNG
@@ -4405,7 +4345,7 @@ maybe_warn_pointless_strcmp (gimple *stmt, HOST_WIDE_INT bound,
    another and false otherwise.  */
 
 static bool
-handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi, const vr_values *rvals)
 {
   gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
   tree lhs = gimple_call_lhs (stmt);
@@ -4451,7 +4391,7 @@ handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
        or definitely unequal and if so, either fold the result to zero
        (when equal) or set the range of the result to ~[0, 0] otherwise.  */
     if (tree eqz = strxcmp_eqz_result (arg1, idx1, arg2, idx2, bound,
-				       len, &siz))
+				       len, &siz, rvals))
       {
 	if (integer_zerop (eqz))
 	  {
@@ -4482,26 +4422,31 @@ handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
   HOST_WIDE_INT cstlen1 = -1, cstlen2 = -1;
   HOST_WIDE_INT arysiz1 = -1, arysiz2 = -1;
 
-  if (idx1)
-    cstlen1 = compute_string_length (idx1);
-  else
-    arysiz1 = determine_min_objsize (arg1);
+  {
+    unsigned HOST_WIDE_INT len1rng[2], len2rng[2];
+    unsigned HOST_WIDE_INT arsz1, arsz2;
+    bool nulterm[2];
+
+    if (!get_len_or_size (arg1, idx1, len1rng, &arsz1, nulterm, rvals)
+	|| !get_len_or_size (arg2, idx2, len2rng, &arsz2, nulterm + 1, rvals))
+      return false;
+
+    if (len1rng[0] == len1rng[1] && len1rng[0] < HOST_WIDE_INT_MAX)
+      cstlen1 = len1rng[0];
+    else if (arsz1 < HOST_WIDE_INT_M1U)
+      arysiz1 = arsz1;
+
+    if (len2rng[0] == len2rng[1] && len2rng[0] < HOST_WIDE_INT_MAX)
+      cstlen2 = len2rng[0];
+    else if (arsz2 < HOST_WIDE_INT_M1U)
+      arysiz2 = arsz2;
+  }
 
   /* Bail if neither the string length nor the size of the array
      it is stored in can be determined.  */
-  if (cstlen1 < 0 && arysiz1 < 0)
-    return false;
-
-  /* Repeat for the second argument.  */
-  if (idx2)
-    cstlen2 = compute_string_length (idx2);
-  else
-    arysiz2 = determine_min_objsize (arg2);
-
-  if (cstlen2 < 0 && arysiz2 < 0)
-    return false;
-
-  if (cstlen1 < 0 && cstlen2 < 0)
+  if ((cstlen1 < 0 && arysiz1 < 0)
+      || (cstlen2 < 0 && arysiz2 < 0)
+      || (cstlen1 < 0 && cstlen2 < 0))
     return false;
 
   if (cstlen1 >= 0)
@@ -5435,7 +5380,7 @@ strlen_check_and_optimize_call (gimple_stmt_iterator *gsi, bool *zero_write,
       break;
     case BUILT_IN_STRCMP:
     case BUILT_IN_STRNCMP:
-      if (handle_builtin_string_cmp (gsi))
+      if (handle_builtin_string_cmp (gsi, rvals))
 	return false;
       break;
     default:
Jeff Law Feb. 3, 2020, 6:44 p.m. | #5
On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:
> Attached is a reworked patch since the first one didn't go far

> enough to solve the major problems.  The new solution relies on

> get_range_strlen_dynamic the same way as the sprintf optimization,

> and does away with the determine_min_objsize function and calling

> compute_builtin_object_size.

> 

> To minimize testsuite fallout I extended get_range_strlen to handle

> a couple more kinds expressions(*), but I still had to xfail and

> disable a few tests that were relying on being able to use the type

> of the destination object as the upper bound on the string length.

> 

> Tested on x86_64-linux.

> 

> Martin

> 

> [*] With all the issues around MEM_REFs and types this change needs

> extra scrutiny.  I'm still not sure I fully understand what can and

> what cannot be safely relied on at this level.

> 

> On 1/15/20 6:18 AM, Martin Sebor wrote:

> > The strcmp optimization newly introduced in GCC 10 relies on

> > the size of the smallest referenced array object to determine

> > whether the function can return zero.  When the size of

> > the object is smaller than the length of the other string

> > argument the optimization folds the equality to false.

> > 

> > The bug report has identified a couple of problems here:

> > 1) when the access to the array object is via a pointer to

> > a (possibly indirect) member of a union, in GIMPLE the pointer

> > may actually point to a different member than the one in

> > the original source code.  Thus the size of the array may

> > appear to be smaller than in the source code which can then

> > result in the optimization being invalid.

> > 2) when the pointer in the access may point to two or more

> > arrays of different size (i.e., it's the result of a PHI),

> > assuming it points to the smallest of them can also lead

> > to an incorrect result when the optimization is applied.

> > 

> > The attached patch adjusts the optimization to 1) avoid making

> > any assumptions about the sizes of objects accessed via union

> > types, and b) use the size of the largest object in PHI nodes.

> > 

> > Tested on x86_64-linux.

> > 

> > Martin

> 

> 

> PR tree-optimization/92765 - wrong code for strcmp of a union member

> 

> gcc/ChangeLog:

> 

>         PR tree-optimization/92765

>         * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.

>         * tree-ssa-strlen.c (compute_string_length): Remove.

>         (determine_min_objsize): Remove.

>         (get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.

>         Avoid using type size as the upper bound on string length.

>         (handle_builtin_string_cmp): Add an argument.  Adjust.

>         (strlen_check_and_optimize_call): Pass additional argument to

>         handle_builtin_string_cmp.

> 

> gcc/testsuite/ChangeLog:

> 

>         PR tree-optimization/92765

>         * g++.dg/tree-ssa/strlenopt-1.C: New test.

>         * g++.dg/tree-ssa/strlenopt-2.C: New test.

>         * gcc.dg/Warray-bounds-58.c: New test.

>         * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.

>         * gcc.dg/Wstring-compare.c: Xfail a test.

>         * gcc.dg/strcmpopt_2.c: Disable tests.

>         * gcc.dg/strcmpopt_4.c: Adjust tests.

>         * gcc.dg/strcmpopt_10.c: New test.

>         * gcc.dg/strlenopt-69.c: Disable tests.

>         * gcc.dg/strlenopt-92.c: New test.

>         * gcc.dg/strlenopt-93.c: New test.

>         * gcc.dg/strlenopt.h: Declare calloc.

>         * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.

>         * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).

> 

> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c

> index ed225922269..d70ac67e1ca 100644

> --- a/gcc/gimple-fold.c

> +++ b/gcc/gimple-fold.c

> @@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,

>                        c_strlen_data *pdata, unsigned eltsize)

>  {

>    gcc_assert (TREE_CODE (arg) != SSA_NAME);

> - 

> +

>    /* The length computed by this invocation of the function.  */

>    tree val = NULL_TREE;

>  

> @@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,

>              type about the length here.  */

>           tight_bound = true;

>         }

> -      else if (VAR_P (arg))

> +      else if (TREE_CODE (arg) == MEM_REF

> +              && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE

> +              && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE

> +              && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)

> +       {

> +         /* Handle a MEM_REF into a DECL accessing an array of integers,

> +            being conservative about references to extern structures with

> +            flexible array members that can be initialized to arbitrary

> +            numbers of elements as an extension (static structs are okay).

> +            FIXME: Make this less conservative -- see

> +            component_ref_size in tree.c.  */

I think it's generally been agreed that we can look at sizes of _DECL
nodes and this code doesn't look like this walks backwards through
casts or anything like that.  So the worry would be if we forward
propagated through a cast into the MEM_REF node.

It looks like forwprop only propagates through "compatible" pointer
conversions.  It makes me a bit nervous.

Jakub/Richi, comments on this hunk?



> diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c

> index 0ca492db0ab..d1534bf7555 100644

> --- a/gcc/testsuite/gcc.dg/Wstring-compare.c

> +++ b/gcc/testsuite/gcc.dg/Wstring-compare.c

In general I have a slight preference for pulling these into new files
when we need to xfail them.  Why?  Because a test which previously
passed, but now fails (even an xfail) causes the tester to flag the
build as failing due to a testsuite regression.

But I don't think that preference is significant enough to ask you to
redo the work.  Just something to ponder in the future.

 
> diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c

> index 57d8f651c28..f31761be173 100644

> --- a/gcc/testsuite/gcc.dg/strcmpopt_2.c

> +++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c

So I'd pulled f1, f3, f5, f7 into a new file.  But disabling them like
you've done is reasonable as well.


> diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c

> index 4e26522eed1..b07fbb6b7b0 100644

> --- a/gcc/testsuite/gcc.dg/strcmpopt_4.c

> +++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c

THanks for creating the new test.  I'd done the exact same thing in my
local tree.  I'm a little surprised that f_param passes.  Can you
double check that?



diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c
b/gcc/testsuite/gcc.dg/strlenopt-69.c
> index 9ad8e2e8aac..9df6eeccb97 100644

> --- a/gcc/testsuite/gcc.dg/strlenopt-69.c

> +++ b/gcc/testsuite/gcc.dg/strlenopt-69.c

I'd pulled the offending tests into a new file, but your approach is
fine too.

> 

> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c

> index ad9e98973b1..b9972c16e18 100644

> --- a/gcc/tree-ssa-strlen.c

> +++ b/gcc/tree-ssa-strlen.c

> -

> -/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths

> -   of  the string(s) referenced by ARG if it can be determined.

> -   If the length cannot be determined, set *SIZE to the size of

> +/* Given strinfo IDX for ARG, sets LENRNG[] to the range of lengths

> +   of the string(s) referenced by ARG if it can be determined.

> +   If the length cannot be determined, sets *SIZE to the size of

>     the array the string is stored in, if any.  If no such array is

> -   known, set *SIZE to -1.  When the strings are nul-terminated set

> -   *NULTERM to true, otherwise to false.  Return true on success.  */

> +   known, sets *SIZE to -1.  When the strings are nul-terminated sets

> +   *NULTERM to true, otherwise to false.  When nonnull uses RVALS to

> +   determine range information. Returns true on success.  */

"When nonnull uses RVALS to detemrine range information."  That isn't a
sentence and just doesn't seem to make sense.  Please review for
comment clarity.

This looks OK to me with the comment fixed.  I like that we drop the
whole determine_min_objsize and replace it it the standard range bits
that we're using elsewhere.

Please give Richi and Jakub time to chime in on the gimple-fold.c
changes.


Jeff
Richard Biener Feb. 4, 2020, 2:35 p.m. | #6
On Mon, Feb 3, 2020 at 7:45 PM Jeff Law <law@redhat.com> wrote:
>

> On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:

> > Attached is a reworked patch since the first one didn't go far

> > enough to solve the major problems.  The new solution relies on

> > get_range_strlen_dynamic the same way as the sprintf optimization,

> > and does away with the determine_min_objsize function and calling

> > compute_builtin_object_size.

> >

> > To minimize testsuite fallout I extended get_range_strlen to handle

> > a couple more kinds expressions(*), but I still had to xfail and

> > disable a few tests that were relying on being able to use the type

> > of the destination object as the upper bound on the string length.

> >

> > Tested on x86_64-linux.

> >

> > Martin

> >

> > [*] With all the issues around MEM_REFs and types this change needs

> > extra scrutiny.  I'm still not sure I fully understand what can and

> > what cannot be safely relied on at this level.

> >

> > On 1/15/20 6:18 AM, Martin Sebor wrote:

> > > The strcmp optimization newly introduced in GCC 10 relies on

> > > the size of the smallest referenced array object to determine

> > > whether the function can return zero.  When the size of

> > > the object is smaller than the length of the other string

> > > argument the optimization folds the equality to false.

> > >

> > > The bug report has identified a couple of problems here:

> > > 1) when the access to the array object is via a pointer to

> > > a (possibly indirect) member of a union, in GIMPLE the pointer

> > > may actually point to a different member than the one in

> > > the original source code.  Thus the size of the array may

> > > appear to be smaller than in the source code which can then

> > > result in the optimization being invalid.

> > > 2) when the pointer in the access may point to two or more

> > > arrays of different size (i.e., it's the result of a PHI),

> > > assuming it points to the smallest of them can also lead

> > > to an incorrect result when the optimization is applied.

> > >

> > > The attached patch adjusts the optimization to 1) avoid making

> > > any assumptions about the sizes of objects accessed via union

> > > types, and b) use the size of the largest object in PHI nodes.

> > >

> > > Tested on x86_64-linux.

> > >

> > > Martin

> >

> >

> > PR tree-optimization/92765 - wrong code for strcmp of a union member

> >

> > gcc/ChangeLog:

> >

> >         PR tree-optimization/92765

> >         * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.

> >         * tree-ssa-strlen.c (compute_string_length): Remove.

> >         (determine_min_objsize): Remove.

> >         (get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.

> >         Avoid using type size as the upper bound on string length.

> >         (handle_builtin_string_cmp): Add an argument.  Adjust.

> >         (strlen_check_and_optimize_call): Pass additional argument to

> >         handle_builtin_string_cmp.

> >

> > gcc/testsuite/ChangeLog:

> >

> >         PR tree-optimization/92765

> >         * g++.dg/tree-ssa/strlenopt-1.C: New test.

> >         * g++.dg/tree-ssa/strlenopt-2.C: New test.

> >         * gcc.dg/Warray-bounds-58.c: New test.

> >         * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.

> >         * gcc.dg/Wstring-compare.c: Xfail a test.

> >         * gcc.dg/strcmpopt_2.c: Disable tests.

> >         * gcc.dg/strcmpopt_4.c: Adjust tests.

> >         * gcc.dg/strcmpopt_10.c: New test.

> >         * gcc.dg/strlenopt-69.c: Disable tests.

> >         * gcc.dg/strlenopt-92.c: New test.

> >         * gcc.dg/strlenopt-93.c: New test.

> >         * gcc.dg/strlenopt.h: Declare calloc.

> >         * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.

> >         * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).

> >

> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c

> > index ed225922269..d70ac67e1ca 100644

> > --- a/gcc/gimple-fold.c

> > +++ b/gcc/gimple-fold.c

> > @@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,

> >                        c_strlen_data *pdata, unsigned eltsize)

> >  {

> >    gcc_assert (TREE_CODE (arg) != SSA_NAME);

> > -

> > +

> >    /* The length computed by this invocation of the function.  */

> >    tree val = NULL_TREE;

> >

> > @@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,

> >              type about the length here.  */

> >           tight_bound = true;

> >         }

> > -      else if (VAR_P (arg))

> > +      else if (TREE_CODE (arg) == MEM_REF

> > +              && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE

> > +              && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE

> > +              && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)

> > +       {

> > +         /* Handle a MEM_REF into a DECL accessing an array of integers,

> > +            being conservative about references to extern structures with

> > +            flexible array members that can be initialized to arbitrary

> > +            numbers of elements as an extension (static structs are okay).

> > +            FIXME: Make this less conservative -- see

> > +            component_ref_size in tree.c.  */

> I think it's generally been agreed that we can look at sizes of _DECL

> nodes and this code doesn't look like this walks backwards through

> casts or anything like that.  So the worry would be if we forward

> propagated through a cast into the MEM_REF node.

>

> It looks like forwprop only propagates through "compatible" pointer

> conversions.  It makes me a bit nervous.

>

> Jakub/Richi, comments on this hunk?


+         tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
+         tree off = TREE_OPERAND (arg, 1);
+         if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))
+             && (!DECL_EXTERNAL (ref)
+                 || !array_at_struct_end_p (arg)))
+           {

I think you'd want decl_binds_to_current_def_p (ref) instead of !DECL_EXTERNAL.
Since 'arg' is originally a pointer array_at_struct_end_p is
meaningless here since
that's about the structure of a reference while the pointer is just a
value.  So if
you're concerned the objects size might not be as it looks like then you have to
rely on decl_binds_to_current_def_p only.  You also shouldn't use 'off' natively
in the code below but use mem_ref_offset to access the embedded offset
which is to be interpreted as signed integer (it's a pointer as you use it).
You compare it against an unsigned size...

>

>

> > diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c

> > index 0ca492db0ab..d1534bf7555 100644

> > --- a/gcc/testsuite/gcc.dg/Wstring-compare.c

> > +++ b/gcc/testsuite/gcc.dg/Wstring-compare.c

> In general I have a slight preference for pulling these into new files

> when we need to xfail them.  Why?  Because a test which previously

> passed, but now fails (even an xfail) causes the tester to flag the

> build as failing due to a testsuite regression.

>

> But I don't think that preference is significant enough to ask you to

> redo the work.  Just something to ponder in the future.

>

>

> > diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c

> > index 57d8f651c28..f31761be173 100644

> > --- a/gcc/testsuite/gcc.dg/strcmpopt_2.c

> > +++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c

> So I'd pulled f1, f3, f5, f7 into a new file.  But disabling them like

> you've done is reasonable as well.

>

>

> > diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c

> > index 4e26522eed1..b07fbb6b7b0 100644

> > --- a/gcc/testsuite/gcc.dg/strcmpopt_4.c

> > +++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c

> THanks for creating the new test.  I'd done the exact same thing in my

> local tree.  I'm a little surprised that f_param passes.  Can you

> double check that?

>

>

>

> diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c

> b/gcc/testsuite/gcc.dg/strlenopt-69.c

> > index 9ad8e2e8aac..9df6eeccb97 100644

> > --- a/gcc/testsuite/gcc.dg/strlenopt-69.c

> > +++ b/gcc/testsuite/gcc.dg/strlenopt-69.c

> I'd pulled the offending tests into a new file, but your approach is

> fine too.

>

> >

> > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c

> > index ad9e98973b1..b9972c16e18 100644

> > --- a/gcc/tree-ssa-strlen.c

> > +++ b/gcc/tree-ssa-strlen.c

> > -

> > -/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths

> > -   of  the string(s) referenced by ARG if it can be determined.

> > -   If the length cannot be determined, set *SIZE to the size of

> > +/* Given strinfo IDX for ARG, sets LENRNG[] to the range of lengths

> > +   of the string(s) referenced by ARG if it can be determined.

> > +   If the length cannot be determined, sets *SIZE to the size of

> >     the array the string is stored in, if any.  If no such array is

> > -   known, set *SIZE to -1.  When the strings are nul-terminated set

> > -   *NULTERM to true, otherwise to false.  Return true on success.  */

> > +   known, sets *SIZE to -1.  When the strings are nul-terminated sets

> > +   *NULTERM to true, otherwise to false.  When nonnull uses RVALS to

> > +   determine range information. Returns true on success.  */

> "When nonnull uses RVALS to detemrine range information."  That isn't a

> sentence and just doesn't seem to make sense.  Please review for

> comment clarity.

>

> This looks OK to me with the comment fixed.  I like that we drop the

> whole determine_min_objsize and replace it it the standard range bits

> that we're using elsewhere.

>

> Please give Richi and Jakub time to chime in on the gimple-fold.c

> changes.

>

>

> Jeff

>
Martin Sebor Feb. 5, 2020, 11:57 p.m. | #7
On 2/3/20 11:44 AM, Jeff Law wrote:
> On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:

>> Attached is a reworked patch since the first one didn't go far

>> enough to solve the major problems.  The new solution relies on

>> get_range_strlen_dynamic the same way as the sprintf optimization,

>> and does away with the determine_min_objsize function and calling

>> compute_builtin_object_size.

>>

>> To minimize testsuite fallout I extended get_range_strlen to handle

>> a couple more kinds expressions(*), but I still had to xfail and

>> disable a few tests that were relying on being able to use the type

>> of the destination object as the upper bound on the string length.

>>

>> Tested on x86_64-linux.

>>

>> Martin

>>

>> [*] With all the issues around MEM_REFs and types this change needs

>> extra scrutiny.  I'm still not sure I fully understand what can and

>> what cannot be safely relied on at this level.

>>

>> On 1/15/20 6:18 AM, Martin Sebor wrote:

>>> The strcmp optimization newly introduced in GCC 10 relies on

>>> the size of the smallest referenced array object to determine

>>> whether the function can return zero.  When the size of

>>> the object is smaller than the length of the other string

>>> argument the optimization folds the equality to false.

>>>

>>> The bug report has identified a couple of problems here:

>>> 1) when the access to the array object is via a pointer to

>>> a (possibly indirect) member of a union, in GIMPLE the pointer

>>> may actually point to a different member than the one in

>>> the original source code.  Thus the size of the array may

>>> appear to be smaller than in the source code which can then

>>> result in the optimization being invalid.

>>> 2) when the pointer in the access may point to two or more

>>> arrays of different size (i.e., it's the result of a PHI),

>>> assuming it points to the smallest of them can also lead

>>> to an incorrect result when the optimization is applied.

>>>

>>> The attached patch adjusts the optimization to 1) avoid making

>>> any assumptions about the sizes of objects accessed via union

>>> types, and b) use the size of the largest object in PHI nodes.

>>>

>>> Tested on x86_64-linux.

>>>

>>> Martin

>>

>>

>> PR tree-optimization/92765 - wrong code for strcmp of a union member

>>

>> gcc/ChangeLog:

>>

>>          PR tree-optimization/92765

>>          * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.

>>          * tree-ssa-strlen.c (compute_string_length): Remove.

>>          (determine_min_objsize): Remove.

>>          (get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.

>>          Avoid using type size as the upper bound on string length.

>>          (handle_builtin_string_cmp): Add an argument.  Adjust.

>>          (strlen_check_and_optimize_call): Pass additional argument to

>>          handle_builtin_string_cmp.

>>

>> gcc/testsuite/ChangeLog:

>>

>>          PR tree-optimization/92765

>>          * g++.dg/tree-ssa/strlenopt-1.C: New test.

>>          * g++.dg/tree-ssa/strlenopt-2.C: New test.

>>          * gcc.dg/Warray-bounds-58.c: New test.

>>          * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.

>>          * gcc.dg/Wstring-compare.c: Xfail a test.

>>          * gcc.dg/strcmpopt_2.c: Disable tests.

>>          * gcc.dg/strcmpopt_4.c: Adjust tests.

>>          * gcc.dg/strcmpopt_10.c: New test.

>>          * gcc.dg/strlenopt-69.c: Disable tests.

>>          * gcc.dg/strlenopt-92.c: New test.

>>          * gcc.dg/strlenopt-93.c: New test.

>>          * gcc.dg/strlenopt.h: Declare calloc.

>>          * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.

>>          * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).

>>

>> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c

>> index ed225922269..d70ac67e1ca 100644

>> --- a/gcc/gimple-fold.c

>> +++ b/gcc/gimple-fold.c

>> @@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,

>>                         c_strlen_data *pdata, unsigned eltsize)

>>   {

>>     gcc_assert (TREE_CODE (arg) != SSA_NAME);

>> -

>> +

>>     /* The length computed by this invocation of the function.  */

>>     tree val = NULL_TREE;

>>   

>> @@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,

>>               type about the length here.  */

>>            tight_bound = true;

>>          }

>> -      else if (VAR_P (arg))

>> +      else if (TREE_CODE (arg) == MEM_REF

>> +              && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE

>> +              && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE

>> +              && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)

>> +       {

>> +         /* Handle a MEM_REF into a DECL accessing an array of integers,

>> +            being conservative about references to extern structures with

>> +            flexible array members that can be initialized to arbitrary

>> +            numbers of elements as an extension (static structs are okay).

>> +            FIXME: Make this less conservative -- see

>> +            component_ref_size in tree.c.  */

> I think it's generally been agreed that we can look at sizes of _DECL

> nodes and this code doesn't look like this walks backwards through

> casts or anything like that.  So the worry would be if we forward

> propagated through a cast into the MEM_REF node.

> 

> It looks like forwprop only propagates through "compatible" pointer

> conversions.  It makes me a bit nervous.

> 

> Jakub/Richi, comments on this hunk?


I have updated the function to call decl_binds_to_current_def_p and
added a test to verify that it's used.

>> diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c

>> index 0ca492db0ab..d1534bf7555 100644

>> --- a/gcc/testsuite/gcc.dg/Wstring-compare.c

>> +++ b/gcc/testsuite/gcc.dg/Wstring-compare.c

> In general I have a slight preference for pulling these into new files

> when we need to xfail them.  Why?  Because a test which previously

> passed, but now fails (even an xfail) causes the tester to flag the

> build as failing due to a testsuite regression.

> 

> But I don't think that preference is significant enough to ask you to

> redo the work.  Just something to ponder in the future.

> 

>   

>> diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c

>> index 57d8f651c28..f31761be173 100644

>> --- a/gcc/testsuite/gcc.dg/strcmpopt_2.c

>> +++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c

> So I'd pulled f1, f3, f5, f7 into a new file.  But disabling them like

> you've done is reasonable as well.

> 

> 

>> diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c

>> index 4e26522eed1..b07fbb6b7b0 100644

>> --- a/gcc/testsuite/gcc.dg/strcmpopt_4.c

>> +++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c

> THanks for creating the new test.  I'd done the exact same thing in my

> local tree.  I'm a little surprised that f_param passes.  Can you

> double check that?


It passes thanks to the TREE_CODE (arg) == PARM_DECL test added
in the patch to get_range_strlen (the test was missing before
and so while it handled ordinary objects (local or global) it
unnecessarily excluded function arguments.

> diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c

> b/gcc/testsuite/gcc.dg/strlenopt-69.c

>> index 9ad8e2e8aac..9df6eeccb97 100644

>> --- a/gcc/testsuite/gcc.dg/strlenopt-69.c

>> +++ b/gcc/testsuite/gcc.dg/strlenopt-69.c

> I'd pulled the offending tests into a new file, but your approach is

> fine too.

> 

>>

>> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c

>> index ad9e98973b1..b9972c16e18 100644

>> --- a/gcc/tree-ssa-strlen.c

>> +++ b/gcc/tree-ssa-strlen.c

>> -

>> -/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths

>> -   of  the string(s) referenced by ARG if it can be determined.

>> -   If the length cannot be determined, set *SIZE to the size of

>> +/* Given strinfo IDX for ARG, sets LENRNG[] to the range of lengths

>> +   of the string(s) referenced by ARG if it can be determined.

>> +   If the length cannot be determined, sets *SIZE to the size of

>>      the array the string is stored in, if any.  If no such array is

>> -   known, set *SIZE to -1.  When the strings are nul-terminated set

>> -   *NULTERM to true, otherwise to false.  Return true on success.  */

>> +   known, sets *SIZE to -1.  When the strings are nul-terminated sets

>> +   *NULTERM to true, otherwise to false.  When nonnull uses RVALS to

>> +   determine range information. Returns true on success.  */

> "When nonnull uses RVALS to detemrine range information."  That isn't a

> sentence and just doesn't seem to make sense.  Please review for

> comment clarity.


I changed it to:

   When RVALS is nonnull uses it to determine range information.

> 

> This looks OK to me with the comment fixed.  I like that we drop the

> whole determine_min_objsize and replace it it the standard range bits

> that we're using elsewhere.

> 

> Please give Richi and Jakub time to chime in on the gimple-fold.c

> changes.


Attached is an updated patch that I have just committed.

Martin
PR tree-optimization/92765 - wrong code for strcmp of a union member

gcc/ChangeLog:

	PR tree-optimization/92765
	* gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
	* tree-ssa-strlen.c (compute_string_length): Remove.
	(determine_min_objsize): Remove.
	(get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
	Avoid using type size as the upper bound on string length.
	(handle_builtin_string_cmp): Add an argument.  Adjust.
	(strlen_check_and_optimize_call): Pass additional argument to
	handle_builtin_string_cmp.

gcc/testsuite/ChangeLog:

	PR tree-optimization/92765
	* g++.dg/tree-ssa/strlenopt-1.C: New test.
	* g++.dg/tree-ssa/strlenopt-2.C: New test.
	* gcc.dg/Warray-bounds-58.c: New test.
	* gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
	* gcc.dg/Wstring-compare.c: Xfail a test.
	* gcc.dg/strcmpopt_2.c: Disable tests.
	* gcc.dg/strcmpopt_4.c: Adjust tests.
	* gcc.dg/strcmpopt_10.c: New test.
	* gcc.dg/strcmpopt_11.c: New test.
	* gcc.dg/strlenopt-69.c: Disable tests.
	* gcc.dg/strlenopt-92.c: New test.
	* gcc.dg/strlenopt-93.c: New test.
	* gcc.dg/strlenopt.h: Declare calloc.
	* gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index ed225922269..fa7a396a361 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "calls.h"
 #include "tree-vector-builder.h"
 #include "tree-ssa-strlen.h"
+#include "varasm.h"
 
 enum strlen_range_kind {
   /* Compute the exact constant string length.  */
@@ -1280,7 +1281,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 		       c_strlen_data *pdata, unsigned eltsize)
 {
   gcc_assert (TREE_CODE (arg) != SSA_NAME);
- 
+
   /* The length computed by this invocation of the function.  */
   tree val = NULL_TREE;
 
@@ -1422,7 +1423,44 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	     type about the length here.  */
 	  tight_bound = true;
 	}
-      else if (VAR_P (arg))
+      else if (TREE_CODE (arg) == MEM_REF
+	       && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
+	       && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
+	       && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
+	{
+	  /* Handle a MEM_REF into a DECL accessing an array of integers,
+	     being conservative about references to extern structures with
+	     flexible array members that can be initialized to arbitrary
+	     numbers of elements as an extension (static structs are okay).
+	     FIXME: Make this less conservative -- see
+	     component_ref_size in tree.c.  */
+	  tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
+	  if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))
+	      && (decl_binds_to_current_def_p (ref)
+		  || !array_at_struct_end_p (arg)))
+	    {
+	      /* Fail if the offset is out of bounds.  Such accesses
+		 should be diagnosed at some point.  */
+	      val = DECL_SIZE_UNIT (ref);
+	      if (!val || integer_zerop (val))
+		return false;
+
+	      poly_offset_int psiz = wi::to_offset (val);
+	      poly_offset_int poff = mem_ref_offset (arg);
+	      if (known_le (psiz, poff))
+		return false;
+
+	      pdata->minlen = ssize_int (0);
+
+	      /* Subtract the offset and one for the terminating nul.  */
+	      psiz -= poff;
+	      psiz -= 1;
+	      val = wide_int_to_tree (TREE_TYPE (val), psiz);
+	      /* Since VAL reflects the size of a declared object
+		 rather the type of the access it is not a tight bound.  */
+	    }
+	}
+      else if (TREE_CODE (arg) == PARM_DECL || VAR_P (arg))
 	{
 	  /* Avoid handling pointers to arrays.  GCC might misuse
 	     a pointer to an array of one bound to point to an array
@@ -1500,7 +1538,8 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
 	     the referenced subobject minus 1 (for the terminating nul).  */
 	  tree type = TREE_TYPE (base);
 	  if (TREE_CODE (type) == POINTER_TYPE
-	      || !VAR_P (base) || !(val = DECL_SIZE_UNIT (base)))
+	      || (TREE_CODE (base) != PARM_DECL && !VAR_P (base))
+	      || !(val = DECL_SIZE_UNIT (base)))
 	    val = build_all_ones_cst (size_type_node);
 	  else
 	    {
diff --git a/gcc/testsuite/g++.dg/tree-ssa/strlenopt-1.C b/gcc/testsuite/g++.dg/tree-ssa/strlenopt-1.C
new file mode 100644
index 00000000000..b6adfac7136
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/strlenopt-1.C
@@ -0,0 +1,42 @@
+/* PR tree-optimization/92765 - wrong code for strcmp of a union member
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+inline void* operator new (size_t, void *p)
+{
+  return p;
+}
+
+struct A { char a2[2]; };
+struct B { char a4[4]; };
+
+__attribute__((noipa)) void
+sink (void*) { }
+
+__attribute__((noipa)) void
+copy (char *d, const char *s)
+{
+  while ((*d++ = *s++));
+}
+
+__attribute__((noipa)) void
+store_and_compare (void *p)
+{
+  A *a = new (p) A;
+  sink (a->a2);
+
+  B *b = new (p) B;
+  char *q = (char *) b->a4;
+  copy (q, "abc");
+
+  if (__builtin_strcmp (q, "abc"))
+    __builtin_abort ();
+}
+
+int main ()
+{
+  char a [sizeof (A) > sizeof (B) ? sizeof (A) : sizeof (B)];
+  store_and_compare (a);
+}
diff --git a/gcc/testsuite/g++.dg/tree-ssa/strlenopt-2.C b/gcc/testsuite/g++.dg/tree-ssa/strlenopt-2.C
new file mode 100644
index 00000000000..60f8205c2ab
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/strlenopt-2.C
@@ -0,0 +1,56 @@
+/* PR tree-optimization/92765 - wrong code for strcmp of a union member
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+inline void* operator new (size_t, void *p)
+{
+  return p;
+}
+
+struct A
+{
+  char a[2]; char b[2]; char c[2];
+  A () { a[0] = 0; b[0] = 0; c[0] = 0; };
+  ~A () { }
+};
+
+struct B
+{
+  char d[6];
+  B () { d[0] = 0; d[2] = 0; }
+  ~B () { }
+};
+
+__attribute__((noipa)) void
+sink (void *) { }
+
+__attribute__((noipa)) void
+copy (char *d, const char *s)
+{
+  while ((*d++ = *s++));
+}
+
+__attribute__((noipa)) void
+store_and_compare (void *p)
+{
+  A *a = new (p) A ();
+  sink (&a->b);
+  a->~A ();
+
+  B *b = new (p) B ();
+  char *q = &b->d[2];
+  copy (q, "abc");
+
+  if (__builtin_strcmp (q, "abc"))
+    __builtin_abort ();
+  b->~B ();
+}
+
+int main ()
+{
+  char a [sizeof (A) > sizeof (B) ? sizeof (A) : sizeof (B)];
+  store_and_compare (a);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-58.c b/gcc/testsuite/gcc.dg/Warray-bounds-58.c
new file mode 100644
index 00000000000..7bd6df2bf2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Warray-bounds-58.c
@@ -0,0 +1,81 @@
+/* { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+extern size_t strlen (const char*);
+
+void sink (size_t);
+
+struct A0 { char i, a[0]; };
+
+extern struct A0 ea0;
+
+void fa0_extern (void)
+{
+  sink (strlen (ea0.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ea0.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ea0.a));        // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ea0.a + 1));    // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static struct A0 sa0 = { 0 };
+
+void fa0_static (void)
+{
+  sink (strlen (sa0.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (sa0.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (sa0.a));        // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (sa0.a + 1));    // { dg-warning "\\\[-Warray-bounds" }
+}
+
+
+struct Ax { char i, a[]; };
+
+extern struct Ax ax;
+
+void fax_extern (void)
+{
+  sink (strlen (ax.a - 2));     // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax.a - 1));     // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax.a));
+  sink (strlen (ax.a + 123));
+}
+
+static struct Ax ax0 = { 0, { 0 } };
+static struct Ax ax1 = { 1, { 1, 0 } };
+static struct Ax ax2 = { 2, { 2, 1, 0 } };
+static struct Ax ax3 = { 3, { 3, 2, 1, 0 } };
+
+void fax_static (void)
+{
+  sink (strlen (ax0.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ax0.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax0.a));
+  sink (strlen (ax0.a + 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax0.a + 2));    // { dg-warning "\\\[-Warray-bounds" }
+
+  sink (strlen (ax1.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ax1.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax1.a));
+  sink (strlen (ax1.a + 1));
+  sink (strlen (ax1.a + 2));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax1.a + 3));    // { dg-warning "\\\[-Warray-bounds" }
+
+  sink (strlen (ax2.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ax2.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax2.a));
+  sink (strlen (ax2.a + 1));
+  sink (strlen (ax2.a + 2));
+  sink (strlen (ax2.a + 3));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax2.a + 4));    // { dg-warning "\\\[-Warray-bounds" }
+
+  sink (strlen (ax3.a - 2));    // { dg-warning "\\\[-Warray-bounds" }
+  sink (strlen (ax3.a - 1));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax3.a));
+  sink (strlen (ax3.a + 1));
+  sink (strlen (ax3.a + 2));
+  sink (strlen (ax3.a + 3));
+  sink (strlen (ax3.a + 4));    // { dg-warning "\\\[-Warray-bounds" "pr93514" { xfail *-*-* } }
+  sink (strlen (ax3.a + 5));    // { dg-warning "\\\[-Warray-bounds" }
+}
diff --git a/gcc/testsuite/gcc.dg/Wrestrict-20.c b/gcc/testsuite/gcc.dg/Wrestrict-20.c
index 9826e7f4503..a2d29887c37 100644
--- a/gcc/testsuite/gcc.dg/Wrestrict-20.c
+++ b/gcc/testsuite/gcc.dg/Wrestrict-20.c
@@ -15,7 +15,7 @@ void test_warn (char *p)
   sprintf (a, "a=%s", a);     /* { dg-warning "-Wrestrict" } */
 
   p = a;
-  char *q = p + 1;
+  char *q = p + 3;
   sprintf (p, "a=%s", q);     /* { dg-warning "-Wrestrict" } */
 }
 
@@ -31,7 +31,7 @@ void test_nowarn_front_end (char *d)
 void test_nowarn_sprintf_pass (char *d)
 {
   char *q = d;
-  
+
   sprintf (d, "p=%p", q);
   snprintf (d, 32, "p=%p", q);
 
diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c
index 0ca492db0ab..d1534bf7555 100644
--- a/gcc/testsuite/gcc.dg/Wstring-compare.c
+++ b/gcc/testsuite/gcc.dg/Wstring-compare.c
@@ -120,7 +120,8 @@ void strcmp_array_copy (void)
 
 void strcmp_member_array_lit (const struct S *p)
 {
-  T (p->a4, "1234");        // { dg-warning "length 4 and an array of size 4 " }
+  // Not handled due to the fix for PR 92756.
+  T (p->a4, "1234");        // { dg-warning "length 4 and an array of size 4 " "pr92765" { xfail *-*-* } }
 }
 
 
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_10.c b/gcc/testsuite/gcc.dg/strcmpopt_10.c
new file mode 100644
index 00000000000..94fda596901
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_10.c
@@ -0,0 +1,130 @@
+/* Verify that strncmp equalities aren't eliminated when the trailing array
+   type referenced by a member pointer is smaller than the string in cases
+   when the pointer pointed to by the enclosing object references an object
+   sufficiently large to store a string of equal length.
+  { dg-do compile }
+  { dg-options "-O2 -Wall -Wextra -fdump-tree-optimized" } */
+
+void init (void*);
+
+struct A1 { char i, a[1]; };
+
+void f1_arr (void)
+{
+  char a[9];
+  init (a);
+
+  struct A1 *p = (struct A1*)a;
+
+  if (__builtin_strncmp (p->a, "01234567", 8) == 0)
+    {
+      extern void array_test (void);
+      array_test ();
+    }
+}
+
+void f1_ptr (void)
+{
+  void *p;
+  init (&p);
+
+  struct A1 *q = (struct A1*)p;
+
+  if (__builtin_strncmp (q->a, "0123456789", 10) == 0)
+    {
+      extern void pointer_test (void);
+      pointer_test ();
+    }
+}
+
+void f1_struct (void)
+{
+  struct { char a[9]; } b;
+  init (&b);
+
+  struct A1 *p = (struct A1*)&b;
+
+  if (__builtin_strncmp (p->a, "01234567", 8) == 0)
+    {
+      extern void struct_test (void);
+      struct_test ();
+    }
+}
+
+void f1_memptr (void)
+{
+  struct { void *p; } b;
+  init (&b);
+
+  struct A1 *p = (struct A1*)b.p;
+
+  if (__builtin_strncmp (p->a, "0123456789", 10) == 0)
+    {
+      extern void memptr_test (void);
+      memptr_test ();
+    }
+}
+
+
+struct A2 { char i, a[2]; };
+
+void f2_arr (void)
+{
+  char a[8];
+  init (a);
+
+  struct A2 *p = (struct A2*)a;
+
+  if (__builtin_strncmp (p->a, "0123456", 7) == 0)
+    {
+      extern void array_test (void);
+      array_test ();
+    }
+}
+
+void f2_ptr (void)
+{
+  void *p;
+  init (&p);
+
+  struct A2 *q = (struct A2*)p;
+
+  if (__builtin_strncmp (q->a, "0123456789", 10) == 0)
+    {
+      extern void pointer_test (void);
+      pointer_test ();
+    }
+}
+
+void f2_struct (void)
+{
+  struct { char a[8]; } b;
+  init (&b);
+
+  struct A2 *p = (struct A2*)&b;
+
+  if (__builtin_strncmp (p->a, "0123456", 7) == 0)
+    {
+      extern void struct_test (void);
+      struct_test ();
+    }
+}
+
+void f2_memptr (void)
+{
+  struct { void *p; } b;
+  init (&b);
+
+  struct A2 *p = (struct A2*)b.p;
+
+  if (__builtin_strncmp (p->a, "0123456789", 10) == 0)
+    {
+      extern void memptr_test (void);
+      memptr_test ();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "array_test" 2 "optimized" } }
+   { dg-final { scan-tree-dump-times "pointer_test" 2 "optimized" } }
+   { dg-final { scan-tree-dump-times "struct_test" 2 "optimized" } }
+   { dg-final { scan-tree-dump-times "memptr_test" 2 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_11.c b/gcc/testsuite/gcc.dg/strcmpopt_11.c
new file mode 100644
index 00000000000..945e0831996
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_11.c
@@ -0,0 +1,16 @@
+/* Verify that strcmp doesn't make assumptions about the size of a weak
+   symbol.
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+/* An ordinary definition of A with more elements might be provided
+   in another translation unit.  Even though that would be undefined
+   (the type of the actual definition must be the same as the type
+   of the weak declaration) this test verifies that GCC doesn't rely
+   on the size of this A for optimization (as a matter of QoI).  */
+__attribute__ ((weak)) char a[3];
+
+int cmp_a3_x (void)
+{
+  return __builtin_strcmp (a, "1234567") == 0;
+}
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
index 57d8f651c28..f31761be173 100644
--- a/gcc/testsuite/gcc.dg/strcmpopt_2.c
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -4,52 +4,53 @@
 char s[100] = {'a','b','c','d'};
 typedef struct { char s[8]; int x; } S;
 
-__attribute__ ((noinline)) int 
-f1 (S *s) 
-{ 
-  return __builtin_strcmp (s->s, "abc") != 0; 
+__attribute__ ((noinline)) int
+f1 (S *s)
+{
+  /* Member arrays not handled due to the fix for PR 92765.  */
+  return 0; // __builtin_strcmp (s->s, "abc") != 0;
 }
 
-__attribute__ ((noinline)) int 
-f2 (void) 
-{ 
-  return __builtin_strcmp (s, "abc") != 0; 
+__attribute__ ((noinline)) int
+f2 (void)
+{
+  return __builtin_strcmp (s, "abc") != 0;
 }
 
-__attribute__ ((noinline)) int 
-f3 (S *s) 
-{ 
-  return __builtin_strcmp ("abc", s->s) != 0; 
+__attribute__ ((noinline)) int
+f3 (S *s)
+{
+  return 0; // __builtin_strcmp ("abc", s->s) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f4 (void) 
-{ 
-  return __builtin_strcmp ("abc", s) != 0; 
+__attribute__ ((noinline)) int
+f4 (void)
+{
+  return __builtin_strcmp ("abc", s) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f5 (S *s) 
-{ 
-  return __builtin_strncmp (s->s, "abc", 3) != 0; 
+__attribute__ ((noinline)) int
+f5 (S *s)
+{
+  return 0; // __builtin_strncmp (s->s, "abc", 3) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f6 (void) 
-{ 
-  return __builtin_strncmp (s, "abc", 2) != 0; 
+__attribute__ ((noinline)) int
+f6 (void)
+{
+  return __builtin_strncmp (s, "abc", 2) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f7 (S *s) 
-{ 
-  return __builtin_strncmp ("abc", s->s, 3) != 0; 
+__attribute__ ((noinline)) int
+f7 (S *s)
+{
+  return 0; // __builtin_strncmp ("abc", s->s, 3) != 0;
 }
 
-__attribute__ ((noinline)) int 
-f8 (void) 
-{ 
-  return __builtin_strncmp ("abc", s, 2) != 0; 
+__attribute__ ((noinline)) int
+f8 (void)
+{
+  return __builtin_strncmp ("abc", s, 2) != 0;
 }
 
 int main (void)
@@ -64,4 +65,4 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 8 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 4 "strlen1" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c
index 4e26522eed1..b07fbb6b7b0 100644
--- a/gcc/testsuite/gcc.dg/strcmpopt_4.c
+++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
@@ -2,15 +2,26 @@
 /* { dg-options "-O2 -fdump-tree-strlen" } */
 
 typedef struct { char s[8]; int x; } S;
+
 extern int max_i;
 
-int
-f1 (S * s)
-{ 
-  int result, i;
-  for (i = 0; i < max_i; i++)
-    result += __builtin_strcmp (s->s, "abc") != 0 ? 2 : 1;
+int f_param (S s)
+{
+  int result = 0;
+  for (int i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s.s, "abc") != 0 ? 2 : 1;
+  return result;
+}
+
+
+S s;
+
+int f_object (void)
+{
+  int result = 0;
+  for (int i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s.s, "abc") != 0 ? 2 : 1;
   return result;
 }
 
-/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 1 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 2 "strlen1" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c b/gcc/testsuite/gcc.dg/strlenopt-69.c
index 9ad8e2e8aac..9df6eeccb97 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-69.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-69.c
@@ -35,6 +35,8 @@ void test_array_lit (void)
 
 void test_memarray_lit (struct S *p)
 {
+#if 0
+  /* Member arrays not handled due to the fix for PR 92765.  */
   A (strcmp (p->a4, "1234"));
   A (strcmp (p->a4, "12345"));
   A (strcmp (p->a4, "123456"));
@@ -42,6 +44,7 @@ void test_memarray_lit (struct S *p)
   A (strcmp ("1234", p->a4));
   A (strcmp ("12345", p->a4));
   A (strcmp ("123456", p->a4));
+#endif
 }
 
 /* Verify that the equality of empty strings is folded.  */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-92.c b/gcc/testsuite/gcc.dg/strlenopt-92.c
new file mode 100644
index 00000000000..a9383e65588
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-92.c
@@ -0,0 +1,58 @@
+/* PR tree-optimization/92765 - wrong code for strcmp of a union member
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+#include "strlenopt.h"
+
+__attribute__((noipa)) int
+copy (char *x, int y)
+{
+  if (y == 0)
+    strcpy (x, "abcd");
+  return y;
+}
+
+__attribute__((noipa)) char *
+alloc_2_copy_compare (int x)
+{
+  char *p;
+  if (x)
+    p = malloc (4);
+  else
+    p = calloc (16, 1);
+
+  char *q = p + 2;
+  if (copy (q, x))
+    return p;
+
+  if (strcmp (q, "abcd") != 0)
+    abort ();
+
+  return p;
+}
+
+char a5[5], a6[6], a7[7];
+
+__attribute__((noipa)) char *
+decl_3_copy_compare (int x)
+{
+  char *p = x < 0 ? a5 : 0 < x ? a6 : a7;
+  char *q = p + 1;
+  if (copy (q, x))
+    return p;
+
+  if (strcmp (q, "abcd") != 0)
+    abort ();
+
+  return p;
+}
+
+int main ()
+{
+  free (alloc_2_copy_compare (0));
+  free (alloc_2_copy_compare (1));
+
+  decl_3_copy_compare (-1);
+  decl_3_copy_compare (0);
+  decl_3_copy_compare (1);
+}
diff --git a/gcc/testsuite/gcc.dg/strlenopt-93.c b/gcc/testsuite/gcc.dg/strlenopt-93.c
new file mode 100644
index 00000000000..dc5f12d2ce3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-93.c
@@ -0,0 +1,71 @@
+/* Verify that strlen doesn't (inadvertently) use the size of an array
+   of char pointers to put an upper bound on the length of the strings
+   they point to.
+   { dg-do compile }
+   { dg-options "-O2 -Wall -fdump-tree-optimized" } */
+
+void eaa_test (void)
+{
+  extern char eaa[4][4];
+
+  char (*p)[4] = eaa;
+
+  if (!*p)
+    return;
+
+  /* The longest string stored in EAA is 15 characters.  */
+  if (__builtin_strlen (*p) > 14)
+    {
+      extern void eaa_ok (void);
+      eaa_ok ();
+    }
+
+  if (__builtin_strlen (*p) > 15)
+    {
+      extern void eaa_fail (void);
+      eaa_fail ();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "eaa_ok" 1 "optimized" } }
+   { dg-final { scan-tree-dump-not "eaa_fail" "optimized" } } */
+
+
+void epa_test (void)
+{
+  extern char* epa[4];
+  char **p = epa;
+
+  if (*p && __builtin_strlen (*p) > 123)
+    {
+      extern void epa_ok (void);
+      epa_ok ();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "epa_ok" 1 "optimized" } } */
+
+
+static char* spa[4];
+
+void spa_test (void)
+{
+  char **p = spa;
+
+  if (*p && __builtin_strlen (*p) > 123)
+    {
+      extern void spa_ok ();
+      spa_ok ();
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "spa_ok" 1 "optimized" } } */
+
+
+void sink (void*, ...);
+
+void init (void)
+{
+  /* Make believe even the static array SA may be non-zero.  */
+  sink (spa);
+}
diff --git a/gcc/testsuite/gcc.dg/strlenopt.h b/gcc/testsuite/gcc.dg/strlenopt.h
index 518d0cf08b2..a3ca951ddc5 100644
--- a/gcc/testsuite/gcc.dg/strlenopt.h
+++ b/gcc/testsuite/gcc.dg/strlenopt.h
@@ -5,6 +5,7 @@
 #define NULL ((void *) 0)
 typedef __SIZE_TYPE__ size_t;
 extern void abort (void);
+void *calloc (size_t, size_t);
 void *malloc (size_t);
 void free (void *);
 char *strdup (const char *);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
index 98dfa1d3966..7fb96514f1d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
@@ -642,10 +642,22 @@ void test_multiple_overlap (int i)
   }
 
   {
-    char a[4];                /* { dg-message "declared here" } */
+    char a[4];
+
+    /* There is no overlap here because the length of a3 is at most 1
+       and a4 is necessarily the empty string.  */
+    char *d = a;
+    char *a3 = a + 2;
+    char *a4 = a + 3;
+
+    T (d, "%s%s", a3, a4);
+  }
+
+  {
+    char a[5];                /* { dg-message "declared here" } */
 
     /* a3 and a4 may overlap the output.  They will only not overlap
-       it when a3 is empty, and a4 is at most chaeracter byte long.  */
+       it when a3 is empty, and a4 is at most 1 character long.  */
     char *d = a;
     char *a3 = a + 2;
     char *a4 = a + 3;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr92056.c b/gcc/testsuite/gcc.dg/tree-ssa/pr92056.c
index e43d0c06e2f..73b1f2ae52f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr92056.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr92056.c
@@ -1,8 +1,9 @@
-/* PR tree-optimization/92056 */
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized" } */
-/* { dg-final { scan-tree-dump-times "return 1;" 2 "optimized" } } */
-/* { dg-final { scan-tree-dump-not "strcmp \\(" "optimized" } } */
+/* PR tree-optimization/92056
+   { dg-do compile }
+   { dg-options "-O2 -fdump-tree-optimized" }
+   Xfailed until pr93518 is resolved.
+   { dg-final { scan-tree-dump-times "return 1;" 2 "optimized" { xfail *-*-* } } }
+   { dg-final { scan-tree-dump-not "strcmp \\(" "optimized" { xfail *-*-* } } } */
 
 void bar (int, char *);
 
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index ad9e98973b1..1cd64302d8b 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -327,7 +327,8 @@ get_next_strinfo (strinfo *si)
 /* Helper function for get_stridx.  Return the strinfo index of the address
    of EXP, which is available in PTR if nonnull.  If OFFSET_OUT, it is
    OK to return the index for some X <= &EXP and store &EXP - X in
-   *OFFSET_OUT.  When nonnull uses RVALS to determine range information.  */
+   *OFFSET_OUT.  When RVALS is nonnull uses it to determine range
+   information.  */
 
 static int
 get_addr_stridx (tree exp, tree ptr, unsigned HOST_WIDE_INT *offset_out,
@@ -4061,105 +4062,20 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   return true;
 }
 
-/* Given an index to the strinfo vector, compute the string length
-   for the corresponding string. Return -1 when unknown.  */
-
-static HOST_WIDE_INT
-compute_string_length (int idx)
-{
-  HOST_WIDE_INT string_leni = -1;
-  gcc_assert (idx != 0);
-
-  if (idx < 0)
-   return ~idx;
-
-  strinfo *si = get_strinfo (idx);
-  if (si)
-    {
-      tree const_string_len = get_string_length (si);
-      if (const_string_len && tree_fits_shwi_p (const_string_len))
-	string_leni = tree_to_shwi (const_string_len);
-    }
-
-  if (string_leni < 0)
-    return -1;
-
-  return string_leni;
-}
-
-/* Determine the minimum size of the object referenced by DEST expression
-   which must have a pointer type.
-   Return the minimum size of the object if successful or HWI_M1U when
-   the size cannot be determined.  */
-
-static unsigned HOST_WIDE_INT
-determine_min_objsize (tree dest)
-{
-  unsigned HOST_WIDE_INT size = 0;
-
-  init_object_sizes ();
-
-  if (compute_builtin_object_size (dest, 2, &size))
-    return size;
-
-  /* Try to determine the size of the object through the RHS
-     of the assign statement.  */
-  if (TREE_CODE (dest) == SSA_NAME)
-    {
-      gimple *stmt = SSA_NAME_DEF_STMT (dest);
-      if (!is_gimple_assign (stmt))
-	return HOST_WIDE_INT_M1U;
-
-      if (!gimple_assign_single_p (stmt)
-	  && !gimple_assign_unary_nop_p (stmt))
-	return HOST_WIDE_INT_M1U;
-
-      dest = gimple_assign_rhs1 (stmt);
-      return determine_min_objsize (dest);
-    }
-
-  /* Try to determine the size of the object from its type.  */
-  if (TREE_CODE (dest) != ADDR_EXPR)
-    return HOST_WIDE_INT_M1U;
-
-  tree type = TREE_TYPE (dest);
-  if (TREE_CODE (type) == POINTER_TYPE)
-    type = TREE_TYPE (type);
-
-  type = TYPE_MAIN_VARIANT (type);
-
-  /* The size of a flexible array cannot be determined.  Otherwise,
-     for arrays with more than one element, return the size of its
-     type.  GCC itself misuses arrays of both zero and one elements
-     as flexible array members so they are excluded as well.  */
-  if (TREE_CODE (type) != ARRAY_TYPE
-      || !array_at_struct_end_p (dest))
-    {
-      tree type_size = TYPE_SIZE_UNIT (type);
-      if (type_size && TREE_CODE (type_size) == INTEGER_CST
-	  && !integer_onep (type_size)
-	  && !integer_zerop (type_size))
-        return tree_to_uhwi (type_size);
-    }
-
-  return HOST_WIDE_INT_M1U;
-}
-
-/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths
-   of  the string(s) referenced by ARG if it can be determined.
-   If the length cannot be determined, set *SIZE to the size of
+/* Given strinfo IDX for ARG, sets LENRNG[] to the range of lengths
+   of the string(s) referenced by ARG if it can be determined.
+   If the length cannot be determined, sets *SIZE to the size of
    the array the string is stored in, if any.  If no such array is
-   known, set *SIZE to -1.  When the strings are nul-terminated set
-   *NULTERM to true, otherwise to false.  Return true on success.  */
+   known, sets *SIZE to -1.  When the strings are nul-terminated sets
+   *NULTERM to true, otherwise to false.  When nonnull uses RVALS to
+   determine range information. Returns true on success.  */
 
 static bool
 get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
-		 unsigned HOST_WIDE_INT *size, bool *nulterm)
+		 unsigned HOST_WIDE_INT *size, bool *nulterm,
+		 const vr_values *rvals)
 {
-  /* Set so that both LEN and ~LEN are invalid lengths, i.e.,
-     maximum possible length + 1.  */
-  lenrng[0] = lenrng[1] = HOST_WIDE_INT_MAX;
-
+  /* Invalidate.  */
   *size = HOST_WIDE_INT_M1U;
 
   if (idx < 0)
@@ -4168,13 +4084,18 @@ get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
       lenrng[0] = ~idx;
       lenrng[1] = lenrng[0];
       *nulterm = true;
+      return true;
     }
-  else if (idx == 0)
-    ; /* Handled below.  */
-  else if (strinfo *si = get_strinfo (idx))
+
+  /* Set so that both LEN and ~LEN are invalid lengths, i.e., maximum
+     possible length + 1.  */
+  lenrng[0] = lenrng[1] = HOST_WIDE_INT_MAX;
+
+  if (strinfo *si = idx ? get_strinfo (idx) : NULL)
     {
+      /* FIXME: Handle all this in_range_strlen_dynamic.  */
       if (!si->nonzero_chars)
-	arg = si->ptr;
+	;
       else if (tree_fits_uhwi_p (si->nonzero_chars))
 	{
 	  lenrng[0] = tree_to_uhwi (si->nonzero_chars);
@@ -4195,42 +4116,62 @@ get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
 	      *nulterm = si->full_string_p;
 	    }
 	}
-      else if (si->ptr)
-	arg = si->ptr;
     }
 
-  if (lenrng[0] == HOST_WIDE_INT_MAX)
+  if (lenrng[0] != HOST_WIDE_INT_MAX)
+    return true;
+
+  /* Compute the minimum and maximum real or possible lengths.  */
+  c_strlen_data lendata = { };
+  /* Set MAXBOUND to an arbitrary non-null non-integer node as a request
+     to have it set to the length of the longest string in a PHI.  */
+  lendata.maxbound = arg;
+  get_range_strlen_dynamic (arg, &lendata, rvals);
+
+  unsigned HOST_WIDE_INT maxbound = HOST_WIDE_INT_M1U;
+  if (tree_fits_uhwi_p (lendata.maxbound)
+      && !integer_all_onesp (lendata.maxbound))
+    maxbound = tree_to_uhwi (lendata.maxbound);
+
+  if (tree_fits_uhwi_p (lendata.minlen) && tree_fits_uhwi_p (lendata.maxlen))
     {
-      /* Compute the minimum and maximum real or possible lengths.  */
-      c_strlen_data lendata = { };
-      if (get_range_strlen (arg, &lendata, /* eltsize = */1))
+      unsigned HOST_WIDE_INT minlen = tree_to_uhwi (lendata.minlen);
+      unsigned HOST_WIDE_INT maxlen = tree_to_uhwi (lendata.maxlen);
+
+      /* The longest string in this data model.  */
+      const unsigned HOST_WIDE_INT lenmax
+	= tree_to_uhwi (max_object_size ()) - 2;
+
+      if (maxbound == HOST_WIDE_INT_M1U)
 	{
-	  if (tree_fits_shwi_p (lendata.maxlen) && !lendata.maxbound)
-	    {
-	      lenrng[0] = tree_to_shwi (lendata.minlen);
-	      lenrng[1] = tree_to_shwi (lendata.maxlen);
-	      *nulterm = true;
-	    }
-	  else if (lendata.maxbound && tree_fits_shwi_p (lendata.maxbound))
-	    {
-	      /* Set *SIZE to the conservative LENDATA.MAXBOUND which
-		 is a conservative estimate of the longest string based
-		 on the sizes of the arrays referenced by ARG.  */
-	      *size = tree_to_uhwi (lendata.maxbound) + 1;
-	      *nulterm = false;
-	    }
+	  lenrng[0] = minlen;
+	  lenrng[1] = maxlen;
+	  *nulterm = minlen == maxlen;
 	}
-      else
+      else if (maxlen < lenmax)
 	{
-	  /* Set *SIZE to the size of the smallest object referenced
-	     by ARG if ARG denotes a single object, or to HWI_M1U
-	     otherwise.  */
-	  *size = determine_min_objsize (arg);
+	  *size = maxbound + 1;
 	  *nulterm = false;
 	}
+      else
+	return false;
+
+      return true;
     }
 
-  return lenrng[0] != HOST_WIDE_INT_MAX || *size != HOST_WIDE_INT_M1U;
+  if (maxbound != HOST_WIDE_INT_M1U
+      && lendata.maxlen
+      && !integer_all_onesp (lendata.maxlen))
+    {
+      /* Set *SIZE to LENDATA.MAXBOUND which is a conservative estimate
+	 of the longest string based on the sizes of the arrays referenced
+	 by ARG.  */
+      *size = maxbound + 1;
+      *nulterm = false;
+      return true;
+    }
+
+  return false;
 }
 
 /* If IDX1 and IDX2 refer to strings A and B of unequal lengths, return
@@ -4245,15 +4186,15 @@ get_len_or_size (tree arg, int idx, unsigned HOST_WIDE_INT lenrng[2],
 static tree
 strxcmp_eqz_result (tree arg1, int idx1, tree arg2, int idx2,
 		    unsigned HOST_WIDE_INT bound, unsigned HOST_WIDE_INT len[2],
-		    unsigned HOST_WIDE_INT *psize)
+		    unsigned HOST_WIDE_INT *psize, const vr_values *rvals)
 {
   /* Determine the range the length of each string is in and whether it's
      known to be nul-terminated, or the size of the array it's stored in.  */
   bool nul1, nul2;
   unsigned HOST_WIDE_INT siz1, siz2;
   unsigned HOST_WIDE_INT len1rng[2], len2rng[2];
-  if (!get_len_or_size (arg1, idx1, len1rng, &siz1, &nul1)
-      || !get_len_or_size (arg2, idx2, len2rng, &siz2, &nul2))
+  if (!get_len_or_size (arg1, idx1, len1rng, &siz1, &nul1, rvals)
+      || !get_len_or_size (arg2, idx2, len2rng, &siz2, &nul2, rvals))
     return NULL_TREE;
 
   /* BOUND is set to HWI_M1U for strcmp and less to strncmp, and LENiRNG
@@ -4405,7 +4346,7 @@ maybe_warn_pointless_strcmp (gimple *stmt, HOST_WIDE_INT bound,
    another and false otherwise.  */
 
 static bool
-handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi, const vr_values *rvals)
 {
   gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
   tree lhs = gimple_call_lhs (stmt);
@@ -4451,7 +4392,7 @@ handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
        or definitely unequal and if so, either fold the result to zero
        (when equal) or set the range of the result to ~[0, 0] otherwise.  */
     if (tree eqz = strxcmp_eqz_result (arg1, idx1, arg2, idx2, bound,
-				       len, &siz))
+				       len, &siz, rvals))
       {
 	if (integer_zerop (eqz))
 	  {
@@ -4482,26 +4423,31 @@ handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
   HOST_WIDE_INT cstlen1 = -1, cstlen2 = -1;
   HOST_WIDE_INT arysiz1 = -1, arysiz2 = -1;
 
-  if (idx1)
-    cstlen1 = compute_string_length (idx1);
-  else
-    arysiz1 = determine_min_objsize (arg1);
+  {
+    unsigned HOST_WIDE_INT len1rng[2], len2rng[2];
+    unsigned HOST_WIDE_INT arsz1, arsz2;
+    bool nulterm[2];
+
+    if (!get_len_or_size (arg1, idx1, len1rng, &arsz1, nulterm, rvals)
+	|| !get_len_or_size (arg2, idx2, len2rng, &arsz2, nulterm + 1, rvals))
+      return false;
+
+    if (len1rng[0] == len1rng[1] && len1rng[0] < HOST_WIDE_INT_MAX)
+      cstlen1 = len1rng[0];
+    else if (arsz1 < HOST_WIDE_INT_M1U)
+      arysiz1 = arsz1;
+
+    if (len2rng[0] == len2rng[1] && len2rng[0] < HOST_WIDE_INT_MAX)
+      cstlen2 = len2rng[0];
+    else if (arsz2 < HOST_WIDE_INT_M1U)
+      arysiz2 = arsz2;
+  }
 
   /* Bail if neither the string length nor the size of the array
      it is stored in can be determined.  */
-  if (cstlen1 < 0 && arysiz1 < 0)
-    return false;
-
-  /* Repeat for the second argument.  */
-  if (idx2)
-    cstlen2 = compute_string_length (idx2);
-  else
-    arysiz2 = determine_min_objsize (arg2);
-
-  if (cstlen2 < 0 && arysiz2 < 0)
-    return false;
-
-  if (cstlen1 < 0 && cstlen2 < 0)
+  if ((cstlen1 < 0 && arysiz1 < 0)
+      || (cstlen2 < 0 && arysiz2 < 0)
+      || (cstlen1 < 0 && cstlen2 < 0))
     return false;
 
   if (cstlen1 >= 0)
@@ -5435,7 +5381,7 @@ strlen_check_and_optimize_call (gimple_stmt_iterator *gsi, bool *zero_write,
       break;
     case BUILT_IN_STRCMP:
     case BUILT_IN_STRNCMP:
-      if (handle_builtin_string_cmp (gsi))
+      if (handle_builtin_string_cmp (gsi, rvals))
 	return false;
       break;
     default:
Martin Sebor Feb. 5, 2020, 11:58 p.m. | #8
On 2/4/20 7:35 AM, Richard Biener wrote:
>>

...
>> Jakub/Richi, comments on this hunk?

> 

> +         tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);

> +         tree off = TREE_OPERAND (arg, 1);

> +         if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))

> +             && (!DECL_EXTERNAL (ref)

> +                 || !array_at_struct_end_p (arg)))

> +           {

> 

> I think you'd want decl_binds_to_current_def_p (ref) instead of !DECL_EXTERNAL.


I've made the change.

> Since 'arg' is originally a pointer array_at_struct_end_p is

> meaningless here since

> that's about the structure of a reference while the pointer is just a

> value.


array_at_struct_end_p handles MEM_REF by looking at the argument
(i.e., at the DECL when it is one), so its use here avoids DECLs
with flexible arrays but allows others.  In other words, I don't
want to exclude MEM_REFs to a in:

   extern char a[4];

just because a is extern.

> So if

> you're concerned the objects size might not be as it looks like then you have to

> rely on decl_binds_to_current_def_p only.


I'm only concerned about sizes of extern objects of struct types
with flexible array members.  I believe others are handled fine.

> You also shouldn't use 'off' natively

> in the code below but use mem_ref_offset to access the embedded offset

> which is to be interpreted as signed integer (it's a pointer as you use it).

> You compare it against an unsigned size...


I've changed it in the latest revision of the patch.

Martin
Jeff Law Feb. 6, 2020, 1:38 a.m. | #9
On Wed, 2020-02-05 at 16:57 -0700, Martin Sebor wrote:
> 

> It passes thanks to the TREE_CODE (arg) == PARM_DECL test added

> in the patch to get_range_strlen (the test was missing before

> and so while it handled ordinary objects (local or global) it

> unnecessarily excluded function arguments.

Oh yea, duh.  I recall noting you added the PARM_DECL handling and
thinking it might allow us to salvage some of the tests.  THen promptly
forgot.

jeff
>

Patch

PR tree-optimization/92765 - wrong code for strcmp of a union member

gcc/testsuite/ChangeLog:

	PR tree-optimization/92765
	* gcc.dg/strlenopt-92.c: New test.

gcc/ChangeLog:

	PR tree-optimization/92765
	* tree-ssa-strlen.c (component_ref_via_union_p): New function.
	(determine_min_objsize): Call it.  Use the maximum object size
	for PHI arguments.

diff --git a/gcc/testsuite/gcc.dg/strlenopt-92.c b/gcc/testsuite/gcc.dg/strlenopt-92.c
new file mode 100644
index 00000000000..44ad5b88854
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-92.c
@@ -0,0 +1,94 @@ 
+/* PR tree-optimization/92765 - wrong code for strcmp of a union member
+   { dg-do run }
+   { dg-options "-O2 -Wall" } */
+
+#include "strlenopt.h"
+
+extern void free (void*);
+extern void* calloc (size_t, size_t);
+extern void* malloc (size_t);
+
+
+/* Test from comment #0.  */
+struct S0 {
+  char a[2];
+};
+
+union U0 {
+  char b[4];
+  struct S0 s;
+};
+
+union U0 u0;
+union U0 *p0 = &u0;
+
+int test_0 ()
+{
+  u0.b[0] = 'a';
+  u0.b[1] = 'b';
+  u0.b[2] = '\0';
+
+  int x = memcmp (p0->s.a, "x", 2);
+
+  if (strcmp (p0->b, "ab"))
+    abort ();
+
+  return x;
+}
+
+
+/* Test from comment #6.  */
+union U1 { struct S1 { char a[2]; char b[2]; char c[2]; } s; char d[6]; } u1;
+
+__attribute__((noipa)) void
+barrier (char *p)
+{
+  asm volatile ("" : : "g" (p) : "memory");
+}
+
+__attribute__((noipa)) void
+test_1 (union U1 *x)
+{
+  char *p = (char *) &x->s.b;
+  barrier (p);
+  if (strcmp (&x->d[2], "cde"))
+    abort ();
+}
+
+/* Test from comment #7.  */
+
+__attribute__((noipa)) int
+barrier_copy (char *x, int y)
+{
+  asm volatile ("" : : "g" (x), "g" (y) : "memory");
+  if (y == 0)
+    strcpy (x, "abcd");
+  return y;
+}
+
+__attribute__((noipa)) char *
+test_2 (int x)
+{
+  char *p;
+  if (x)
+    p = malloc (2);
+  else
+    p = calloc (16, 1);
+  if (barrier_copy (p, x))
+    return p;
+  if (strcmp (p, "abcd") != 0)
+    abort ();
+  return p;
+}
+
+
+int main (void)
+{
+  test_0 ();
+
+  strcpy (u1.d, "abcde");
+  test_1 (&u1);
+
+  free (test_2 (0));
+  free (test_2 (1));
+}
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index ad9e98973b1..f4b6aadae47 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -4087,6 +4087,29 @@  compute_string_length (int idx)
   return string_leni;
 }
 
+/* Returns true if REF is a reference to a member of a union type,
+   or a member of such a type (traversing all references along
+   the path).  Used to avoid making assumptions about accesses
+   to members that could also be accessed by other members of
+   incompatible types.  */
+
+static bool
+component_ref_via_union_p (tree ref)
+{
+  if (TREE_CODE (ref) == ADDR_EXPR)
+    ref = TREE_OPERAND (ref, 0);
+
+  while (TREE_CODE (ref) == MEM_REF || handled_component_p (ref))
+    {
+      tree type = TREE_TYPE (ref);
+      if (TREE_CODE (type) == UNION_TYPE)
+	return true;
+      ref = TREE_OPERAND (ref, 0);
+    }
+
+  return false;
+}
+
 /* Determine the minimum size of the object referenced by DEST expression
    which must have a pointer type.
    Return the minimum size of the object if successful or HWI_M1U when
@@ -4099,14 +4122,18 @@  determine_min_objsize (tree dest)
 
   init_object_sizes ();
 
-  if (compute_builtin_object_size (dest, 2, &size))
-    return size;
-
   /* Try to determine the size of the object through the RHS
      of the assign statement.  */
   if (TREE_CODE (dest) == SSA_NAME)
     {
       gimple *stmt = SSA_NAME_DEF_STMT (dest);
+
+      /* Determine the size of the largest object when DEST refers
+	 to two or more via a PHI, otherwise the smallest.  */
+      int ostype = gimple_code (stmt) == GIMPLE_PHI ? 0 : 2;
+      if (compute_builtin_object_size (dest, ostype, &size))
+	return size;
+
       if (!is_gimple_assign (stmt))
 	return HOST_WIDE_INT_M1U;
 
@@ -4118,6 +4145,10 @@  determine_min_objsize (tree dest)
       return determine_min_objsize (dest);
     }
 
+  /* Try to determine the size of the referenced object itself.  */
+  if (compute_builtin_object_size (dest, 2, &size))
+    return size;
+
   /* Try to determine the size of the object from its type.  */
   if (TREE_CODE (dest) != ADDR_EXPR)
     return HOST_WIDE_INT_M1U;
@@ -4129,11 +4160,13 @@  determine_min_objsize (tree dest)
   type = TYPE_MAIN_VARIANT (type);
 
   /* The size of a flexible array cannot be determined.  Otherwise,
-     for arrays with more than one element, return the size of its
-     type.  GCC itself misuses arrays of both zero and one elements
-     as flexible array members so they are excluded as well.  */
+     unless the reference involves a union, for arrays with more than
+     one element, return the size of its type.  GCC itself misuses
+     arrays of both zero and one elements as flexible array members
+     so they are excluded as well.  */
   if (TREE_CODE (type) != ARRAY_TYPE
-      || !array_at_struct_end_p (dest))
+      || (!component_ref_via_union_p (dest)
+	  && !array_at_struct_end_p (dest)))
     {
       tree type_size = TYPE_SIZE_UNIT (type);
       if (type_size && TREE_CODE (type_size) == INTEGER_CST