[RFA/RFC,tree-optimization/91890,P1,Regression] Avoid clobbering useful location in Wrestrict code

Message ID 1f238a3d24c6b5ca99a154b4e924484ae2515f96.camel@redhat.com
State New
Headers show
Series
  • [RFA/RFC,tree-optimization/91890,P1,Regression] Avoid clobbering useful location in Wrestrict code
Related show

Commit Message

Jeff Law March 4, 2020, 3:54 p.m.
Martin, I'd like your thoughts here.

As noted in the BZ our #pragmas aren't working to suppress a warning.

I did some debugging and ultimately found that the location passed down to the
diagnostic code is indeed outside the scope of the pragmas.

Further digging uncovered that we had a reasonable location passed to
maybe_diag_access_bounds, but rather than using it, we extracted a location
attached to the builtin_memref structure.

So if we look at the test:


char one[50];
char two[50];

void
test_strncat (void)
{
  (void) __builtin_strcpy (one, "gh");
  (void) __builtin_strcpy (two, "ef");
 
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wstringop-overflow="
#pragma GCC diagnostic ignored "-Warray-bounds"
  (void) __builtin_strncat (one, two, 99); 
#pragma GCC diagnostic pop
}

The location we end up passing to the diagnostic code comes from the destination
of the call (via DSTREF which is passed to maybe_diag_access_bounds and becomes
REF in that context).

> (gdb) p debug_tree (ref.ptr)

>  <addr_expr 0x7fffea93d0a0

>     type <pointer_type 0x7fffea937540

>         type <array_type 0x7fffea9372a0 type <integer_type 0x7fffea8173f0 char>

>             BLK

>             size <integer_cst 0x7fffea81c3a8 constant 400>

>             unit-size <integer_cst 0x7fffea933f90 constant 50>

>             align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type

> 0x7fffea9372a0 domain <integer_type 0x7fffea9371f8>

>             pointer_to_this <pointer_type 0x7fffea937540>>

>         unsigned DI

>         size <integer_cst 0x7fffea7fecd8 constant 64>

>         unit-size <integer_cst 0x7fffea7fecf0 constant 8>

>         align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type

> 0x7fffea937540>

>     constant

>     arg:0 <var_decl 0x7ffff7ffbb40 one type <array_type 0x7fffea9372a0>

>         addressable used public static read BLK j.c:2:6 size <integer_cst

> 0x7fffea81c3a8 400> unit-size <integer_cst 0x7fffea933f90 50>

>         align:256 warn_if_not_align:0 context <translation_unit_decl

> 0x7fffea80bac8 j.c>

>         chain <var_decl 0x7ffff7ffbbd0 two type <array_type 0x7fffea9372a0>

>             addressable used public static read BLK j.c:3:6 size <integer_cst

> 0x7fffea81c3a8 400> unit-size <integer_cst 0x7fffea933f90 50>

>             align:256 warn_if_not_align:0 context <translation_unit_decl

> 0x7fffea80bac8 j.c> chain <function_decl 0x7fffea92f500 test_strncat>>>

>     j.c:8:28 start: j.c:8:28 finish: j.c:8:30>

> 


Note the location.  j.c line 8 column 28-30.  THat location makes absolutely no
sense (it's the first call to strcpy) given we're trying to diagnose the strncat.

I have no idea why we're using the location from ref.ptr rather than the location
of the statement we're analyzing (which is passed in as LOC).  It's no surprise
the pragma didn't work.

Anyway, this patch seems to fix the problem and doesn't regress anything.  But
I'm hesitant to go forward with it given I don't know the motivation behind
extracting the location from ref.ptr.

Another thought would be to just remove the code that pulls the location from
ref.ptr.  I haven't tested that, but certainly could easily.


Martin, thoughts?
PR tree-optimization/91890
	* gimple-ssa-warn-restrict.c (maybe_diag_access_bounds): Only
	extract a location from the reference if there is currently
	no valid location information.

	* gcc.dg/pragma-diag-8.c: New test.

Comments

Martin Sebor March 4, 2020, 4:22 p.m. | #1
On 3/4/20 8:54 AM, Jeff Law wrote:
> 

> Martin, I'd like your thoughts here.

> 

> As noted in the BZ our #pragmas aren't working to suppress a warning.

> 

> I did some debugging and ultimately found that the location passed down to the

> diagnostic code is indeed outside the scope of the pragmas.

> 

> Further digging uncovered that we had a reasonable location passed to

> maybe_diag_access_bounds, but rather than using it, we extracted a location

> attached to the builtin_memref structure.

> 

> So if we look at the test:

> 

> 

> char one[50];

> char two[50];

> 

> void

> test_strncat (void)

> {

>    (void) __builtin_strcpy (one, "gh");

>    (void) __builtin_strcpy (two, "ef");

>   

> #pragma GCC diagnostic push

> #pragma GCC diagnostic ignored "-Wstringop-overflow="

> #pragma GCC diagnostic ignored "-Warray-bounds"

>    (void) __builtin_strncat (one, two, 99);

> #pragma GCC diagnostic pop

> }

> 

> The location we end up passing to the diagnostic code comes from the destination

> of the call (via DSTREF which is passed to maybe_diag_access_bounds and becomes

> REF in that context).

> 

>> (gdb) p debug_tree (ref.ptr)

>>   <addr_expr 0x7fffea93d0a0

>>      type <pointer_type 0x7fffea937540

>>          type <array_type 0x7fffea9372a0 type <integer_type 0x7fffea8173f0 char>

>>              BLK

>>              size <integer_cst 0x7fffea81c3a8 constant 400>

>>              unit-size <integer_cst 0x7fffea933f90 constant 50>

>>              align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type

>> 0x7fffea9372a0 domain <integer_type 0x7fffea9371f8>

>>              pointer_to_this <pointer_type 0x7fffea937540>>

>>          unsigned DI

>>          size <integer_cst 0x7fffea7fecd8 constant 64>

>>          unit-size <integer_cst 0x7fffea7fecf0 constant 8>

>>          align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type

>> 0x7fffea937540>

>>      constant

>>      arg:0 <var_decl 0x7ffff7ffbb40 one type <array_type 0x7fffea9372a0>

>>          addressable used public static read BLK j.c:2:6 size <integer_cst

>> 0x7fffea81c3a8 400> unit-size <integer_cst 0x7fffea933f90 50>

>>          align:256 warn_if_not_align:0 context <translation_unit_decl

>> 0x7fffea80bac8 j.c>

>>          chain <var_decl 0x7ffff7ffbbd0 two type <array_type 0x7fffea9372a0>

>>              addressable used public static read BLK j.c:3:6 size <integer_cst

>> 0x7fffea81c3a8 400> unit-size <integer_cst 0x7fffea933f90 50>

>>              align:256 warn_if_not_align:0 context <translation_unit_decl

>> 0x7fffea80bac8 j.c> chain <function_decl 0x7fffea92f500 test_strncat>>>

>>      j.c:8:28 start: j.c:8:28 finish: j.c:8:30>

>>

> 

> Note the location.  j.c line 8 column 28-30.  THat location makes absolutely no

> sense (it's the first call to strcpy) given we're trying to diagnose the strncat.

> 

> I have no idea why we're using the location from ref.ptr rather than the location

> of the statement we're analyzing (which is passed in as LOC).  It's no surprise

> the pragma didn't work.

> 

> Anyway, this patch seems to fix the problem and doesn't regress anything.  But

> I'm hesitant to go forward with it given I don't know the motivation behind

> extracting the location from ref.ptr.

> 

> Another thought would be to just remove the code that pulls the location from

> ref.ptr.  I haven't tested that, but certainly could easily.

> 

> 

> Martin, thoughts?


I don't remember why the code in -Wrestrict unconditionally overwrites
the statement location rather than only when it's not available, but
I do remember adding conditional code like in your patch in r277076
to deal with missing location on the statement.  So either your fix
or something like the hunk below might be the right solution (if we
go with the code below, abstracting it into a utility function might
be nice).

Thanks for looking into it!

Martin


@@ -3642,7 +3670,11 @@ maybe_warn_pointless_strcmp (gimple *stmt, 
HOST_WIDE_INT bound,

    /* FIXME: Include a note pointing to the declaration of the smaller
       array.  */
-  location_t stmt_loc = gimple_location (stmt);
+  location_t stmt_loc = gimple_nonartificial_location (stmt);
+  if (stmt_loc == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (lhs))
+    stmt_loc = tree_nonartificial_location (lhs);
+  stmt_loc = expansion_point_location_if_in_system_header (stmt_loc);
+
    tree callee = gimple_call_fndecl (stmt);
    bool warned = false;
    if (siz <= minlen && bound == -1)
Jeff Law March 4, 2020, 11:49 p.m. | #2
On Wed, 2020-03-04 at 09:22 -0700, Martin Sebor wrote:
> 

> I don't remember why the code in -Wrestrict unconditionally overwrites

> the statement location rather than only when it's not available, but

> I do remember adding conditional code like in your patch in r277076

> to deal with missing location on the statement.  So either your fix

> or something like the hunk below might be the right solution (if we

> go with the code below, abstracting it into a utility function might

> be nice).

So there's several chunks that are fairly similar to what you referenced in
maybe_warn_pointless_strcmp.  Factoring all of them into a single location is
pretty easy.

That also gives us a nice place where we can experiment with "does extraction of
location information from the expression ever help".  The answer is, it doesn't,
at least not within our testsuite when run on x86_64.

I'm hesitant to remove the code that extracts the location out of the expression,
but could be convinced to do so.

Thoughts?

Jeff
Fix location maybe_diag_overlap passes to diagnostics so that
diagnostic pragmas work better.

	PR tree-optimization/91890
	* gimple-ssa-warn-restrict.c (maybe_diag_overlap): Remove LOC argument.
	Use gimple_or_expr_nonartificial_location.
	(check_bounds_overlap): Drop LOC argument to maybe_diag_access_bounds.
	Use gimple_or_expr_nonartificial_location.
	* gimple.c (gimple_or_expr_nonartificial_location): New function.
	* gimple.h (gimple_or_expr_nonartificial_location): Declare it.
	* tree-ssa-strlen.c (maybe_warn_overflow): Use
	gimple_or_expr_nonartificial_location.
	(maybe_diag_stxncpy_trunc, handle_builtin_stxncpy_strncat): Likewise.
	(maybe_warn_pointless_strcmp): Likewise.

	* gcc.dg/pragma-diag-8.c: New test.
diff --git a/gcc/gimple-ssa-warn-restrict.c b/gcc/gimple-ssa-warn-restrict.c
index 2c582a670eb..5e7e5d41dbb 100644
--- a/gcc/gimple-ssa-warn-restrict.c
+++ b/gcc/gimple-ssa-warn-restrict.c
@@ -1692,10 +1692,11 @@ maybe_diag_overlap (location_t loc, gimple *call, builtin_access &acs)
    has been issued, or would have been issued if DO_WARN had been true.  */
 
 static bool
-maybe_diag_access_bounds (location_t loc, gimple *call, tree func, int strict,
+maybe_diag_access_bounds (gimple *call, tree func, int strict,
 			  const builtin_memref &ref, offset_int wroff,
 			  bool do_warn)
 {
+  location_t loc = gimple_or_expr_nonartificial_location (call, ref.ptr);
   const offset_int maxobjsize = ref.maxobjsize;
 
   /* Check for excessive size first and regardless of warning options
@@ -1711,11 +1712,6 @@ maybe_diag_access_bounds (location_t loc, gimple *call, tree func, int strict,
 
       if (warn_stringop_overflow)
 	{
-	  if (EXPR_HAS_LOCATION (ref.ptr))
-	    loc = EXPR_LOCATION (ref.ptr);
-
-	  loc = expansion_point_location_if_in_system_header (loc);
-
 	  if (ref.sizrange[0] == ref.sizrange[1])
 	    return warning_at (loc, OPT_Wstringop_overflow_,
 			       "%G%qD specified bound %wu "
@@ -1754,11 +1750,6 @@ maybe_diag_access_bounds (location_t loc, gimple *call, tree func, int strict,
       || (ref.ref && TREE_NO_WARNING (ref.ref)))
     return false;
 
-  if (EXPR_HAS_LOCATION (ref.ptr))
-    loc = EXPR_LOCATION (ref.ptr);
-
-  loc = expansion_point_location_if_in_system_header (loc);
-
   char rangestr[2][64];
   if (ooboff[0] == ooboff[1]
       || (ooboff[0] != ref.offrange[0]
@@ -2018,9 +2009,6 @@ check_bounds_or_overlap (gimple *call, tree dst, tree src, tree dstsize,
 			 tree srcsize, bool bounds_only /* = false */,
 			 bool do_warn /* = true */)
 {
-  location_t loc = gimple_nonartificial_location (call);
-  loc = expansion_point_location_if_in_system_header (loc);
-
   tree func = gimple_call_fndecl (call);
 
   builtin_memref dstref (dst, dstsize);
@@ -2041,8 +2029,8 @@ check_bounds_or_overlap (gimple *call, tree dst, tree src, tree dstsize,
   /* Validate offsets to each reference before the access first to make
      sure they are within the bounds of the destination object if its
      size is known, or PTRDIFF_MAX otherwise.  */
-  if (maybe_diag_access_bounds (loc, call, func, strict, dstref, wroff, do_warn)
-      || maybe_diag_access_bounds (loc, call, func, strict, srcref, 0, do_warn))
+  if (maybe_diag_access_bounds (call, func, strict, dstref, wroff, do_warn)
+      || maybe_diag_access_bounds (call, func, strict, srcref, 0, do_warn))
     {
       if (do_warn)
 	gimple_set_no_warning (call, true);
@@ -2066,6 +2054,7 @@ check_bounds_or_overlap (gimple *call, tree dst, tree src, tree dstsize,
 	}
     }
 
+  location_t loc = gimple_or_expr_nonartificial_location (call, dst);
   if (operand_equal_p (dst, src, 0))
     {
       /* Issue -Wrestrict unless the pointers are null (those do
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 324e706d508..92c6e642589 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -3285,6 +3285,19 @@ gimple_inexpensive_call_p (gcall *stmt)
   return false;
 }
 
+/* Return a non-artificial location for STMT.  If STMT does not have
+   location information, get the location from EXPR.  */
+
+location_t
+gimple_or_expr_nonartificial_location (gimple *stmt, tree)
+{
+  location_t loc = gimple_nonartificial_location (stmt);
+  if (loc == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (expr))
+    loc = tree_nonartificial_location (expr);
+  return expansion_point_location_if_in_system_header (loc);
+}
+
+
 #if CHECKING_P
 
 namespace selftest {
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 192f19ad23a..0420d6d2251 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1633,6 +1633,8 @@ extern void gimple_seq_discard (gimple_seq);
 extern void maybe_remove_unused_call_args (struct function *, gimple *);
 extern bool gimple_inexpensive_call_p (gcall *);
 extern bool stmt_can_terminate_bb_p (gimple *);
+extern location_t gimple_or_expr_nonartificial_location (gimple *, tree);
+
 
 /* Formal (expression) temporary table handling: multiple occurrences of
    the same scalar expression are evaluated into the same temporary.  */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index b76b54efbd8..7cc576b333e 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2106,11 +2106,7 @@ maybe_warn_overflow (gimple *stmt, tree len,
 	  || !si || !is_strlen_related_p (si->ptr, len)))
     return;
 
-  location_t loc = gimple_nonartificial_location (stmt);
-  if (loc == UNKNOWN_LOCATION && dest && EXPR_HAS_LOCATION (dest))
-    loc = tree_nonartificial_location (dest);
-  loc = expansion_point_location_if_in_system_header (loc);
-
+  location_t loc = gimple_or_expr_nonartificial_location (stmt, dest);
   bool warned = false;
   if (wi::leu_p (lenrng[0], spcrng[1]))
     {
@@ -3159,9 +3155,7 @@ maybe_diag_stxncpy_trunc (gimple_stmt_iterator gsi, tree src, tree cnt)
 	}
     }
 
-  location_t callloc = gimple_nonartificial_location (stmt);
-  callloc = expansion_point_location_if_in_system_header (callloc);
-
+  location_t callloc = gimple_or_expr_nonartificial_location (stmt, dst);
   tree func = gimple_call_fndecl (stmt);
 
   if (lenrange[0] != 0 || !wi::neg_p (lenrange[1]))
@@ -3373,8 +3367,7 @@ handle_builtin_stxncpy_strncat (bool append_p, gimple_stmt_iterator *gsi)
      to strlen(S)).  */
   strinfo *silen = get_strinfo (pss->first);
 
-  location_t callloc = gimple_nonartificial_location (stmt);
-  callloc = expansion_point_location_if_in_system_header (callloc);
+  location_t callloc = gimple_or_expr_nonartificial_location (stmt, dst);
 
   tree func = gimple_call_fndecl (stmt);
 
@@ -4301,10 +4294,7 @@ maybe_warn_pointless_strcmp (gimple *stmt, HOST_WIDE_INT bound,
 
   /* FIXME: Include a note pointing to the declaration of the smaller
      array.  */
-  location_t stmt_loc = gimple_nonartificial_location (stmt);
-  if (stmt_loc == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (lhs))
-    stmt_loc = tree_nonartificial_location (lhs);
-  stmt_loc = expansion_point_location_if_in_system_header (stmt_loc);
+  location_t stmt_loc = gimple_or_expr_nonartificial_location (stmt, lhs);
 
   tree callee = gimple_call_fndecl (stmt);
   bool warned = false;
diff --git a/gcc/testsuite/gcc.dg/pragma-diag-8.c b/gcc/testsuite/gcc.dg/pragma-diag-8.c
new file mode 100644
index 00000000000..00780606e9b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pragma-diag-8.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wall" } */
+
+
+char one[50];
+char two[50];
+
+void
+test_strncat (void)
+{
+  (void) __builtin_strcpy (one, "gh");
+  (void) __builtin_strcpy (two, "ef");
+ 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wstringop-overflow="
+#pragma GCC diagnostic ignored "-Warray-bounds"
+  (void) __builtin_strncat (one, two, 99); 
+#pragma GCC diagnostic pop
+}
+
Martin Sebor March 5, 2020, 1 a.m. | #3
On 3/4/20 4:49 PM, Jeff Law wrote:
> On Wed, 2020-03-04 at 09:22 -0700, Martin Sebor wrote:

>>

>> I don't remember why the code in -Wrestrict unconditionally overwrites

>> the statement location rather than only when it's not available, but

>> I do remember adding conditional code like in your patch in r277076

>> to deal with missing location on the statement.  So either your fix

>> or something like the hunk below might be the right solution (if we

>> go with the code below, abstracting it into a utility function might

>> be nice).

> So there's several chunks that are fairly similar to what you referenced in

> maybe_warn_pointless_strcmp.  Factoring all of them into a single location is

> pretty easy.

> 

> That also gives us a nice place where we can experiment with "does extraction of

> location information from the expression ever help".  The answer is, it doesn't,

> at least not within our testsuite when run on x86_64.

> 

> I'm hesitant to remove the code that extracts the location out of the expression,

> but could be convinced to do so.

> 

> Thoughts?


I tried disabling the two UNKNOWN_LOCATION workarounds in the strlen
pass and that didn't make a difference to the test results I ran either
(I only ran warning tests) so it looks like whatever reason it was added
for doesn't exist anymore.

I usually open bugs when I see something not working but the only bug
I've raised that I can find that has to do with location is pr90735,
for -Wreturn-local-addr.  It has the background into the problem there.
Grepping for gimple_set_location and UNKNOWN_LOCATION turns up just four 
instances in the middle-end:

grep gimple_set_location gcc/*.c | grep UNKNOWN_)
gcc/gimple-low.c:	  gimple_set_location (t.stmt, UNKNOWN_LOCATION);
gcc/gimple-low.c:	  gimple_set_location (tmp_rs.stmt, UNKNOWN_LOCATION);
gcc/tree-eh.c:	gimple_set_location (stmt, UNKNOWN_LOCATION);
gcc/tree-inline.c:		  gimple_set_location (stmt, UNKNOWN_LOCATION);

Maybe something will jump out at you there.

For GCC 10 I agree it's safest to leave it in place.  For GCC 11, if
we can't come up with a test case that shows it's needed, I'll look
into removing it in stage 1.

Martin
Richard Biener March 5, 2020, 7:51 a.m. | #4
On Thu, Mar 5, 2020 at 12:49 AM Jeff Law <law@redhat.com> wrote:
>

> On Wed, 2020-03-04 at 09:22 -0700, Martin Sebor wrote:

> >

> > I don't remember why the code in -Wrestrict unconditionally overwrites

> > the statement location rather than only when it's not available, but

> > I do remember adding conditional code like in your patch in r277076

> > to deal with missing location on the statement.  So either your fix

> > or something like the hunk below might be the right solution (if we

> > go with the code below, abstracting it into a utility function might

> > be nice).

> So there's several chunks that are fairly similar to what you referenced in

> maybe_warn_pointless_strcmp.  Factoring all of them into a single location is

> pretty easy.

>

> That also gives us a nice place where we can experiment with "does extraction of

> location information from the expression ever help".  The answer is, it doesn't,

> at least not within our testsuite when run on x86_64.

>

> I'm hesitant to remove the code that extracts the location out of the expression,

> but could be convinced to do so.

>

> Thoughts?


Using anything but the actual stmt location is prone to end up at random places
due to tree sharing issues, CSE and copy propagation.  Simply consider

char one[50];
char two[50];

void
test_strncat (void)
{
  char *p = one;
  (void) __builtin_strcpy (p, "gh");
  (void) __builtin_strcpy (two, "ef");

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wstringop-overflow="
#pragma GCC diagnostic ignored "-Warray-bounds"
  (void) __builtin_strncat (p, two, 99);
#pragma GCC diagnostic pop
}

where we happily forward p = &one[0] to both uses injecting
a "faulty" location.  Well, it's actually the correct location
computing &one[0] but irrelevant for the actual call.

So the question is why we end up with UNKNOWN_LOCATION
for such call and if why we need to bother emit a diagnostic
at all (and why emitting it for another possibly random location is a good idea
instead of maybe simply emitting it without location).

Richard.

> Jeff
Jeff Law March 5, 2020, 2:55 p.m. | #5
On Thu, 2020-03-05 at 08:51 +0100, Richard Biener wrote:
> On Thu, Mar 5, 2020 at 12:49 AM Jeff Law <law@redhat.com> wrote:

> > On Wed, 2020-03-04 at 09:22 -0700, Martin Sebor wrote:

> > > I don't remember why the code in -Wrestrict unconditionally overwrites

> > > the statement location rather than only when it's not available, but

> > > I do remember adding conditional code like in your patch in r277076

> > > to deal with missing location on the statement.  So either your fix

> > > or something like the hunk below might be the right solution (if we

> > > go with the code below, abstracting it into a utility function might

> > > be nice).

> > So there's several chunks that are fairly similar to what you referenced in

> > maybe_warn_pointless_strcmp.  Factoring all of them into a single location is

> > pretty easy.

> > 

> > That also gives us a nice place where we can experiment with "does extraction

> > of

> > location information from the expression ever help".  The answer is, it

> > doesn't,

> > at least not within our testsuite when run on x86_64.

> > 

> > I'm hesitant to remove the code that extracts the location out of the

> > expression,

> > but could be convinced to do so.

> > 

> > Thoughts?

> 

> Using anything but the actual stmt location is prone to end up at random places

> due to tree sharing issues, CSE and copy propagation.  Simply consider

I'd tend to agree.  My conservatism is due to being in stage4 and not knowing
precisely why we have code to extract the location from the operand to begin
with.


> where we happily forward p = &one[0] to both uses injecting

> a "faulty" location.  Well, it's actually the correct location

> computing &one[0] but irrelevant for the actual call.

Exactly.

> 

> So the question is why we end up with UNKNOWN_LOCATION

> for such call and if why we need to bother emit a diagnostic

> at all (and why emitting it for another possibly random location is a good idea

> instead of maybe simply emitting it without location).

One might argue that scenario should be a gcc_unreachable rather than extracting
a likely bogus location.  I'm even more hesitant to do that for gcc-10, but it
might sense for gcc-11.

My first inclination would be do do the refactor, but leave in the code that
extracts a location from the expression.  We'd close out the regression BZ and
open a new one to remove the expression handling bits for gcc-11 (or turn them
into a gcc_unreachable)

Does that work for  you Richi?

jeff
Martin Sebor March 5, 2020, 3:53 p.m. | #6
On 3/5/20 12:51 AM, Richard Biener wrote:
> On Thu, Mar 5, 2020 at 12:49 AM Jeff Law <law@redhat.com> wrote:

>>

>> On Wed, 2020-03-04 at 09:22 -0700, Martin Sebor wrote:

>>>

>>> I don't remember why the code in -Wrestrict unconditionally overwrites

>>> the statement location rather than only when it's not available, but

>>> I do remember adding conditional code like in your patch in r277076

>>> to deal with missing location on the statement.  So either your fix

>>> or something like the hunk below might be the right solution (if we

>>> go with the code below, abstracting it into a utility function might

>>> be nice).

>> So there's several chunks that are fairly similar to what you referenced in

>> maybe_warn_pointless_strcmp.  Factoring all of them into a single location is

>> pretty easy.

>>

>> That also gives us a nice place where we can experiment with "does extraction of

>> location information from the expression ever help".  The answer is, it doesn't,

>> at least not within our testsuite when run on x86_64.

>>

>> I'm hesitant to remove the code that extracts the location out of the expression,

>> but could be convinced to do so.

>>

>> Thoughts?

> 

> Using anything but the actual stmt location is prone to end up at random places

> due to tree sharing issues, CSE and copy propagation.  Simply consider

> 

> char one[50];

> char two[50];

> 

> void

> test_strncat (void)

> {

>    char *p = one;

>    (void) __builtin_strcpy (p, "gh");

>    (void) __builtin_strcpy (two, "ef");

> 

> #pragma GCC diagnostic push

> #pragma GCC diagnostic ignored "-Wstringop-overflow="

> #pragma GCC diagnostic ignored "-Warray-bounds"

>    (void) __builtin_strncat (p, two, 99);


Interestingly, while the expression location points to p, the warning
points to the statement:

warning: ‘__builtin_strncat’ forming offset [50, 98] is out of the 
bounds [0, 50] of object ‘one’ with type ‘char[50]’ [-Warray-bounds]
    14 |   (void) __builtin_strncat (p, two, 99);
       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As it happens, the %G directive in the warning_at() call replaces
the location passed to it with that of the Gimple call argument to
the %G directive.  Removing the %G directive turns the warning into:

warning: ‘__builtin_strncat’ forming offset [50, 98] is out of the 
bounds [0, 50] of object ‘one’ with type ‘char[50]’ [-Warray-bounds]
     7 |   char *p = one;
       |             ^~~

But the code that checks the scope of #pragma GCC diagnostic uses
the original location passed to warning_at, not the location set
subsequently by the %G directive, and so the two are out of synch.

We've discussed removing the %G/%K directives before and having
the diagnostic machinery always print the inlining context instead.
Let me look into it for GCC 11.

Martin

> #pragma GCC diagnostic pop

> }

> 

> where we happily forward p = &one[0] to both uses injecting

> a "faulty" location.  Well, it's actually the correct location

> computing &one[0] but irrelevant for the actual call.

> 

> So the question is why we end up with UNKNOWN_LOCATION

> for such call and if why we need to bother emit a diagnostic

> at all (and why emitting it for another possibly random location is a good idea

> instead of maybe simply emitting it without location).

> 

> Richard.

> 

>> Jeff
Richard Biener March 5, 2020, 6:49 p.m. | #7
On March 5, 2020 4:53:43 PM GMT+01:00, Martin Sebor <msebor@gmail.com> wrote:
>On 3/5/20 12:51 AM, Richard Biener wrote:

>> On Thu, Mar 5, 2020 at 12:49 AM Jeff Law <law@redhat.com> wrote:

>>>

>>> On Wed, 2020-03-04 at 09:22 -0700, Martin Sebor wrote:

>>>>

>>>> I don't remember why the code in -Wrestrict unconditionally

>overwrites

>>>> the statement location rather than only when it's not available,

>but

>>>> I do remember adding conditional code like in your patch in r277076

>>>> to deal with missing location on the statement.  So either your fix

>>>> or something like the hunk below might be the right solution (if we

>>>> go with the code below, abstracting it into a utility function

>might

>>>> be nice).

>>> So there's several chunks that are fairly similar to what you

>referenced in

>>> maybe_warn_pointless_strcmp.  Factoring all of them into a single

>location is

>>> pretty easy.

>>>

>>> That also gives us a nice place where we can experiment with "does

>extraction of

>>> location information from the expression ever help".  The answer is,

>it doesn't,

>>> at least not within our testsuite when run on x86_64.

>>>

>>> I'm hesitant to remove the code that extracts the location out of

>the expression,

>>> but could be convinced to do so.

>>>

>>> Thoughts?

>> 

>> Using anything but the actual stmt location is prone to end up at

>random places

>> due to tree sharing issues, CSE and copy propagation.  Simply

>consider

>> 

>> char one[50];

>> char two[50];

>> 

>> void

>> test_strncat (void)

>> {

>>    char *p = one;

>>    (void) __builtin_strcpy (p, "gh");

>>    (void) __builtin_strcpy (two, "ef");

>> 

>> #pragma GCC diagnostic push

>> #pragma GCC diagnostic ignored "-Wstringop-overflow="

>> #pragma GCC diagnostic ignored "-Warray-bounds"

>>    (void) __builtin_strncat (p, two, 99);

>

>Interestingly, while the expression location points to p, the warning

>points to the statement:

>

>warning: ‘__builtin_strncat’ forming offset [50, 98] is out of the 

>bounds [0, 50] of object ‘one’ with type ‘char[50]’ [-Warray-bounds]

>    14 |   (void) __builtin_strncat (p, two, 99);

>       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

>

>As it happens, the %G directive in the warning_at() call replaces

>the location passed to it with that of the Gimple call argument to

>the %G directive.  Removing the %G directive turns the warning into:

>

>warning: ‘__builtin_strncat’ forming offset [50, 98] is out of the 

>bounds [0, 50] of object ‘one’ with type ‘char[50]’ [-Warray-bounds]

>     7 |   char *p = one;

>       |             ^~~

>

>But the code that checks the scope of #pragma GCC diagnostic uses

>the original location passed to warning_at, not the location set

>subsequently by the %G directive, and so the two are out of synch.


Ah, I see. 

>We've discussed removing the %G/%K directives before and having

>the diagnostic machinery always print the inlining context instead.

>Let me look into it for GCC 11.

>

>Martin

>

>> #pragma GCC diagnostic pop

>> }

>> 

>> where we happily forward p = &one[0] to both uses injecting

>> a "faulty" location.  Well, it's actually the correct location

>> computing &one[0] but irrelevant for the actual call.

>> 

>> So the question is why we end up with UNKNOWN_LOCATION

>> for such call and if why we need to bother emit a diagnostic

>> at all (and why emitting it for another possibly random location is a

>good idea

>> instead of maybe simply emitting it without location).

>> 

>> Richard.

>> 

>>> Jeff
Richard Biener March 5, 2020, 6:51 p.m. | #8
On March 5, 2020 3:55:57 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>On Thu, 2020-03-05 at 08:51 +0100, Richard Biener wrote:

>> On Thu, Mar 5, 2020 at 12:49 AM Jeff Law <law@redhat.com> wrote:

>> > On Wed, 2020-03-04 at 09:22 -0700, Martin Sebor wrote:

>> > > I don't remember why the code in -Wrestrict unconditionally

>overwrites

>> > > the statement location rather than only when it's not available,

>but

>> > > I do remember adding conditional code like in your patch in

>r277076

>> > > to deal with missing location on the statement.  So either your

>fix

>> > > or something like the hunk below might be the right solution (if

>we

>> > > go with the code below, abstracting it into a utility function

>might

>> > > be nice).

>> > So there's several chunks that are fairly similar to what you

>referenced in

>> > maybe_warn_pointless_strcmp.  Factoring all of them into a single

>location is

>> > pretty easy.

>> > 

>> > That also gives us a nice place where we can experiment with "does

>extraction

>> > of

>> > location information from the expression ever help".  The answer

>is, it

>> > doesn't,

>> > at least not within our testsuite when run on x86_64.

>> > 

>> > I'm hesitant to remove the code that extracts the location out of

>the

>> > expression,

>> > but could be convinced to do so.

>> > 

>> > Thoughts?

>> 

>> Using anything but the actual stmt location is prone to end up at

>random places

>> due to tree sharing issues, CSE and copy propagation.  Simply

>consider

>I'd tend to agree.  My conservatism is due to being in stage4 and not

>knowing

>precisely why we have code to extract the location from the operand to

>begin

>with.

>

>

>> where we happily forward p = &one[0] to both uses injecting

>> a "faulty" location.  Well, it's actually the correct location

>> computing &one[0] but irrelevant for the actual call.

>Exactly.

>

>> 

>> So the question is why we end up with UNKNOWN_LOCATION

>> for such call and if why we need to bother emit a diagnostic

>> at all (and why emitting it for another possibly random location is a

>good idea

>> instead of maybe simply emitting it without location).

>One might argue that scenario should be a gcc_unreachable rather than

>extracting

>a likely bogus location.  I'm even more hesitant to do that for gcc-10,

>but it

>might sense for gcc-11.

>

>My first inclination would be do do the refactor, but leave in the code

>that

>extracts a location from the expression.  We'd close out the regression

>BZ and

>open a new one to remove the expression handling bits for gcc-11 (or

>turn them

>into a gcc_unreachable)

>

>Does that work for  you Richi?


We should avoid regressing in other ways of course. Given Martin's followup I'm not sure what to do but eventually stop using ‰G and remove the odd expression location code? 

Richard. 

>jeff
Jeff Law March 5, 2020, 7:33 p.m. | #9
On Thu, 2020-03-05 at 19:51 +0100, Richard Biener wrote:
> On March 5, 2020 3:55:57 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:

> > On Thu, 2020-03-05 at 08:51 +0100, Richard Biener wrote:

> > > On Thu, Mar 5, 2020 at 12:49 AM Jeff Law <law@redhat.com> wrote:

> > > > On Wed, 2020-03-04 at 09:22 -0700, Martin Sebor wrote:

> > > > > I don't remember why the code in -Wrestrict unconditionally

> > overwrites

> > > > > the statement location rather than only when it's not available,

> > but

> > > > > I do remember adding conditional code like in your patch in

> > r277076

> > > > > to deal with missing location on the statement.  So either your

> > fix

> > > > > or something like the hunk below might be the right solution (if

> > we

> > > > > go with the code below, abstracting it into a utility function

> > might

> > > > > be nice).

> > > > So there's several chunks that are fairly similar to what you

> > referenced in

> > > > maybe_warn_pointless_strcmp.  Factoring all of them into a single

> > location is

> > > > pretty easy.

> > > > 

> > > > That also gives us a nice place where we can experiment with "does

> > extraction

> > > > of

> > > > location information from the expression ever help".  The answer

> > is, it

> > > > doesn't,

> > > > at least not within our testsuite when run on x86_64.

> > > > 

> > > > I'm hesitant to remove the code that extracts the location out of

> > the

> > > > expression,

> > > > but could be convinced to do so.

> > > > 

> > > > Thoughts?

> > > 

> > > Using anything but the actual stmt location is prone to end up at

> > random places

> > > due to tree sharing issues, CSE and copy propagation.  Simply

> > consider

> > I'd tend to agree.  My conservatism is due to being in stage4 and not

> > knowing

> > precisely why we have code to extract the location from the operand to

> > begin

> > with.

> > 

> > 

> > > where we happily forward p = &one[0] to both uses injecting

> > > a "faulty" location.  Well, it's actually the correct location

> > > computing &one[0] but irrelevant for the actual call.

> > Exactly.

> > 

> > > So the question is why we end up with UNKNOWN_LOCATION

> > > for such call and if why we need to bother emit a diagnostic

> > > at all (and why emitting it for another possibly random location is a

> > good idea

> > > instead of maybe simply emitting it without location).

> > One might argue that scenario should be a gcc_unreachable rather than

> > extracting

> > a likely bogus location.  I'm even more hesitant to do that for gcc-10,

> > but it

> > might sense for gcc-11.

> > 

> > My first inclination would be do do the refactor, but leave in the code

> > that

> > extracts a location from the expression.  We'd close out the regression

> > BZ and

> > open a new one to remove the expression handling bits for gcc-11 (or

> > turn them

> > into a gcc_unreachable)

> > 

> > Does that work for  you Richi?

> 

> We should avoid regressing in other ways of course. Given Martin's followup I'm

> not sure what to do but eventually stop using ‰G and remove the odd expression

> location code? 

Yea, I think that's the conclusion Martin and I came to today as well.  Do the
refactoring now, but leaving in the expression location bits, then remove the
expression location bits in gcc-11 after further testing.

jeff

Patch

diff --git a/gcc/gimple-ssa-warn-restrict.c b/gcc/gimple-ssa-warn-restrict.c
index 2c582a670eb..9421d3d6899 100644
--- a/gcc/gimple-ssa-warn-restrict.c
+++ b/gcc/gimple-ssa-warn-restrict.c
@@ -1711,7 +1711,7 @@  maybe_diag_access_bounds (location_t loc, gimple *call, tree func, int strict,
 
       if (warn_stringop_overflow)
 	{
-	  if (EXPR_HAS_LOCATION (ref.ptr))
+	  if (loc == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (ref.ptr))
 	    loc = EXPR_LOCATION (ref.ptr);
 
 	  loc = expansion_point_location_if_in_system_header (loc);
@@ -1754,7 +1754,7 @@  maybe_diag_access_bounds (location_t loc, gimple *call, tree func, int strict,
       || (ref.ref && TREE_NO_WARNING (ref.ref)))
     return false;
 
-  if (EXPR_HAS_LOCATION (ref.ptr))
+  if (loc == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (ref.ptr))
     loc = EXPR_LOCATION (ref.ptr);
 
   loc = expansion_point_location_if_in_system_header (loc);
diff --git a/gcc/testsuite/gcc.dg/pragma-diag-8.c b/gcc/testsuite/gcc.dg/pragma-diag-8.c
new file mode 100644
index 00000000000..00780606e9b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pragma-diag-8.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wall" } */
+
+
+char one[50];
+char two[50];
+
+void
+test_strncat (void)
+{
+  (void) __builtin_strcpy (one, "gh");
+  (void) __builtin_strcpy (two, "ef");
+ 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wstringop-overflow="
+#pragma GCC diagnostic ignored "-Warray-bounds"
+  (void) __builtin_strncat (one, two, 99); 
+#pragma GCC diagnostic pop
+}
+