combine: Fix find_split_point handling of constant store into ZERO_EXTRACT [PR93908]

Message ID 20200225091008.GJ2155@tucnak
State New
Headers show
Series
  • combine: Fix find_split_point handling of constant store into ZERO_EXTRACT [PR93908]
Related show

Commit Message

Jakub Jelinek Feb. 25, 2020, 9:10 a.m.
Hi!

git is miscompiled on s390x-linux with -O2 -march=zEC12 -mtune=z13.
I've managed to reduce it into the following testcase.  The problem is that
during combine we see the s->k = -1; bitfield store and change the SET_SRC
from a pseudo into a constant:
(set (zero_extract:DI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ])
                (const_int 10 [0xa])) [0 +0 S2 A16])
        (const_int 2 [0x2])
        (const_int 7 [0x7]))
    (const_int -1 [0xffffffffffffffff]))
This on s390x with the above option isn't recognized as valid instruction,
so find_split_point decides to handle it as IOR or IOR/AND.
src is -1, mask is 3 and pos is 7.
src != mask (this is also incorrect, we want to set all (both) bits in the
bitfield), so we go for IOR/AND, but instead of trying
mem = (mem & ~0x180) | ((-1 << 7) & 0x180)
we actually try
mem = (mem & ~0x180) | (-1 << 7)
and that is further simplified into:
mem = mem | (-1 << 7)
aka
mem = mem | 0xff80
which doesn't set just the 2-bit bitfield, but also many other bitfields
that shouldn't be touched.
We really should do:
mem = mem | 0x180
instead.
The problem is that we assume that no bits but those low len (2 here) will
be set in the SET_SRC, but there is nothing that can prevent that, we just
should ignore the other bits.

The following patch fixes it by masking src with mask, this way already
the src == mask test will DTRT, and as the code for or_mask uses
gen_int_mode, if the most significant bit is set after shifting it left by
pos, it will be properly sign-extended.

Bootstrapped/regtested on {x86_64,i686,s390x}-linux, ok for trunk and
release branches?

2020-02-25  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/93908
	* combine.c (find_split_point): For store into ZERO_EXTRACT, and src
	with mask.

	* gcc.c-torture/execute/pr93908.c: New test.


	Jakub

Comments

Segher Boessenkool Feb. 25, 2020, 12:43 p.m. | #1
Hi!

On Tue, Feb 25, 2020 at 10:10:08AM +0100, Jakub Jelinek wrote:
> git is miscompiled on s390x-linux with -O2 -march=zEC12 -mtune=z13.

> I've managed to reduce it into the following testcase.  The problem is that

> during combine we see the s->k = -1; bitfield store and change the SET_SRC

> from a pseudo into a constant:

> (set (zero_extract:DI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ])

>                 (const_int 10 [0xa])) [0 +0 S2 A16])

>         (const_int 2 [0x2])

>         (const_int 7 [0x7]))

>     (const_int -1 [0xffffffffffffffff]))


Extracts on the LHS are a monstrosity.

> The following patch fixes it by masking src with mask, this way already

> the src == mask test will DTRT, and as the code for or_mask uses

> gen_int_mode, if the most significant bit is set after shifting it left by

> pos, it will be properly sign-extended.


Yup.  Did you see anything else nearby forgetting to do this as well?

> Bootstrapped/regtested on {x86_64,i686,s390x}-linux, ok for trunk and

> release branches?


Okay everywhere (8 and 9 immediately, if your schedule requires).
Thanks!

> 2020-02-25  Jakub Jelinek  <jakub@redhat.com>

> 

> 	PR rtl-optimization/93908

> 	* combine.c (find_split_point): For store into ZERO_EXTRACT, and src

> 	with mask.

> 

> 	* gcc.c-torture/execute/pr93908.c: New test.


> --- gcc/testsuite/gcc.target/s390/pr93908.c.jj	2020-02-24 18:39:26.767378641 +0100

> +++ gcc/testsuite/gcc.target/s390/pr93908.c	2020-02-24 18:39:18.928495337 +0100

> @@ -0,0 +1,5 @@

> +/* PR rtl-optimization/93908 */

> +/* { dg-do run { target s390_zEC12_hw } } */

> +/* { dg-options "-O2 -march=zEC12 -mtune=z13" } */

> +

> +#include "../../gcc.c-torture/execute/pr93908.c"


This file isn't in the changelog, please fix that?


Segher

Patch

--- gcc/combine.c.jj	2020-01-30 21:27:48.059362833 +0100
+++ gcc/combine.c	2020-02-24 18:31:44.482260431 +0100
@@ -5130,10 +5130,9 @@  find_split_point (rtx *loc, rtx_insn *in
 	{
 	  HOST_WIDE_INT pos = INTVAL (XEXP (SET_DEST (x), 2));
 	  unsigned HOST_WIDE_INT len = INTVAL (XEXP (SET_DEST (x), 1));
-	  unsigned HOST_WIDE_INT src = INTVAL (SET_SRC (x));
 	  rtx dest = XEXP (SET_DEST (x), 0);
-	  unsigned HOST_WIDE_INT mask
-	    = (HOST_WIDE_INT_1U << len) - 1;
+	  unsigned HOST_WIDE_INT mask = (HOST_WIDE_INT_1U << len) - 1;
+	  unsigned HOST_WIDE_INT src = INTVAL (SET_SRC (x)) & mask;
 	  rtx or_mask;
 
 	  if (BITS_BIG_ENDIAN)
--- gcc/testsuite/gcc.c-torture/execute/pr93908.c.jj	2020-02-24 18:37:57.207711869 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr93908.c	2020-02-24 18:37:40.698957622 +0100
@@ -0,0 +1,54 @@ 
+/* PR rtl-optimization/93908 */
+
+struct T
+{
+  int b;
+  int c;
+  unsigned short d;
+  unsigned e:1, f:1, g:1, h:2, i:1, j:1;
+  signed int k:2;
+};
+
+struct S
+{
+  struct T s;
+  char c[64];
+} buf[2];
+
+__attribute__ ((noipa)) void *
+baz (void)
+{
+  static int cnt;
+  return (void *) &buf[cnt++];
+}
+
+static inline __attribute__ ((always_inline)) struct T *
+bar (const char *a)
+{
+  struct T *s;
+  s = baz ();
+  s->b = 1;
+  s->k = -1;
+  return s;
+}
+
+__attribute__ ((noipa)) void
+foo (const char *x, struct T **y)
+{
+  struct T *l = bar (x);
+  struct T *m = bar (x);
+  y[0] = l;
+  y[1] = m;
+}
+
+int
+main ()
+{
+  struct T *r[2];
+  foo ("foo", r);
+  if (r[0]->e || r[0]->f || r[0]->g || r[0]->h || r[0]->i || r[0]->j || r[0]->k != -1)
+    __builtin_abort ();
+  if (r[1]->e || r[1]->f || r[1]->g || r[1]->h || r[1]->i || r[1]->j || r[1]->k != -1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/s390/pr93908.c.jj	2020-02-24 18:39:26.767378641 +0100
+++ gcc/testsuite/gcc.target/s390/pr93908.c	2020-02-24 18:39:18.928495337 +0100
@@ -0,0 +1,5 @@ 
+/* PR rtl-optimization/93908 */
+/* { dg-do run { target s390_zEC12_hw } } */
+/* { dg-options "-O2 -march=zEC12 -mtune=z13" } */
+
+#include "../../gcc.c-torture/execute/pr93908.c"