[AArch64] PR target/84164: Relax predicate in *aarch64_<optab>_reg_di3_mask2

Message ID 5A747F56.7020403@foss.arm.com
State New
Headers show
Series
  • [AArch64] PR target/84164: Relax predicate in *aarch64_<optab>_reg_di3_mask2
Related show

Commit Message

Kyrill Tkachov Feb. 2, 2018, 3:10 p.m.
Hi all,

In this [8 Regression] PR we ICE because we can't recognise the insn:
(insn 59 58 43 7 (set (reg:DI 124)
         (rotatert:DI (reg:DI 125 [ c ])
             (subreg:QI (and:SI (reg:SI 128)
                     (const_int 65535 [0xffff])) 0)))

This was created by the *aarch64_reg_<mode>3_neg_mask2 splitter.
The point of these splitters and patterns is to eliminate redundant
masking of the shift (or rotate in this case) amount when shifting
by a variable amount.  For example in AND x3, x3, 0xffff ; ROR x1, x2, x3
we can eliminate the AND because the ROR instruction implicitly "MOD"s
the shift amount in X3 by 64. So this pattern is supposed to match the following:

(define_insn "*aarch64_<optab>_reg_di3_mask2"
   [(set (match_operand:DI 0 "register_operand" "=r")
     (SHIFT:DI
       (match_operand:DI 1 "register_operand" "r")
       (match_operator 4 "subreg_lowpart_operator"
        [(and:SI (match_operand:SI 2 "register_operand" "r")
             (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]

The rotation amount mask is supposed to be operand 3 but the predicate for it here
is aarch64_shift_imm_di which only allows values in [0, 63], whereas we want to allow
any value that doesn't touch the bottom GET_MODE_BITSIZE (DImode) bits, which is what
the pattern predicate tests. So the predicate on operands 3 is too strict.

This patch just makes it const_int_operand since the pattern predicates has the correct
validation for its values. This is in line with what the *aarch64_reg_<mode>3_neg_mask2
and *aarch64_reg_<mode>3_minus_mask splitters accept (and they are the ones that generate
this insn pattern).

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?
Thanks,
Kyrill

2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/84164
     * config/aarch64/aarch64.md (*aarch64_<optab>_reg_di3_mask2):
     Use const_int_operand predicate for operand 3.

2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/84164
     * gcc.c-torture/compile/pr84164.c: New test.

Comments

Richard Earnshaw (lists) Feb. 2, 2018, 3:25 p.m. | #1
On 02/02/18 15:10, Kyrill Tkachov wrote:
> Hi all,

> 

> In this [8 Regression] PR we ICE because we can't recognise the insn:

> (insn 59 58 43 7 (set (reg:DI 124)

>         (rotatert:DI (reg:DI 125 [ c ])

>             (subreg:QI (and:SI (reg:SI 128)

>                     (const_int 65535 [0xffff])) 0)))



Aren't we heading off down the wrong path here?

(subreg:QI (and:SI (reg:SI 128) (const_int 65535 [0xffff])) 0))

can be simplified to

(subreg:QI (reg:SI 128) 0)

since the AND operation is redundant as we're only looking at the bottom
8 bits.

R.

> 

> This was created by the *aarch64_reg_<mode>3_neg_mask2 splitter.

> The point of these splitters and patterns is to eliminate redundant

> masking of the shift (or rotate in this case) amount when shifting

> by a variable amount.  For example in AND x3, x3, 0xffff ; ROR x1, x2, x3

> we can eliminate the AND because the ROR instruction implicitly "MOD"s

> the shift amount in X3 by 64. So this pattern is supposed to match the

> following:

> 

> (define_insn "*aarch64_<optab>_reg_di3_mask2"

>   [(set (match_operand:DI 0 "register_operand" "=r")

>     (SHIFT:DI

>       (match_operand:DI 1 "register_operand" "r")

>       (match_operator 4 "subreg_lowpart_operator"

>        [(and:SI (match_operand:SI 2 "register_operand" "r")

>             (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]

> 

> The rotation amount mask is supposed to be operand 3 but the predicate

> for it here

> is aarch64_shift_imm_di which only allows values in [0, 63], whereas we

> want to allow

> any value that doesn't touch the bottom GET_MODE_BITSIZE (DImode) bits,

> which is what

> the pattern predicate tests. So the predicate on operands 3 is too strict.

> 

> This patch just makes it const_int_operand since the pattern predicates

> has the correct

> validation for its values. This is in line with what the

> *aarch64_reg_<mode>3_neg_mask2

> and *aarch64_reg_<mode>3_minus_mask splitters accept (and they are the

> ones that generate

> this insn pattern).

> 

> Bootstrapped and tested on aarch64-none-linux-gnu.

> 

> Ok for trunk?

> Thanks,

> Kyrill

> 

> 2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

> 

>     PR target/84164

>     * config/aarch64/aarch64.md (*aarch64_<optab>_reg_di3_mask2):

>     Use const_int_operand predicate for operand 3.

> 

> 2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

> 

>     PR target/84164

>     * gcc.c-torture/compile/pr84164.c: New test.

> 

> aarch64-mask.patch

> 

> 

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md

> index 04b2d203fa168bebcf6f93a13e3bd67a5998935a..eef0d1a780dd1c886e1bebb9992c552fb9d5b57c 100644

> --- a/gcc/config/aarch64/aarch64.md

> +++ b/gcc/config/aarch64/aarch64.md

> @@ -4358,8 +4358,8 @@ (define_insn "*aarch64_<optab>_reg_di3_mask2"

>  	  (match_operand:DI 1 "register_operand" "r")

>  	  (match_operator 4 "subreg_lowpart_operator"

>  	   [(and:SI (match_operand:SI 2 "register_operand" "r")

> -		     (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]

> -  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode)-1)) == 0)"

> +		    (match_operand 3 "const_int_operand" "n"))])))]

> +  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode) - 1)) == 0)"

>  {

>    rtx xop[3];

>    xop[0] = operands[0];

> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr84164.c b/gcc/testsuite/gcc.c-torture/compile/pr84164.c

> new file mode 100644

> index 0000000000000000000000000000000000000000..d663fe5d6649e495f3e956e6a3bc938236a4bf91

> --- /dev/null

> +++ b/gcc/testsuite/gcc.c-torture/compile/pr84164.c

> @@ -0,0 +1,17 @@

> +/* PR target/84164.  */

> +

> +int b, d;

> +unsigned long c;

> +

> +static inline __attribute__ ((always_inline)) void

> +bar (int e)

> +{

> +  e += __builtin_sub_overflow_p (b, d, c);

> +  c = c << ((short) ~e & 3) | c >> (-((short) ~e & 3) & 63);

> +}

> +

> +int

> +foo (void)

> +{

> +  bar (~1);

> +}

>
Kyrill Tkachov Feb. 2, 2018, 3:43 p.m. | #2
Hi Richard,

On 02/02/18 15:25, Richard Earnshaw (lists) wrote:
> On 02/02/18 15:10, Kyrill Tkachov wrote:

>> Hi all,

>>

>> In this [8 Regression] PR we ICE because we can't recognise the insn:

>> (insn 59 58 43 7 (set (reg:DI 124)

>>          (rotatert:DI (reg:DI 125 [ c ])

>>              (subreg:QI (and:SI (reg:SI 128)

>>                      (const_int 65535 [0xffff])) 0)))

>

> Aren't we heading off down the wrong path here?

>

> (subreg:QI (and:SI (reg:SI 128) (const_int 65535 [0xffff])) 0))

>

> can be simplified to

>

> (subreg:QI (reg:SI 128) 0)

>

> since the AND operation is redundant as we're only looking at the bottom

> 8 bits.


I have tried implementing such a transformation in combine [1]
but it was not clear that it was universally beneficial.
See the linked thread and PR 70119 for the discussion (the thread
continues into the next month).

This patch fixes the existing patterns, such as they are,
with the explicit subreg matching and associated warts.

We can resurrect the combine simplification discussion at some point
but for the GCC 8 it would be safer to fix the existing pattern.

Thanks,
Kyrill

[1] https://gcc.gnu.org/ml/gcc/2016-02/msg00357.html

> R.

>

>> This was created by the *aarch64_reg_<mode>3_neg_mask2 splitter.

>> The point of these splitters and patterns is to eliminate redundant

>> masking of the shift (or rotate in this case) amount when shifting

>> by a variable amount.  For example in AND x3, x3, 0xffff ; ROR x1, x2, x3

>> we can eliminate the AND because the ROR instruction implicitly "MOD"s

>> the shift amount in X3 by 64. So this pattern is supposed to match the

>> following:

>>

>> (define_insn "*aarch64_<optab>_reg_di3_mask2"

>>    [(set (match_operand:DI 0 "register_operand" "=r")

>>      (SHIFT:DI

>>        (match_operand:DI 1 "register_operand" "r")

>>        (match_operator 4 "subreg_lowpart_operator"

>>         [(and:SI (match_operand:SI 2 "register_operand" "r")

>>              (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]

>>

>> The rotation amount mask is supposed to be operand 3 but the predicate

>> for it here

>> is aarch64_shift_imm_di which only allows values in [0, 63], whereas we

>> want to allow

>> any value that doesn't touch the bottom GET_MODE_BITSIZE (DImode) bits,

>> which is what

>> the pattern predicate tests. So the predicate on operands 3 is too strict.

>>

>> This patch just makes it const_int_operand since the pattern predicates

>> has the correct

>> validation for its values. This is in line with what the

>> *aarch64_reg_<mode>3_neg_mask2

>> and *aarch64_reg_<mode>3_minus_mask splitters accept (and they are the

>> ones that generate

>> this insn pattern).

>>

>> Bootstrapped and tested on aarch64-none-linux-gnu.

>>

>> Ok for trunk?

>> Thanks,

>> Kyrill

>>

>> 2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>

>>      PR target/84164

>>      * config/aarch64/aarch64.md (*aarch64_<optab>_reg_di3_mask2):

>>      Use const_int_operand predicate for operand 3.

>>

>> 2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>

>>      PR target/84164

>>      * gcc.c-torture/compile/pr84164.c: New test.

>>

>> aarch64-mask.patch

>>

>>

>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md

>> index 04b2d203fa168bebcf6f93a13e3bd67a5998935a..eef0d1a780dd1c886e1bebb9992c552fb9d5b57c 100644

>> --- a/gcc/config/aarch64/aarch64.md

>> +++ b/gcc/config/aarch64/aarch64.md

>> @@ -4358,8 +4358,8 @@ (define_insn "*aarch64_<optab>_reg_di3_mask2"

>>   	  (match_operand:DI 1 "register_operand" "r")

>>   	  (match_operator 4 "subreg_lowpart_operator"

>>   	   [(and:SI (match_operand:SI 2 "register_operand" "r")

>> -		     (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]

>> -  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode)-1)) == 0)"

>> +		    (match_operand 3 "const_int_operand" "n"))])))]

>> +  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode) - 1)) == 0)"

>>   {

>>     rtx xop[3];

>>     xop[0] = operands[0];

>> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr84164.c b/gcc/testsuite/gcc.c-torture/compile/pr84164.c

>> new file mode 100644

>> index 0000000000000000000000000000000000000000..d663fe5d6649e495f3e956e6a3bc938236a4bf91

>> --- /dev/null

>> +++ b/gcc/testsuite/gcc.c-torture/compile/pr84164.c

>> @@ -0,0 +1,17 @@

>> +/* PR target/84164.  */

>> +

>> +int b, d;

>> +unsigned long c;

>> +

>> +static inline __attribute__ ((always_inline)) void

>> +bar (int e)

>> +{

>> +  e += __builtin_sub_overflow_p (b, d, c);

>> +  c = c << ((short) ~e & 3) | c >> (-((short) ~e & 3) & 63);

>> +}

>> +

>> +int

>> +foo (void)

>> +{

>> +  bar (~1);

>> +}

>>
Richard Earnshaw (lists) Feb. 6, 2018, 11:32 a.m. | #3
On 02/02/18 15:43, Kyrill Tkachov wrote:
> Hi Richard,

> 

> On 02/02/18 15:25, Richard Earnshaw (lists) wrote:

>> On 02/02/18 15:10, Kyrill Tkachov wrote:

>>> Hi all,

>>>

>>> In this [8 Regression] PR we ICE because we can't recognise the insn:

>>> (insn 59 58 43 7 (set (reg:DI 124)

>>>          (rotatert:DI (reg:DI 125 [ c ])

>>>              (subreg:QI (and:SI (reg:SI 128)

>>>                      (const_int 65535 [0xffff])) 0)))

>>

>> Aren't we heading off down the wrong path here?

>>

>> (subreg:QI (and:SI (reg:SI 128) (const_int 65535 [0xffff])) 0))

>>

>> can be simplified to

>>

>> (subreg:QI (reg:SI 128) 0)

>>

>> since the AND operation is redundant as we're only looking at the bottom

>> 8 bits.

> 

> I have tried implementing such a transformation in combine [1]

> but it was not clear that it was universally beneficial.

> See the linked thread and PR 70119 for the discussion (the thread

> continues into the next month).


Is that really the same thing?  The example there was using a mask that
was narrower than the subreg and thus not redundant.  The case here is
where the mask is completely redundant because we are only looking at
the bottom 8 bits of the result (which are not changed by the AND
operation).

R.

> 

> This patch fixes the existing patterns, such as they are,

> with the explicit subreg matching and associated warts.

> 

> We can resurrect the combine simplification discussion at some point

> but for the GCC 8 it would be safer to fix the existing pattern.

> 

> Thanks,

> Kyrill

> 

> [1] https://gcc.gnu.org/ml/gcc/2016-02/msg00357.html

> 

>> R.

>>

>>> This was created by the *aarch64_reg_<mode>3_neg_mask2 splitter.

>>> The point of these splitters and patterns is to eliminate redundant

>>> masking of the shift (or rotate in this case) amount when shifting

>>> by a variable amount.  For example in AND x3, x3, 0xffff ; ROR x1,

>>> x2, x3

>>> we can eliminate the AND because the ROR instruction implicitly "MOD"s

>>> the shift amount in X3 by 64. So this pattern is supposed to match the

>>> following:

>>>

>>> (define_insn "*aarch64_<optab>_reg_di3_mask2"

>>>    [(set (match_operand:DI 0 "register_operand" "=r")

>>>      (SHIFT:DI

>>>        (match_operand:DI 1 "register_operand" "r")

>>>        (match_operator 4 "subreg_lowpart_operator"

>>>         [(and:SI (match_operand:SI 2 "register_operand" "r")

>>>              (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]

>>>

>>> The rotation amount mask is supposed to be operand 3 but the predicate

>>> for it here

>>> is aarch64_shift_imm_di which only allows values in [0, 63], whereas we

>>> want to allow

>>> any value that doesn't touch the bottom GET_MODE_BITSIZE (DImode) bits,

>>> which is what

>>> the pattern predicate tests. So the predicate on operands 3 is too

>>> strict.

>>>

>>> This patch just makes it const_int_operand since the pattern predicates

>>> has the correct

>>> validation for its values. This is in line with what the

>>> *aarch64_reg_<mode>3_neg_mask2

>>> and *aarch64_reg_<mode>3_minus_mask splitters accept (and they are the

>>> ones that generate

>>> this insn pattern).

>>>

>>> Bootstrapped and tested on aarch64-none-linux-gnu.

>>>

>>> Ok for trunk?

>>> Thanks,

>>> Kyrill

>>>

>>> 2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>>

>>>      PR target/84164

>>>      * config/aarch64/aarch64.md (*aarch64_<optab>_reg_di3_mask2):

>>>      Use const_int_operand predicate for operand 3.

>>>

>>> 2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>>

>>>      PR target/84164

>>>      * gcc.c-torture/compile/pr84164.c: New test.

>>>

>>> aarch64-mask.patch

>>>

>>>

>>> diff --git a/gcc/config/aarch64/aarch64.md

>>> b/gcc/config/aarch64/aarch64.md

>>> index

>>> 04b2d203fa168bebcf6f93a13e3bd67a5998935a..eef0d1a780dd1c886e1bebb9992c552fb9d5b57c

>>> 100644

>>> --- a/gcc/config/aarch64/aarch64.md

>>> +++ b/gcc/config/aarch64/aarch64.md

>>> @@ -4358,8 +4358,8 @@ (define_insn "*aarch64_<optab>_reg_di3_mask2"

>>>         (match_operand:DI 1 "register_operand" "r")

>>>         (match_operator 4 "subreg_lowpart_operator"

>>>          [(and:SI (match_operand:SI 2 "register_operand" "r")

>>> -             (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]

>>> -  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode)-1)) == 0)"

>>> +            (match_operand 3 "const_int_operand" "n"))])))]

>>> +  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode) - 1)) == 0)"

>>>   {

>>>     rtx xop[3];

>>>     xop[0] = operands[0];

>>> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr84164.c

>>> b/gcc/testsuite/gcc.c-torture/compile/pr84164.c

>>> new file mode 100644

>>> index

>>> 0000000000000000000000000000000000000000..d663fe5d6649e495f3e956e6a3bc938236a4bf91

>>>

>>> --- /dev/null

>>> +++ b/gcc/testsuite/gcc.c-torture/compile/pr84164.c

>>> @@ -0,0 +1,17 @@

>>> +/* PR target/84164.  */

>>> +

>>> +int b, d;

>>> +unsigned long c;

>>> +

>>> +static inline __attribute__ ((always_inline)) void

>>> +bar (int e)

>>> +{

>>> +  e += __builtin_sub_overflow_p (b, d, c);

>>> +  c = c << ((short) ~e & 3) | c >> (-((short) ~e & 3) & 63);

>>> +}

>>> +

>>> +int

>>> +foo (void)

>>> +{

>>> +  bar (~1);

>>> +}

>>>

>

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 04b2d203fa168bebcf6f93a13e3bd67a5998935a..eef0d1a780dd1c886e1bebb9992c552fb9d5b57c 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4358,8 +4358,8 @@  (define_insn "*aarch64_<optab>_reg_di3_mask2"
 	  (match_operand:DI 1 "register_operand" "r")
 	  (match_operator 4 "subreg_lowpart_operator"
 	   [(and:SI (match_operand:SI 2 "register_operand" "r")
-		     (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]
-  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode)-1)) == 0)"
+		    (match_operand 3 "const_int_operand" "n"))])))]
+  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode) - 1)) == 0)"
 {
   rtx xop[3];
   xop[0] = operands[0];
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr84164.c b/gcc/testsuite/gcc.c-torture/compile/pr84164.c
new file mode 100644
index 0000000000000000000000000000000000000000..d663fe5d6649e495f3e956e6a3bc938236a4bf91
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr84164.c
@@ -0,0 +1,17 @@ 
+/* PR target/84164.  */
+
+int b, d;
+unsigned long c;
+
+static inline __attribute__ ((always_inline)) void
+bar (int e)
+{
+  e += __builtin_sub_overflow_p (b, d, c);
+  c = c << ((short) ~e & 3) | c >> (-((short) ~e & 3) & 63);
+}
+
+int
+foo (void)
+{
+  bar (~1);
+}