[AArch64] Fix gcc.target/aarch64/subs_compare_[12].c

Message ID 5A674B5F.2000305@foss.arm.com
State New
Headers show
Series
  • [AArch64] Fix gcc.target/aarch64/subs_compare_[12].c
Related show

Commit Message

Kyrill Tkachov Jan. 23, 2018, 2:49 p.m.
Hi all,

This patch fixes the testsuite failures gcc.target/aarch64/subs_compare_1.c and subs_compare_2.c
The tests check that we combine a sequence like:
         sub     w2, w0, w1
         cmp     w0, w1

into
         subs    w2, w0, w1

This is done by a couple of peepholes in aarch64.md.

Unfortunately due to scheduling and other optimisations the SUB and CMP
can come in a different order:
         cmp     w0, w1
         sub     w0, w0, w1

And the existing peepholes cannot catch that and we fail to combine the two.
This patch adds a peephole that matches the CMP as the first insn and the SUB as the second
and outputs a SUBS.  This is almost equivalent to the existing peephole that matches SUB first and CMP second
except that it doesn't have the restriction that the output register of the SUB has to not be one of the input registers.
Remember "sub w0, w0, w1 ; cmp w0, w1" is *not* equivalent to: "subs  w0, w0, w1"
but "cmp w0, w1 ; sub w0, w0, w1" is.

So this is what this patch does. It adds a peephole for the case above and one for the SUB-immediate variant
(because the SUB-immediate is represented as PLUS-of-negated-immediate and thus has different RTL structure).

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?
Thanks,
Kyrill

2018-01-23  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/aarch64.md: Add peepholes for CMP + SUB -> SUBS
     and CMP + SUB-immediate -> SUBS.

Comments

James Greenhalgh Jan. 26, 2018, 5:09 p.m. | #1
On Tue, Jan 23, 2018 at 02:49:03PM +0000, Kyrill Tkachov wrote:
> Hi all,

> 

> This patch fixes the testsuite failures gcc.target/aarch64/subs_compare_1.c

> and subs_compare_2.c The tests check that we combine a sequence like:

>          sub     w2, w0, w1

>          cmp     w0, w1

> 

> into

>          subs    w2, w0, w1

> 

> This is done by a couple of peepholes in aarch64.md.

> 

> Unfortunately due to scheduling and other optimisations the SUB and CMP

> can come in a different order:

>          cmp     w0, w1

>          sub     w0, w0, w1

> 

> And the existing peepholes cannot catch that and we fail to combine the two.

> This patch adds a peephole that matches the CMP as the first insn and the SUB

> as the second and outputs a SUBS.  This is almost equivalent to the existing

> peephole that matches SUB first and CMP second except that it doesn't have

> the restriction that the output register of the SUB has to not be one of the

> input registers.  Remember "sub w0, w0, w1 ; cmp w0, w1" is *not* equivalent

> to: "subs  w0, w0, w1" but "cmp w0, w1 ; sub w0, w0, w1" is.

> 

> So this is what this patch does. It adds a peephole for the case above and

> one for the SUB-immediate variant (because the SUB-immediate is represented

> as PLUS-of-negated-immediate and thus has different RTL structure).

> 

> Bootstrapped and tested on aarch64-none-linux-gnu.

> 

> Ok for trunk?


OK.

Thanks,
James

> Thanks,

> Kyrill

> 

> 2018-01-23  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

> 

>      * config/aarch64/aarch64.md: Add peepholes for CMP + SUB -> SUBS

>      and CMP + SUB-immediate -> SUBS.

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index a6ecb391309494087416913c11f339cf10357977..49095f8f3d995907903c11b68cae25a919204a76 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2430,6 +2430,26 @@  (define_peephole2
   }
 )
 
+;; Same as the above peephole but with the compare and minus in
+;; swapped order.  The restriction on overlap between operand 0
+;; and operands 1 and 2 doesn't apply here.
+(define_peephole2
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (match_operand:GPI 1 "aarch64_reg_or_zero")
+	  (match_operand:GPI 2 "aarch64_reg_or_zero")))
+   (set (match_operand:GPI 0 "register_operand")
+	(minus:GPI (match_dup 1)
+		   (match_dup 2)))]
+  ""
+  [(const_int 0)]
+  {
+    emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1],
+					 operands[2]));
+    DONE;
+  }
+)
+
 (define_peephole2
   [(set (match_operand:GPI 0 "register_operand")
 	(plus:GPI (match_operand:GPI 1 "register_operand")
@@ -2448,6 +2468,26 @@  (define_peephole2
   }
 )
 
+;; Same as the above peephole but with the compare and minus in
+;; swapped order.  The restriction on overlap between operand 0
+;; and operands 1 doesn't apply here.
+(define_peephole2
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (match_operand:GPI 1 "register_operand")
+	  (match_operand:GPI 3 "const_int_operand")))
+   (set (match_operand:GPI 0 "register_operand")
+	(plus:GPI (match_dup 1)
+		  (match_operand:GPI 2 "aarch64_sub_immediate")))]
+  "INTVAL (operands[3]) == -INTVAL (operands[2])"
+  [(const_int 0)]
+  {
+    emit_insn (gen_sub<mode>3_compare1_imm (operands[0], operands[1],
+					 operands[2], operands[3]));
+    DONE;
+  }
+)
+
 (define_insn "*sub_<shift>_<mode>"
   [(set (match_operand:GPI 0 "register_operand" "=r")
 	(minus:GPI (match_operand:GPI 3 "register_operand" "r")