[committed] i386: Additional peephole2 to use flags from CMPXCHG more [PR96189]

Message ID CAFULd4bF1iOF-yJ_Qs_uj7awuA2BRy=a5UesCW7xbdmKN7Lpzg@mail.gmail.com
State New
Headers show
Series
  • [committed] i386: Additional peephole2 to use flags from CMPXCHG more [PR96189]
Related show

Commit Message

Jonathan Wakely via Gcc-patches July 16, 2020, 6:17 p.m.
CMPXCHG instruction sets ZF flag if the values in the destination operand
and EAX register are equal; otherwise the ZF flag is cleared and value
from destination operand is loaded to EAX. Following assembly:

        xorl    %eax, %eax
        lock cmpxchgl   %edx, (%rdi)
        testl   %eax, %eax
        sete    %al

can be optimized by removing the unneeded comparison, since set ZF flag
signals that no update to EAX happened.  This patch adds peephole2
pattern to also handle XOR zeroing and load of -1 by OR.

2020-07-16  UroŇ° Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog:
    PR target/96189
    * config/i386/sync.md
    (peephole2 to remove unneded compare after CMPXCHG):
    New pattern, also handle XOR zeroing and load of -1 by OR.

gcc/testsuite/ChangeLog:
    PR target/96189
    * gcc.target/i386/pr96189-1.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.

Patch

diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index d203e9d1ecb..e22109039c1 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -629,6 +629,40 @@ 
 	      (set (reg:CCZ FLAGS_REG)
 		   (unspec_volatile:CCZ [(const_int 0)] UNSPECV_CMPXCHG))])])
 
+(define_peephole2
+  [(parallel [(set (match_operand:SWI48 0 "register_operand")
+		   (match_operand:SWI48 1 "const_int_operand"))
+	      (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_operand:SWI 2 "register_operand")
+		   (unspec_volatile:SWI
+		     [(match_operand:SWI 3 "memory_operand")
+		      (match_dup 2)
+		      (match_operand:SWI 4 "register_operand")
+		      (match_operand:SI 5 "const_int_operand")]
+		     UNSPECV_CMPXCHG))
+	      (set (match_dup 3)
+		   (unspec_volatile:SWI [(const_int 0)] UNSPECV_CMPXCHG))
+	      (set (reg:CCZ FLAGS_REG)
+		   (unspec_volatile:CCZ [(const_int 0)] UNSPECV_CMPXCHG))])
+   (set (reg:CCZ FLAGS_REG)
+	(compare:CCZ (match_dup 2)
+		     (match_dup 1)))]
+  "REGNO (operands[0]) == REGNO (operands[2])"
+  [(parallel [(set (match_dup 0)
+		   (match_dup 1))
+	      (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_dup 2)
+		   (unspec_volatile:SWI
+		     [(match_dup 3)
+		      (match_dup 2)
+		      (match_dup 4)
+		      (match_dup 5)]
+		     UNSPECV_CMPXCHG))
+	      (set (match_dup 3)
+		   (unspec_volatile:SWI [(const_int 0)] UNSPECV_CMPXCHG))
+	      (set (reg:CCZ FLAGS_REG)
+		   (unspec_volatile:CCZ [(const_int 0)] UNSPECV_CMPXCHG))])])
+
 ;; For operand 2 nonmemory_operand predicate is used instead of
 ;; register_operand to allow combiner to better optimize atomic
 ;; additions of constants.