[04/43] i386: Emulate MMX plusminus/sat_plusminus with SSE

Message ID 20190209132352.1828-5-hjl.tools@gmail.com
State New
Headers show
Series
  • V2: Emulate MMX intrinsics with SSE
Related show

Commit Message

H.J. Lu Feb. 9, 2019, 1:23 p.m.
Emulate MMX plusminus/sat_plusminus with SSE.  Only SSE register source
operand is allowed.

2019-02-08  H.J. Lu  <hongjiu.lu@intel.com>
	    Uros Bizjak  <ubizjak@gmail.com>

	PR target/89021
	* config/i386/mmx.md (MMXMODEI8): Require TARGET_SSE2 for V1DI.
	(<plusminus_insn><mode>3): New.
	(*mmx_<plusminus_insn><mode>3): Changed to define_insn_and_split
	to support SSE emulation.
	(*mmx_<plusminus_insn><mode>3): Likewise.
	(mmx_<plusminus_insn><mode>3): Also allow TARGET_MMX_WITH_SSE.
---
 gcc/config/i386/mmx.md | 51 +++++++++++++++++++++++++++++-------------
 1 file changed, 35 insertions(+), 16 deletions(-)

-- 
2.20.1

Comments

Uros Bizjak Feb. 9, 2019, 2:19 p.m. | #1
On 2/9/19, H.J. Lu <hjl.tools@gmail.com> wrote:
> Emulate MMX plusminus/sat_plusminus with SSE.  Only SSE register source

> operand is allowed.

>

> 2019-02-08  H.J. Lu  <hongjiu.lu@intel.com>

> 	    Uros Bizjak  <ubizjak@gmail.com>

>

> 	PR target/89021

> 	* config/i386/mmx.md (MMXMODEI8): Require TARGET_SSE2 for V1DI.

> 	(<plusminus_insn><mode>3): New.

> 	(*mmx_<plusminus_insn><mode>3): Changed to define_insn_and_split

> 	to support SSE emulation.

> 	(*mmx_<plusminus_insn><mode>3): Likewise.

> 	(mmx_<plusminus_insn><mode>3): Also allow TARGET_MMX_WITH_SSE.

> ---

>  gcc/config/i386/mmx.md | 51 +++++++++++++++++++++++++++++-------------

>  1 file changed, 35 insertions(+), 16 deletions(-)

>

> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md

> index 1d5ed83e7b2..01a71aa128b 100644

> --- a/gcc/config/i386/mmx.md

> +++ b/gcc/config/i386/mmx.md

> @@ -45,7 +45,7 @@

>

>  ;; 8 byte integral modes handled by MMX (and by extension, SSE)

>  (define_mode_iterator MMXMODEI [V8QI V4HI V2SI])

> -(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI V1DI])

> +(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI (V1DI "TARGET_SSE2")])

>

>  ;; All 8-byte vector modes handled by MMX

>  (define_mode_iterator MMXMODE [V8QI V4HI V2SI V1DI V2SF])

> @@ -698,34 +698,53 @@

>    "TARGET_MMX || (TARGET_SSE2 && <MODE>mode == V1DImode)"

>    "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")

>

> +(define_expand "<plusminus_insn><mode>3"

> +  [(set (match_operand:MMXMODEI 0 "register_operand")

> +	(plusminus:MMXMODEI

> +	  (match_operand:MMXMODEI 1 "nonimmediate_operand")

> +	  (match_operand:MMXMODEI 2 "nonimmediate_operand")))]

> +  "TARGET_MMX_WITH_SSE"

> +  "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")

> +

>  (define_insn "*mmx_<plusminus_insn><mode>3"

> -  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y")

> +  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y,Yx,Yy")

>          (plusminus:MMXMODEI8

> -	  (match_operand:MMXMODEI8 1 "nonimmediate_operand" "<comm>0")

> -	  (match_operand:MMXMODEI8 2 "nonimmediate_operand" "ym")))]

> -  "(TARGET_MMX || (TARGET_SSE2 && <MODE>mode == V1DImode))

> +	  (match_operand:MMXMODEI8 1 "nonimmediate_operand" "<comm>0,0,Yy")

> +	  (match_operand:MMXMODEI8 2 "nonimmediate_operand" "ym,Yx,Yy")))]

> +  "(TARGET_MMX

> +    || TARGET_MMX_WITH_SSE

> +    || (TARGET_SSE2 && <MODE>mode == V1DImode))


You don't need V1DImode bypass. This was wrong before the patch and
would break for -msse2 -mno-mmx, since the pattern uses MMX registers.

On a related note, all SSE2 mmx patterns (also in sse.md) should
depend on TARGET_MMX, since they currently use MMX registers. Before
your patch serie, this didn't trigger problems since 8-byte vector
modes were rarely used, but with a new autovectorizer opportunities,
some of these problems can and will trigger. Also note that we
currently enable MMX for SSE2 builtins to mitigate this problem.

Uros.

>     && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"

> -  "p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}"

> -  [(set_attr "type" "mmxadd")

> -   (set_attr "mode" "DI")])

> +  "@

> +   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}

> +   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}

> +   vp<plusminus_mnemonic><mmxvecsize>\t{%2, %1, %0|%0, %1, %2}"

> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")

> +   (set_attr "type" "mmxadd,sseadd,sseadd")

> +   (set_attr "mode" "DI,TI,TI")])

>

>  (define_expand "mmx_<plusminus_insn><mode>3"

>    [(set (match_operand:MMXMODE12 0 "register_operand")

>  	(sat_plusminus:MMXMODE12

>  	  (match_operand:MMXMODE12 1 "nonimmediate_operand")

>  	  (match_operand:MMXMODE12 2 "nonimmediate_operand")))]

> -  "TARGET_MMX"

> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"

>    "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")

>

>  (define_insn "*mmx_<plusminus_insn><mode>3"

> -  [(set (match_operand:MMXMODE12 0 "register_operand" "=y")

> +  [(set (match_operand:MMXMODE12 0 "register_operand" "=y,Yx,Yy")

>          (sat_plusminus:MMXMODE12

> -	  (match_operand:MMXMODE12 1 "nonimmediate_operand" "<comm>0")

> -	  (match_operand:MMXMODE12 2 "nonimmediate_operand" "ym")))]

> -  "TARGET_MMX && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"

> -  "p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}"

> -  [(set_attr "type" "mmxadd")

> -   (set_attr "mode" "DI")])

> +	  (match_operand:MMXMODE12 1 "nonimmediate_operand" "<comm>0,0,Yy")

> +	  (match_operand:MMXMODE12 2 "nonimmediate_operand" "ym,Yx,Yy")))]

> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)

> +   && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"

> +  "@

> +   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}

> +   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}

> +   vp<plusminus_mnemonic><mmxvecsize>\t{%2, %1, %0|%0, %1, %2}"

> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")

> +   (set_attr "type" "mmxadd,sseadd,sseadd")

> +   (set_attr "mode" "DI,TI,TI")])

>

>  (define_expand "mmx_mulv4hi3"

>    [(set (match_operand:V4HI 0 "register_operand")

> --

> 2.20.1

>

>

Patch

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 1d5ed83e7b2..01a71aa128b 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -45,7 +45,7 @@ 
 
 ;; 8 byte integral modes handled by MMX (and by extension, SSE)
 (define_mode_iterator MMXMODEI [V8QI V4HI V2SI])
-(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI V1DI])
+(define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI (V1DI "TARGET_SSE2")])
 
 ;; All 8-byte vector modes handled by MMX
 (define_mode_iterator MMXMODE [V8QI V4HI V2SI V1DI V2SF])
@@ -698,34 +698,53 @@ 
   "TARGET_MMX || (TARGET_SSE2 && <MODE>mode == V1DImode)"
   "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")
 
+(define_expand "<plusminus_insn><mode>3"
+  [(set (match_operand:MMXMODEI 0 "register_operand")
+	(plusminus:MMXMODEI
+	  (match_operand:MMXMODEI 1 "nonimmediate_operand")
+	  (match_operand:MMXMODEI 2 "nonimmediate_operand")))]
+  "TARGET_MMX_WITH_SSE"
+  "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")
+
 (define_insn "*mmx_<plusminus_insn><mode>3"
-  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODEI8 0 "register_operand" "=y,Yx,Yy")
         (plusminus:MMXMODEI8
-	  (match_operand:MMXMODEI8 1 "nonimmediate_operand" "<comm>0")
-	  (match_operand:MMXMODEI8 2 "nonimmediate_operand" "ym")))]
-  "(TARGET_MMX || (TARGET_SSE2 && <MODE>mode == V1DImode))
+	  (match_operand:MMXMODEI8 1 "nonimmediate_operand" "<comm>0,0,Yy")
+	  (match_operand:MMXMODEI8 2 "nonimmediate_operand" "ym,Yx,Yy")))]
+  "(TARGET_MMX
+    || TARGET_MMX_WITH_SSE
+    || (TARGET_SSE2 && <MODE>mode == V1DImode))
    && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
-  "p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "mode" "DI")])
+  "@
+   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}
+   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}
+   vp<plusminus_mnemonic><mmxvecsize>\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxadd,sseadd,sseadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_<plusminus_insn><mode>3"
   [(set (match_operand:MMXMODE12 0 "register_operand")
 	(sat_plusminus:MMXMODE12
 	  (match_operand:MMXMODE12 1 "nonimmediate_operand")
 	  (match_operand:MMXMODE12 2 "nonimmediate_operand")))]
-  "TARGET_MMX"
+  "TARGET_MMX || TARGET_MMX_WITH_SSE"
   "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")
 
 (define_insn "*mmx_<plusminus_insn><mode>3"
-  [(set (match_operand:MMXMODE12 0 "register_operand" "=y")
+  [(set (match_operand:MMXMODE12 0 "register_operand" "=y,Yx,Yy")
         (sat_plusminus:MMXMODE12
-	  (match_operand:MMXMODE12 1 "nonimmediate_operand" "<comm>0")
-	  (match_operand:MMXMODE12 2 "nonimmediate_operand" "ym")))]
-  "TARGET_MMX && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
-  "p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}"
-  [(set_attr "type" "mmxadd")
-   (set_attr "mode" "DI")])
+	  (match_operand:MMXMODE12 1 "nonimmediate_operand" "<comm>0,0,Yy")
+	  (match_operand:MMXMODE12 2 "nonimmediate_operand" "ym,Yx,Yy")))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
+  "@
+   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}
+   p<plusminus_mnemonic><mmxvecsize>\t{%2, %0|%0, %2}
+   vp<plusminus_mnemonic><mmxvecsize>\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
+   (set_attr "type" "mmxadd,sseadd,sseadd")
+   (set_attr "mode" "DI,TI,TI")])
 
 (define_expand "mmx_mulv4hi3"
   [(set (match_operand:V4HI 0 "register_operand")