x86: Correctly optimize EVEX to 128-bit VEX/EVEX

Message ID 20190316224846.7527-1-hjl.tools@gmail.com
State New
Headers show
Series
  • x86: Correctly optimize EVEX to 128-bit VEX/EVEX
Related show

Commit Message

H.J. Lu March 16, 2019, 10:48 p.m.
We can optimize 512-bit EVEX to 128-bit EVEX encoding for upper 16
vector registers only when AVX512VL is enabled.  We can't optimize
EVEX to 128-bit VEX encoding when AVX isn't enabled.

	PR gas/24352
	* config/tc-i386.c (optimize_encoding): Encode 512-bit EVEX
	with 128-bit VEX encoding only when AVX is enabled and with
	128-bit EVEX encoding only when AVX512VL is enabled.
	* testsuite/gas/i386/i386.exp: Run PR gas/24352 tests.
	* testsuite/gas/i386/optimize-6.s: New file.
	* testsuite/gas/i386/optimize-6a.d: Likewise.
	* testsuite/gas/i386/optimize-6b.d: Likewise.
	* testsuite/gas/i386/optimize-6c.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-7.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-7a.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-7b.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-7c.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-2.d: Updated.
---
 gas/config/tc-i386.c                        | 18 ++++--
 gas/testsuite/gas/i386/i386.exp             |  6 ++
 gas/testsuite/gas/i386/optimize-6.s         | 46 +++++++++++++++
 gas/testsuite/gas/i386/optimize-6a.d        | 40 +++++++++++++
 gas/testsuite/gas/i386/optimize-6b.d        | 40 +++++++++++++
 gas/testsuite/gas/i386/optimize-6c.d        | 40 +++++++++++++
 gas/testsuite/gas/i386/x86-64-optimize-2.d  | 48 ++++++++--------
 gas/testsuite/gas/i386/x86-64-optimize-7.s  | 64 +++++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-optimize-7a.d | 60 +++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-optimize-7b.d | 60 +++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-optimize-7c.d | 60 +++++++++++++++++++
 11 files changed, 453 insertions(+), 29 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/optimize-6.s
 create mode 100644 gas/testsuite/gas/i386/optimize-6a.d
 create mode 100644 gas/testsuite/gas/i386/optimize-6b.d
 create mode 100644 gas/testsuite/gas/i386/optimize-6c.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-7.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-7a.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-7b.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-7c.d

-- 
2.20.1

Comments

Jan Beulich March 18, 2019, 11:31 a.m. | #1
>>> On 16.03.19 at 23:48, <hjl.tools@gmail.com> wrote:

> We can optimize 512-bit EVEX to 128-bit EVEX encoding for upper 16

> vector registers only when AVX512VL is enabled.  We can't optimize

> EVEX to 128-bit VEX encoding when AVX isn't enabled.


I don't understand the last sentence: AVX is a prereq to anything
that's EVEX-encoded, at least as of now. "-march=+noavx" should
really result in all of AVX512 to also get disabled.

> --- a/gas/config/tc-i386.c

> +++ b/gas/config/tc-i386.c

> @@ -3975,10 +3975,13 @@ optimize_encoding (void)

>  		   && !i.rounding

>  		   && is_evex_encoding (&i.tm)

>  		   && (i.vec_encoding != vex_encoding_evex

> +		       || cpu_arch_flags.bitfield.cpuavx

> +		       || cpu_arch_isa_flags.bitfield.cpuavx

> +		       || cpu_arch_flags.bitfield.cpuavx512vl

> +		       || cpu_arch_isa_flags.bitfield.cpuavx512vl


cpu_arch_flags starts out with (almost) all bits set. It was for that
reason that ...

>  		       || i.tm.cpu_flags.bitfield.cpuavx512vl

>  		       || (i.tm.operand_types[2].bitfield.zmmword

> -			   && i.types[2].bitfield.ymmword)

> -		       || cpu_arch_isa_flags.bitfield.cpuavx512vl)))

> +			   && i.types[2].bitfield.ymmword))))


... originally only cpu_arch_isa_flags got checked here.

Jan
H.J. Lu March 19, 2019, 2:51 a.m. | #2
On Mon, Mar 18, 2019 at 7:31 PM Jan Beulich <JBeulich@suse.com> wrote:
>

> >>> On 16.03.19 at 23:48, <hjl.tools@gmail.com> wrote:

> > We can optimize 512-bit EVEX to 128-bit EVEX encoding for upper 16

> > vector registers only when AVX512VL is enabled.  We can't optimize

> > EVEX to 128-bit VEX encoding when AVX isn't enabled.

>

> I don't understand the last sentence: AVX is a prereq to anything

> that's EVEX-encoded, at least as of now. "-march=+noavx" should

> really result in all of AVX512 to also get disabled.


I opened:

https://sourceware.org/bugzilla/show_bug.cgi?id=24359

and enclosed a patch here.

> > --- a/gas/config/tc-i386.c

> > +++ b/gas/config/tc-i386.c

> > @@ -3975,10 +3975,13 @@ optimize_encoding (void)

> >                  && !i.rounding

> >                  && is_evex_encoding (&i.tm)

> >                  && (i.vec_encoding != vex_encoding_evex

> > +                    || cpu_arch_flags.bitfield.cpuavx

> > +                    || cpu_arch_isa_flags.bitfield.cpuavx

> > +                    || cpu_arch_flags.bitfield.cpuavx512vl

> > +                    || cpu_arch_isa_flags.bitfield.cpuavx512vl

>

> cpu_arch_flags starts out with (almost) all bits set. It was for that

> reason that ...

>

> >                      || i.tm.cpu_flags.bitfield.cpuavx512vl

> >                      || (i.tm.operand_types[2].bitfield.zmmword

> > -                        && i.types[2].bitfield.ymmword)

> > -                    || cpu_arch_isa_flags.bitfield.cpuavx512vl)))

> > +                        && i.types[2].bitfield.ymmword))))

>

> ... originally only cpu_arch_isa_flags got checked here.

>


I am looking into it.

--
H.J.
Jan Beulich March 19, 2019, 8:16 a.m. | #3
>>> On 19.03.19 at 03:51, <hjl.tools@gmail.com> wrote:

> On Mon, Mar 18, 2019 at 7:31 PM Jan Beulich <JBeulich@suse.com> wrote:

>>

>> >>> On 16.03.19 at 23:48, <hjl.tools@gmail.com> wrote:

>> > We can optimize 512-bit EVEX to 128-bit EVEX encoding for upper 16

>> > vector registers only when AVX512VL is enabled.  We can't optimize

>> > EVEX to 128-bit VEX encoding when AVX isn't enabled.

>>

>> I don't understand the last sentence: AVX is a prereq to anything

>> that's EVEX-encoded, at least as of now. "-march=+noavx" should

>> really result in all of AVX512 to also get disabled.

> 

> I opened:

> 

> https://sourceware.org/bugzilla/show_bug.cgi?id=24359 

> 

> and enclosed a patch here.


Looks good for the specific purpose, but I think there's further
cleanup potential then since you had committed the earlier
patch already: The cpuavx conditionals become pointless when
we already know the encoding is vex_encoding_evex or
is_evex_encoding() has returned true.

Jan

Patch

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 1b1b0a95da..959fda2b84 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -3975,10 +3975,13 @@  optimize_encoding (void)
 		   && !i.rounding
 		   && is_evex_encoding (&i.tm)
 		   && (i.vec_encoding != vex_encoding_evex
+		       || cpu_arch_flags.bitfield.cpuavx
+		       || cpu_arch_isa_flags.bitfield.cpuavx
+		       || cpu_arch_flags.bitfield.cpuavx512vl
+		       || cpu_arch_isa_flags.bitfield.cpuavx512vl
 		       || i.tm.cpu_flags.bitfield.cpuavx512vl
 		       || (i.tm.operand_types[2].bitfield.zmmword
-			   && i.types[2].bitfield.ymmword)
-		       || cpu_arch_isa_flags.bitfield.cpuavx512vl)))
+			   && i.types[2].bitfield.ymmword))))
 	   && ((i.tm.base_opcode == 0x55
 		|| i.tm.base_opcode == 0x6655
 		|| i.tm.base_opcode == 0x66df
@@ -4032,14 +4035,19 @@  optimize_encoding (void)
        */
       if (is_evex_encoding (&i.tm))
 	{
-	  if (i.vec_encoding == vex_encoding_evex)
-	    i.tm.opcode_modifier.evex = EVEX128;
-	  else
+	  if (i.vec_encoding != vex_encoding_evex
+	      && (cpu_arch_flags.bitfield.cpuavx
+		  || cpu_arch_isa_flags.bitfield.cpuavx))
 	    {
 	      i.tm.opcode_modifier.vex = VEX128;
 	      i.tm.opcode_modifier.vexw = VEXW0;
 	      i.tm.opcode_modifier.evex = 0;
 	    }
+	  else if (cpu_arch_flags.bitfield.cpuavx512vl
+		   || cpu_arch_isa_flags.bitfield.cpuavx512vl)
+	    i.tm.opcode_modifier.evex = EVEX128;
+	  else
+	    return;
 	}
       else if (i.tm.operand_types[0].bitfield.regmask)
 	{
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 22ee134fae..61a13eaf9e 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -472,6 +472,9 @@  if [expr ([istarget "i*86-*-*"] ||  [istarget "x86_64-*-*"]) && [gas_32_check]]
     run_dump_test "optimize-3"
     run_dump_test "optimize-4"
     run_dump_test "optimize-5"
+    run_dump_test "optimize-6a"
+    run_dump_test "optimize-6b"
+    run_dump_test "optimize-6c"
 
     # These tests require support for 8 and 16 bit relocs,
     # so we only run them for ELF and COFF targets.
@@ -982,6 +985,9 @@  if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
     run_dump_test "x86-64-optimize-4"
     run_dump_test "x86-64-optimize-5"
     run_dump_test "x86-64-optimize-6"
+    run_dump_test "x86-64-optimize-7a"
+    run_dump_test "x86-64-optimize-7b"
+    run_dump_test "x86-64-optimize-7c"
 
     if { ![istarget "*-*-aix*"]
       && ![istarget "*-*-beos*"]
diff --git a/gas/testsuite/gas/i386/optimize-6.s b/gas/testsuite/gas/i386/optimize-6.s
new file mode 100644
index 0000000000..103200aa98
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-6.s
@@ -0,0 +1,46 @@ 
+# Check instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	vandnpd %zmm1, %zmm1, %zmm5{%k7}
+	vandnpd %zmm1, %zmm1, %zmm5
+
+	vandnps %zmm1, %zmm1, %zmm5{%k7}
+	vandnps %zmm1, %zmm1, %zmm5
+
+	vpandnd %zmm1, %zmm1, %zmm5{%k7}
+	vpandnd %zmm1, %zmm1, %zmm5
+
+	vpandnq %zmm1, %zmm1, %zmm5{%k7}
+	vpandnq %zmm1, %zmm1, %zmm5
+
+	vxorpd %zmm1, %zmm1, %zmm5{%k7}
+	vxorpd %zmm1, %zmm1, %zmm5
+
+	vxorps %zmm1, %zmm1, %zmm5{%k7}
+	vxorps %zmm1, %zmm1, %zmm5
+
+	vpxord %zmm1, %zmm1, %zmm5{%k7}
+	vpxord %zmm1, %zmm1, %zmm5
+
+	vpxorq %zmm1, %zmm1, %zmm5{%k7}
+	vpxorq %zmm1, %zmm1, %zmm5
+
+	vpsubb %zmm1, %zmm1, %zmm5{%k7}
+	vpsubb %zmm1, %zmm1, %zmm5
+
+	vpsubw %zmm1, %zmm1, %zmm5{%k7}
+	vpsubw %zmm1, %zmm1, %zmm5
+
+	vpsubd %zmm1, %zmm1, %zmm5{%k7}
+	vpsubd %zmm1, %zmm1, %zmm5
+
+	vpsubq %zmm1, %zmm1, %zmm5{%k7}
+	vpsubq %zmm1, %zmm1, %zmm5
+
+	kxord %k1, %k1, %k5
+	kxorq %k1, %k1, %k5
+
+	kandnd %k1, %k1, %k5
+	kandnq %k1, %k1, %k5
diff --git a/gas/testsuite/gas/i386/optimize-6a.d b/gas/testsuite/gas/i386/optimize-6a.d
new file mode 100644
index 0000000000..d852f8663a
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-6a.d
@@ -0,0 +1,40 @@ 
+#source: optimize-6.s
+#as: -O2 -march=+noavx
+#objdump: -drw
+#name: optimized encoding 6a with -O2
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 f1 f5 4f 55 e9    	vandnpd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 08 55 e9    	vandnpd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 74 4f 55 e9    	vandnps %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 74 08 55 e9    	vandnps %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f df e9    	vpandnd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 08 df e9    	vpandnd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f df e9    	vpandnq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 08 df e9    	vpandnq %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f 57 e9    	vxorpd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 08 57 e9    	vxorpd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 74 4f 57 e9    	vxorps %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 74 08 57 e9    	vxorps %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f ef e9    	vpxord %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 08 ef e9    	vpxord %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f ef e9    	vpxorq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 08 ef e9    	vpxorq %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f f8 e9    	vpsubb %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 08 f8 e9    	vpsubb %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f f9 e9    	vpsubw %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 08 f9 e9    	vpsubw %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f fa e9    	vpsubd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 08 fa e9    	vpsubd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f fb e9    	vpsubq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 08 fb e9    	vpsubq %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f4 47 e9          	kxorw  %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 47 e9          	kxorw  %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 42 e9          	kandnw %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 42 e9          	kandnw %k1,%k1,%k5
+#pass
diff --git a/gas/testsuite/gas/i386/optimize-6b.d b/gas/testsuite/gas/i386/optimize-6b.d
new file mode 100644
index 0000000000..0ad37b1ac5
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-6b.d
@@ -0,0 +1,40 @@ 
+#source: optimize-6.s
+#as: -O2 -march=+noavx512vl
+#objdump: -drw
+#name: optimized encoding 6b with -O2
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 f1 f5 4f 55 e9    	vandnpd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 55 e9          	vandnpd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 74 4f 55 e9    	vandnps %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f0 55 e9          	vandnps %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f df e9    	vpandnd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 df e9          	vpandn %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f df e9    	vpandnq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 df e9          	vpandn %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f 57 e9    	vxorpd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 57 e9          	vxorpd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 74 4f 57 e9    	vxorps %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f0 57 e9          	vxorps %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f ef e9    	vpxord %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 ef e9          	vpxor  %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f ef e9    	vpxorq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 ef e9          	vpxor  %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f f8 e9    	vpsubb %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 f8 e9          	vpsubb %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f f9 e9    	vpsubw %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 f9 e9          	vpsubw %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f fa e9    	vpsubd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 fa e9          	vpsubd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f fb e9    	vpsubq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	c5 f1 fb e9          	vpsubq %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f4 47 e9          	kxorw  %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 47 e9          	kxorw  %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 42 e9          	kandnw %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 42 e9          	kandnw %k1,%k1,%k5
+#pass
diff --git a/gas/testsuite/gas/i386/optimize-6c.d b/gas/testsuite/gas/i386/optimize-6c.d
new file mode 100644
index 0000000000..4c70f03d1f
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-6c.d
@@ -0,0 +1,40 @@ 
+#source: optimize-6.s
+#as: -O2 -march=+noavx+noavx512vl
+#objdump: -drw
+#name: optimized encoding 6c with -O2
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 f1 f5 4f 55 e9    	vandnpd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 48 55 e9    	vandnpd %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 74 4f 55 e9    	vandnps %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 74 48 55 e9    	vandnps %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 75 4f df e9    	vpandnd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 48 df e9    	vpandnd %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 f5 4f df e9    	vpandnq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 48 df e9    	vpandnq %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 f5 4f 57 e9    	vxorpd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 48 57 e9    	vxorpd %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 74 4f 57 e9    	vxorps %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 74 48 57 e9    	vxorps %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 75 4f ef e9    	vpxord %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 48 ef e9    	vpxord %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 f5 4f ef e9    	vpxorq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 48 ef e9    	vpxorq %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 75 4f f8 e9    	vpsubb %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 48 f8 e9    	vpsubb %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 75 4f f9 e9    	vpsubw %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 48 f9 e9    	vpsubw %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 75 4f fa e9    	vpsubd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 48 fa e9    	vpsubd %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	62 f1 f5 4f fb e9    	vpsubq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 48 fb e9    	vpsubq %zmm1,%zmm1,%zmm5
+ +[a-f0-9]+:	c5 f4 47 e9          	kxorw  %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 47 e9          	kxorw  %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 42 e9          	kandnw %k1,%k1,%k5
+ +[a-f0-9]+:	c5 f4 42 e9          	kandnw %k1,%k1,%k5
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-2.d b/gas/testsuite/gas/i386/x86-64-optimize-2.d
index f374619d4a..fa031e8893 100644
--- a/gas/testsuite/gas/i386/x86-64-optimize-2.d
+++ b/gas/testsuite/gas/i386/x86-64-optimize-2.d
@@ -12,98 +12,98 @@  Disassembly of section .text:
  +[a-f0-9]+:	c5 71 55 f9          	vandnpd %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 55 f9          	vandnpd %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 55 f9          	vandnpd %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 f5 48 55 c1    	vandnpd %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 f5 08 55 c1    	vandnpd %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 f5 40 55 c9    	vandnpd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 f5 08 55 c1    	vandnpd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 55 c9    	vandnpd %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 f5 00 55 c9    	vandnpd %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 74 4f 55 f9    	vandnps %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 70 55 f9          	vandnps %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 70 55 f9          	vandnps %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 70 55 f9          	vandnps %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 74 48 55 c1    	vandnps %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 74 08 55 c1    	vandnps %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 74 40 55 c9    	vandnps %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 74 08 55 c1    	vandnps %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 74 00 55 c9    	vandnps %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 74 00 55 c9    	vandnps %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	62 71 75 4f df f9    	vpandnd %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 75 48 df c1    	vpandnd %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 75 08 df c1    	vpandnd %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 75 40 df c9    	vpandnd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 75 08 df c1    	vpandnd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 df c9    	vpandnd %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 75 00 df c9    	vpandnd %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 f5 4f df f9    	vpandnq %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 f5 48 df c1    	vpandnq %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 f5 08 df c1    	vpandnq %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 f5 40 df c9    	vpandnq %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 f5 08 df c1    	vpandnq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 df c9    	vpandnq %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 f5 00 df c9    	vpandnq %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 f5 4f 57 f9    	vxorpd %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 57 f9          	vxorpd %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 57 f9          	vxorpd %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 57 f9          	vxorpd %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 f5 48 57 c1    	vxorpd %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 f5 08 57 c1    	vxorpd %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 f5 40 57 c9    	vxorpd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 f5 08 57 c1    	vxorpd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 57 c9    	vxorpd %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 f5 00 57 c9    	vxorpd %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 74 4f 57 f9    	vxorps %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 70 57 f9          	vxorps %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 70 57 f9          	vxorps %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 70 57 f9          	vxorps %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 74 48 57 c1    	vxorps %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 74 08 57 c1    	vxorps %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 74 40 57 c9    	vxorps %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 74 08 57 c1    	vxorps %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 74 00 57 c9    	vxorps %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 74 00 57 c9    	vxorps %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	62 71 75 4f ef f9    	vpxord %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 75 48 ef c1    	vpxord %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 75 08 ef c1    	vpxord %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 75 40 ef c9    	vpxord %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 75 08 ef c1    	vpxord %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 ef c9    	vpxord %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 75 00 ef c9    	vpxord %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 f5 4f ef f9    	vpxorq %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 f5 48 ef c1    	vpxorq %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 f5 08 ef c1    	vpxorq %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 f5 40 ef c9    	vpxorq %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 f5 08 ef c1    	vpxorq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 ef c9    	vpxorq %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 f5 00 ef c9    	vpxorq %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 75 4f f8 f9    	vpsubb %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 f8 f9          	vpsubb %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 f8 f9          	vpsubb %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 f8 f9          	vpsubb %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 75 48 f8 c1    	vpsubb %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 75 08 f8 c1    	vpsubb %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 75 40 f8 c9    	vpsubb %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 75 08 f8 c1    	vpsubb %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 f8 c9    	vpsubb %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 75 00 f8 c9    	vpsubb %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 75 4f f9 f9    	vpsubw %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 f9 f9          	vpsubw %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 f9 f9          	vpsubw %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 f9 f9          	vpsubw %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 75 48 f9 c1    	vpsubw %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 75 08 f9 c1    	vpsubw %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 75 40 f9 c9    	vpsubw %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 75 08 f9 c1    	vpsubw %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 f9 c9    	vpsubw %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 75 00 f9 c9    	vpsubw %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 75 4f fa f9    	vpsubd %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 fa f9          	vpsubd %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 fa f9          	vpsubd %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 fa f9          	vpsubd %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 75 48 fa c1    	vpsubd %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 75 08 fa c1    	vpsubd %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 75 40 fa c9    	vpsubd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 75 08 fa c1    	vpsubd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 fa c9    	vpsubd %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 75 00 fa c9    	vpsubd %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 71 f5 4f fb f9    	vpsubq %zmm1,%zmm1,%zmm15\{%k7\}
  +[a-f0-9]+:	c5 71 fb f9          	vpsubq %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 fb f9          	vpsubq %xmm1,%xmm1,%xmm15
  +[a-f0-9]+:	c5 71 fb f9          	vpsubq %xmm1,%xmm1,%xmm15
- +[a-f0-9]+:	62 e1 f5 48 fb c1    	vpsubq %zmm1,%zmm1,%zmm16
  +[a-f0-9]+:	62 e1 f5 08 fb c1    	vpsubq %xmm1,%xmm1,%xmm16
- +[a-f0-9]+:	62 b1 f5 40 fb c9    	vpsubq %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 e1 f5 08 fb c1    	vpsubq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 fb c9    	vpsubq %xmm17,%xmm17,%xmm1
  +[a-f0-9]+:	62 b1 f5 00 fb c9    	vpsubq %xmm17,%xmm17,%xmm1
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-7.s b/gas/testsuite/gas/i386/x86-64-optimize-7.s
new file mode 100644
index 0000000000..0112f47b03
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-7.s
@@ -0,0 +1,64 @@ 
+# Check 64bit instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	vandnpd %zmm1, %zmm1, %zmm15{%k7}
+	vandnpd %zmm1, %zmm1, %zmm15
+	vandnpd %zmm1, %zmm1, %zmm16
+	vandnpd %zmm17, %zmm17, %zmm1
+
+	vandnps %zmm1, %zmm1, %zmm15{%k7}
+	vandnps %zmm1, %zmm1, %zmm15
+	vandnps %zmm1, %zmm1, %zmm16
+	vandnps %zmm17, %zmm17, %zmm1
+
+	vpandnd %zmm1, %zmm1, %zmm15{%k7}
+	vpandnd %zmm1, %zmm1, %zmm15
+	vpandnd %zmm1, %zmm1, %zmm16
+	vpandnd %zmm17, %zmm17, %zmm1
+
+	vpandnq %zmm1, %zmm1, %zmm15{%k7}
+	vpandnq %zmm1, %zmm1, %zmm15
+	vpandnq %zmm1, %zmm1, %zmm16
+	vpandnq %zmm17, %zmm17, %zmm1
+
+	vxorpd %zmm1, %zmm1, %zmm15{%k7}
+	vxorpd %zmm1, %zmm1, %zmm15
+	vxorpd %zmm1, %zmm1, %zmm16
+	vxorpd %zmm17, %zmm17, %zmm1
+
+	vxorps %zmm1, %zmm1, %zmm15{%k7}
+	vxorps %zmm1, %zmm1, %zmm15
+	vxorps %zmm1, %zmm1, %zmm16
+	vxorps %zmm17, %zmm17, %zmm1
+
+	vpxord %zmm1, %zmm1, %zmm15{%k7}
+	vpxord %zmm1, %zmm1, %zmm15
+	vpxord %zmm1, %zmm1, %zmm16
+	vpxord %zmm17, %zmm17, %zmm1
+
+	vpxorq %zmm1, %zmm1, %zmm15{%k7}
+	vpxorq %zmm1, %zmm1, %zmm15
+	vpxorq %zmm1, %zmm1, %zmm16
+	vpxorq %zmm17, %zmm17, %zmm1
+
+	vpsubb %zmm1, %zmm1, %zmm15{%k7}
+	vpsubb %zmm1, %zmm1, %zmm15
+	vpsubb %zmm1, %zmm1, %zmm16
+	vpsubb %zmm17, %zmm17, %zmm1
+
+	vpsubw %zmm1, %zmm1, %zmm15{%k7}
+	vpsubw %zmm1, %zmm1, %zmm15
+	vpsubw %zmm1, %zmm1, %zmm16
+	vpsubw %zmm17, %zmm17, %zmm1
+
+	vpsubd %zmm1, %zmm1, %zmm15{%k7}
+	vpsubd %zmm1, %zmm1, %zmm15
+	vpsubd %zmm1, %zmm1, %zmm16
+	vpsubd %zmm17, %zmm17, %zmm1
+
+	vpsubq %zmm1, %zmm1, %zmm15{%k7}
+	vpsubq %zmm1, %zmm1, %zmm15
+	vpsubq %zmm1, %zmm1, %zmm16
+	vpsubq %zmm17, %zmm17, %zmm1
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-7a.d b/gas/testsuite/gas/i386/x86-64-optimize-7a.d
new file mode 100644
index 0000000000..4e48c061c1
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-7a.d
@@ -0,0 +1,60 @@ 
+#source: x86-64-optimize-7.s
+#as: -O2 -march=+noavx
+#objdump: -drw
+#name: x86-64 optimized encoding 7a with -O2
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 71 f5 4f 55 f9    	vandnpd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 08 55 f9    	vandnpd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 55 c1    	vandnpd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 55 c9    	vandnpd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 74 4f 55 f9    	vandnps %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 74 08 55 f9    	vandnps %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 74 08 55 c1    	vandnps %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 74 00 55 c9    	vandnps %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 75 4f df f9    	vpandnd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 08 df f9    	vpandnd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 08 df c1    	vpandnd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 df c9    	vpandnd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 f5 4f df f9    	vpandnq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 08 df f9    	vpandnq %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 df c1    	vpandnq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 df c9    	vpandnq %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 f5 4f 57 f9    	vxorpd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 08 57 f9    	vxorpd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 57 c1    	vxorpd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 57 c9    	vxorpd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 74 4f 57 f9    	vxorps %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 74 08 57 f9    	vxorps %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 74 08 57 c1    	vxorps %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 74 00 57 c9    	vxorps %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 75 4f ef f9    	vpxord %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 08 ef f9    	vpxord %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 08 ef c1    	vpxord %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 ef c9    	vpxord %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 f5 4f ef f9    	vpxorq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 08 ef f9    	vpxorq %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 ef c1    	vpxorq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 ef c9    	vpxorq %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 75 4f f8 f9    	vpsubb %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 08 f8 f9    	vpsubb %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 08 f8 c1    	vpsubb %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 f8 c9    	vpsubb %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 75 4f f9 f9    	vpsubw %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 08 f9 f9    	vpsubw %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 08 f9 c1    	vpsubw %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 f9 c9    	vpsubw %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 75 4f fa f9    	vpsubd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 08 fa f9    	vpsubd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 08 fa c1    	vpsubd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 fa c9    	vpsubd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 f5 4f fb f9    	vpsubq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 08 fb f9    	vpsubq %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 fb c1    	vpsubq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 fb c9    	vpsubq %xmm17,%xmm17,%xmm1
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-7b.d b/gas/testsuite/gas/i386/x86-64-optimize-7b.d
new file mode 100644
index 0000000000..f68711da8b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-7b.d
@@ -0,0 +1,60 @@ 
+#source: x86-64-optimize-7.s
+#as: -O2 -march=+noavx512vl
+#objdump: -drw
+#name: x86-64 optimized encoding 7b with -O2
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 71 f5 4f 55 f9    	vandnpd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 55 f9          	vandnpd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 48 55 c1    	vandnpd %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 55 c9    	vandnpd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 74 4f 55 f9    	vandnps %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 70 55 f9          	vandnps %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 74 48 55 c1    	vandnps %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 74 40 55 c9    	vandnps %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f df f9    	vpandnd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 48 df c1    	vpandnd %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 df c9    	vpandnd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 f5 4f df f9    	vpandnq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 48 df c1    	vpandnq %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 df c9    	vpandnq %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 f5 4f 57 f9    	vxorpd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 57 f9          	vxorpd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 48 57 c1    	vxorpd %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 57 c9    	vxorpd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 74 4f 57 f9    	vxorps %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 70 57 f9          	vxorps %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 74 48 57 c1    	vxorps %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 74 40 57 c9    	vxorps %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f ef f9    	vpxord %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 48 ef c1    	vpxord %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 ef c9    	vpxord %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 f5 4f ef f9    	vpxorq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 48 ef c1    	vpxorq %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 ef c9    	vpxorq %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f f8 f9    	vpsubb %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 f8 f9          	vpsubb %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 48 f8 c1    	vpsubb %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 f8 c9    	vpsubb %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f f9 f9    	vpsubw %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 f9 f9          	vpsubw %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 48 f9 c1    	vpsubw %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 f9 c9    	vpsubw %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f fa f9    	vpsubd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 fa f9          	vpsubd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 48 fa c1    	vpsubd %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 fa c9    	vpsubd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 f5 4f fb f9    	vpsubq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	c5 71 fb f9          	vpsubq %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 48 fb c1    	vpsubq %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 fb c9    	vpsubq %zmm17,%zmm17,%zmm1
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-7c.d b/gas/testsuite/gas/i386/x86-64-optimize-7c.d
new file mode 100644
index 0000000000..c22d264a57
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-7c.d
@@ -0,0 +1,60 @@ 
+#source: x86-64-optimize-7.s
+#as: -O2 -march=+noavx+noavx512vl
+#objdump: -drw
+#name: x86-64 optimized encoding 7c with -O2
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 71 f5 4f 55 f9    	vandnpd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 48 55 f9    	vandnpd %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 f5 48 55 c1    	vandnpd %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 55 c9    	vandnpd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 74 4f 55 f9    	vandnps %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 74 48 55 f9    	vandnps %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 74 48 55 c1    	vandnps %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 74 40 55 c9    	vandnps %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f df f9    	vpandnd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 48 df f9    	vpandnd %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 75 48 df c1    	vpandnd %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 df c9    	vpandnd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 f5 4f df f9    	vpandnq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 48 df f9    	vpandnq %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 f5 48 df c1    	vpandnq %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 df c9    	vpandnq %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 f5 4f 57 f9    	vxorpd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 48 57 f9    	vxorpd %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 f5 48 57 c1    	vxorpd %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 57 c9    	vxorpd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 74 4f 57 f9    	vxorps %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 74 48 57 f9    	vxorps %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 74 48 57 c1    	vxorps %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 74 40 57 c9    	vxorps %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f ef f9    	vpxord %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 48 ef f9    	vpxord %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 75 48 ef c1    	vpxord %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 ef c9    	vpxord %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 f5 4f ef f9    	vpxorq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 48 ef f9    	vpxorq %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 f5 48 ef c1    	vpxorq %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 ef c9    	vpxorq %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f f8 f9    	vpsubb %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 48 f8 f9    	vpsubb %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 75 48 f8 c1    	vpsubb %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 f8 c9    	vpsubb %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f f9 f9    	vpsubw %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 48 f9 f9    	vpsubw %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 75 48 f9 c1    	vpsubw %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 f9 c9    	vpsubw %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 75 4f fa f9    	vpsubd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 48 fa f9    	vpsubd %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 75 48 fa c1    	vpsubd %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 75 40 fa c9    	vpsubd %zmm17,%zmm17,%zmm1
+ +[a-f0-9]+:	62 71 f5 4f fb f9    	vpsubq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 48 fb f9    	vpsubq %zmm1,%zmm1,%zmm15
+ +[a-f0-9]+:	62 e1 f5 48 fb c1    	vpsubq %zmm1,%zmm1,%zmm16
+ +[a-f0-9]+:	62 b1 f5 40 fb c9    	vpsubq %zmm17,%zmm17,%zmm1
+#pass