x86: drop SSE4a from SSE check again

Message ID 072a61bd-32a6-93a4-4663-90d3807ab89a@suse.com
State New
Headers show
Series
  • x86: drop SSE4a from SSE check again
Related show

Commit Message

Jan Beulich June 9, 2020, 12:46 p.m.
Upon re-consideration in commit 569d50f1c611 ("x86: further refine SSE
check (SSE4a, SHA, GFNI)") I went too far: Mixing of SSE and AVX insns
doesn't suffer as bad a penalty on AMD CPUs as on Intel ones. SSE4a
being an AMD-only extension, it shouldn't be part of the ISA extensions
set for which the diagnostic may get issued. Undo that part.

gas/
2020-06-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (md_assemble): Drop SSE4a from SSE check
	conditional.
	* testsuite/gas/i386/sse-check.s: Adjust comment.
	* testsuite/gas/i386/sse-check-error.l,
	testsuite/gas/i386/sse-check-warn.e,
	testsuite/gas/i386/x86-64-sse-check-error.l: Adjust
	expectations.

Comments

David Faust via Binutils June 9, 2020, 12:48 p.m. | #1
On Tue, Jun 9, 2020 at 5:46 AM Jan Beulich <jbeulich@suse.com> wrote:
>

> Upon re-consideration in commit 569d50f1c611 ("x86: further refine SSE

> check (SSE4a, SHA, GFNI)") I went too far: Mixing of SSE and AVX insns

> doesn't suffer as bad a penalty on AMD CPUs as on Intel ones. SSE4a

> being an AMD-only extension, it shouldn't be part of the ISA extensions

> set for which the diagnostic may get issued. Undo that part.

>

> gas/

> 2020-06-XX  Jan Beulich  <jbeulich@suse.com>

>

>         * config/tc-i386.c (md_assemble): Drop SSE4a from SSE check

>         conditional.

>         * testsuite/gas/i386/sse-check.s: Adjust comment.

>         * testsuite/gas/i386/sse-check-error.l,

>         testsuite/gas/i386/sse-check-warn.e,

>         testsuite/gas/i386/x86-64-sse-check-error.l: Adjust

>         expectations.

>


OK.

Thanks.


-- 
H.J.
Jan Beulich June 9, 2020, 12:52 p.m. | #2
On 09.06.2020 14:46, Jan Beulich wrote:
> Upon re-consideration in commit 569d50f1c611 ("x86: further refine SSE

> check (SSE4a, SHA, GFNI)") I went too far: Mixing of SSE and AVX insns

> doesn't suffer as bad a penalty on AMD CPUs as on Intel ones. SSE4a

> being an AMD-only extension, it shouldn't be part of the ISA extensions

> set for which the diagnostic may get issued. Undo that part.


I'd additionally like to note that documentation of -msse-check and
implemention are not consistent: The text says "for any SSE
instruction", yet NoAVX is also present on e.g. MOVQ2DQ and CVTPI2PS,
which clearly are SSE instructions. The question is which of the two
is wrong here.

Jan
David Faust via Binutils June 9, 2020, 12:56 p.m. | #3
On Tue, Jun 9, 2020 at 5:52 AM Jan Beulich <jbeulich@suse.com> wrote:
>

> On 09.06.2020 14:46, Jan Beulich wrote:

> > Upon re-consideration in commit 569d50f1c611 ("x86: further refine SSE

> > check (SSE4a, SHA, GFNI)") I went too far: Mixing of SSE and AVX insns

> > doesn't suffer as bad a penalty on AMD CPUs as on Intel ones. SSE4a

> > being an AMD-only extension, it shouldn't be part of the ISA extensions

> > set for which the diagnostic may get issued. Undo that part.

>

> I'd additionally like to note that documentation of -msse-check and

> implemention are not consistent: The text says "for any SSE

> instruction", yet NoAVX is also present on e.g. MOVQ2DQ and CVTPI2PS,

> which clearly are SSE instructions. The question is which of the two

> is wrong here.


-msse-check is intended to check for SSE instructions which can be encoded with
VEX.  We can update its documentation.

-- 
H.J.
Jan Beulich June 9, 2020, 1:03 p.m. | #4
On 09.06.2020 14:56, H.J. Lu wrote:
> On Tue, Jun 9, 2020 at 5:52 AM Jan Beulich <jbeulich@suse.com> wrote:

>> On 09.06.2020 14:46, Jan Beulich wrote:

>>> Upon re-consideration in commit 569d50f1c611 ("x86: further refine SSE

>>> check (SSE4a, SHA, GFNI)") I went too far: Mixing of SSE and AVX insns

>>> doesn't suffer as bad a penalty on AMD CPUs as on Intel ones. SSE4a

>>> being an AMD-only extension, it shouldn't be part of the ISA extensions

>>> set for which the diagnostic may get issued. Undo that part.

>>

>> I'd additionally like to note that documentation of -msse-check and

>> implemention are not consistent: The text says "for any SSE

>> instruction", yet NoAVX is also present on e.g. MOVQ2DQ and CVTPI2PS,

>> which clearly are SSE instructions. The question is which of the two

>> is wrong here.

> 

> -msse-check is intended to check for SSE instructions which can be encoded with

> VEX.  We can update its documentation.


IOW it is _not_ intended to help spot insns that would better be
replaced, and in particular ones that gas can't be told to replace
via -msse2avx without the programmer needing to take action?

Jan
David Faust via Binutils June 9, 2020, 1:07 p.m. | #5
On Tue, Jun 9, 2020 at 6:03 AM Jan Beulich <jbeulich@suse.com> wrote:
>

> On 09.06.2020 14:56, H.J. Lu wrote:

> > On Tue, Jun 9, 2020 at 5:52 AM Jan Beulich <jbeulich@suse.com> wrote:

> >> On 09.06.2020 14:46, Jan Beulich wrote:

> >>> Upon re-consideration in commit 569d50f1c611 ("x86: further refine SSE

> >>> check (SSE4a, SHA, GFNI)") I went too far: Mixing of SSE and AVX insns

> >>> doesn't suffer as bad a penalty on AMD CPUs as on Intel ones. SSE4a

> >>> being an AMD-only extension, it shouldn't be part of the ISA extensions

> >>> set for which the diagnostic may get issued. Undo that part.

> >>

> >> I'd additionally like to note that documentation of -msse-check and

> >> implemention are not consistent: The text says "for any SSE

> >> instruction", yet NoAVX is also present on e.g. MOVQ2DQ and CVTPI2PS,

> >> which clearly are SSE instructions. The question is which of the two

> >> is wrong here.

> >

> > -msse-check is intended to check for SSE instructions which can be encoded with

> > VEX.  We can update its documentation.

>

> IOW it is _not_ intended to help spot insns that would better be

> replaced, and in particular ones that gas can't be told to replace

> via -msse2avx without the programmer needing to take action?

>


The intended usage is to verify that the assembly sources don't
contain any VEX encodable SSE operations.


-- 
H.J.
Jan Beulich June 9, 2020, 1:53 p.m. | #6
On 09.06.2020 15:07, H.J. Lu wrote:
> On Tue, Jun 9, 2020 at 6:03 AM Jan Beulich <jbeulich@suse.com> wrote:

>>

>> On 09.06.2020 14:56, H.J. Lu wrote:

>>> On Tue, Jun 9, 2020 at 5:52 AM Jan Beulich <jbeulich@suse.com> wrote:

>>>> On 09.06.2020 14:46, Jan Beulich wrote:

>>>>> Upon re-consideration in commit 569d50f1c611 ("x86: further refine SSE

>>>>> check (SSE4a, SHA, GFNI)") I went too far: Mixing of SSE and AVX insns

>>>>> doesn't suffer as bad a penalty on AMD CPUs as on Intel ones. SSE4a

>>>>> being an AMD-only extension, it shouldn't be part of the ISA extensions

>>>>> set for which the diagnostic may get issued. Undo that part.

>>>>

>>>> I'd additionally like to note that documentation of -msse-check and

>>>> implemention are not consistent: The text says "for any SSE

>>>> instruction", yet NoAVX is also present on e.g. MOVQ2DQ and CVTPI2PS,

>>>> which clearly are SSE instructions. The question is which of the two

>>>> is wrong here.

>>>

>>> -msse-check is intended to check for SSE instructions which can be encoded with

>>> VEX.  We can update its documentation.

>>

>> IOW it is _not_ intended to help spot insns that would better be

>> replaced, and in particular ones that gas can't be told to replace

>> via -msse2avx without the programmer needing to take action?

>>

> 

> The intended usage is to verify that the assembly sources don't

> contain any VEX encodable SSE operations.


I.e. the combination of -msse2avx and -msse-check= is of no use at
all?

What about CVTPI2PD with a memory operand then? This can be re-encoded
as CVTDQ2PD and hence as VCVTDQ2PD afaict.

Jan
David Faust via Binutils June 9, 2020, 2:09 p.m. | #7
On Tue, Jun 9, 2020 at 6:53 AM Jan Beulich <jbeulich@suse.com> wrote:
>

> On 09.06.2020 15:07, H.J. Lu wrote:

> > On Tue, Jun 9, 2020 at 6:03 AM Jan Beulich <jbeulich@suse.com> wrote:

> >>

> >> On 09.06.2020 14:56, H.J. Lu wrote:

> >>> On Tue, Jun 9, 2020 at 5:52 AM Jan Beulich <jbeulich@suse.com> wrote:

> >>>> On 09.06.2020 14:46, Jan Beulich wrote:

> >>>>> Upon re-consideration in commit 569d50f1c611 ("x86: further refine SSE

> >>>>> check (SSE4a, SHA, GFNI)") I went too far: Mixing of SSE and AVX insns

> >>>>> doesn't suffer as bad a penalty on AMD CPUs as on Intel ones. SSE4a

> >>>>> being an AMD-only extension, it shouldn't be part of the ISA extensions

> >>>>> set for which the diagnostic may get issued. Undo that part.

> >>>>

> >>>> I'd additionally like to note that documentation of -msse-check and

> >>>> implemention are not consistent: The text says "for any SSE

> >>>> instruction", yet NoAVX is also present on e.g. MOVQ2DQ and CVTPI2PS,

> >>>> which clearly are SSE instructions. The question is which of the two

> >>>> is wrong here.

> >>>

> >>> -msse-check is intended to check for SSE instructions which can be encoded with

> >>> VEX.  We can update its documentation.

> >>

> >> IOW it is _not_ intended to help spot insns that would better be

> >> replaced, and in particular ones that gas can't be told to replace

> >> via -msse2avx without the programmer needing to take action?

> >>

> >

> > The intended usage is to verify that the assembly sources don't

> > contain any VEX encodable SSE operations.

>

> I.e. the combination of -msse2avx and -msse-check= is of no use at

> all?


Yes.

> What about CVTPI2PD with a memory operand then? This can be re-encoded

> as CVTDQ2PD and hence as VCVTDQ2PD afaict.

>


Good question.   I don't know if we need to bother with it.


-- 
H.J.

Patch

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4753,7 +4753,6 @@  md_assemble (char *line)
 	  || i.tm.cpu_flags.bitfield.cpussse3
 	  || i.tm.cpu_flags.bitfield.cpusse4_1
 	  || i.tm.cpu_flags.bitfield.cpusse4_2
-	  || i.tm.cpu_flags.bitfield.cpusse4a
 	  || i.tm.cpu_flags.bitfield.cpupclmul
 	  || i.tm.cpu_flags.bitfield.cpuaes
 	  || i.tm.cpu_flags.bitfield.cpusha
--- a/gas/testsuite/gas/i386/sse-check-error.l
+++ b/gas/testsuite/gas/i386/sse-check-error.l
@@ -5,7 +5,6 @@ 
 .*:16: Error: .*
 .*:19: Error: .*
 .*:20: Error: .*
-.*:23: Error: .*
 .*:26: Error: .*
 .*:29: Error: .*
 .*:32: Error: .*
@@ -44,9 +43,8 @@  GAS LISTING .*
 .*  Error: SSE instruction `pcmpgtq' is used
 [ 	]*20[ 	]+C1
 [ 	]*21[ 	]+
-[ 	]*22[ 	]+\# SSE4a instruction
+[ 	]*22[ 	]+\# SSE4a instruction.*
 [ 	]*23[ 	]+\?\?\?\? 660F78C0 		extrq \$0, \$0, %xmm0
-.*  Error: SSE instruction `extrq' is used
 [ 	]*23[ 	]+0000
 [ 	]*24[ 	]+
 [ 	]*25[ 	]+\# PCMUL instruction
@@ -70,7 +68,7 @@  GAS LISTING .*
 [ 	]*36[ 	]+\?\?\?\? 62F27D09 		vgf2p8mulb %xmm0, %xmm0, %xmm0\{%k1\}
 [ 	]*36[ 	]+CFC0
 [ 	]*37[ 	]+\?\?\?\? 62F27D48 		vgf2p8mulb %zmm0, %zmm0, %zmm0
+[ 	]*37[ 	]+CFC0
 GAS LISTING .*
 
 
-[ 	]*37[ 	]+CFC0
--- a/gas/testsuite/gas/i386/sse-check-warn.e
+++ b/gas/testsuite/gas/i386/sse-check-warn.e
@@ -5,7 +5,6 @@ 
 .*:16: Warning: SSE instruction `phaddw' is used
 .*:19: Warning: SSE instruction `blendvpd' is used
 .*:20: Warning: SSE instruction `pcmpgtq' is used
-.*:23: Warning: SSE instruction `extrq' is used
 .*:26: Warning: SSE instruction `pclmulqdq' is used
 .*:29: Warning: SSE instruction `aesdec' is used
 .*:32: Warning: SSE instruction `sha1nexte' is used
--- a/gas/testsuite/gas/i386/sse-check.s
+++ b/gas/testsuite/gas/i386/sse-check.s
@@ -19,7 +19,7 @@  _start:
 	blendvpd %xmm0,%xmm1,%xmm0
 	pcmpgtq %xmm1,%xmm0
 
-# SSE4a instruction
+# SSE4a instruction (no diagnostic)
 	extrq $0, $0, %xmm0
 
 # PCMUL instruction
--- a/gas/testsuite/gas/i386/x86-64-sse-check-error.l
+++ b/gas/testsuite/gas/i386/x86-64-sse-check-error.l
@@ -5,7 +5,6 @@ 
 .*:16: Error: .*
 .*:19: Error: .*
 .*:20: Error: .*
-.*:23: Error: .*
 .*:26: Error: .*
 .*:29: Error: .*
 .*:32: Error: .*
@@ -44,9 +43,8 @@  GAS LISTING .*
 .*  Error: SSE instruction `pcmpgtq' is used
 [ 	]*20[ 	]+C1
 [ 	]*21[ 	]+
-[ 	]*22[ 	]+\# SSE4a instruction
+[ 	]*22[ 	]+\# SSE4a instruction.*
 [ 	]*23[ 	]+\?\?\?\? 660F78C0 		extrq \$0, \$0, %xmm0
-.*  Error: SSE instruction `extrq' is used
 [ 	]*23[ 	]+0000
 [ 	]*24[ 	]+
 [ 	]*25[ 	]+\# PCMUL instruction
@@ -70,7 +68,7 @@  GAS LISTING .*
 [ 	]*36[ 	]+\?\?\?\? 62F27D09 		vgf2p8mulb %xmm0, %xmm0, %xmm0\{%k1\}
 [ 	]*36[ 	]+CFC0
 [ 	]*37[ 	]+\?\?\?\? 62F27D48 		vgf2p8mulb %zmm0, %zmm0, %zmm0
+[ 	]*37[ 	]+CFC0
 GAS LISTING .*
 
 
-[ 	]*37[ 	]+CFC0