x86: mark "k" and "Yk" constraints as non-internal

Message ID 5D149658020000780023B7A9@prv1-mh.provo.novell.com
State New
Headers show
Series
  • x86: mark "k" and "Yk" constraints as non-internal
Related show

Commit Message

Jan Beulich June 27, 2019, 10:11 a.m.
Without these constraints asm() can't make use of mask registers.

gcc/
2019-06-27  Jan Beulich  <jbeulich@suse.com>

	* config/i386/constraints.md: Remove @internal from "k" and
	"Yk".

Comments

Jan Beulich June 27, 2019, 10:17 a.m. | #1
>>> On 27.06.19 at 12:11,  wrote:

> Without these constraints asm() can't make use of mask registers.


Similarly it is entirely unclear to me how to use e.g. v4fmaddps or
vp2intersectd in asm(): For the former the respective "Yh"
constraint was dropped (oddly enough leaving its comment line in
place), and there's nothing I can spot for a mask register pair.

Using (in an attempt to find alternative options)

typedef float __attribute__((vector_size(256))) v64sf_t;

results in "impossible constraint" when using an operand of this
type in asm(), while

typedef unsigned __attribute__((mode(P2HI))) kpair_t;

gives "no data type for mode" even without any actual use.

Jan
Uros Bizjak June 27, 2019, 11:09 a.m. | #2
On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich <JBeulich@suse.com> wrote:
>

> Without these constraints asm() can't make use of mask registers.


asm should be deprecated. We have intrinsics for this purpose.

Uros.

> gcc/

> 2019-06-27  Jan Beulich  <jbeulich@suse.com>

>

>         * config/i386/constraints.md: Remove @internal from "k" and

>         "Yk".

>

> --- a/gcc/config/i386/constraints.md

> +++ b/gcc/config/i386/constraints.md

> @@ -79,10 +79,10 @@

>   "Second from top of 80387 floating-point stack (@code{%st(1)}).")

>

>  (define_register_constraint "Yk" "TARGET_AVX512F ? MASK_REGS : NO_REGS"

> -"@internal Any mask register that can be used as predicate, i.e. k1-k7.")

> +"Any mask register that can be used as predicate, i.e. k1-k7.")

>

>  (define_register_constraint "k" "TARGET_AVX512F ? ALL_MASK_REGS : NO_REGS"

> -"@internal Any mask register.")

> +"Any mask register.")

>

>  ;; Vector registers (also used for plain floating point nowadays).

>  (define_register_constraint "y" "TARGET_MMX ? MMX_REGS : NO_REGS"

>

>
Jan Beulich June 27, 2019, 11:46 a.m. | #3
>>> On 27.06.19 at 13:09, <ubizjak@gmail.com> wrote:

> On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich <JBeulich@suse.com> wrote:

>>

>> Without these constraints asm() can't make use of mask registers.

> 

> asm should be deprecated. We have intrinsics for this purpose.


While maybe not explicitly applicable here, the intrinsics aren't
(afaict) providing full flexibility. In particular (just as example)
I haven't found a way to use embedded broadcast with the
intrinsics, but I can easily do so with asm().

Furthermore there are other reasons to use asm() - things like
the Linux kernel are full of it for a reason. And once one has
to use asm(), the resulting code typically is easier to follow if
one doesn't further intermix it with uses of builtins.

And finally, if asm() was indeed meant to be deprecated, how
come it pretty recently got extended to allow for "inline"?

Jan
Uros Bizjak June 27, 2019, noon | #4
On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich <JBeulich@suse.com> wrote:
>

> >>> On 27.06.19 at 13:09, <ubizjak@gmail.com> wrote:

> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich <JBeulich@suse.com> wrote:

> >>

> >> Without these constraints asm() can't make use of mask registers.

> >

> > asm should be deprecated. We have intrinsics for this purpose.

>

> While maybe not explicitly applicable here, the intrinsics aren't

> (afaict) providing full flexibility. In particular (just as example)

> I haven't found a way to use embedded broadcast with the

> intrinsics, but I can easily do so with asm().

>

> Furthermore there are other reasons to use asm() - things like

> the Linux kernel are full of it for a reason. And once one has

> to use asm(), the resulting code typically is easier to follow if

> one doesn't further intermix it with uses of builtins.

>

> And finally, if asm() was indeed meant to be deprecated, how

> come it pretty recently got extended to allow for "inline"?


I didn't mean that asm() in general should be deprecated, but for SSE
and other vector extensions, where intrinsics are available,
intrinsics should be used instead. There was exactly zero requests to
use new asm constraints, it looks that people are satisfied with
intrinsics approach (which is also future-proof, etc).

Uros.
Jan Beulich June 27, 2019, 12:23 p.m. | #5
>>> On 27.06.19 at 14:00, <ubizjak@gmail.com> wrote:

> On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich <JBeulich@suse.com> wrote:

>>

>> >>> On 27.06.19 at 13:09, <ubizjak@gmail.com> wrote:

>> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich <JBeulich@suse.com> wrote:

>> >>

>> >> Without these constraints asm() can't make use of mask registers.

>> >

>> > asm should be deprecated. We have intrinsics for this purpose.

>>

>> While maybe not explicitly applicable here, the intrinsics aren't

>> (afaict) providing full flexibility. In particular (just as example)

>> I haven't found a way to use embedded broadcast with the

>> intrinsics, but I can easily do so with asm().

>>

>> Furthermore there are other reasons to use asm() - things like

>> the Linux kernel are full of it for a reason. And once one has

>> to use asm(), the resulting code typically is easier to follow if

>> one doesn't further intermix it with uses of builtins.

>>

>> And finally, if asm() was indeed meant to be deprecated, how

>> come it pretty recently got extended to allow for "inline"?

> 

> I didn't mean that asm() in general should be deprecated, but for SSE

> and other vector extensions, where intrinsics are available,

> intrinsics should be used instead. There was exactly zero requests to

> use new asm constraints, it looks that people are satisfied with

> intrinsics approach (which is also future-proof, etc).


So what about my embedded broadcast example then? "Zero
requests" is clearly not exactly right. It simply didn't occur to me
(until I noticed the @internal here) that I should raise such a
request, rather than just using asm(). Subsequently I did then
notice "Yh" going away, complicating things further ...

I'd also like to note that the choice of types on some of the builtins
makes it rather cumbersome to use them. Especially for scalar
operations I've found myself better resorting to asm(). See
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/tests/x86_emulator/simd.c
(most of the changes submitted (not so) recently have been
coming from the work of putting together this and its sibling
tests for the Xen Project instruction emulator).

Jan
Uros Bizjak June 27, 2019, 12:26 p.m. | #6
On Thu, Jun 27, 2019 at 2:23 PM Jan Beulich <JBeulich@suse.com> wrote:
>

> >>> On 27.06.19 at 14:00, <ubizjak@gmail.com> wrote:

> > On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich <JBeulich@suse.com> wrote:

> >>

> >> >>> On 27.06.19 at 13:09, <ubizjak@gmail.com> wrote:

> >> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich <JBeulich@suse.com> wrote:

> >> >>

> >> >> Without these constraints asm() can't make use of mask registers.

> >> >

> >> > asm should be deprecated. We have intrinsics for this purpose.

> >>

> >> While maybe not explicitly applicable here, the intrinsics aren't

> >> (afaict) providing full flexibility. In particular (just as example)

> >> I haven't found a way to use embedded broadcast with the

> >> intrinsics, but I can easily do so with asm().

> >>

> >> Furthermore there are other reasons to use asm() - things like

> >> the Linux kernel are full of it for a reason. And once one has

> >> to use asm(), the resulting code typically is easier to follow if

> >> one doesn't further intermix it with uses of builtins.

> >>

> >> And finally, if asm() was indeed meant to be deprecated, how

> >> come it pretty recently got extended to allow for "inline"?

> >

> > I didn't mean that asm() in general should be deprecated, but for SSE

> > and other vector extensions, where intrinsics are available,

> > intrinsics should be used instead. There was exactly zero requests to

> > use new asm constraints, it looks that people are satisfied with

> > intrinsics approach (which is also future-proof, etc).

>

> So what about my embedded broadcast example then? "Zero

> requests" is clearly not exactly right. It simply didn't occur to me

> (until I noticed the @internal here) that I should raise such a

> request, rather than just using asm(). Subsequently I did then

> notice "Yh" going away, complicating things further ...


There was some work by HJ a while ago that implemented automatic
generation of embedded broadcasts. Perhaps he can answer the question.

Uros.

> I'd also like to note that the choice of types on some of the builtins

> makes it rather cumbersome to use them. Especially for scalar

> operations I've found myself better resorting to asm(). See

> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/tests/x86_emulator/simd.c

> (most of the changes submitted (not so) recently have been

> coming from the work of putting together this and its sibling

> tests for the Xen Project instruction emulator).

>

> Jan

>

>
Segher Boessenkool June 27, 2019, 1:15 p.m. | #7
On Thu, Jun 27, 2019 at 05:46:00AM -0600, Jan Beulich wrote:
> While maybe not explicitly applicable here, the intrinsics aren't

> (afaict) providing full flexibility. In particular (just as example)

> I haven't found a way to use embedded broadcast with the

> intrinsics, but I can easily do so with asm().

> 

> Furthermore there are other reasons to use asm() - things like

> the Linux kernel are full of it for a reason.


Some of it for a (good) reason.  Yes.

> And once one has

> to use asm(), the resulting code typically is easier to follow if

> one doesn't further intermix it with uses of builtins.


If you have to write more than a few lines of assembler, you are usually
*much* better off just writing it in an assembler source file (a .s).

> And finally, if asm() was indeed meant to be deprecated, how

> come it pretty recently got extended to allow for "inline"?


That was just a simple extension for a very specific purpose.  This does
not mean you should use inline assembler if there are better alternatives.
I'm not sure Linux using such extremely huge inline assembler statements
is a good idea at all, or if there are better ways to do what they want
(not that I see any fwiw); but they already did, and this takes the pain
away a bit for them, so why not.

You should not use inline assembler when there are better way of
expressing what you want.  Like with builtins, for example.  Inline
assembler is hard to write correctly, and hard to *keep* correct.  It is
mixing abstractions (that is its *purpose*), which of course makes it a
royal pain to deal with, esp. if overused.


Segher
H.J. Lu June 27, 2019, 5:03 p.m. | #8
On Thu, Jun 27, 2019 at 5:26 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>

> On Thu, Jun 27, 2019 at 2:23 PM Jan Beulich <JBeulich@suse.com> wrote:

> >

> > >>> On 27.06.19 at 14:00, <ubizjak@gmail.com> wrote:

> > > On Thu, Jun 27, 2019 at 1:46 PM Jan Beulich <JBeulich@suse.com> wrote:

> > >>

> > >> >>> On 27.06.19 at 13:09, <ubizjak@gmail.com> wrote:

> > >> > On Thu, Jun 27, 2019 at 12:11 PM Jan Beulich <JBeulich@suse.com> wrote:

> > >> >>

> > >> >> Without these constraints asm() can't make use of mask registers.

> > >> >

> > >> > asm should be deprecated. We have intrinsics for this purpose.

> > >>

> > >> While maybe not explicitly applicable here, the intrinsics aren't

> > >> (afaict) providing full flexibility. In particular (just as example)

> > >> I haven't found a way to use embedded broadcast with the

> > >> intrinsics, but I can easily do so with asm().

> > >>

> > >> Furthermore there are other reasons to use asm() - things like

> > >> the Linux kernel are full of it for a reason. And once one has

> > >> to use asm(), the resulting code typically is easier to follow if

> > >> one doesn't further intermix it with uses of builtins.

> > >>

> > >> And finally, if asm() was indeed meant to be deprecated, how

> > >> come it pretty recently got extended to allow for "inline"?

> > >

> > > I didn't mean that asm() in general should be deprecated, but for SSE

> > > and other vector extensions, where intrinsics are available,

> > > intrinsics should be used instead. There was exactly zero requests to

> > > use new asm constraints, it looks that people are satisfied with

> > > intrinsics approach (which is also future-proof, etc).

> >

> > So what about my embedded broadcast example then? "Zero

> > requests" is clearly not exactly right. It simply didn't occur to me

> > (until I noticed the @internal here) that I should raise such a

> > request, rather than just using asm(). Subsequently I did then

> > notice "Yh" going away, complicating things further ...

>

> There was some work by HJ a while ago that implemented automatic

> generation of embedded broadcasts. Perhaps he can answer the question.


I implemented broadcast for some commonly used instructions.  Adding
broadcast to all will take time, especially when there is no user demand.

> Uros.

>

> > I'd also like to note that the choice of types on some of the builtins

> > makes it rather cumbersome to use them. Especially for scalar

> > operations I've found myself better resorting to asm(). See

> > https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/tests/x86_emulator/simd.c

> > (most of the changes submitted (not so) recently have been

> > coming from the work of putting together this and its sibling

> > tests for the Xen Project instruction emulator).

> >

> > Jan

> >

> >




-- 
H.J.

Patch

--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -79,10 +79,10 @@ 
  "Second from top of 80387 floating-point stack (@code{%st(1)}).")
 
 (define_register_constraint "Yk" "TARGET_AVX512F ? MASK_REGS : NO_REGS"
-"@internal Any mask register that can be used as predicate, i.e. k1-k7.")
+"Any mask register that can be used as predicate, i.e. k1-k7.")
 
 (define_register_constraint "k" "TARGET_AVX512F ? ALL_MASK_REGS : NO_REGS"
-"@internal Any mask register.")
+"Any mask register.")
 
 ;; Vector registers (also used for plain floating point nowadays).
 (define_register_constraint "y" "TARGET_MMX ? MMX_REGS : NO_REGS"