[Aarch64] PR 86538 - Define __ARM_FEATURE_LSE if LSE is available

Message ID 1532463722.15609.40.camel@cavium.com
State New
Headers show
Series
  • [Aarch64] PR 86538 - Define __ARM_FEATURE_LSE if LSE is available
Related show

Commit Message

Steve Ellcey July 24, 2018, 8:22 p.m.
This is a patch for PR 86538, to define an __ARM_FEATURE_LSE macro
when LSE is available.  Richard Earnshaw closed PR 86538 as WONTFIX
because the ACLE (Arm C Language Extension) does not require this
macro and because he is concerned that it might encourage people to
use inline assembly instead of the __sync and atomic intrinsics.
(See actual comments in the defect report.)

While I agree that we want people to use the intrinsics I still think
there are use cases where people may want to know if LSE is available
or not and there is currrently no (simple) way to determine if this feature
is available since it can be turned or and off independently of the
architecture used.  Also, as a general principle, I  think any feature
that can be toggled on or off by the compiler should provide a way for
users to determine what its state is.

So what do other ARM maintainers and users think?  Is this a useful
feature to have in GCC?

Steve Ellcey
sellcey@cavium.com


2018-07-24  Steve Ellcey  <sellcey@cavium.com>

	PR target/86538
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
	Add define of __ARM_FEATURE_LSE.

Comments

James Greenhalgh July 24, 2018, 9:04 p.m. | #1
On Tue, Jul 24, 2018 at 03:22:02PM -0500, Steve Ellcey wrote:
> This is a patch for PR 86538, to define an __ARM_FEATURE_LSE macro

> when LSE is available.  Richard Earnshaw closed PR 86538 as WONTFIX

> because the ACLE (Arm C Language Extension) does not require this

> macro and because he is concerned that it might encourage people to

> use inline assembly instead of the __sync and atomic intrinsics.

> (See actual comments in the defect report.)

> 

> While I agree that we want people to use the intrinsics I still think

> there are use cases where people may want to know if LSE is available

> or not and there is currrently no (simple) way to determine if this feature

> is available since it can be turned or and off independently of the

> architecture used.  Also, as a general principle, I  think any feature

> that can be toggled on or off by the compiler should provide a way for

> users to determine what its state is.


Well, we blow that design principle all over the place (find me a macro
which tells you whether AARCH64_EXTRA_TUNE_SLOW_UNALIGNED_LDPW is on for
example :-) )

A better design principle would be that if we think language programmers
may want to compile in different C code depending on a compiler option, we
should consider adding a feature macro.

> So what do other ARM maintainers and users think?  Is this a useful

> feature to have in GCC?


I'm with Richard on this one.

Whether LSE is available or not at compile time, the best user strategy is
to use the C11/C++11 atomic extensions. That's where the memory model is
well defined, well reasoned about, and well implemented.

Purely in ACLE we're not keen on providing macros that don't provide choice
to a C level programmer (i.e. change the prescence of intrinsics).

You could well imagine an inline asm programmer wanting to choose between an
LSE path and an Armv8.0-A path; but I can't imagine what they would want to
do on that path that couldn't be expressed better in the C language. You
might say they want to validate presence of the instruction; but that will
need to be a dynamic check outside of ACLE anyway.

All of which is to say, I don't think that this is a neccessary macro. Each
time I've seen it requested by a user, we've told them the same thing; what
do you want to express here that isn't better expressed by C atomic
primitives.

I'd say this patch isn't desirable for trunk. I'd be interested in use cases
that need a static decision on presence of LSE that are not better expressed
using higher level language features.

Thanks,
James
Steve Ellcey July 24, 2018, 9:55 p.m. | #2
On Tue, 2018-07-24 at 22:04 +0100, James Greenhalgh wrote:

> 

> I'd say this patch isn't desirable for trunk. I'd be interested in use cases

> that need a static decision on presence of LSE that are not better expressed

> using higher level language features.

> 

> Thanks,

> James


How about when building the higher level features?  Right now,
in sysdeps/aarch64/atomic-machine.h, we
hardcode ATOMIC_EXCHANGE_USES_CAS to 0.  If we had __ARM_FEATURE_LSE we
could use that to determine if we wanted to set
ATOMIC_EXCHANGE_USES_CAS to 0 or 1 which would affect the call
generated in nptl/pthread_spin_lock.c.  That would be useful if we
built a lipthread specifically for a platform that had LSE.

Steve Ellcey
sellcey@cavium.com
Richard Earnshaw (lists) July 25, 2018, 10:04 a.m. | #3
On 24/07/18 22:55, Steve Ellcey wrote:
> On Tue, 2018-07-24 at 22:04 +0100, James Greenhalgh wrote:

>>  

>>

>> I'd say this patch isn't desirable for trunk. I'd be interested in use cases

>> that need a static decision on presence of LSE that are not better expressed

>> using higher level language features.

>>

>> Thanks,

>> James

> 

> How about when building the higher level features?  Right now,

> in sysdeps/aarch64/atomic-machine.h, we

> hardcode ATOMIC_EXCHANGE_USES_CAS to 0.  If we had __ARM_FEATURE_LSE we

> could use that to determine if we wanted to set

> ATOMIC_EXCHANGE_USES_CAS to 0 or 1 which would affect the call

> generated in nptl/pthread_spin_lock.c.  That would be useful if we

> built a lipthread specifically for a platform that had LSE.

> 

> Steve Ellcey

> sellcey@cavium.com

> 


If there is a case for such a define, it needs to be made with the ACLE
specification maintainers.  I don't think GCC should be ploughing a
separate furrow here.

So make your case to the ACLE maintainers.  If that adopts a pre-define,
then implementing it in GCC would go through on the nod.

R.
Ramana Radhakrishnan July 25, 2018, 10:55 a.m. | #4
On Tue, Jul 24, 2018 at 10:55 PM, Steve Ellcey <sellcey@cavium.com> wrote:
> On Tue, 2018-07-24 at 22:04 +0100, James Greenhalgh wrote:

>>

>>

>> I'd say this patch isn't desirable for trunk. I'd be interested in use cases

>> that need a static decision on presence of LSE that are not better expressed

>> using higher level language features.

>>

>> Thanks,

>> James

>

> How about when building the higher level features?



> Right now,

> in sysdeps/aarch64/atomic-machine.h, we

> hardcode ATOMIC_EXCHANGE_USES_CAS to 0.  If we had __ARM_FEATURE_LSE we

> could use that to determine if we wanted to set

> ATOMIC_EXCHANGE_USES_CAS to 0 or 1 which would affect the call

> generated in nptl/pthread_spin_lock.c.  That would be useful if we

> built a lipthread specifically for a platform that had LSE.



No, you don't need to define ATOMIC_EXCHANGE_USES_CAS=1 to get LSE
instructions in libpthread. You get that already with
-march=armv8-a+lse on the command line.

ATOMIC_EXCHANGE_USES_CAS =1 in glibc implies that a CAS is faster than
a SWP and that on AArch64 is a per micro-architectural decision *not*
an architectural decision for the port unless someone can
categorically say that the majority of implementations that glibc
cares about *have* better CAS performance than SWP performance. Both
the SWP and CAS instructions are part of v8.1-A / LSE thus all you
need to build libpthread with lse is merely the command line option
-march=armv8-a+lse. So, no you don't need this macro to build
libpthread for v8.1 or LSE . You need that macro to statically choose
a cas implementation over a swp implementation.

See comment in include/atomic.h :

/* ATOMIC_EXCHANGE_USES_CAS is non-zero if atomic_exchange operations
   are implemented based on a CAS loop; otherwise, this is zero and we assume
   that the atomic_exchange operations could provide better performance
   than a CAS loop.  */


regards
Ramana


>

> Steve Ellcey

> sellcey@cavium.com

>

Patch

diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 40c738c..e057ba9 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -154,6 +154,9 @@  aarch64_update_cpp_builtins (cpp_reader *pfile)
   aarch64_def_or_undef (TARGET_SM4, "__ARM_FEATURE_SM4", pfile);
   aarch64_def_or_undef (TARGET_F16FML, "__ARM_FEATURE_FP16_FML", pfile);
 
+  /* This is not required by ACLE, but it is useful.  */
+  aarch64_def_or_undef (TARGET_LSE, "__ARM_FEATURE_LSE", pfile);
+
   /* Not for ACLE, but required to keep "float.h" correct if we switch
      target between implementations that do or do not support ARMv8.2-A
      16-bit floating-point extensions.  */