[2/2,aarch64] Rework fpcr fpsr getter/setter builtins

Message ID gkrsgig8rgr.fsf@arm.com
State New
Headers show
Series
  • [1/2,aarch64] Rework fpcr fpsr getter/setter builtins
Related show

Commit Message

Andrea Corallo March 10, 2020, 9:55 a.m.
Hi all,

second and last patch of the two reworking FPCR and FPSR builtins.

This rework __builtin_aarch64_set_fpcr (unsigned) and
__builtin_aarch64_set_fpsr (unsigned) to emit a read-modify-sequences
as:

         mrs     x1, fpsr
         bfi     x1, x0, 0, 32
         msr     fpsr, x1

This in order to preserve the original high 32 bits of the system
register.  Both FPSR and FPCR became 64bit regs with armv8.1.

Bootstrapped on aarch64-linux-gnu, does not introduce regressions.

Regards

  Andrea

gcc/ChangeLog:

2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

	* config/aarch64/aarch64.md (insv_reg<mode>): Pattern renamed.
	(set_fpcr, set_fpsr): Pattern modified for read-modify-write
	sequence respecting high 32bit register content.

gcc/testsuite/ChangeLog:

2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

	* gcc.target/aarch64/set_fpcr.c: New test.
	* gcc.target/aarch64/get_fpcr.c: New test.
	* gcc.target/aarch64/set_fpsr.c: New test.
	* gcc.target/aarch64/get_fpsr.c: New test.

Comments

Szabolcs Nagy March 11, 2020, 1:50 p.m. | #1
On 10/03/2020 09:55, Andrea Corallo wrote:
> Hi all,

> 

> second and last patch of the two reworking FPCR and FPSR builtins.

> 

> This rework __builtin_aarch64_set_fpcr (unsigned) and

> __builtin_aarch64_set_fpsr (unsigned) to emit a read-modify-sequences

> as:

> 

>          mrs     x1, fpsr

>          bfi     x1, x0, 0, 32

>          msr     fpsr, x1

> 

> This in order to preserve the original high 32 bits of the system

> register.  Both FPSR and FPCR became 64bit regs with armv8.1.


so the architectural fpsr and fpcr registers got
extended to 64bit from 32bit and currently the top
32 bits are reserved 0.

the floating-point environment has to fit in fenv_t
which is currently two unsigned ints on aarch64.

so glibc cannot support 64bit registers without breaking
abi which means in c programs the fpsr, fpcr top bits
must be held 0 at all times when extern calls are made
that involves dynamic linking or calls into the libc
(otherwise fegetenv and fesetenv are not able to
reliably restore the default fp environment).

i think trying to preserve the top bits is wrong
(it may be acceptable depending on what those bits
do, but it seems to me better to only allow 0 top bits).

if the user changes the top bits the behaviour is
undefined if an extern c call is made with such setting.

i think we have to change the builtins to ensure the
top bits of the used x reg is 0. we may or may not need
a new builtin to allow setting 64 bits of fpsr and fpcr:
currently non-zero top bits are unsupportable (but once
there is new meaning for the top 32bits then it may make
sense to break abi).


> 

> Bootstrapped on aarch64-linux-gnu, does not introduce regressions.

> 

> Regards

> 

>   Andrea

> 

> gcc/ChangeLog:

> 

> 2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

> 

> 	* config/aarch64/aarch64.md (insv_reg<mode>): Pattern renamed.

> 	(set_fpcr, set_fpsr): Pattern modified for read-modify-write

> 	sequence respecting high 32bit register content.

> 

> gcc/testsuite/ChangeLog:

> 

> 2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

> 

> 	* gcc.target/aarch64/set_fpcr.c: New test.

> 	* gcc.target/aarch64/get_fpcr.c: New test.

> 	* gcc.target/aarch64/set_fpsr.c: New test.

> 	* gcc.target/aarch64/get_fpsr.c: New test.

>
Richard Earnshaw (lists) March 11, 2020, 1:53 p.m. | #2
On 11/03/2020 13:50, Szabolcs Nagy wrote:
> On 10/03/2020 09:55, Andrea Corallo wrote:

>> Hi all,

>>

>> second and last patch of the two reworking FPCR and FPSR builtins.

>>

>> This rework __builtin_aarch64_set_fpcr (unsigned) and

>> __builtin_aarch64_set_fpsr (unsigned) to emit a read-modify-sequences

>> as:

>>

>>           mrs     x1, fpsr

>>           bfi     x1, x0, 0, 32

>>           msr     fpsr, x1

>>

>> This in order to preserve the original high 32 bits of the system

>> register.  Both FPSR and FPCR became 64bit regs with armv8.1.

> 

> so the architectural fpsr and fpcr registers got

> extended to 64bit from 32bit and currently the top

> 32 bits are reserved 0.

> 

> the floating-point environment has to fit in fenv_t

> which is currently two unsigned ints on aarch64.

> 

> so glibc cannot support 64bit registers without breaking

> abi which means in c programs the fpsr, fpcr top bits

> must be held 0 at all times when extern calls are made

> that involves dynamic linking or calls into the libc

> (otherwise fegetenv and fesetenv are not able to

> reliably restore the default fp environment).


I disagree.  The top bits must be preserved, even if they are not zero. 
This does potentially limit what those top bits might be used for, since 
they are beyond the ability to represent state in C-like programming 
environments, but it doesn't mean they should be reset by a call to 
change the register.


R.

> 

> i think trying to preserve the top bits is wrong

> (it may be acceptable depending on what those bits

> do, but it seems to me better to only allow 0 top bits).

> 

> if the user changes the top bits the behaviour is

> undefined if an extern c call is made with such setting.

> 

> i think we have to change the builtins to ensure the

> top bits of the used x reg is 0. we may or may not need

> a new builtin to allow setting 64 bits of fpsr and fpcr:

> currently non-zero top bits are unsupportable (but once

> there is new meaning for the top 32bits then it may make

> sense to break abi).

> 

> 

>>

>> Bootstrapped on aarch64-linux-gnu, does not introduce regressions.

>>

>> Regards

>>

>>    Andrea

>>

>> gcc/ChangeLog:

>>

>> 2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

>>

>> 	* config/aarch64/aarch64.md (insv_reg<mode>): Pattern renamed.

>> 	(set_fpcr, set_fpsr): Pattern modified for read-modify-write

>> 	sequence respecting high 32bit register content.

>>

>> gcc/testsuite/ChangeLog:

>>

>> 2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

>>

>> 	* gcc.target/aarch64/set_fpcr.c: New test.

>> 	* gcc.target/aarch64/get_fpcr.c: New test.

>> 	* gcc.target/aarch64/set_fpsr.c: New test.

>> 	* gcc.target/aarch64/get_fpsr.c: New test.

>>

>
Szabolcs Nagy March 11, 2020, 4:22 p.m. | #3
On 11/03/2020 13:53, Richard Earnshaw (lists) wrote:
> On 11/03/2020 13:50, Szabolcs Nagy wrote:

>> On 10/03/2020 09:55, Andrea Corallo wrote:

>>> Hi all,

>>>

>>> second and last patch of the two reworking FPCR and FPSR builtins.

>>>

>>> This rework __builtin_aarch64_set_fpcr (unsigned) and

>>> __builtin_aarch64_set_fpsr (unsigned) to emit a read-modify-sequences

>>> as:

>>>

>>>           mrs     x1, fpsr

>>>           bfi     x1, x0, 0, 32

>>>           msr     fpsr, x1

>>>

>>> This in order to preserve the original high 32 bits of the system

>>> register.  Both FPSR and FPCR became 64bit regs with armv8.1.

>>

>> so the architectural fpsr and fpcr registers got

>> extended to 64bit from 32bit and currently the top

>> 32 bits are reserved 0.

>>

>> the floating-point environment has to fit in fenv_t

>> which is currently two unsigned ints on aarch64.

>>

>> so glibc cannot support 64bit registers without breaking

>> abi which means in c programs the fpsr, fpcr top bits

>> must be held 0 at all times when extern calls are made

>> that involves dynamic linking or calls into the libc

>> (otherwise fegetenv and fesetenv are not able to

>> reliably restore the default fp environment).

> 

> I disagree.  The top bits must be preserved, even if they are not zero. This does potentially limit what those top bits might be used for, since

> they are beyond the ability to represent state in C-like programming environments, but it doesn't mean they should be reset by a call to change

> the register.



ok preserving works for fesetenv.
(which internally uses the builtin)

it means the top bits are not part of fenv.
(we will have to change asm in glibc to deal with this)

since the top and bottom bits cannot be set
independently there are implications about
atomic updates, but they are not relevant
on linux where signal handlers save/restore
the fenv (not required by the standard so
theoretically treating the top and bottom bits
separately may cause issues on some systems).

> 

> 

> R.

> 

>>

>> i think trying to preserve the top bits is wrong

>> (it may be acceptable depending on what those bits

>> do, but it seems to me better to only allow 0 top bits).

>>

>> if the user changes the top bits the behaviour is

>> undefined if an extern c call is made with such setting.

>>

>> i think we have to change the builtins to ensure the

>> top bits of the used x reg is 0. we may or may not need

>> a new builtin to allow setting 64 bits of fpsr and fpcr:

>> currently non-zero top bits are unsupportable (but once

>> there is new meaning for the top 32bits then it may make

>> sense to break abi).

>>

>>

>>>

>>> Bootstrapped on aarch64-linux-gnu, does not introduce regressions.

>>>

>>> Regards

>>>

>>>    Andrea

>>>

>>> gcc/ChangeLog:

>>>

>>> 2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

>>>

>>>     * config/aarch64/aarch64.md (insv_reg<mode>): Pattern renamed.

>>>     (set_fpcr, set_fpsr): Pattern modified for read-modify-write

>>>     sequence respecting high 32bit register content.

>>>

>>> gcc/testsuite/ChangeLog:

>>>

>>> 2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

>>>

>>>     * gcc.target/aarch64/set_fpcr.c: New test.

>>>     * gcc.target/aarch64/get_fpcr.c: New test.

>>>     * gcc.target/aarch64/set_fpsr.c: New test.

>>>     * gcc.target/aarch64/get_fpsr.c: New test.

>>>

>>

>
Andrea Corallo March 17, 2020, 10:51 a.m. | #4
Andrea Corallo <andrea.corallo@arm.com> writes:

> Hi all,

>

> second and last patch of the two reworking FPCR and FPSR builtins.

>

> This rework __builtin_aarch64_set_fpcr (unsigned) and

> __builtin_aarch64_set_fpsr (unsigned) to emit a read-modify-sequences

> as:

>

>          mrs     x1, fpsr

>          bfi     x1, x0, 0, 32

>          msr     fpsr, x1

>

> This in order to preserve the original high 32 bits of the system

> register.  Both FPSR and FPCR became 64bit regs with armv8.1.

>

> Bootstrapped on aarch64-linux-gnu, does not introduce regressions.

>

> Regards

>

>   Andrea

>

> gcc/ChangeLog:

>

> 2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

>

> 	* config/aarch64/aarch64.md (insv_reg<mode>): Pattern renamed.

> 	(set_fpcr, set_fpsr): Pattern modified for read-modify-write

> 	sequence respecting high 32bit register content.

>

> gcc/testsuite/ChangeLog:

>

> 2020-??-??  Andrea Corallo  <andrea.corallo@arm.com>

>

> 	* gcc.target/aarch64/set_fpcr.c: New test.

> 	* gcc.target/aarch64/get_fpcr.c: New test.

> 	* gcc.target/aarch64/set_fpsr.c: New test.

> 	* gcc.target/aarch64/get_fpsr.c: New test.


Ping, please let me know if these are okay for trunk.

Thanks

  Andrea

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index b6836710c9c2..b6ee2e1a946e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5613,7 +5613,7 @@ 
   operands[3] = force_reg (<MODE>mode, value);
 })
 
-(define_insn "*insv_reg<mode>"
+(define_insn "insv_reg<mode>"
   [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r")
 			  (match_operand 1 "const_int_operand" "n")
 			  (match_operand 2 "const_int_operand" "n"))
@@ -7173,10 +7173,21 @@ 
    (set_attr "type" "multiple")])
 
 ;; Write Floating-point Control Register.
-(define_insn "set_fpcr"
-  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] UNSPECV_SET_FPCR)]
+
+(define_expand "set_fpcr"
+  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")]
+                    UNSPECV_SET_FPCR)]
   ""
-  "msr\\tfpcr, %0"
+  {
+    rtx val32 = simplify_gen_subreg (DImode, operands[0], SImode, 0);
+    /* Read-modify-write sequence.  */
+    rtx scratch = gen_reg_rtx (DImode);
+    emit_insn (gen_get_fpcr64 (scratch));
+    emit_insn (gen_insv_regdi (scratch, GEN_INT (32), GEN_INT (0),
+			       val32));
+    emit_insn (gen_set_fpcr64 (scratch));
+    DONE;
+  }
   [(set_attr "type" "mrs")])
 
 ;; Read Floating-point Control Register.
@@ -7188,10 +7199,19 @@ 
   [(set_attr "type" "mrs")])
 
 ;; Write Floating-point Status Register.
-(define_insn "set_fpsr"
+(define_expand "set_fpsr"
   [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] UNSPECV_SET_FPSR)]
   ""
-  "msr\\tfpsr, %0"
+  {
+    rtx val32 = simplify_gen_subreg (DImode, operands[0], SImode, 0);
+    /* Read-modify-write sequence.  */
+    rtx scratch = gen_reg_rtx (DImode);
+    emit_insn (gen_get_fpsr64 (scratch));
+    emit_insn (gen_insv_regdi (scratch, GEN_INT (32), GEN_INT (0),
+			       val32));
+    emit_insn (gen_set_fpsr64 (scratch));
+    DONE;
+  }
   [(set_attr "type" "mrs")])
 
 ;; Read Floating-point Status Register.
diff --git a/gcc/testsuite/gcc.target/aarch64/get_fpcr.c b/gcc/testsuite/gcc.target/aarch64/get_fpcr.c
new file mode 100644
index 000000000000..f33e70e34cd9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/get_fpcr.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned int
+get_fpcr ()
+{
+  return __builtin_aarch64_get_fpcr ();
+}
+
+/* { dg-final { scan-assembler-times "mrs.*fpcr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/get_fpsr.c b/gcc/testsuite/gcc.target/aarch64/get_fpsr.c
new file mode 100644
index 000000000000..2f7d75637d20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/get_fpsr.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned int
+get_fpsr ()
+{
+  return __builtin_aarch64_get_fpsr ();
+}
+
+/* { dg-final { scan-assembler-times "mrs.*fpsr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/set_fpcr.c b/gcc/testsuite/gcc.target/aarch64/set_fpcr.c
new file mode 100644
index 000000000000..74525981f323
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/set_fpcr.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+set_fpcr (unsigned int x)
+{
+  return __builtin_aarch64_set_fpcr (x);
+}
+
+/* { dg-final { scan-assembler-times "bfi" 1 } } */
+/* { dg-final { scan-assembler-times "mrs.*fpcr" 1 } } */
+/* { dg-final { scan-assembler-times "msr.*fpcr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/set_fpsr.c b/gcc/testsuite/gcc.target/aarch64/set_fpsr.c
new file mode 100644
index 000000000000..e3e2e631f70c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/set_fpsr.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+set_fpsr (unsigned int x)
+{
+  return __builtin_aarch64_set_fpsr (x);
+}
+
+/* { dg-final { scan-assembler-times "bfi" 1 } } */
+/* { dg-final { scan-assembler-times "mrs.*fpsr" 1 } } */
+/* { dg-final { scan-assembler-times "msr.*fpsr" 1 } } */