[powerpc] get_rounding_mode: utilize faster method to get rounding mode

Message ID 1559773022-26575-1-git-send-email-pc@us.ibm.com
State New
Headers show
Series
  • [powerpc] get_rounding_mode: utilize faster method to get rounding mode
Related show

Commit Message

Paul Clarke June 5, 2019, 10:17 p.m.
From: "Paul A. Clarke" <pc@us.ibm.com>


Add support to use 'mffsl' instruction if compiled for POWER9 (or later).

Also, mask the result to avoid bleeding unrelated bits into the result of
_FPU_GET_RC().

2019-06-05  Paul A. Clarke  <pc@us.ibm.com>

	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
	with call to equivalent function.
	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.
	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
	  (__FPU_MFFS): New.
	  (__FPU_MFFSL): New.
	  (_FPU_GET_RC): New.
	  (_FPU_GETCW): Use __FPU_MFFS().
	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.
---
 sysdeps/powerpc/fpu/fenv_libc.h               | 15 ++++++++++++
 sysdeps/powerpc/fpu_control.h                 | 29 ++++++++++++++++++++---
 sysdeps/powerpc/powerpc64/get-rounding-mode.h | 33 +++++++++++++++++++++++++++
 3 files changed, 74 insertions(+), 3 deletions(-)
 create mode 100644 sysdeps/powerpc/powerpc64/get-rounding-mode.h

-- 
1.8.3.1

Comments

Tulio Magno Quites Machado Filho June 5, 2019, 10:42 p.m. | #1
"Paul A. Clarke" <pc@us.ibm.com> writes:

> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).

>

> Also, mask the result to avoid bleeding unrelated bits into the result of

> _FPU_GET_RC().

>

> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>

>

> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code

> 	with call to equivalent function.


This file is missing in this patch.

> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h

> index 8a0bace..621e57f 100644

> --- a/sysdeps/powerpc/fpu/fenv_libc.h

> +++ b/sysdeps/powerpc/fpu/fenv_libc.h

> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;

>     pointer.  */

>  #define fegetenv_register() __builtin_mffs()

>  

> +/* Equivalent to fegetenv_register, but only returns bits for

> +   status, exception enables, and mode.  */

> +#ifdef _ARCH_PWR9

> +#define fegetenv_status() \

> +  ({union { double __d; unsigned long long __ll; } __u;	\

> +    register double __fr;				\

> +    __asm__ ("mffsl %0" : "=f" (__fr));			\

> +    __u.__d = __fr;					\

> +    __u.__ll;						\

> +  })

> +#else

> +#define fegetenv_status() __builtin_mffs()

> +#error "power8"

> +#endif


Is this macro used in fegetexcept.c?
Is the error intentional?

-- 
Tulio Magno
Paul Clarke June 5, 2019, 10:54 p.m. | #2
On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:

> 

>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).

>>

>> Also, mask the result to avoid bleeding unrelated bits into the result of

>> _FPU_GET_RC().

>>

>> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>

>>

>> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code

>> 	with call to equivalent function.

> 

> This file is missing in this patch.


Copy and paste error from the previous patch I submitted earlier today.  Sorry.
I will fix when committing, presuming the rest of this patch is OK.

PC
Paul Clarke June 5, 2019, 11:04 p.m. | #3
On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:

> 

>> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h

>> index 8a0bace..621e57f 100644

>> --- a/sysdeps/powerpc/fpu/fenv_libc.h

>> +++ b/sysdeps/powerpc/fpu/fenv_libc.h

>> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;

>>     pointer.  */

>>  #define fegetenv_register() __builtin_mffs()

>>  

>> +/* Equivalent to fegetenv_register, but only returns bits for

>> +   status, exception enables, and mode.  */

>> +#ifdef _ARCH_PWR9

>> +#define fegetenv_status() \

>> +  ({union { double __d; unsigned long long __ll; } __u;	\

>> +    register double __fr;				\

>> +    __asm__ ("mffsl %0" : "=f" (__fr));			\

>> +    __u.__d = __fr;					\

>> +    __u.__ll;						\

>> +  })

>> +#else

>> +#define fegetenv_status() __builtin_mffs()

>> +#error "power8"

>> +#endif

> 

> Is this macro used in fegetexcept.c?

> Is the error intentional?


This hunk should not be in this patch submission.  I shall remove it.

Sorry I missed this comment before.

PC
Tulio Magno Quites Machado Filho June 6, 2019, 2:17 p.m. | #4
"Paul A. Clarke" <pc@us.ibm.com> writes:

> From: "Paul A. Clarke" <pc@us.ibm.com>

>

> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).

>

> Also, mask the result to avoid bleeding unrelated bits into the result of

> _FPU_GET_RC().

>

> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>

>

> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code

> 	with call to equivalent function.

> 	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.


As you explained, both entries have to be removed.

> 	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.

> 	  (__FPU_MFFS): New.

> 	  (__FPU_MFFSL): New.

> 	  (_FPU_GET_RC): New.

> 	  (_FPU_GETCW): Use __FPU_MFFS().


        ^ Extra spaces here.  Should be only 1 tab.

> 	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.


I also think this file is unnecessary.

> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h

> new file mode 100644

> index 0000000..e2fdbbb

> --- /dev/null

> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h

> @@ -0,0 +1,33 @@

> +/* Determine floating-point rounding mode within libc.  powerpc64 version.

> +   Copyright (C) 2019 Free Software Foundation, Inc.

> +   This file is part of the GNU C Library.

> +

> +   The GNU C Library is free software; you can redistribute it and/or

> +   modify it under the terms of the GNU Lesser General Public

> +   License as published by the Free Software Foundation; either

> +   version 2.1 of the License, or (at your option) any later version.

> +

> +   The GNU C Library is distributed in the hope that it will be useful,

> +   but WITHOUT ANY WARRANTY; without even the implied warranty of

> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU

> +   Lesser General Public License for more details.

> +

> +   You should have received a copy of the GNU Lesser General Public

> +   License along with the GNU C Library; if not, see

> +   <http://www.gnu.org/licenses/>.  */

> +

> +#ifndef _POWERPC64_GET_ROUNDING_MODE_H

> +#define _POWERPC64_GET_ROUNDING_MODE_H	1

> +

> +#include <fenv.h>

> +#include <fpu_control.h>

> +

> +/* Return the floating-point rounding mode.  */

> +

> +static inline int

> +get_rounding_mode (void)

> +{

> +  return _FPU_GET_RC ();

> +}

> +

> +#endif /* get-rounding-mode.h */


I don't understand why this file/function is needed.
Looks like another file that should be removed from this patch.

LGTM if you fix the ChangeLog and keep only the changes to
sysdeps/powerpc/fpu_control.h.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>


-- 
Tulio Magno
Paul Clarke June 6, 2019, 5:41 p.m. | #5
On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:

>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).

>>

>> Also, mask the result to avoid bleeding unrelated bits into the result of

>> _FPU_GET_RC().

>>

>> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>

>>

>> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code

>> 	with call to equivalent function.

>> 	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.

> 

> As you explained, both entries have to be removed.


Yes, sorry about the cruft.

>> 	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.

>> 	  (__FPU_MFFS): New.

>> 	  (__FPU_MFFSL): New.

>> 	  (_FPU_GET_RC): New.

>> 	  (_FPU_GETCW): Use __FPU_MFFS().

> 

>         ^ Extra spaces here.  Should be only 1 tab.


OK.

>> 	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.

> 

> I also think this file is unnecessary.


(discussed below)

>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h

>> new file mode 100644

>> index 0000000..e2fdbbb

>> --- /dev/null

>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h

>> @@ -0,0 +1,33 @@

>> +/* Determine floating-point rounding mode within libc.  powerpc64 version.

<snip>
>> +static inline int

>> +get_rounding_mode (void)

>> +{

>> +  return _FPU_GET_RC ();

>> +}

>> +

>> +#endif /* get-rounding-mode.h */

> 

> I don't understand why this file/function is needed.

> Looks like another file that should be removed from this patch.


Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs".  get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9).

> LGTM if you fix the ChangeLog and keep only the changes to

> sysdeps/powerpc/fpu_control.h.

> 

> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>


PC
Tulio Magno Quites Machado Filho June 6, 2019, 6:05 p.m. | #6
Paul Clarke <pc@us.ibm.com> writes:

> On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote:

>> "Paul A. Clarke" <pc@us.ibm.com> writes:


>>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h

>>> new file mode 100644

>>> index 0000000..e2fdbbb

>>> --- /dev/null

>>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h

>>> @@ -0,0 +1,33 @@

>>> +/* Determine floating-point rounding mode within libc.  powerpc64 version.

> <snip>

>>> +static inline int

>>> +get_rounding_mode (void)

>>> +{

>>> +  return _FPU_GET_RC ();

>>> +}

>>> +

>>> +#endif /* get-rounding-mode.h */

>> 

>> I don't understand why this file/function is needed.

>> Looks like another file that should be removed from this patch.

>

> Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs".  get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9).


Very good point.  I had missed this.

In that case, I think you should move this file to
sysdeps/powerpc/fpu/get-rounding-mode.h allowing 32-bits builds to use it
and restricting its usage to fpu.

-- 
Tulio Magno

Patch

diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h
index 8a0bace..621e57f 100644
--- a/sysdeps/powerpc/fpu/fenv_libc.h
+++ b/sysdeps/powerpc/fpu/fenv_libc.h
@@ -34,6 +34,21 @@  extern const fenv_t *__fe_mask_env (void) attribute_hidden;
    pointer.  */
 #define fegetenv_register() __builtin_mffs()
 
+/* Equivalent to fegetenv_register, but only returns bits for
+   status, exception enables, and mode.  */
+#ifdef _ARCH_PWR9
+#define fegetenv_status() \
+  ({union { double __d; unsigned long long __ll; } __u;	\
+    register double __fr;				\
+    __asm__ ("mffsl %0" : "=f" (__fr));			\
+    __u.__d = __fr;					\
+    __u.__ll;						\
+  })
+#else
+#define fegetenv_status() __builtin_mffs()
+#error "power8"
+#endif
+
 /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer.  */
 #define fesetenv_register(env) \
 	do { \
diff --git a/sysdeps/powerpc/fpu_control.h b/sysdeps/powerpc/fpu_control.h
index e0c5cf6..22c52b7 100644
--- a/sysdeps/powerpc/fpu_control.h
+++ b/sysdeps/powerpc/fpu_control.h
@@ -71,6 +71,8 @@  extern fpu_control_t __fpu_control;
 # define _FPU_RC_UP      0x02
 # define _FPU_RC_ZERO    0x01
 
+# define _FPU_MASK_RC (_FPU_RC_NEAREST|_FPU_RC_DOWN|_FPU_RC_UP|_FPU_RC_ZERO)
+
 # define _FPU_MASK_NI  0x04 /* non-ieee mode */
 
 /* masking of interrupts */
@@ -94,15 +96,36 @@  extern fpu_control_t __fpu_control;
 typedef unsigned int fpu_control_t;
 
 /* Macros for accessing the hardware control word.  */
+# define __FPU_MFFS()						\
+  ({register double __fr;					\
+    __asm__ ("mffs %0" : "=f" (__fr));				\
+    __fr;							\
+  })
+
 # define _FPU_GETCW(cw)						\
   ({union { double __d; unsigned long long __ll; } __u;		\
-    register double __fr;					\
-    __asm__ ("mffs %0" : "=f" (__fr));				\
-    __u.__d = __fr;						\
+    __u.__d = __FPU_MFFS();					\
     (cw) = (fpu_control_t) __u.__ll;				\
     (fpu_control_t) __u.__ll;					\
   })
 
+#ifdef _ARCH_PWR9
+# define __FPU_MFFSL()						\
+  ({register double __fr;					\
+    __asm__ ("mffsl %0" : "=f" (__fr));				\
+    __fr;							\
+  })
+#else
+# define __FPU_MFFSL() __FPU_MFFS()
+#endif
+    
+# define _FPU_GET_RC()						\
+  ({union { double __d; unsigned long long __ll; } __u;		\
+    __u.__d = __FPU_MFFSL();					\
+    __u.__ll &= _FPU_MASK_RC;					\
+    (fpu_control_t) __u.__ll;					\
+  })
+
 # define _FPU_SETCW(cw)						\
   { union { double __d; unsigned long long __ll; } __u;		\
     register double __fr;					\
diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
new file mode 100644
index 0000000..e2fdbbb
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
@@ -0,0 +1,33 @@ 
+/* Determine floating-point rounding mode within libc.  powerpc64 version.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _POWERPC64_GET_ROUNDING_MODE_H
+#define _POWERPC64_GET_ROUNDING_MODE_H	1
+
+#include <fenv.h>
+#include <fpu_control.h>
+
+/* Return the floating-point rounding mode.  */
+
+static inline int
+get_rounding_mode (void)
+{
+  return _FPU_GET_RC ();
+}
+
+#endif /* get-rounding-mode.h */