[v2,1/2] PPC64: Add libmvec SIMD single-precision natural exponent function.

Message ID 20190527174717.21619-1-shawn@git.icu
State New
Headers show
Series
  • [v2,1/2] PPC64: Add libmvec SIMD single-precision natural exponent function.
Related show

Commit Message

Shawn Landden May 27, 2019, 5:47 p.m.
Passes all tests.

Based off the ./sysdeps/ieee754/dbl-64/e_exp.c implementation,
and thus provides identical results.

Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar expf to figure out what is causing the overflow or underflow.

Suprisingly the special-case path performs as well as the normal path.
(both of which are vectorized)
Running 20 times over 32MiB
vector: mean 432.263032 MiB/s (sd 0.486733)
scalar: mean 178.646197 MiB/s (sd 0.050013)

v2: NFC, fixed some style issues, note that this is based on another impl.

2019-05-11  Shawn Landden  <shawn@git.icu>

	[BZ #24209]
        * NEWS: Noted the addition of PPC64 vector expf function.
        * sysdeps/powerpc/bits/math-vector.h: Added entry for vector expf.
        * sysdeps/powerpc/powerpc64/fpu/Versions: Added vector expf entry.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile:
        (libmvec-sysdep_routines, CFLAGS-vec_s_expf4_vsx.c):
        (CFLAGS-vec_s_exp2f_data.c): Added build of VSX SIMD expf
        function and tests. Added vec_math_errf.c to build.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h: Modified for expf.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c:
        Added entry for vector expf.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c: New file.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c: Likewise.
        * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c: Likewise.
        * sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: SIMD expf
        added.
---
 NEWS                                          |   1 +
 sysdeps/powerpc/bits/math-vector.h            |   2 +
 sysdeps/powerpc/powerpc64/fpu/Versions        |   1 +
 .../powerpc/powerpc64/fpu/multiarch/Makefile  |   6 +-
 .../powerpc64/fpu/multiarch/math_config_flt.h |  64 +++++--
 .../fpu/multiarch/test-float-vlen4-wrappers.c |   1 +
 .../powerpc64/fpu/multiarch/vec_math_errf.c   |  39 +++++
 .../fpu/multiarch/vec_s_exp2f_data.c          |  58 +++++++
 .../powerpc64/fpu/multiarch/vec_s_expf4_vsx.c | 159 ++++++++++++++++++
 .../linux/powerpc/powerpc64/libmvec.abilist   |   1 +
 10 files changed, 321 insertions(+), 11 deletions(-)
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c

-- 
2.20.1

Comments

Shawn Landden May 27, 2019, 6 p.m. | #1
On Mon, May 27, 2019 at 12:47 PM Shawn Landden <shawn@git.icu> wrote:
>

> Passes all tests.

>

> Based off the ./sysdeps/ieee754/dbl-64/e_exp.c implementation,

> and thus provides identical results.

>

> Unlike other libmvec functions, this sets the underflow and overflow bits.

> The caller can check these flags, and possibly re-run the calculations with

> scalar expf to figure out what is causing the overflow or underflow.

>

> Suprisingly the special-case path performs as well as the normal path.

> (both of which are vectorized)

> Running 20 times over 32MiB

> vector: mean 432.263032 MiB/s (sd 0.486733)

> scalar: mean 178.646197 MiB/s (sd 0.050013)

>

> v2: NFC, fixed some style issues, note that this is based on another impl.

>

> 2019-05-11  Shawn Landden  <shawn@git.icu>

>

>         [BZ #24209]

>         * NEWS: Noted the addition of PPC64 vector expf function.

>         * sysdeps/powerpc/bits/math-vector.h: Added entry for vector expf.

>         * sysdeps/powerpc/powerpc64/fpu/Versions: Added vector expf entry.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile:

>         (libmvec-sysdep_routines, CFLAGS-vec_s_expf4_vsx.c):

>         (CFLAGS-vec_s_exp2f_data.c): Added build of VSX SIMD expf

>         function and tests. Added vec_math_errf.c to build.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h: Modified for expf.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c:

>         Added entry for vector expf.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c: New file.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c: Likewise.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c: Likewise.

>         * sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: SIMD expf

>         added.

> ---

>  NEWS                                          |   1 +

>  sysdeps/powerpc/bits/math-vector.h            |   2 +

>  sysdeps/powerpc/powerpc64/fpu/Versions        |   1 +

>  .../powerpc/powerpc64/fpu/multiarch/Makefile  |   6 +-

>  .../powerpc64/fpu/multiarch/math_config_flt.h |  64 +++++--

>  .../fpu/multiarch/test-float-vlen4-wrappers.c |   1 +

>  .../powerpc64/fpu/multiarch/vec_math_errf.c   |  39 +++++

>  .../fpu/multiarch/vec_s_exp2f_data.c          |  58 +++++++

>  .../powerpc64/fpu/multiarch/vec_s_expf4_vsx.c | 159 ++++++++++++++++++

>  .../linux/powerpc/powerpc64/libmvec.abilist   |   1 +

>  10 files changed, 321 insertions(+), 11 deletions(-)

>  create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c

>  create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c

>  create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c

>

> diff --git a/NEWS b/NEWS

> index 6f9af0e24e..26f5c67f75 100644

> --- a/NEWS

> +++ b/NEWS

> @@ -18,10 +18,11 @@ Major new features:

>    - single-precision logarithm: logf

>    - double-precision sine: sin

>    - single-precision sine: sinf

>    - double-precision sincos: sincos

>    - single-precision sincos: sincosf

> +  - single-precision exp: expf

>

>    GCC support for auto-vectorization of functions on PPC64 is not yet

>    available. Until that is done, the new vector math functions are

>    inaccessible to applications.

>

> diff --git a/sysdeps/powerpc/bits/math-vector.h b/sysdeps/powerpc/bits/math-vector.h

> index 6dc62480fd..bbc9db0dd9 100644

> --- a/sysdeps/powerpc/bits/math-vector.h

> +++ b/sysdeps/powerpc/bits/math-vector.h

> @@ -42,8 +42,10 @@

>  #  define __DECL_SIMD_logf __DECL_SIMD_PPC64

>  #  undef __DECL_SIMD_sincos

>  #  define __DECL_SIMD_sincos __DECL_SIMD_PPC64

>  #  undef __DECL_SIMD_sincosf

>  #  define __DECL_SIMD_sincosf __DECL_SIMD_PPC64

> +#  undef __DECL_SIMD_expf

> +#  define __DECL_SIMD_expf __DECL_SIMD_PPC64

>

>  # endif

>  #endif

> diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions

> index c083d9f938..7c45e76074 100644

> --- a/sysdeps/powerpc/powerpc64/fpu/Versions

> +++ b/sysdeps/powerpc/powerpc64/fpu/Versions

> @@ -1,6 +1,7 @@

>  libmvec {

>    GLIBC_2.30 {

>      _ZGVbN2v_cos; _ZGVbN4v_cosf; _ZGVbN2v_sin; _ZGVbN4v_sinf;

>      _ZGVbN2v_log; _ZGVbN4v_logf; _ZGVbN2vvv_sincos; _ZGVbN4vvv_sincosf;

> +    _ZGVbN4v_expf;

>    }

>  }

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile

> index a4f0d1694e..ad3c29b1ab 100644

> --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile

> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile

> @@ -46,10 +46,12 @@ endif

>  ifeq ($(subdir),mathvec)

>  libmvec-sysdep_routines += vec_d_cos2_vsx vec_s_cosf4_vsx \

>                            vec_d_sin2_vsx vec_s_sinf4_vsx \

>                            vec_d_log2_vsx vec_d_log_data \

>                            vec_s_logf4_vsx vec_s_logf_data \

> +                          vec_s_expf4_vsx vec_s_exp2f_data \

> +                          vec_math_errf \

>                            vec_d_sincos2_vsx vec_s_sincosf4_vsx

>  CFLAGS-vec_d_cos2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_d_log2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_d_log_data.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_s_cosf4_vsx.c += -mabi=altivec -maltivec -mvsx

> @@ -57,19 +59,21 @@ CFLAGS-vec_s_logf4_vsx.c += -mabi=altivec -maltivec -mvsx

>  CFLAGS-vec_s_logf_data.c += -mabi=altivec -maltivec -mvsx

>  CFLAGS-vec_d_sin2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_s_sinf4_vsx.c += -mabi=altivec -maltivec -mvsx

>  CFLAGS-vec_d_sincos2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_s_sincosf4_vsx.c += -mabi=altivec -maltivec -mvsx

> +CFLAGS-vec_s_expf4_vsx.c += -mabi=altivec -maltivec -mvsx

> +CFLAGS-vec_s_exp2f_data.c += -mabi=altivec -maltivec -mvsx

>  endif

>

>  # Variables for libmvec tests.

>  ifeq ($(subdir),math)

>  ifeq ($(build-mathvec),yes)

>  libmvec-tests += double-vlen2 float-vlen4

>

>  double-vlen2-funcs = cos sin sincos log

> -float-vlen4-funcs = cos sin sincos log

> +float-vlen4-funcs = cos sin sincos log exp

>

>  double-vlen2-arch-ext-cflags = -mabi=altivec -maltivec -mvsx -DREQUIRE_VSX

>  float-vlen4-arch-ext-cflags = -mabi=altivec -maltivec -mvsx -DREQUIRE_VSX

>

>  CFLAGS-test-double-vlen2-wrappers.c += -mabi=altivec -maltivec -mvsx -DREQUIRE_VSX

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h

> index b5fbe02d91..18f34197e2 100644

> --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h

> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h

> @@ -46,20 +46,42 @@

>

>  #if TOINT_INTRINSICS

>  /* Round x to nearest int in all rounding modes, ties have to be rounded

>     consistently with converttoint so the results match.  If the result

>     would be outside of [-2^31, 2^31-1] then the semantics is unspecified.  */

> -static inline double_t

> -roundtoint (double_t x);

> +static inline vector double

> +roundtoint (vector double x);

>

>  /* Convert x to nearest int in all rounding modes, ties have to be rounded

>     consistently with roundtoint.  If the result is not representible in an

>     int32_t then the semantics is unspecified.  */

> -static inline int32_t

> -converttoint (double_t x);

> +static inline vector unsigned long long

> +converttoint (vector double x);

>  #endif

>

> +static inline vector unsigned

> +vasuint (vector float f)

> +{

> +  union

> +  {

> +    vector float f;

> +    vector unsigned i;

> +  } u = {f};

> +  return u.i;

> +}

> +

> +static inline vector float

> +vasfloat (vector unsigned i)

> +{

> +  union

> +  {

> +    vector unsigned i;

> +    vector float f;

> +  } u = {i};

> +  return u.f;

> +}

> +

>  static inline uint32_t

>  asuint (float f)

>  {

>    union

>    {

> @@ -78,10 +100,32 @@ asfloat (uint32_t i)

>      float f;

>    } u = {i};

>    return u.f;

>  }

>

> +static inline vector unsigned long long

> +vasuint64 (vector double f)

> +{

> +  union

> +  {

> +    vector double f;

> +    vector unsigned long long i;

> +  } u = {f};

> +  return u.i;

> +}

> +

> +static inline vector double

> +vasdouble (vector unsigned long long i)

> +{

> +  union

> +  {

> +    vector unsigned long long i;

> +    vector double f;

> +  } u = {i};

> +  return u.f;

> +}

> +

>  static inline uint64_t

>  asuint64 (double f)

>  {

>    union

>    {

> @@ -122,16 +166,16 @@ attribute_hidden float __math_invalidf (float);

>  /* Shared between expf, exp2f and powf.  */

>  #define EXP2F_TABLE_BITS 5

>  #define EXP2F_POLY_ORDER 3

>  extern const struct exp2f_data

>  {

> -  uint64_t tab[1 << EXP2F_TABLE_BITS];

> -  double shift_scaled;

> -  double poly[EXP2F_POLY_ORDER];

> -  double shift;

> -  double invln2_scaled;

> -  double poly_scaled[EXP2F_POLY_ORDER];

> +  long long unsigned tab[1 << EXP2F_TABLE_BITS];

> +  vector double shift_scaled;

> +  vector double poly[EXP2F_POLY_ORDER];

> +  vector double shift;

> +  vector double invln2_scaled;

> +  vector double poly_scaled[EXP2F_POLY_ORDER];

>  } __exp2f_data attribute_hidden;

>

>  #define LOGF_TABLE_BITS 4

>  #define LOGF_POLY_ORDER 4

>  extern const struct logf_data

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c

> index b4721b1dac..37886d1a45 100644

> --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c

> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c

> @@ -22,6 +22,7 @@

>  #define VEC_TYPE vector float

>

>  VECTOR_WRAPPER (WRAPPER_NAME (cosf), _ZGVbN4v_cosf)

>  VECTOR_WRAPPER (WRAPPER_NAME (sinf), _ZGVbN4v_sinf)

>  VECTOR_WRAPPER (WRAPPER_NAME (logf), _ZGVbN4v_logf)

> +VECTOR_WRAPPER (WRAPPER_NAME (expf), _ZGVbN4v_expf)

>  VECTOR_WRAPPER_fFF (WRAPPER_NAME (sincosf), _ZGVbN4vvv_sincosf)

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c

> new file mode 100644

> index 0000000000..22b3a62fcf

> --- /dev/null

> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c

> @@ -0,0 +1,39 @@

> +/* Single-precision vector math error handling.

> +   Copyright 2019 Free Software Foundation, Inc.

> +   This file is part of the GNU C Library.

> +

> +   The GNU C Library is free software; you can redistribute it and/or

> +   modify it under the terms of the GNU Lesser General Public

> +   License as published by the Free Software Foundation; either

> +   version 2.1 of the License, or (at your option) any later version.

> +

> +   The GNU C Library is distributed in the hope that it will be useful,

> +   but WITHOUT ANY WARRANTY; without even the implied warranty of

> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU

> +   Lesser General Public License for more details.

> +

> +   You should have received a copy of the GNU Lesser General Public

> +   License along with the GNU C Library; if not, see

> +   <http://www.gnu.org/licenses/>.  */

> +

> +#include "math_config_flt.h"

> +/* NOINLINE prevents fenv semantics breaking optimizations.  */

> +NOINLINE static float

> +xflowf (uint32_t sign, float y)

> +{

> +  y = (sign ? -y : y) * y;

> +  return y;

> +}

> +

> +attribute_hidden float

> +__math_uflowf (uint32_t sign)

> +{

> +  return xflowf (sign, 0x1p-95f);

> +}

> +

> +attribute_hidden float

> +__math_oflowf (uint32_t sign)

> +{

> +  return xflowf (sign, 0x1p97f);

> +}

> +

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c

> new file mode 100644

> index 0000000000..b2232713a9

> --- /dev/null

> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c

> @@ -0,0 +1,58 @@

> +/* Data for exp, rearranged for vector operations

> +   Copyright (C) 2019 Free Software Foundation, Inc.

> +   This file is part of the GNU C Library.

> +

> +   The GNU C Library is free software; you can redistribute it and/or

> +   modify it under the terms of the GNU Lesser General Public

> +   License as published by the Free Software Foundation; either

> +   version 2.1 of the License, or (at your option) any later version.

> +

> +   The GNU C Library is distributed in the hope that it will be useful,

> +   but WITHOUT ANY WARRANTY; without even the implied warranty of

> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU

> +   Lesser General Public License for more details.

> +

> +   You should have received a copy of the GNU Lesser General Public

> +   License along with the GNU C Library; if not, see

> +   <http://www.gnu.org/licenses/>.  */

> +

> +#include "math_config_flt.h"

> +

> +#define N (1 << EXP2F_TABLE_BITS)

> +

> +const struct exp2f_data __exp2f_data = {

> +  /* tab[i] = uint(2^(i/N)) - (i << 52-BITS)

> +     used for computing 2^(k/N) for an int |k| < 150 N as

> +     double(tab[k%N] + (k << 52-BITS)) */

> +  .tab = {

> +0x3ff0000000000000, 0x3fefd9b0d3158574, 0x3fefb5586cf9890f, 0x3fef9301d0125b51,

> +0x3fef72b83c7d517b, 0x3fef54873168b9aa, 0x3fef387a6e756238, 0x3fef1e9df51fdee1,

> +0x3fef06fe0a31b715, 0x3feef1a7373aa9cb, 0x3feedea64c123422, 0x3feece086061892d,

> +0x3feebfdad5362a27, 0x3feeb42b569d4f82, 0x3feeab07dd485429, 0x3feea47eb03a5585,

> +0x3feea09e667f3bcd, 0x3fee9f75e8ec5f74, 0x3feea11473eb0187, 0x3feea589994cce13,

> +0x3feeace5422aa0db, 0x3feeb737b0cdc5e5, 0x3feec49182a3f090, 0x3feed503b23e255d,

> +0x3feee89f995ad3ad, 0x3feeff76f2fb5e47, 0x3fef199bdd85529c, 0x3fef3720dcef9069,

> +0x3fef5818dcfba487, 0x3fef7c97337b9b5f, 0x3fefa4afa2a490da, 0x3fefd0765b6e4540,

> +  },

> +  .shift_scaled = {

> +    0x1.8p+52 / N,

> +    0x1.8p+52 / N },

> +  .poly = {

> +  {0x1.c6af84b912394p-5, 0x1.c6af84b912394p-5},

> +  {0x1.ebfce50fac4f3p-3, 0x1.ebfce50fac4f3p-3},

> +  {0x1.62e42ff0c52d6p-1, 0x1.62e42ff0c52d6p-1},

> +  },

> +  .shift = {

> +    0x1.8p+52,

> +    0x1.8p+52,

> +  },

> +  .invln2_scaled = {

> +    0x1.71547652b82fep+0 * N,

> +    0x1.71547652b82fep+0 * N,

> +  },

> +  .poly_scaled = {

> +  {0x1.c6af84b912394p-5/N/N/N, 0x1.c6af84b912394p-5/N/N/N},

> +  {0x1.ebfce50fac4f3p-3/N/N, 0x1.ebfce50fac4f3p-3/N/N},

> +  {0x1.62e42ff0c52d6p-1/N, 0x1.62e42ff0c52d6p-1/N},

> +  },

> +};

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c

> new file mode 100644

> index 0000000000..d118428a66

> --- /dev/null

> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c

> @@ -0,0 +1,159 @@

> +/* Single-precision vector expf function.

> +   Copyright (C) 2019 Free Software Foundation, Inc.

> +   This file is part of the GNU C Library.

> +

> +   The GNU C Library is free software; you can redistribute it and/or

> +   modify it under the terms of the GNU Lesser General Public

> +   License as published by the Free Software Foundation; either

> +   version 2.1 of the License, or (at your option) any later version.

> +

> +   The GNU C Library is distributed in the hope that it will be useful,

> +   but WITHOUT ANY WARRANTY; without even the implied warranty of

> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU

> +   Lesser General Public License for more details.

> +

> +   You should have received a copy of the GNU Lesser General Public

> +   License along with the GNU C Library; if not, see

> +   <http://www.gnu.org/licenses/>.  */

> +/* Based off ./sysdeps/ieee754/flt-32/e_expf.c

> +   which was contributed by Szabolcs Nagy of ARM Ltd.  */

> +#include <altivec.h>

> +#include <math.h>

> +

> +#include "math_config_flt.h"

> +

> +typedef vector long long unsigned v64u;

> +typedef union {

> +  vector unsigned u;

> +  vector float f;

> +  v64u l;

> +  vector double d;

> +} u;

> +typedef union {

> +  double d;

> +  int64_t l;

> +  unsigned u;

> +  float f;

> +} us;

> +

> +#define N (1 << EXP2F_TABLE_BITS)

> +#define InvLn2N __exp2f_data.invln2_scaled

> +#define T __exp2f_data.tab

> +#define C __exp2f_data.poly_scaled

> +

> +vector float

> +_ZGVbN4v_expf (vector float x)

> +{

> +  u res;

> +  u xu;

> +  xu.f = x;

> +  us c88;

> +  c88.f = 88.0f;

> +  us inf;

> +  inf.f = INFINITY;

> +  us ninf;

> +  ninf.f = -INFINITY;

> +  vector unsigned constants = {(c88.u & 0xfff00000) << 1,

> +                               ninf.u, inf.u, 0};

> +  vector float constants2 = {0x1.62e42ep6f, -0x1.9fe368p6f, 0, 0};

> +  vector unsigned zero = {0, 0, 0, 0};

> +  vector unsigned v88 = vec_splat (constants, 0);

> +  vector unsigned is_special_case

> +       = (vector unsigned) vec_cmpge (xu.u << 1, v88);

> +  vector unsigned is_special_not_covered = is_special_case;

> +  if (__glibc_unlikely (!vec_all_eq (is_special_case, zero)))

> +    {

> +      vector unsigned inf = vec_splat (constants, 2);

> +      vector unsigned is_inf_or_ninf_or_nan

> +       = (vector unsigned) vec_cmpge (xu.u, inf);

> +      u xpx;

> +      xpx.f = x + x;

> +      res.u = xpx.u;

> +      is_special_not_covered &= ~is_inf_or_ninf_or_nan;

> +      vector unsigned ninf = vec_splat (constants, 1);

> +      vector unsigned is_ninf

> +       = (vector unsigned) vec_cmpeq (xu.u, ninf);

> +      res.u = vec_sel (res.u, zero, is_ninf);

> +

> +      vector float overflow_v = vec_splat (constants2, 0);

> +      vector unsigned is_overflow

> +       = (vector unsigned) vec_cmpgt (xu.f, overflow_v);

> +      if (__glibc_unlikely (!vec_all_eq (is_overflow, zero)))

> +         {

> +         // This branch is because we are generating an

> +         // overflow fp flag

> +         us ofu;

> +         ofu.f = __math_oflowf (0);

> +         vector unsigned of = {ofu.u, ofu.u, ofu.u, ofu.u};

> +         res.u = vec_sel (res.u, of, is_overflow);

> +         is_special_not_covered &= ~is_overflow;

> +       }

> +      vector float underflow_v = vec_splat (constants2, 1);

> +      vector unsigned is_underflow

> +       = (vector unsigned) vec_cmplt (xu.f, underflow_v);

> +      if (__glibc_unlikely (!vec_all_eq (is_underflow, zero)))

> +       {

> +         // This branch is because we are generating an

> +         // underflow fp flag

> +         us ufu;

> +         ufu.f = __math_uflowf (0);

> +         vector unsigned uf = {ufu.u, ufu.u, ufu.u, ufu.u};

> +         res.u = vec_sel (res.u, uf, is_underflow);

> +         is_special_not_covered &= ~is_underflow;

> +       }

> +     }

> +

> +#if __GNUC__ >= 8

|| __clang_major__ >= 6
https://reviews.llvm.org/rG41e14c4dfa63c45271cb42368d09b5a94ee16e4f#change-sAqzIEx8bz0L
> +  vector double xl = vec_unpackh (x);

> +  vector double xr = vec_unpackl (x);

> +#else

> +  vector double xl = {x[0], x[1]};

> +  vector double xr = {x[2], x[3]};

> +#endif

> +  vector double zl = InvLn2N * xl;

> +  vector double zr = InvLn2N * xr;

> +#if TOINT_INTRINSICS

> +  vector double kdl = roundtoint (zl);

> +  vector double kdl = roundtoint (zl);

> +  v64u kil = converttoint (zl);

> +  v64u kir = converttoint (zr);

> +#else

> +#define SHIFT __exp2f_data.shift

> +  vector double kdl = zl + SHIFT;

> +  vector double kdr = zr + SHIFT;

> +  u kilu;

> +  kilu.d = kdl;

> +  u kiru;

> +  kiru.d = kdr;

> +  v64u kil = kilu.l;

> +  v64u kir = kiru.l;

> +  kdl -= SHIFT;

> +  kdr -= SHIFT;

> +#endif

> +  vector double rl = zl - kdl;

> +  vector double rr = zr - kdr;

> +

> +  v64u tl = {T[kil[0] % N], T[kil[1] % N]};

> +  v64u tr = {T[kir[0] % N], T[kir[1] % N]};

> +  tl += (kil << (52 - EXP2F_TABLE_BITS));

> +  tr += (kir << (52 - EXP2F_TABLE_BITS));

> +  u slu;

> +  slu.l = tl;

> +  u sru;

> +  sru.l = tr;

> +  // This cast is obnoxious, but there is no vec_ld for double

> +  zl = C[0] * rl + C[1];

> +  zr = C[0] * rr + C[1];

> +  vector double r2l = rl * rl;

> +  vector double r2r = rr * rr;

> +  vector double yl = C[2] * rl + 1;

> +  vector double yr = C[2] * rr + 1;

> +  yl = zl * r2l + yl;

> +  yr = zr * r2r + yr;

> +  yl = yl * slu.d;

> +  yr = yr * sru.d;

> +  vector float restmp = {(float) yl[0], (float) yl[1],

> +       (float) yr[0], (float) yr[1]};

> +  return vec_sel (restmp, res.f,

> +       is_special_case & ~is_special_not_covered);

> +}

> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist

> index bb9f0beea4..63770c8da3 100644

> --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist

> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist

> @@ -4,5 +4,6 @@ GLIBC_2.30 _ZGVbN2v_sin F

>  GLIBC_2.30 _ZGVbN2vvv_sincos F

>  GLIBC_2.30 _ZGVbN4v_cosf F

>  GLIBC_2.30 _ZGVbN4v_logf F

>  GLIBC_2.30 _ZGVbN4v_sinf F

>  GLIBC_2.30 _ZGVbN4vvv_sincosf F

> +GLIBC_2.30 _ZGVbN4v_expf F

> --

> 2.20.1

>
Tulio Magno Quites Machado Filho May 28, 2019, 6:20 p.m. | #2
Shawn Landden <shawn@git.icu> writes:

> 2019-05-11  Shawn Landden  <shawn@git.icu>

>

> 	[BZ #24209]

>         * NEWS: Noted the addition of PPC64 vector expf function.

>         * sysdeps/powerpc/bits/math-vector.h: Added entry for vector expf.

>         * sysdeps/powerpc/powerpc64/fpu/Versions: Added vector expf entry.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile:

>         (libmvec-sysdep_routines, CFLAGS-vec_s_expf4_vsx.c):

>         (CFLAGS-vec_s_exp2f_data.c): Added build of VSX SIMD expf

>         function and tests. Added vec_math_errf.c to build.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h: Modified for expf.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c:

>         Added entry for vector expf.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c: New file.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c: Likewise.

>         * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c: Likewise.

>         * sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: SIMD expf

>         added.


These entries are still indented with spaces.  Fixed.

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile

> index a4f0d1694e..ad3c29b1ab 100644

> --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile

> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile

> @@ -46,10 +46,12 @@ endif

>  ifeq ($(subdir),mathvec)

>  libmvec-sysdep_routines += vec_d_cos2_vsx vec_s_cosf4_vsx \

>  			   vec_d_sin2_vsx vec_s_sinf4_vsx \

>  			   vec_d_log2_vsx vec_d_log_data \

>  			   vec_s_logf4_vsx vec_s_logf_data \

> +			   vec_s_expf4_vsx vec_s_exp2f_data \

> +			   vec_math_errf \

>  			   vec_d_sincos2_vsx vec_s_sincosf4_vsx

>  CFLAGS-vec_d_cos2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_d_log2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_d_log_data.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_s_cosf4_vsx.c += -mabi=altivec -maltivec -mvsx

> @@ -57,19 +59,21 @@ CFLAGS-vec_s_logf4_vsx.c += -mabi=altivec -maltivec -mvsx

>  CFLAGS-vec_s_logf_data.c += -mabi=altivec -maltivec -mvsx

>  CFLAGS-vec_d_sin2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_s_sinf4_vsx.c += -mabi=altivec -maltivec -mvsx

>  CFLAGS-vec_d_sincos2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector

>  CFLAGS-vec_s_sincosf4_vsx.c += -mabi=altivec -maltivec -mvsx

> +CFLAGS-vec_s_expf4_vsx.c += -mabi=altivec -maltivec -mvsx

> +CFLAGS-vec_s_exp2f_data.c += -mabi=altivec -maltivec -mvsx


vec_math_errf.c indirectly depends on altivec.h, which requires to use the same
flags.

Fixed locally.

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c

> new file mode 100644

> index 0000000000..22b3a62fcf

> --- /dev/null

> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c

> @@ -0,0 +1,39 @@

> ...

> +attribute_hidden float

> +__math_oflowf (uint32_t sign)

> +{

> +  return xflowf (sign, 0x1p97f);

> +}

> +


Extra line at EOF here.  Fixed.

I'm merging this patch with the fixes I mentioned to branch tuliom/libmvec.

Thanks!

-- 
Tulio Magno

Patch

diff --git a/NEWS b/NEWS
index 6f9af0e24e..26f5c67f75 100644
--- a/NEWS
+++ b/NEWS
@@ -18,10 +18,11 @@  Major new features:
   - single-precision logarithm: logf
   - double-precision sine: sin
   - single-precision sine: sinf
   - double-precision sincos: sincos
   - single-precision sincos: sincosf
+  - single-precision exp: expf
 
   GCC support for auto-vectorization of functions on PPC64 is not yet
   available. Until that is done, the new vector math functions are
   inaccessible to applications.
 
diff --git a/sysdeps/powerpc/bits/math-vector.h b/sysdeps/powerpc/bits/math-vector.h
index 6dc62480fd..bbc9db0dd9 100644
--- a/sysdeps/powerpc/bits/math-vector.h
+++ b/sysdeps/powerpc/bits/math-vector.h
@@ -42,8 +42,10 @@ 
 #  define __DECL_SIMD_logf __DECL_SIMD_PPC64
 #  undef __DECL_SIMD_sincos
 #  define __DECL_SIMD_sincos __DECL_SIMD_PPC64
 #  undef __DECL_SIMD_sincosf
 #  define __DECL_SIMD_sincosf __DECL_SIMD_PPC64
+#  undef __DECL_SIMD_expf
+#  define __DECL_SIMD_expf __DECL_SIMD_PPC64
 
 # endif
 #endif
diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions
index c083d9f938..7c45e76074 100644
--- a/sysdeps/powerpc/powerpc64/fpu/Versions
+++ b/sysdeps/powerpc/powerpc64/fpu/Versions
@@ -1,6 +1,7 @@ 
 libmvec {
   GLIBC_2.30 {
     _ZGVbN2v_cos; _ZGVbN4v_cosf; _ZGVbN2v_sin; _ZGVbN4v_sinf;
     _ZGVbN2v_log; _ZGVbN4v_logf; _ZGVbN2vvv_sincos; _ZGVbN4vvv_sincosf;
+    _ZGVbN4v_expf;
   }
 }
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
index a4f0d1694e..ad3c29b1ab 100644
--- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
@@ -46,10 +46,12 @@  endif
 ifeq ($(subdir),mathvec)
 libmvec-sysdep_routines += vec_d_cos2_vsx vec_s_cosf4_vsx \
 			   vec_d_sin2_vsx vec_s_sinf4_vsx \
 			   vec_d_log2_vsx vec_d_log_data \
 			   vec_s_logf4_vsx vec_s_logf_data \
+			   vec_s_expf4_vsx vec_s_exp2f_data \
+			   vec_math_errf \
 			   vec_d_sincos2_vsx vec_s_sincosf4_vsx
 CFLAGS-vec_d_cos2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector
 CFLAGS-vec_d_log2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector
 CFLAGS-vec_d_log_data.c += -mabi=altivec -maltivec -mvsx -mpower8-vector
 CFLAGS-vec_s_cosf4_vsx.c += -mabi=altivec -maltivec -mvsx
@@ -57,19 +59,21 @@  CFLAGS-vec_s_logf4_vsx.c += -mabi=altivec -maltivec -mvsx
 CFLAGS-vec_s_logf_data.c += -mabi=altivec -maltivec -mvsx
 CFLAGS-vec_d_sin2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector
 CFLAGS-vec_s_sinf4_vsx.c += -mabi=altivec -maltivec -mvsx
 CFLAGS-vec_d_sincos2_vsx.c += -mabi=altivec -maltivec -mvsx -mpower8-vector
 CFLAGS-vec_s_sincosf4_vsx.c += -mabi=altivec -maltivec -mvsx
+CFLAGS-vec_s_expf4_vsx.c += -mabi=altivec -maltivec -mvsx
+CFLAGS-vec_s_exp2f_data.c += -mabi=altivec -maltivec -mvsx
 endif
 
 # Variables for libmvec tests.
 ifeq ($(subdir),math)
 ifeq ($(build-mathvec),yes)
 libmvec-tests += double-vlen2 float-vlen4
 
 double-vlen2-funcs = cos sin sincos log
-float-vlen4-funcs = cos sin sincos log
+float-vlen4-funcs = cos sin sincos log exp
 
 double-vlen2-arch-ext-cflags = -mabi=altivec -maltivec -mvsx -DREQUIRE_VSX
 float-vlen4-arch-ext-cflags = -mabi=altivec -maltivec -mvsx -DREQUIRE_VSX
 
 CFLAGS-test-double-vlen2-wrappers.c += -mabi=altivec -maltivec -mvsx -DREQUIRE_VSX
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h
index b5fbe02d91..18f34197e2 100644
--- a/sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/math_config_flt.h
@@ -46,20 +46,42 @@ 
 
 #if TOINT_INTRINSICS
 /* Round x to nearest int in all rounding modes, ties have to be rounded
    consistently with converttoint so the results match.  If the result
    would be outside of [-2^31, 2^31-1] then the semantics is unspecified.  */
-static inline double_t
-roundtoint (double_t x);
+static inline vector double
+roundtoint (vector double x);
 
 /* Convert x to nearest int in all rounding modes, ties have to be rounded
    consistently with roundtoint.  If the result is not representible in an
    int32_t then the semantics is unspecified.  */
-static inline int32_t
-converttoint (double_t x);
+static inline vector unsigned long long
+converttoint (vector double x);
 #endif
 
+static inline vector unsigned
+vasuint (vector float f)
+{
+  union
+  {
+    vector float f;
+    vector unsigned i;
+  } u = {f};
+  return u.i;
+}
+
+static inline vector float
+vasfloat (vector unsigned i)
+{
+  union
+  {
+    vector unsigned i;
+    vector float f;
+  } u = {i};
+  return u.f;
+}
+
 static inline uint32_t
 asuint (float f)
 {
   union
   {
@@ -78,10 +100,32 @@  asfloat (uint32_t i)
     float f;
   } u = {i};
   return u.f;
 }
 
+static inline vector unsigned long long
+vasuint64 (vector double f)
+{
+  union
+  {
+    vector double f;
+    vector unsigned long long i;
+  } u = {f};
+  return u.i;
+}
+
+static inline vector double
+vasdouble (vector unsigned long long i)
+{
+  union
+  {
+    vector unsigned long long i;
+    vector double f;
+  } u = {i};
+  return u.f;
+}
+
 static inline uint64_t
 asuint64 (double f)
 {
   union
   {
@@ -122,16 +166,16 @@  attribute_hidden float __math_invalidf (float);
 /* Shared between expf, exp2f and powf.  */
 #define EXP2F_TABLE_BITS 5
 #define EXP2F_POLY_ORDER 3
 extern const struct exp2f_data
 {
-  uint64_t tab[1 << EXP2F_TABLE_BITS];
-  double shift_scaled;
-  double poly[EXP2F_POLY_ORDER];
-  double shift;
-  double invln2_scaled;
-  double poly_scaled[EXP2F_POLY_ORDER];
+  long long unsigned tab[1 << EXP2F_TABLE_BITS];
+  vector double shift_scaled;
+  vector double poly[EXP2F_POLY_ORDER];
+  vector double shift;
+  vector double invln2_scaled;
+  vector double poly_scaled[EXP2F_POLY_ORDER];
 } __exp2f_data attribute_hidden;
 
 #define LOGF_TABLE_BITS 4
 #define LOGF_POLY_ORDER 4
 extern const struct logf_data
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c
index b4721b1dac..37886d1a45 100644
--- a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-float-vlen4-wrappers.c
@@ -22,6 +22,7 @@ 
 #define VEC_TYPE vector float
 
 VECTOR_WRAPPER (WRAPPER_NAME (cosf), _ZGVbN4v_cosf)
 VECTOR_WRAPPER (WRAPPER_NAME (sinf), _ZGVbN4v_sinf)
 VECTOR_WRAPPER (WRAPPER_NAME (logf), _ZGVbN4v_logf)
+VECTOR_WRAPPER (WRAPPER_NAME (expf), _ZGVbN4v_expf)
 VECTOR_WRAPPER_fFF (WRAPPER_NAME (sincosf), _ZGVbN4vvv_sincosf)
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c
new file mode 100644
index 0000000000..22b3a62fcf
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_math_errf.c
@@ -0,0 +1,39 @@ 
+/* Single-precision vector math error handling.
+   Copyright 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "math_config_flt.h"
+/* NOINLINE prevents fenv semantics breaking optimizations.  */
+NOINLINE static float
+xflowf (uint32_t sign, float y)
+{
+  y = (sign ? -y : y) * y;
+  return y;
+}
+
+attribute_hidden float
+__math_uflowf (uint32_t sign)
+{
+  return xflowf (sign, 0x1p-95f);
+}
+
+attribute_hidden float
+__math_oflowf (uint32_t sign)
+{
+  return xflowf (sign, 0x1p97f);
+}
+
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c
new file mode 100644
index 0000000000..b2232713a9
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_exp2f_data.c
@@ -0,0 +1,58 @@ 
+/* Data for exp, rearranged for vector operations
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "math_config_flt.h"
+
+#define N (1 << EXP2F_TABLE_BITS)
+
+const struct exp2f_data __exp2f_data = {
+  /* tab[i] = uint(2^(i/N)) - (i << 52-BITS)
+     used for computing 2^(k/N) for an int |k| < 150 N as
+     double(tab[k%N] + (k << 52-BITS)) */
+  .tab = {
+0x3ff0000000000000, 0x3fefd9b0d3158574, 0x3fefb5586cf9890f, 0x3fef9301d0125b51,
+0x3fef72b83c7d517b, 0x3fef54873168b9aa, 0x3fef387a6e756238, 0x3fef1e9df51fdee1,
+0x3fef06fe0a31b715, 0x3feef1a7373aa9cb, 0x3feedea64c123422, 0x3feece086061892d,
+0x3feebfdad5362a27, 0x3feeb42b569d4f82, 0x3feeab07dd485429, 0x3feea47eb03a5585,
+0x3feea09e667f3bcd, 0x3fee9f75e8ec5f74, 0x3feea11473eb0187, 0x3feea589994cce13,
+0x3feeace5422aa0db, 0x3feeb737b0cdc5e5, 0x3feec49182a3f090, 0x3feed503b23e255d,
+0x3feee89f995ad3ad, 0x3feeff76f2fb5e47, 0x3fef199bdd85529c, 0x3fef3720dcef9069,
+0x3fef5818dcfba487, 0x3fef7c97337b9b5f, 0x3fefa4afa2a490da, 0x3fefd0765b6e4540,
+  },
+  .shift_scaled = {
+    0x1.8p+52 / N,
+    0x1.8p+52 / N },
+  .poly = {
+  {0x1.c6af84b912394p-5, 0x1.c6af84b912394p-5},
+  {0x1.ebfce50fac4f3p-3, 0x1.ebfce50fac4f3p-3},
+  {0x1.62e42ff0c52d6p-1, 0x1.62e42ff0c52d6p-1},
+  },
+  .shift = {
+    0x1.8p+52,
+    0x1.8p+52,
+  },
+  .invln2_scaled = {
+    0x1.71547652b82fep+0 * N,
+    0x1.71547652b82fep+0 * N,
+  },
+  .poly_scaled = {
+  {0x1.c6af84b912394p-5/N/N/N, 0x1.c6af84b912394p-5/N/N/N},
+  {0x1.ebfce50fac4f3p-3/N/N, 0x1.ebfce50fac4f3p-3/N/N},
+  {0x1.62e42ff0c52d6p-1/N, 0x1.62e42ff0c52d6p-1/N},
+  },
+};
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c
new file mode 100644
index 0000000000..d118428a66
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_s_expf4_vsx.c
@@ -0,0 +1,159 @@ 
+/* Single-precision vector expf function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+/* Based off ./sysdeps/ieee754/flt-32/e_expf.c
+   which was contributed by Szabolcs Nagy of ARM Ltd.  */
+#include <altivec.h>
+#include <math.h>
+
+#include "math_config_flt.h"
+
+typedef vector long long unsigned v64u;
+typedef union {
+  vector unsigned u;
+  vector float f;
+  v64u l;
+  vector double d;
+} u;
+typedef union {
+  double d;
+  int64_t l;
+  unsigned u;
+  float f;
+} us;
+
+#define N (1 << EXP2F_TABLE_BITS)
+#define InvLn2N __exp2f_data.invln2_scaled
+#define T __exp2f_data.tab
+#define C __exp2f_data.poly_scaled
+
+vector float
+_ZGVbN4v_expf (vector float x)
+{
+  u res;
+  u xu;
+  xu.f = x;
+  us c88;
+  c88.f = 88.0f;
+  us inf;
+  inf.f = INFINITY;
+  us ninf;
+  ninf.f = -INFINITY;
+  vector unsigned constants = {(c88.u & 0xfff00000) << 1,
+				ninf.u, inf.u, 0};
+  vector float constants2 = {0x1.62e42ep6f, -0x1.9fe368p6f, 0, 0};
+  vector unsigned zero = {0, 0, 0, 0};
+  vector unsigned v88 = vec_splat (constants, 0);
+  vector unsigned is_special_case
+	= (vector unsigned) vec_cmpge (xu.u << 1, v88);
+  vector unsigned is_special_not_covered = is_special_case;
+  if (__glibc_unlikely (!vec_all_eq (is_special_case, zero)))
+    {
+      vector unsigned inf = vec_splat (constants, 2);
+      vector unsigned is_inf_or_ninf_or_nan
+	= (vector unsigned) vec_cmpge (xu.u, inf);
+      u xpx;
+      xpx.f = x + x;
+      res.u = xpx.u;
+      is_special_not_covered &= ~is_inf_or_ninf_or_nan;
+      vector unsigned ninf = vec_splat (constants, 1);
+      vector unsigned is_ninf
+	= (vector unsigned) vec_cmpeq (xu.u, ninf);
+      res.u = vec_sel (res.u, zero, is_ninf);
+
+      vector float overflow_v = vec_splat (constants2, 0);
+      vector unsigned is_overflow
+	= (vector unsigned) vec_cmpgt (xu.f, overflow_v);
+      if (__glibc_unlikely (!vec_all_eq (is_overflow, zero)))
+	  {
+	  // This branch is because we are generating an
+	  // overflow fp flag
+	  us ofu;
+	  ofu.f = __math_oflowf (0);
+	  vector unsigned of = {ofu.u, ofu.u, ofu.u, ofu.u};
+	  res.u = vec_sel (res.u, of, is_overflow);
+	  is_special_not_covered &= ~is_overflow;
+	}
+      vector float underflow_v = vec_splat (constants2, 1);
+      vector unsigned is_underflow
+	= (vector unsigned) vec_cmplt (xu.f, underflow_v);
+      if (__glibc_unlikely (!vec_all_eq (is_underflow, zero)))
+	{
+	  // This branch is because we are generating an
+	  // underflow fp flag
+	  us ufu;
+	  ufu.f = __math_uflowf (0);
+	  vector unsigned uf = {ufu.u, ufu.u, ufu.u, ufu.u};
+	  res.u = vec_sel (res.u, uf, is_underflow);
+	  is_special_not_covered &= ~is_underflow;
+	}
+     }
+
+#if __GNUC__ >= 8
+  vector double xl = vec_unpackh (x);
+  vector double xr = vec_unpackl (x);
+#else
+  vector double xl = {x[0], x[1]};
+  vector double xr = {x[2], x[3]};
+#endif
+  vector double zl = InvLn2N * xl;
+  vector double zr = InvLn2N * xr;
+#if TOINT_INTRINSICS
+  vector double kdl = roundtoint (zl);
+  vector double kdl = roundtoint (zl);
+  v64u kil = converttoint (zl);
+  v64u kir = converttoint (zr);
+#else
+#define SHIFT __exp2f_data.shift
+  vector double kdl = zl + SHIFT;
+  vector double kdr = zr + SHIFT;
+  u kilu;
+  kilu.d = kdl;
+  u kiru;
+  kiru.d = kdr;
+  v64u kil = kilu.l;
+  v64u kir = kiru.l;
+  kdl -= SHIFT;
+  kdr -= SHIFT;
+#endif
+  vector double rl = zl - kdl;
+  vector double rr = zr - kdr;
+
+  v64u tl = {T[kil[0] % N], T[kil[1] % N]};
+  v64u tr = {T[kir[0] % N], T[kir[1] % N]};
+  tl += (kil << (52 - EXP2F_TABLE_BITS));
+  tr += (kir << (52 - EXP2F_TABLE_BITS));
+  u slu;
+  slu.l = tl;
+  u sru;
+  sru.l = tr;
+  // This cast is obnoxious, but there is no vec_ld for double
+  zl = C[0] * rl + C[1];
+  zr = C[0] * rr + C[1];
+  vector double r2l = rl * rl;
+  vector double r2r = rr * rr;
+  vector double yl = C[2] * rl + 1;
+  vector double yr = C[2] * rr + 1;
+  yl = zl * r2l + yl;
+  yr = zr * r2r + yr;
+  yl = yl * slu.d;
+  yr = yr * sru.d;
+  vector float restmp = {(float) yl[0], (float) yl[1],
+	(float) yr[0], (float) yr[1]};
+  return vec_sel (restmp, res.f,
+	is_special_case & ~is_special_not_covered);
+}
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
index bb9f0beea4..63770c8da3 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
@@ -4,5 +4,6 @@  GLIBC_2.30 _ZGVbN2v_sin F
 GLIBC_2.30 _ZGVbN2vvv_sincos F
 GLIBC_2.30 _ZGVbN4v_cosf F
 GLIBC_2.30 _ZGVbN4v_logf F
 GLIBC_2.30 _ZGVbN4v_sinf F
 GLIBC_2.30 _ZGVbN4vvv_sincosf F
+GLIBC_2.30 _ZGVbN4v_expf F