[v2,ARM,2/x] : MVE ACLE intrinsics framework patch.

Message ID DBBPR08MB4775BAFF92BE85AB99508C509B150@DBBPR08MB4775.eurprd08.prod.outlook.com
State Superseded
Headers show
Series
  • [v2,ARM,2/x] : MVE ACLE intrinsics framework patch.
Related show

Commit Message

Srinath Parvathaneni Feb. 14, 2020, 4:34 p.m.
Hello Kyrill,

In this patch (v2) all the review comments mentioned in previous patch (v1) are
addressed.

(v1) https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01395.html

#####################

Hello,

This patch is part of MVE ACLE intrinsics framework.
This patches add support to update (read/write) the APSR (Application Program Status Register)
register and FPSCR (Floating-point Status and Control Register) register for MVE.
This patch also enables thumb2 mov RTL patterns for MVE.

A new feature bit vfp_base is added. This bit is enabled for all VFP, MVE and MVE with floating point
extensions. This bit is used to enable the macro TARGET_VFP_BASE. For all the VFP instructions, RTL patterns,
status and control registers are guarded by TARGET_HAVE_FLOAT. But this patch modifies that and the
common instructions, RTL patterns, status and control registers bewteen MVE and VFP are guarded by
TARGET_VFP_BASE macro.

The RTL pattern set_fpscr and get_fpscr are updated to use VFPCC_REGNUM because few MVE intrinsics
set/get carry bit of FPSCR register.

Please refer to Arm reference manual [1] for more details.
[1] https://developer.arm.com/docs/ddi0553/latest

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath
gcc/ChangeLog:

2020-20-11  Andre Vieira  <andre.simoesdiasvieira@arm.com>
	    Mihail Ionescu  <mihail.ionescu@arm.com>
	    Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

	* common/config/arm/arm-common.c (arm_asm_auto_mfpu): When vfp_base
	feature bit is on and -mfpu=auto is passed as compiler option, do not
	generate error on not finding any match fpu. Because in this case fpu
	is not required.
	* config/arm/arm-cpus.in (vfp_base): Define feature bit, this bit is
	enabled for MVE and also for all VFP extensions.
	(VFPv2): Modify fgroup to enable vfp_base feature bit when ever VFPv2
	is enabled.
	(MVE): Define fgroup to enable feature bits mve, vfp_base and armv7em.
	(MVE_FP): Define fgroup to enable feature bits is fgroup MVE and FPv5
	along with feature bits mve_float.
	(mve): Modify add options in armv8.1-m.main arch for MVE.
	(mve.fp): Modify add options in armv8.1-m.main arch for MVE with
	floating point.
	* config/arm/arm.c (use_return_insn): Replace the
	check with TARGET_VFP_BASE.
	(thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with
	TARGET_VFP_BASE.
	(arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
	with TARGET_VFP_BASE, to allow cost calculations for copies in MVE as
	well.
	(arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with
	TARGET_VFP_BASE, to allow space calculation for VFP registers in MVE
	as well.
	(arm_compute_frame_layout): Likewise.
	(arm_save_coproc_regs): Likewise.
	(arm_fixed_condition_code_regs): Modify to enable using VFPCC_REGNUM
	in MVE as well.
	(arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
	with equivalent macro TARGET_VFP_BASE.
	(arm_expand_epilogue_apcs_frame): Likewise.
	(arm_expand_epilogue): Likewise.
	(arm_conditional_register_usage): Likewise.
	(arm_declare_function_name): Add check to skip printing .fpu directive
	in assembly file when TARGET_VFP_BASE is enabled and fpu_to_print is
	"softvfp".
	* config/arm/arm.h (TARGET_VFP_BASE): Define.
	* config/arm/arm.md (arch): Add "mve" to arch.
	(eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.
	(vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT
	|| TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.
	* config/arm/constraints.md (Uf): Define for MVE.
	* config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify target guard
	to not allow for MVE.
	* config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile unspecs
	enum.
	(VUNSPEC_GET_FPSCR): Define.
	* config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR and VMRS
	instructions which move to general-purpose Register from Floating-point
	Special register and vice-versa.
	(thumb2_movhi_fp16): Likewise.
	(thumb2_movsi_vfp): Add support for VMSR and VMRS instructions along
	with MCR and MRC instructions which set and get Floating-point Status
	and Control Register (FPSCR).
	(movdi_vfp): Modify pattern to enable Single-precision scalar float move
	in MVE.
	(thumb2_movdf_vfp): Modify pattern to enable Double-precision scalar
	float move patterns in MVE.
	(thumb2_movsfcc_vfp): Modify pattern to enable single float conditional
	code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
	(thumb2_movdfcc_vfp): Modify pattern to enable double float conditional
	code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
	(push_multi_vfp): Add support to use VFP VPUSH pattern for MVE by adding
	TARGET_VFP_BASE check.
	(set_fpscr): Add support to set FPSCR register for MVE. Modify pattern
	using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
	register.
	(get_fpscr): Add support to get FPSCR register for MVE. Modify pattern
        using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
	register.

gcc/testsuite/ChangeLog:

2020-02-11  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

	* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test.
	* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
	* gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise.
	* gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise.
	* gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise.


###############     Attachment also inlined for ease of reply    ###############


>From 997855cd3551822ba74c7b3c5edb52d97dd0b1d3 Mon Sep 17 00:00:00 2001

From: Srinath Parvathaneni <srinath.parvathaneni@arm.com>

Date: Tue, 11 Feb 2020 17:29:32 +0000
Subject: [PATCH] [PATCH][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.

---
 gcc/common/config/arm/arm-common.c                 |   2 +-
 gcc/config/arm/arm-cpus.in                         |  14 ++-
 gcc/config/arm/arm.c                               |  29 ++---
 gcc/config/arm/arm.h                               |  13 +++
 gcc/config/arm/arm.md                              |   8 +-
 gcc/config/arm/constraints.md                      |   5 +-
 gcc/config/arm/thumb2.md                           |   2 +-
 gcc/config/arm/unspecs.md                          |   2 +-
 gcc/config/arm/vfp.md                              | 129 +++++++++++++--------
 .../gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c    |  14 +++
 .../gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c    |  14 +++
 .../gcc.target/arm/mve/intrinsics/mve_fpu1.c       |  14 +++
 .../gcc.target/arm/mve/intrinsics/mve_fpu2.c       |  14 +++
 .../gcc.target/arm/mve/intrinsics/mve_fpu3.c       |  12 ++
 14 files changed, 200 insertions(+), 72 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c

-- 
2.7.4
From 997855cd3551822ba74c7b3c5edb52d97dd0b1d3 Mon Sep 17 00:00:00 2001
From: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Date: Tue, 11 Feb 2020 17:29:32 +0000
Subject: [PATCH] [PATCH][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.

---
 gcc/common/config/arm/arm-common.c                 |   2 +-
 gcc/config/arm/arm-cpus.in                         |  14 ++-
 gcc/config/arm/arm.c                               |  29 ++---
 gcc/config/arm/arm.h                               |  13 +++
 gcc/config/arm/arm.md                              |   8 +-
 gcc/config/arm/constraints.md                      |   5 +-
 gcc/config/arm/thumb2.md                           |   2 +-
 gcc/config/arm/unspecs.md                          |   2 +-
 gcc/config/arm/vfp.md                              | 129 +++++++++++++--------
 .../gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c    |  14 +++
 .../gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c    |  14 +++
 .../gcc.target/arm/mve/intrinsics/mve_fpu1.c       |  14 +++
 .../gcc.target/arm/mve/intrinsics/mve_fpu2.c       |  14 +++
 .../gcc.target/arm/mve/intrinsics/mve_fpu3.c       |  12 ++
 14 files changed, 200 insertions(+), 72 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c

diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index 30a2a1d..a465966 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -1009,7 +1009,7 @@ arm_asm_auto_mfpu (int argc, const char **argv)
 	    }
 	}
 
-      gcc_assert (i != TARGET_FPU_auto);
+      gcc_assert (i != TARGET_FPU_auto || isa_bit_vfp_base);
     }
 
   auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 96f584d..77b4309 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -135,6 +135,10 @@ define feature armv8_1m_main
 # Floating point and Neon extensions.
 # VFPv1 is not supported in GCC.
 
+# This feature bit is enabled for all VFP, MVE and
+# MVE with floating point extensions.
+define feature vfp_base
+
 # Vector floating point v2.
 define feature vfpv2
 
@@ -234,7 +238,7 @@ define fgroup ALL_SIMD	ALL_SIMD_INTERNAL ALL_SIMD_EXTERNAL
 
 # List of all FPU bits to strip out if -mfpu is used to override the
 # default.  fp16 is deliberately missing from this list.
-define fgroup ALL_FPU_INTERNAL	vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
+define fgroup ALL_FPU_INTERNAL	vfp_base vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
 # Similarly, but including fp16 and other extensions that aren't part of
 # -mfpu support.
 define fgroup ALL_FPU_EXTERNAL fp16 bf16
@@ -279,10 +283,12 @@ define fgroup ARMv8r      ARMv8a
 define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main
 
 # Useful combinations.
-define fgroup VFPv2	vfpv2
+define fgroup VFPv2	vfp_base vfpv2
 define fgroup VFPv3	VFPv2 vfpv3
 define fgroup VFPv4	VFPv3 vfpv4 fp16conv
 define fgroup FPv5	VFPv4 fpv5
+define fgroup MVE      mve vfp_base armv7em
+define fgroup MVE_FP   MVE FPv5 fp16 mve_float
 
 define fgroup FP_DBL	fp_dbl
 define fgroup FP_D32	FP_DBL fp_d32
@@ -699,8 +705,8 @@ begin arch armv8.1-m.main
  option fp add FPv5 fp16
  option fp.dp add FPv5 FP_DBL fp16
  option nofp remove ALL_FP
- option mve add mve armv7em
- option mve.fp add mve FPv5 fp16 mve_float armv7em
+ option mve add MVE
+ option mve.fp add MVE_FP
 end arch armv8.1-m.main
 
 begin arch iwmmxt
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 10cf0e6..7e993b9 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -334,6 +334,19 @@ emission of floating point pcs attributes.  */
 						isa_bit_mve_float) \
 			       && !TARGET_GENERAL_REGS_ONLY)
 
+/* MVE have few common instructions as VFP, like VLDM alias VPOP, VLDR, VSTM
+   alia VPUSH, VSTR and VMOV, VMSR and VMRS.  In the same manner it updates few
+   registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and MVFR2.  All
+   the VFP instructions, RTL patterns and register are guarded by
+   TARGET_HARD_FLOAT.  But the common instructions, RTL pattern and registers
+   between MVE and VFP will be guarded by the following macro TARGET_VFP_BASE
+   hereafter.  */
+
+#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
+			 && bitmap_bit_p (arm_active_target.isa, \
+					  isa_bit_vfp_base) \
+			 && !TARGET_GENERAL_REGS_ONLY)
+
 /* Nonzero if integer division instructions supported.  */
 #define TARGET_IDIV	((TARGET_ARM && arm_arch_arm_hwdiv)	\
 			 || (TARGET_THUMB && arm_arch_thumb_hwdiv))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3a95ea3..c28a475 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling)
 
   /* Can't be done if any of the VFP regs are pushed,
      since this also requires an insn.  */
-  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+  if (TARGET_VFP_BASE)
     for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
       if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p (regno))
 	return 0;
@@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool is_double)
     return false;
 
   return (TARGET_32BIT && TARGET_HARD_FLOAT &&
-	  (TARGET_VFP_DOUBLE || !is_double));
+	 (TARGET_VFP_DOUBLE || !is_double));
 }
 
 /* Return true if an argument whose type is TYPE, or mode is MODE, is
@@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode, rtx index, int strict_p)
 
   /* ??? Combine arm and thumb2 coprocessor addressing modes.  */
   /* Standard coprocessor addressing modes.  */
-  if (TARGET_HARD_FLOAT
+  if (TARGET_VFP_BASE
       && (mode == SFmode || mode == DFmode))
     return (code == CONST_INT && INTVAL (index) < 1024
 	    /* Thumb-2 allows only > -256 index range for it's core register
@@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	  /* Assume that most copies can be done with a single insn,
 	     unless we don't have HW FP, in which case everything
 	     larger than word mode will require two insns.  */
-	  *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+	  *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE
 				   && GET_MODE_SIZE (mode) > 4)
 				  || mode == DImode)
 				 ? 2 : 1);
@@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void)
 
   saved = 0;
   /* Space for saved VFP registers.  */
-  if (TARGET_HARD_FLOAT)
+  if (TARGET_VFP_BASE)
     {
       count = 0;
       for (regno = FIRST_VFP_REGNUM;
@@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void)
       func_type = arm_current_func_type ();
       /* Space for saved VFP registers.  */
       if (! IS_VOLATILE (func_type)
-	  && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
+	  && TARGET_VFP_BASE)
 	saved += arm_get_vfp_saved_size ();
 
       /* Allocate space for saving/restoring FPCXTNS in Armv8.1-M Mainline
@@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void)
 	saved_size += 8;
       }
 
-  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+  if (TARGET_VFP_BASE)
     {
       start_reg = FIRST_VFP_REGNUM;
 
@@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2)
     return false;
 
   *p1 = CC_REGNUM;
-  *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;
+  *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;
   return true;
 }
 
@@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
 {
   if (GET_MODE_CLASS (mode) == MODE_CC)
     return (regno == CC_REGNUM
-	    || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+	    || (TARGET_VFP_BASE
 		&& regno == VFPCC_REGNUM));
 
   if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
@@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
        start of an even numbered register pair.  */
     return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
 
-  if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
+  if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))
     {
       if (mode == DFmode)
 	return VFP_REGNO_OK_FOR_DOUBLE (regno);
@@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool really_return)
         floats_from_frame += 4;
       }
 
-  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+  if (TARGET_VFP_BASE)
     {
       int start_reg;
       rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
@@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return)
         }
     }
 
-  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+  if (TARGET_VFP_BASE)
     {
       /* Generate VFP register multi-pop.  */
       int end_reg = LAST_VFP_REGNUM + 1;
@@ -29695,7 +29695,7 @@ arm_conditional_register_usage (void)
   if (TARGET_THUMB1)
     fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
 
-  if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
+  if (TARGET_32BIT && TARGET_VFP_BASE)
     {
       /* VFPv3 registers are disabled when earlier VFP
 	 versions are selected due to the definition of
@@ -32470,7 +32470,8 @@ arm_declare_function_name (FILE *stream, const char *name, tree decl)
     = TARGET_SOFT_FLOAT
 	? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
 
-  if (fpu_to_print != arm_last_printed_arch_string)
+  if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)
+      && (fpu_to_print != arm_last_printed_arch_string))
     {
       asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
       arm_last_printed_fpu_string = fpu_to_print;
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 3a12f18..dcafb71 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -129,7 +129,7 @@
 ; arm_arch6.  "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
 ; Baseline.  This attribute is used to compute attribute "enabled",
 ; use type "any" to enable an alternative in all cases.
-(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
+(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"
   (const_string "any"))
 
 (define_attr "arch_enabled" "no,yes"
@@ -183,6 +183,10 @@
 	 (and (eq_attr "arch" "neon")
 	      (match_test "TARGET_NEON"))
 	 (const_string "yes")
+
+	 (and (eq_attr "arch" "mve")
+	      (match_test "TARGET_HAVE_MVE"))
+	 (const_string "yes")
 	]
 
 	(const_string "no")))
@@ -11744,7 +11748,7 @@
                    (match_operand:SI 2 "const_int_I_operand" "I")))
      (set (match_operand:DF 3 "vfp_hard_register_operand" "")
           (mem:DF (match_dup 1)))])]
-  "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
+  "TARGET_32BIT && TARGET_VFP_BASE"
   "*
   {
     int num_regs = XVECLEN (operands[0], 0);
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 3577fb9..0908c79 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -38,7 +38,7 @@
 ;; in all states: Pf, Pg
 
 ;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Uf
 ;; in ARM state: Uq
 ;; in Thumb state: Uu, Uw
 ;; in all states: Q
@@ -46,6 +46,9 @@
 (define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
   "MVE VPR register")
 
+(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
+  "MVE FPCCR register")
+
 (define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
  "The VFP registers @code{s0}-@code{s31}.")
 
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index b0d3bd1..793f670 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -517,7 +517,7 @@
 			  [(match_operand 4 "cc_register" "") (const_int 0)])
 			 (match_operand:SF 1 "s_register_operand" "0,r")
 			 (match_operand:SF 2 "s_register_operand" "r,0")))]
-  "TARGET_THUMB2 && TARGET_SOFT_FLOAT"
+  "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
   "@
    it\\t%D3\;mov%D3\\t%0, %2
    it\\t%d3\;mov%d3\\t%0, %1"
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 8f4a705..73588fc 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -170,6 +170,7 @@
   UNSPEC_TORC		; Used by the intrinsic form of the iWMMXt TORC instruction.
   UNSPEC_TORVSC		; Used by the intrinsic form of the iWMMXt TORVSC instruction.
   UNSPEC_TEXTRC		; Used by the intrinsic form of the iWMMXt TEXTRC instruction.
+  UNSPEC_GET_FPSCR	; Represent fetch of FPSCR content.
 ])
 
 
@@ -216,7 +217,6 @@
   VUNSPEC_SLX		; Represent a store-register-release-exclusive.
   VUNSPEC_LDA		; Represent a store-register-acquire.
   VUNSPEC_STL		; Represent a store-register-release.
-  VUNSPEC_GET_FPSCR	; Represent fetch of FPSCR content.
   VUNSPEC_SET_FPSCR	; Represent assign of FPSCR content.
   VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
   VUNSPEC_CDP		; Represent the coprocessor cdp instruction.
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index ab16a6b..eb6ae7b 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -74,10 +74,10 @@
 (define_insn "*thumb2_movhi_vfp"
  [(set
    (match_operand:HI 0 "nonimmediate_operand"
-    "=rk, r, l, r, m, r, *t, r, *t")
+    "=rk, r, l, r, m, r, *t, r, *t, Up, r")
    (match_operand:HI 1 "general_operand"
-    "rk, I, Py, n, r, m, r, *t, *t"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+    "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
+ "TARGET_THUMB2 && TARGET_VFP_BASE
   && !TARGET_VFP_FP16INST
   && (register_operand (operands[0], HImode)
        || register_operand (operands[1], HImode))"
@@ -99,20 +99,24 @@
       return "vmov%?\t%0, %1\t%@ int";
     case 8:
       return "vmov%?.f32\t%0, %1\t%@ int";
+    case 9:
+      return "vmsr%?\t P0, %1\t@ movhi";
+    case 10:
+      return "vmrs%?\t %0, P0\t@ movhi";
     default:
       gcc_unreachable ();
     }
 }
  [(set_attr "predicable" "yes")
   (set_attr "predicable_short_it"
-   "yes, no, yes, no, no, no, no, no, no")
+   "yes, no, yes, no, no, no, no, no, no, no, no")
   (set_attr "type"
    "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
-    f_mcr, f_mrc, fmov")
-  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
-  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
-  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
-  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
+    f_mcr, f_mrc, fmov, mve_move, mve_move")
+  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
+  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
+  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
+  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
 )
 
 ;; Patterns for HI moves which provide more data transfer instructions when FP16
@@ -170,10 +174,10 @@
 (define_insn "*thumb2_movhi_fp16"
  [(set
    (match_operand:HI 0 "nonimmediate_operand"
-    "=rk, r, l, r, m, r, *t, r, *t")
+    "=rk, r, l, r, m, r, *t, r, *t, Up, r")
    (match_operand:HI 1 "general_operand"
-    "rk, I, Py, n, r, m, r, *t, *t"))]
- "TARGET_THUMB2 && TARGET_VFP_FP16INST
+    "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
+ "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)
   && (register_operand (operands[0], HImode)
        || register_operand (operands[1], HImode))"
 {
@@ -194,21 +198,25 @@
       return "vmov.f16\t%0, %1\t%@ int";
     case 8:
       return "vmov%?.f32\t%0, %1\t%@ int";
+    case 9:
+      return "vmsr%?\tP0, %1\t%@ movhi";
+    case 10:
+      return "vmrs%?\t%0, P0\t%@ movhi";
     default:
       gcc_unreachable ();
     }
 }
  [(set_attr "predicable"
-   "yes, yes, yes, yes, yes, yes, no, no, yes")
+   "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")
   (set_attr "predicable_short_it"
-   "yes, no, yes, no, no, no, no, no, no")
+   "yes, no, yes, no, no, no, no, no, no, no, no")
   (set_attr "type"
    "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
-    f_mcr, f_mrc, fmov")
-  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
-  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
-  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
-  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
+    f_mcr, f_mrc, fmov, mve_move, mve_move")
+  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
+  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
+  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
+  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
 )
 
 ;; SImode moves
@@ -258,9 +266,11 @@
 ;; is chosen with length 2 when the instruction is predicated for
 ;; arm_restrict_it.
 (define_insn "*thumb2_movsi_vfp"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t,  *Uv")
-	(match_operand:SI 1 "general_operand"	   "rk,I,Py,K,j,mi,lk*r, r,*t,*t,*UvTu,*t"))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,l,*hk,m,*m,*t,\
+						    r,*t,*t,*Uv, Up, r,Uf,r")
+	(match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\
+					       *t,*UvTu,*t, r, Up,r,Uf"))]
+  "TARGET_THUMB2 && TARGET_VFP_BASE
    && (   s_register_operand (operands[0], SImode)
        || s_register_operand (operands[1], SImode))"
   "*
@@ -275,30 +285,44 @@
     case 4:
       return \"movw%?\\t%0, %1\";
     case 5:
+    case 6:
       /* Cannot load it directly, split to load it via MOV / MOVT.  */
       if (!MEM_P (operands[1]) && arm_disable_literal_pool)
 	return \"#\";
       return \"ldr%?\\t%0, %1\";
-    case 6:
-      return \"str%?\\t%1, %0\";
     case 7:
-      return \"vmov%?\\t%0, %1\\t%@ int\";
     case 8:
-      return \"vmov%?\\t%0, %1\\t%@ int\";
+      return \"str%?\\t%1, %0\";
     case 9:
+      return \"vmov%?\\t%0, %1\\t%@ int\";
+    case 10:
+      return \"vmov%?\\t%0, %1\\t%@ int\";
+    case 11:
       return \"vmov%?.f32\\t%0, %1\\t%@ int\";
-    case 10: case 11:
+    case 12: case 13:
       return output_move_vfp (operands);
+    case 14:
+      return \"vmsr\\t P0, %1\";
+    case 15:
+      return \"vmrs\\t %0, P0\";
+    case 16:
+      return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";
+    case 17:
+      return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";
     default:
       gcc_unreachable ();
     }
   "
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no")
-   (set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")
-   (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")
-   (set_attr "pool_range"     "*,*,*,*,*,1018,*,*,*,*,1018,*")
-   (set_attr "neg_pool_range" "*,*,*,*,*,   0,*,*,*,*,1008,*")]
+   (set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\
+	      no,no,no,no,no")
+   (set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\
+	     store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\
+	     mve_move,mrs,mrs")
+   (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")
+   (set_attr "pool_range"     "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")
+   (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")
+   (set_attr "neg_pool_range" "*,*,*,*,*,   0,   0,*,*,*,*,*,1008,*,*,*,*,*")]
 )
 
 
@@ -306,12 +330,12 @@
 
 (define_insn "*movdi_vfp"
   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,r,r,m,w,!r,w,w, Uv")
-	(match_operand:DI 1 "di_operand"	      "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
-  "TARGET_32BIT && TARGET_HARD_FLOAT
+	(match_operand:DI 1 "di_operand"       "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
+  "TARGET_32BIT && TARGET_VFP_BASE
    && (   register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))
-   && !(TARGET_NEON && CONST_INT_P (operands[1])
-	&& simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
+   && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])
+       && simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
   "*
   switch (which_alternative)
     {
@@ -333,7 +357,7 @@
     case 8:
       return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
     case 9:
-      if (TARGET_VFP_SINGLE)
+      if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)
 	return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0, %p1\\t%@ int\";
       else
 	return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
@@ -390,9 +414,15 @@
     case 6: /* S register from immediate.  */
       return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";
     case 7: /* S register from memory.  */
-      return \"vld1.16\\t{%z0}, %A1\";
+      if (TARGET_HAVE_MVE)
+	return \"vldr.16\\t%0, %A1\";
+      else
+	return \"vld1.16\\t{%z0}, %A1\";
     case 8: /* Memory from S register.  */
-      return \"vst1.16\\t{%z1}, %A0\";
+      if (TARGET_HAVE_MVE)
+	return \"vstr.16\\t%1, %A0\";
+      else
+	return \"vst1.16\\t{%z1}, %A0\";
     case 9: /* ARM register from constant.  */
       {
 	long bits;
@@ -593,7 +623,7 @@
 (define_insn "*thumb2_movsf_vfp"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r ,m,t,r")
 	(match_operand:SF 1 "hard_sf_operand"	   " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT
+  "TARGET_THUMB2 && TARGET_VFP_BASE
    && (   s_register_operand (operands[0], SFmode)
        || s_register_operand (operands[1], SFmode))"
   "*
@@ -682,7 +712,7 @@
 (define_insn "*thumb2_movdf_vfp"
   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  ,Uv,r ,m,w,r")
 	(match_operand:DF 1 "hard_df_operand"		   " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT
+  "TARGET_THUMB2 && TARGET_VFP_BASE
    && (   register_operand (operands[0], DFmode)
        || register_operand (operands[1], DFmode))"
   "*
@@ -760,7 +790,7 @@
 	    [(match_operand 4 "cc_register" "") (const_int 0)])
 	  (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
 	  (match_operand:SF 2 "s_register_operand" "t,0,t,?r,0,?r,t,0,t")))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"
+  "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"
   "@
    it\\t%D3\;vmov%D3.f32\\t%0, %2
    it\\t%d3\;vmov%d3.f32\\t%0, %1
@@ -806,7 +836,8 @@
 	    [(match_operand 4 "cc_register" "") (const_int 0)])
 	  (match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
 	  (match_operand:DF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE && !arm_restrict_it"
+  "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE
+   && !arm_restrict_it"
   "@
    it\\t%D3\;vmov%D3.f64\\t%P0, %P2
    it\\t%d3\;vmov%d3.f64\\t%P0, %P1
@@ -1977,7 +2008,7 @@
     [(set (match_operand:BLK 0 "memory_operand" "=m")
 	  (unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]
 		      UNSPEC_PUSH_MULT))])]
-  "TARGET_32BIT && TARGET_HARD_FLOAT"
+  "TARGET_32BIT && TARGET_VFP_BASE"
   "* return vfp_output_vstmd (operands);"
   [(set_attr "type" "f_stored")]
 )
@@ -2065,16 +2096,18 @@
 
 ;; Write Floating-point Status and Control Register.
 (define_insn "set_fpscr"
-  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
-  "TARGET_HARD_FLOAT"
+  [(set (reg:SI VFPCC_REGNUM)
+	(unspec_volatile:SI
+	 [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR))]
+  "TARGET_VFP_BASE"
   "mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"
   [(set_attr "type" "mrs")])
 
 ;; Read Floating-point Status and Control Register.
 (define_insn "get_fpscr"
   [(set (match_operand:SI 0 "register_operand" "=r")
-        (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]
-  "TARGET_HARD_FLOAT"
+	(unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]
+  "TARGET_VFP_BASE"
   "mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"
   [(set_attr "type" "mrs")])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
new file mode 100644
index 0000000..17ba616
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=hard -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+  int8x16_t b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
new file mode 100644
index 0000000..7b877c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=softfp -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+  int8x16_t b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
new file mode 100644
index 0000000..85fbb57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+  int8x16_t b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler-not "\.fpu softvfp" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
new file mode 100644
index 0000000..23b3683
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=softfp -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+  int8x16_t b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler-not "\.fpu softvfp" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
new file mode 100644
index 0000000..8f7fa34
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
@@ -0,0 +1,12 @@
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=soft -mthumb" } */
+
+int
+foo1 (int value)
+{
+  int b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu softvfp" }  } */

Comments

Kyrill Tkachov Feb. 17, 2020, 2:57 p.m. | #1
Hi Srinath,

On 2/14/20 4:34 PM, Srinath Parvathaneni wrote:
> Hello Kyrill,

>

> In this patch (v2) all the review comments mentioned in previous patch 

> (v1) are

> addressed.

>

> (v1) https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01395.html

>

> #####################

>

> Hello,

>

> This patch is part of MVE ACLE intrinsics framework.

> This patches add support to update (read/write) the APSR (Application 

> Program Status Register)

> register and FPSCR (Floating-point Status and Control Register) 

> register for MVE.

> This patch also enables thumb2 mov RTL patterns for MVE.

>

> A new feature bit vfp_base is added. This bit is enabled for all VFP, 

> MVE and MVE with floating point

> extensions. This bit is used to enable the macro TARGET_VFP_BASE. For 

> all the VFP instructions, RTL patterns,

> status and control registers are guarded by TARGET_HAVE_FLOAT. But 

> this patch modifies that and the

> common instructions, RTL patterns, status and control registers 

> bewteen MVE and VFP are guarded by

> TARGET_VFP_BASE macro.

>

> The RTL pattern set_fpscr and get_fpscr are updated to use 

> VFPCC_REGNUM because few MVE intrinsics

> set/get carry bit of FPSCR register.

>

> Please refer to Arm reference manual [1] for more details.

> [1] https://developer.arm.com/docs/ddi0553/latest

>

> Regression tested on arm-none-eabi and found no regressions.

>

> Ok for trunk?


Ok (please test a big-endian target as well, as per the 1st framework 
patch).

Thanks,

Kyrill


>

> Thanks,

> Srinath

> gcc/ChangeLog:

>

> 2020-20-11  Andre Vieira <andre.simoesdiasvieira@arm.com>

>             Mihail Ionescu  <mihail.ionescu@arm.com>

>             Srinath Parvathaneni <srinath.parvathaneni@arm.com>

>

>         * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When 

> vfp_base

>         feature bit is on and -mfpu=auto is passed as compiler option, 

> do not

>         generate error on not finding any match fpu. Because in this 

> case fpu

>         is not required.

>         * config/arm/arm-cpus.in (vfp_base): Define feature bit, this 

> bit is

>         enabled for MVE and also for all VFP extensions.

>         (VFPv2): Modify fgroup to enable vfp_base feature bit when 

> ever VFPv2

>         is enabled.

>         (MVE): Define fgroup to enable feature bits mve, vfp_base and 

> armv7em.

>         (MVE_FP): Define fgroup to enable feature bits is fgroup MVE 

> and FPv5

>         along with feature bits mve_float.

>         (mve): Modify add options in armv8.1-m.main arch for MVE.

>         (mve.fp): Modify add options in armv8.1-m.main arch for MVE with

>         floating point.

>         * config/arm/arm.c (use_return_insn): Replace the

>         check with TARGET_VFP_BASE.

>         (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with

>         TARGET_VFP_BASE.

>         (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || 

> TARGET_HAVE_MVE"

>         with TARGET_VFP_BASE, to allow cost calculations for copies in 

> MVE as

>         well.

>         (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with

>         TARGET_VFP_BASE, to allow space calculation for VFP registers 

> in MVE

>         as well.

>         (arm_compute_frame_layout): Likewise.

>         (arm_save_coproc_regs): Likewise.

>         (arm_fixed_condition_code_regs): Modify to enable using 

> VFPCC_REGNUM

>         in MVE as well.

>         (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || 

> TARGET_HAVE_MVE"

>         with equivalent macro TARGET_VFP_BASE.

>         (arm_expand_epilogue_apcs_frame): Likewise.

>         (arm_expand_epilogue): Likewise.

>         (arm_conditional_register_usage): Likewise.

>         (arm_declare_function_name): Add check to skip printing .fpu 

> directive

>         in assembly file when TARGET_VFP_BASE is enabled and 

> fpu_to_print is

>         "softvfp".

>         * config/arm/arm.h (TARGET_VFP_BASE): Define.

>         * config/arm/arm.md (arch): Add "mve" to arch.

>         (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.

>         (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT

>         || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.

>         * config/arm/constraints.md (Uf): Define for MVE.

>         * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify 

> target guard

>         to not allow for MVE.

>         * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile 

> unspecs

>         enum.

>         (VUNSPEC_GET_FPSCR): Define.

>         * config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR 

> and VMRS

>         instructions which move to general-purpose Register from 

> Floating-point

>         Special register and vice-versa.

>         (thumb2_movhi_fp16): Likewise.

>         (thumb2_movsi_vfp): Add support for VMSR and VMRS instructions 

> along

>         with MCR and MRC instructions which set and get Floating-point 

> Status

>         and Control Register (FPSCR).

>         (movdi_vfp): Modify pattern to enable Single-precision scalar 

> float move

>         in MVE.

>         (thumb2_movdf_vfp): Modify pattern to enable Double-precision 

> scalar

>         float move patterns in MVE.

>         (thumb2_movsfcc_vfp): Modify pattern to enable single float 

> conditional

>         code move patterns of VFP also in MVE by adding 

> TARGET_VFP_BASE check.

>         (thumb2_movdfcc_vfp): Modify pattern to enable double float 

> conditional

>         code move patterns of VFP also in MVE by adding 

> TARGET_VFP_BASE check.

>         (push_multi_vfp): Add support to use VFP VPUSH pattern for MVE 

> by adding

>         TARGET_VFP_BASE check.

>         (set_fpscr): Add support to set FPSCR register for MVE. Modify 

> pattern

>         using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR

>         register.

>         (get_fpscr): Add support to get FPSCR register for MVE. Modify 

> pattern

>         using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR

>         register.

>

> gcc/testsuite/ChangeLog:

>

> 2020-02-11  Srinath Parvathaneni <srinath.parvathaneni@arm.com>

>

>         * gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test.

>         * gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.

>         * gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise.

>         * gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise.

>         * gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise.

>

>

> ###############     Attachment also inlined for ease of reply    

> ###############

>

>

> >From 997855cd3551822ba74c7b3c5edb52d97dd0b1d3 Mon Sep 17 00:00:00 2001

> From: Srinath Parvathaneni <srinath.parvathaneni@arm.com>

> Date: Tue, 11 Feb 2020 17:29:32 +0000

> Subject: [PATCH] [PATCH][ARM][GCC][2/x]: MVE ACLE intrinsics framework 

> patch.

>

> ---

>  gcc/common/config/arm/arm-common.c                 |   2 +-

>  gcc/config/arm/arm-cpus.in                         |  14 ++-

>  gcc/config/arm/arm.c                               |  29 ++---

>  gcc/config/arm/arm.h                               |  13 +++

>  gcc/config/arm/arm.md                              |   8 +-

>  gcc/config/arm/constraints.md                      |   5 +-

>  gcc/config/arm/thumb2.md                           |   2 +-

>  gcc/config/arm/unspecs.md                          |   2 +-

>  gcc/config/arm/vfp.md                              | 129 

> +++++++++++++--------

>  .../gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c    |  14 +++

>  .../gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c    |  14 +++

>  .../gcc.target/arm/mve/intrinsics/mve_fpu1.c       |  14 +++

>  .../gcc.target/arm/mve/intrinsics/mve_fpu2.c       |  14 +++

>  .../gcc.target/arm/mve/intrinsics/mve_fpu3.c       |  12 ++

>  14 files changed, 200 insertions(+), 72 deletions(-)

>  create mode 100644 

> gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c

>  create mode 100644 

> gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c

>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c

>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c

>  create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c

>

> diff --git a/gcc/common/config/arm/arm-common.c 

> b/gcc/common/config/arm/arm-common.c

> index 30a2a1d..a465966 100644

> --- a/gcc/common/config/arm/arm-common.c

> +++ b/gcc/common/config/arm/arm-common.c

> @@ -1009,7 +1009,7 @@ arm_asm_auto_mfpu (int argc, const char **argv)

>              }

>          }

>

> -      gcc_assert (i != TARGET_FPU_auto);

> +      gcc_assert (i != TARGET_FPU_auto || isa_bit_vfp_base);

>      }

>

>    auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));

> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in

> index 96f584d..77b4309 100644

> --- a/gcc/config/arm/arm-cpus.in

> +++ b/gcc/config/arm/arm-cpus.in

> @@ -135,6 +135,10 @@ define feature armv8_1m_main

>  # Floating point and Neon extensions.

>  # VFPv1 is not supported in GCC.

>

> +# This feature bit is enabled for all VFP, MVE and

> +# MVE with floating point extensions.

> +define feature vfp_base

> +

>  # Vector floating point v2.

>  define feature vfpv2

>

> @@ -234,7 +238,7 @@ define fgroup ALL_SIMD ALL_SIMD_INTERNAL 

> ALL_SIMD_EXTERNAL

>

>  # List of all FPU bits to strip out if -mfpu is used to override the

>  # default.  fp16 is deliberately missing from this list.

> -define fgroup ALL_FPU_INTERNAL vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl 

> ALL_SIMD_INTERNAL

> +define fgroup ALL_FPU_INTERNAL vfp_base vfpv2 vfpv3 vfpv4 fpv5 

> fp16conv fp_dbl ALL_SIMD_INTERNAL

>  # Similarly, but including fp16 and other extensions that aren't part of

>  # -mfpu support.

>  define fgroup ALL_FPU_EXTERNAL fp16 bf16

> @@ -279,10 +283,12 @@ define fgroup ARMv8r      ARMv8a

>  define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main

>

>  # Useful combinations.

> -define fgroup VFPv2    vfpv2

> +define fgroup VFPv2    vfp_base vfpv2

>  define fgroup VFPv3     VFPv2 vfpv3

>  define fgroup VFPv4     VFPv3 vfpv4 fp16conv

>  define fgroup FPv5      VFPv4 fpv5

> +define fgroup MVE      mve vfp_base armv7em

> +define fgroup MVE_FP   MVE FPv5 fp16 mve_float

>

>  define fgroup FP_DBL    fp_dbl

>  define fgroup FP_D32    FP_DBL fp_d32

> @@ -699,8 +705,8 @@ begin arch armv8.1-m.main

>   option fp add FPv5 fp16

>   option fp.dp add FPv5 FP_DBL fp16

>   option nofp remove ALL_FP

> - option mve add mve armv7em

> - option mve.fp add mve FPv5 fp16 mve_float armv7em

> + option mve add MVE

> + option mve.fp add MVE_FP

>  end arch armv8.1-m.main

>

>  begin arch iwmmxt

> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h

> index 10cf0e6..7e993b9 100644

> --- a/gcc/config/arm/arm.h

> +++ b/gcc/config/arm/arm.h

> @@ -334,6 +334,19 @@ emission of floating point pcs attributes.  */

> isa_bit_mve_float) \

>                                 && !TARGET_GENERAL_REGS_ONLY)

>

> +/* MVE have few common instructions as VFP, like VLDM alias VPOP, 

> VLDR, VSTM

> +   alia VPUSH, VSTR and VMOV, VMSR and VMRS.  In the same manner it 

> updates few

> +   registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and 

> MVFR2.  All

> +   the VFP instructions, RTL patterns and register are guarded by

> +   TARGET_HARD_FLOAT.  But the common instructions, RTL pattern and 

> registers

> +   between MVE and VFP will be guarded by the following macro 

> TARGET_VFP_BASE

> +   hereafter.  */

> +

> +#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \

> +                        && bitmap_bit_p (arm_active_target.isa, \

> +                                         isa_bit_vfp_base) \

> +                        && !TARGET_GENERAL_REGS_ONLY)

> +

>  /* Nonzero if integer division instructions supported.  */

>  #define TARGET_IDIV     ((TARGET_ARM && arm_arch_arm_hwdiv)     \

>                           || (TARGET_THUMB && arm_arch_thumb_hwdiv))

> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c

> index 3a95ea3..c28a475 100644

> --- a/gcc/config/arm/arm.c

> +++ b/gcc/config/arm/arm.c

> @@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling)

>

>    /* Can't be done if any of the VFP regs are pushed,

>       since this also requires an insn.  */

> -  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)

> +  if (TARGET_VFP_BASE)

>      for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)

>        if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p 

> (regno))

>          return 0;

> @@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool 

> is_double)

>      return false;

>

>    return (TARGET_32BIT && TARGET_HARD_FLOAT &&

> -         (TARGET_VFP_DOUBLE || !is_double));

> +        (TARGET_VFP_DOUBLE || !is_double));

>  }

>

>  /* Return true if an argument whose type is TYPE, or mode is MODE, is

> @@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode, 

> rtx index, int strict_p)

>

>    /* ??? Combine arm and thumb2 coprocessor addressing modes.  */

>    /* Standard coprocessor addressing modes.  */

> -  if (TARGET_HARD_FLOAT

> +  if (TARGET_VFP_BASE

>        && (mode == SFmode || mode == DFmode))

>      return (code == CONST_INT && INTVAL (index) < 1024

>              /* Thumb-2 allows only > -256 index range for it's core 

> register

> @@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code 

> code, enum rtx_code outer_code,

>            /* Assume that most copies can be done with a single insn,

>               unless we don't have HW FP, in which case everything

>               larger than word mode will require two insns. */

> -         *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)

> +         *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE

>                                     && GET_MODE_SIZE (mode) > 4)

>                                    || mode == DImode)

>                                   ? 2 : 1);

> @@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void)

>

>    saved = 0;

>    /* Space for saved VFP registers.  */

> -  if (TARGET_HARD_FLOAT)

> +  if (TARGET_VFP_BASE)

>      {

>        count = 0;

>        for (regno = FIRST_VFP_REGNUM;

> @@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void)

>        func_type = arm_current_func_type ();

>        /* Space for saved VFP registers.  */

>        if (! IS_VOLATILE (func_type)

> -         && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))

> +         && TARGET_VFP_BASE)

>          saved += arm_get_vfp_saved_size ();

>

>        /* Allocate space for saving/restoring FPCXTNS in Armv8.1-M 

> Mainline

> @@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void)

>          saved_size += 8;

>        }

>

> -  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)

> +  if (TARGET_VFP_BASE)

>      {

>        start_reg = FIRST_VFP_REGNUM;

>

> @@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int 

> *p1, unsigned int *p2)

>      return false;

>

>    *p1 = CC_REGNUM;

> -  *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;

> +  *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;

>    return true;

>  }

>

> @@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno, 

> machine_mode mode)

>  {

>    if (GET_MODE_CLASS (mode) == MODE_CC)

>      return (regno == CC_REGNUM

> -           || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)

> +           || (TARGET_VFP_BASE

>                  && regno == VFPCC_REGNUM));

>

>    if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)

> @@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno, 

> machine_mode mode)

>         start of an even numbered register pair.  */

>      return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);

>

> -  if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))

> +  if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))

>      {

>        if (mode == DFmode)

>          return VFP_REGNO_OK_FOR_DOUBLE (regno);

> @@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool 

> really_return)

>          floats_from_frame += 4;

>        }

>

> -  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)

> +  if (TARGET_VFP_BASE)

>      {

>        int start_reg;

>        rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);

> @@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return)

>          }

>      }

>

> -  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)

> +  if (TARGET_VFP_BASE)

>      {

>        /* Generate VFP register multi-pop.  */

>        int end_reg = LAST_VFP_REGNUM + 1;

> @@ -29695,7 +29695,7 @@ arm_conditional_register_usage (void)

>    if (TARGET_THUMB1)

>      fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;

>

> -  if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))

> +  if (TARGET_32BIT && TARGET_VFP_BASE)

>      {

>        /* VFPv3 registers are disabled when earlier VFP

>           versions are selected due to the definition of

> @@ -32470,7 +32470,8 @@ arm_declare_function_name (FILE *stream, const 

> char *name, tree decl)

>      = TARGET_SOFT_FLOAT

>          ? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);

>

> -  if (fpu_to_print != arm_last_printed_arch_string)

> +  if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)

> +      && (fpu_to_print != arm_last_printed_arch_string))

>      {

>        asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());

>        arm_last_printed_fpu_string = fpu_to_print;

> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md

> index 3a12f18..dcafb71 100644

> --- a/gcc/config/arm/arm.md

> +++ b/gcc/config/arm/arm.md

> @@ -129,7 +129,7 @@

>  ; arm_arch6.  "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M

>  ; Baseline.  This attribute is used to compute attribute "enabled",

>  ; use type "any" to enable an alternative in all cases.

> -(define_attr "arch" 

> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"

> +(define_attr "arch" 

> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"

>    (const_string "any"))

>

>  (define_attr "arch_enabled" "no,yes"

> @@ -183,6 +183,10 @@

>           (and (eq_attr "arch" "neon")

>                (match_test "TARGET_NEON"))

>           (const_string "yes")

> +

> +        (and (eq_attr "arch" "mve")

> +             (match_test "TARGET_HAVE_MVE"))

> +        (const_string "yes")

>          ]

>

>          (const_string "no")))

> @@ -11744,7 +11748,7 @@

>                     (match_operand:SI 2 "const_int_I_operand" "I")))

>       (set (match_operand:DF 3 "vfp_hard_register_operand" "")

>            (mem:DF (match_dup 1)))])]

> -  "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"

> +  "TARGET_32BIT && TARGET_VFP_BASE"

>    "*

>    {

>      int num_regs = XVECLEN (operands[0], 0);

> diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md

> index 3577fb9..0908c79 100644

> --- a/gcc/config/arm/constraints.md

> +++ b/gcc/config/arm/constraints.md

> @@ -38,7 +38,7 @@

>  ;; in all states: Pf, Pg

>

>  ;; The following memory constraints have been used:

> -;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us

> +;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Uf

>  ;; in ARM state: Uq

>  ;; in Thumb state: Uu, Uw

>  ;; in all states: Q

> @@ -46,6 +46,9 @@

>  (define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"

>    "MVE VPR register")

>

> +(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"

> +  "MVE FPCCR register")

> +

>  (define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"

>   "The VFP registers @code{s0}-@code{s31}.")

>

> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md

> index b0d3bd1..793f670 100644

> --- a/gcc/config/arm/thumb2.md

> +++ b/gcc/config/arm/thumb2.md

> @@ -517,7 +517,7 @@

>                            [(match_operand 4 "cc_register" "") 

> (const_int 0)])

>                           (match_operand:SF 1 "s_register_operand" "0,r")

>                           (match_operand:SF 2 "s_register_operand" 

> "r,0")))]

> -  "TARGET_THUMB2 && TARGET_SOFT_FLOAT"

> +  "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"

>    "@

>     it\\t%D3\;mov%D3\\t%0, %2

>     it\\t%d3\;mov%d3\\t%0, %1"

> diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md

> index 8f4a705..73588fc 100644

> --- a/gcc/config/arm/unspecs.md

> +++ b/gcc/config/arm/unspecs.md

> @@ -170,6 +170,7 @@

>    UNSPEC_TORC          ; Used by the intrinsic form of the iWMMXt 

> TORC instruction.

>    UNSPEC_TORVSC                ; Used by the intrinsic form of the 

> iWMMXt TORVSC instruction.

>    UNSPEC_TEXTRC                ; Used by the intrinsic form of the 

> iWMMXt TEXTRC instruction.

> +  UNSPEC_GET_FPSCR     ; Represent fetch of FPSCR content.

>  ])

>

>

> @@ -216,7 +217,6 @@

>    VUNSPEC_SLX          ; Represent a store-register-release-exclusive.

>    VUNSPEC_LDA          ; Represent a store-register-acquire.

>    VUNSPEC_STL          ; Represent a store-register-release.

> -  VUNSPEC_GET_FPSCR    ; Represent fetch of FPSCR content.

>    VUNSPEC_SET_FPSCR    ; Represent assign of FPSCR content.

>    VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.

>    VUNSPEC_CDP          ; Represent the coprocessor cdp instruction.

> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md

> index ab16a6b..eb6ae7b 100644

> --- a/gcc/config/arm/vfp.md

> +++ b/gcc/config/arm/vfp.md

> @@ -74,10 +74,10 @@

>  (define_insn "*thumb2_movhi_vfp"

>   [(set

>     (match_operand:HI 0 "nonimmediate_operand"

> -    "=rk, r, l, r, m, r, *t, r, *t")

> +    "=rk, r, l, r, m, r, *t, r, *t, Up, r")

>     (match_operand:HI 1 "general_operand"

> -    "rk, I, Py, n, r, m, r, *t, *t"))]

> - "TARGET_THUMB2 && TARGET_HARD_FLOAT

> +    "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]

> + "TARGET_THUMB2 && TARGET_VFP_BASE

>    && !TARGET_VFP_FP16INST

>    && (register_operand (operands[0], HImode)

>         || register_operand (operands[1], HImode))"

> @@ -99,20 +99,24 @@

>        return "vmov%?\t%0, %1\t%@ int";

>      case 8:

>        return "vmov%?.f32\t%0, %1\t%@ int";

> +    case 9:

> +      return "vmsr%?\t P0, %1\t@ movhi";

> +    case 10:

> +      return "vmrs%?\t %0, P0\t@ movhi";

>      default:

>        gcc_unreachable ();

>      }

>  }

>   [(set_attr "predicable" "yes")

>    (set_attr "predicable_short_it"

> -   "yes, no, yes, no, no, no, no, no, no")

> +   "yes, no, yes, no, no, no, no, no, no, no, no")

>    (set_attr "type"

>     "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\

> -    f_mcr, f_mrc, fmov")

> -  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")

> -  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")

> -  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")

> -  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]

> +    f_mcr, f_mrc, fmov, mve_move, mve_move")

> +  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")

> +  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")

> +  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")

> +  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]

>  )

>

>  ;; Patterns for HI moves which provide more data transfer 

> instructions when FP16

> @@ -170,10 +174,10 @@

>  (define_insn "*thumb2_movhi_fp16"

>   [(set

>     (match_operand:HI 0 "nonimmediate_operand"

> -    "=rk, r, l, r, m, r, *t, r, *t")

> +    "=rk, r, l, r, m, r, *t, r, *t, Up, r")

>     (match_operand:HI 1 "general_operand"

> -    "rk, I, Py, n, r, m, r, *t, *t"))]

> - "TARGET_THUMB2 && TARGET_VFP_FP16INST

> +    "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]

> + "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)

>    && (register_operand (operands[0], HImode)

>         || register_operand (operands[1], HImode))"

>  {

> @@ -194,21 +198,25 @@

>        return "vmov.f16\t%0, %1\t%@ int";

>      case 8:

>        return "vmov%?.f32\t%0, %1\t%@ int";

> +    case 9:

> +      return "vmsr%?\tP0, %1\t%@ movhi";

> +    case 10:

> +      return "vmrs%?\t%0, P0\t%@ movhi";

>      default:

>        gcc_unreachable ();

>      }

>  }

>   [(set_attr "predicable"

> -   "yes, yes, yes, yes, yes, yes, no, no, yes")

> +   "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")

>    (set_attr "predicable_short_it"

> -   "yes, no, yes, no, no, no, no, no, no")

> +   "yes, no, yes, no, no, no, no, no, no, no, no")

>    (set_attr "type"

>     "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\

> -    f_mcr, f_mrc, fmov")

> -  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")

> -  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")

> -  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")

> -  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]

> +    f_mcr, f_mrc, fmov, mve_move, mve_move")

> +  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")

> +  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")

> +  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")

> +  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]

>  )

>

>  ;; SImode moves

> @@ -258,9 +266,11 @@

>  ;; is chosen with length 2 when the instruction is predicated for

>  ;; arm_restrict_it.

>  (define_insn "*thumb2_movsi_vfp"

> -  [(set (match_operand:SI 0 "nonimmediate_operand" 

> "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t,  *Uv")

> -       (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,lk*r, 

> r,*t,*t,*UvTu,*t"))]

> -  "TARGET_THUMB2 && TARGET_HARD_FLOAT

> +  [(set (match_operand:SI 0 "nonimmediate_operand" 

> "=rk,r,l,r,r,l,*hk,m,*m,*t,\

> + r,*t,*t,*Uv, Up, r,Uf,r")

> +       (match_operand:SI 1 "general_operand" 

> "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\

> +                                              *t,*UvTu,*t, r, Up,r,Uf"))]

> +  "TARGET_THUMB2 && TARGET_VFP_BASE

>     && (   s_register_operand (operands[0], SImode)

>         || s_register_operand (operands[1], SImode))"

>    "*

> @@ -275,30 +285,44 @@

>      case 4:

>        return \"movw%?\\t%0, %1\";

>      case 5:

> +    case 6:

>        /* Cannot load it directly, split to load it via MOV / MOVT.  */

>        if (!MEM_P (operands[1]) && arm_disable_literal_pool)

>          return \"#\";

>        return \"ldr%?\\t%0, %1\";

> -    case 6:

> -      return \"str%?\\t%1, %0\";

>      case 7:

> -      return \"vmov%?\\t%0, %1\\t%@ int\";

>      case 8:

> -      return \"vmov%?\\t%0, %1\\t%@ int\";

> +      return \"str%?\\t%1, %0\";

>      case 9:

> +      return \"vmov%?\\t%0, %1\\t%@ int\";

> +    case 10:

> +      return \"vmov%?\\t%0, %1\\t%@ int\";

> +    case 11:

>        return \"vmov%?.f32\\t%0, %1\\t%@ int\";

> -    case 10: case 11:

> +    case 12: case 13:

>        return output_move_vfp (operands);

> +    case 14:

> +      return \"vmsr\\t P0, %1\";

> +    case 15:

> +      return \"vmrs\\t %0, P0\";

> +    case 16:

> +      return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";

> +    case 17:

> +      return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";

>      default:

>        gcc_unreachable ();

>      }

>    "

>    [(set_attr "predicable" "yes")

> -   (set_attr "predicable_short_it" 

> "yes,no,yes,no,no,no,no,no,no,no,no,no")

> -   (set_attr "type" 

> "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")

> -   (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")

> -   (set_attr "pool_range" "*,*,*,*,*,1018,*,*,*,*,1018,*")

> -   (set_attr "neg_pool_range" "*,*,*,*,*, 0,*,*,*,*,1008,*")]

> +   (set_attr "predicable_short_it" 

> "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\

> +             no,no,no,no,no")

> +   (set_attr "type" 

> "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\

> + store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\

> +            mve_move,mrs,mrs")

> +   (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")

> +   (set_attr "pool_range" "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")

> +   (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")

> +   (set_attr "neg_pool_range" "*,*,*,*,*,   0, 

> 0,*,*,*,*,*,1008,*,*,*,*,*")]

>  )

>

>

> @@ -306,12 +330,12 @@

>

>  (define_insn "*movdi_vfp"

>    [(set (match_operand:DI 0 "nonimmediate_di_operand" 

> "=r,r,r,r,r,r,m,w,!r,w,w, Uv")

> -       (match_operand:DI 1 "di_operand" 

> "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]

> -  "TARGET_32BIT && TARGET_HARD_FLOAT

> +       (match_operand:DI 1 "di_operand" 

> "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]

> +  "TARGET_32BIT && TARGET_VFP_BASE

>     && (   register_operand (operands[0], DImode)

>         || register_operand (operands[1], DImode))

> -   && !(TARGET_NEON && CONST_INT_P (operands[1])

> -       && simd_immediate_valid_for_move (operands[1], DImode, NULL, 

> NULL))"

> +   && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])

> +       && simd_immediate_valid_for_move (operands[1], DImode, NULL, 

> NULL))"

>    "*

>    switch (which_alternative)

>      {

> @@ -333,7 +357,7 @@

>      case 8:

>        return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";

>      case 9:

> -      if (TARGET_VFP_SINGLE)

> +      if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)

>          return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0, 

> %p1\\t%@ int\";

>        else

>          return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";

> @@ -390,9 +414,15 @@

>      case 6: /* S register from immediate.  */

>        return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";

>      case 7: /* S register from memory.  */

> -      return \"vld1.16\\t{%z0}, %A1\";

> +      if (TARGET_HAVE_MVE)

> +       return \"vldr.16\\t%0, %A1\";

> +      else

> +       return \"vld1.16\\t{%z0}, %A1\";

>      case 8: /* Memory from S register.  */

> -      return \"vst1.16\\t{%z1}, %A0\";

> +      if (TARGET_HAVE_MVE)

> +       return \"vstr.16\\t%1, %A0\";

> +      else

> +       return \"vst1.16\\t{%z1}, %A0\";

>      case 9: /* ARM register from constant.  */

>        {

>          long bits;

> @@ -593,7 +623,7 @@

>  (define_insn "*thumb2_movsf_vfp"

>    [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r 

> ,m,t,r")

>          (match_operand:SF 1 "hard_sf_operand"      " ?r,t,Dv,UvHa,t, 

> mHa,r,t,r"))]

> -  "TARGET_THUMB2 && TARGET_HARD_FLOAT

> +  "TARGET_THUMB2 && TARGET_VFP_BASE

>     && (   s_register_operand (operands[0], SFmode)

>         || s_register_operand (operands[1], SFmode))"

>    "*

> @@ -682,7 +712,7 @@

>  (define_insn "*thumb2_movdf_vfp"

>    [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w 

> ,w,w  ,Uv,r ,m,w,r")

>          (match_operand:DF 1 "hard_df_operand" " ?r,w,Dy,G,UvHa,w, 

> mHa,r, w,r"))]

> -  "TARGET_THUMB2 && TARGET_HARD_FLOAT

> +  "TARGET_THUMB2 && TARGET_VFP_BASE

>     && (   register_operand (operands[0], DFmode)

>         || register_operand (operands[1], DFmode))"

>    "*

> @@ -760,7 +790,7 @@

>              [(match_operand 4 "cc_register" "") (const_int 0)])

>            (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")

>            (match_operand:SF 2 "s_register_operand" 

> "t,0,t,?r,0,?r,t,0,t")))]

> -  "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"

> +  "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"

>    "@

>     it\\t%D3\;vmov%D3.f32\\t%0, %2

>     it\\t%d3\;vmov%d3.f32\\t%0, %1

> @@ -806,7 +836,8 @@

>              [(match_operand 4 "cc_register" "") (const_int 0)])

>            (match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")

>            (match_operand:DF 2 "s_register_operand" 

> "w,0,w,?r,0,?r,w,0,w")))]

> -  "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE && 

> !arm_restrict_it"

> +  "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE

> +   && !arm_restrict_it"

>    "@

>     it\\t%D3\;vmov%D3.f64\\t%P0, %P2

>     it\\t%d3\;vmov%d3.f64\\t%P0, %P1

> @@ -1977,7 +2008,7 @@

>      [(set (match_operand:BLK 0 "memory_operand" "=m")

>            (unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]

>                        UNSPEC_PUSH_MULT))])]

> -  "TARGET_32BIT && TARGET_HARD_FLOAT"

> +  "TARGET_32BIT && TARGET_VFP_BASE"

>    "* return vfp_output_vstmd (operands);"

>    [(set_attr "type" "f_stored")]

>  )

> @@ -2065,16 +2096,18 @@

>

>  ;; Write Floating-point Status and Control Register.

>  (define_insn "set_fpscr"

> -  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] 

> VUNSPEC_SET_FPSCR)]

> -  "TARGET_HARD_FLOAT"

> +  [(set (reg:SI VFPCC_REGNUM)

> +       (unspec_volatile:SI

> +        [(match_operand:SI 0 "register_operand" "r")] 

> VUNSPEC_SET_FPSCR))]

> +  "TARGET_VFP_BASE"

>    "mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"

>    [(set_attr "type" "mrs")])

>

>  ;; Read Floating-point Status and Control Register.

>  (define_insn "get_fpscr"

>    [(set (match_operand:SI 0 "register_operand" "=r")

> -        (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]

> -  "TARGET_HARD_FLOAT"

> +       (unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]

> +  "TARGET_VFP_BASE"

>    "mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"

>    [(set_attr "type" "mrs")])

>

> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c 

> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c

> new file mode 100644

> index 0000000..17ba616

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c

> @@ -0,0 +1,14 @@

> +/* { dg-do compile  } */

> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */

> +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp 

> -mfloat-abi=hard -mthumb" } */

> +

> +#include "arm_mve.h"

> +

> +int8x16_t

> +foo1 (int8x16_t value)

> +{

> +  int8x16_t b = value;

> +  return b;

> +}

> +

> +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */

> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c 

> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c

> new file mode 100644

> index 0000000..7b877c4

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c

> @@ -0,0 +1,14 @@

> +/* { dg-do compile  } */

> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */

> +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp 

> -mfloat-abi=softfp -mthumb" } */

> +

> +#include "arm_mve.h"

> +

> +int8x16_t

> +foo1 (int8x16_t value)

> +{

> +  int8x16_t b = value;

> +  return b;

> +}

> +

> +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */

> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c 

> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c

> new file mode 100644

> index 0000000..85fbb57

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c

> @@ -0,0 +1,14 @@

> +/* { dg-do compile  } */

> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */

> +/* { dg-additional-options "-march=armv8.1-m.main+mve 

> -mfloat-abi=hard -mthumb" } */

> +

> +#include "arm_mve.h"

> +

> +int8x16_t

> +foo1 (int8x16_t value)

> +{

> +  int8x16_t b = value;

> +  return b;

> +}

> +

> +/* { dg-final { scan-assembler-not "\.fpu softvfp" }  } */

> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c 

> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c

> new file mode 100644

> index 0000000..23b3683

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c

> @@ -0,0 +1,14 @@

> +/* { dg-do compile  } */

> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */

> +/* { dg-additional-options "-march=armv8.1-m.main+mve 

> -mfloat-abi=softfp -mthumb" } */

> +

> +#include "arm_mve.h"

> +

> +int8x16_t

> +foo1 (int8x16_t value)

> +{

> +  int8x16_t b = value;

> +  return b;

> +}

> +

> +/* { dg-final { scan-assembler-not "\.fpu softvfp" }  } */

> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c 

> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c

> new file mode 100644

> index 0000000..8f7fa34

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c

> @@ -0,0 +1,12 @@

> +/* { dg-do compile  } */

> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */

> +/* { dg-additional-options "-march=armv8.1-m.main+mve 

> -mfloat-abi=soft -mthumb" } */

> +

> +int

> +foo1 (int value)

> +{

> +  int b = value;

> +  return b;

> +}

> +

> +/* { dg-final { scan-assembler "\.fpu softvfp" }  } */

> -- 

> 2.7.4

>

>

Patch

diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index 30a2a1d..a465966 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -1009,7 +1009,7 @@  arm_asm_auto_mfpu (int argc, const char **argv)
 	    }
 	}
 
-      gcc_assert (i != TARGET_FPU_auto);
+      gcc_assert (i != TARGET_FPU_auto || isa_bit_vfp_base);
     }
 
   auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 96f584d..77b4309 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -135,6 +135,10 @@  define feature armv8_1m_main
 # Floating point and Neon extensions.
 # VFPv1 is not supported in GCC.
 
+# This feature bit is enabled for all VFP, MVE and
+# MVE with floating point extensions.
+define feature vfp_base
+
 # Vector floating point v2.
 define feature vfpv2
 
@@ -234,7 +238,7 @@  define fgroup ALL_SIMD	ALL_SIMD_INTERNAL ALL_SIMD_EXTERNAL
 
 # List of all FPU bits to strip out if -mfpu is used to override the
 # default.  fp16 is deliberately missing from this list.
-define fgroup ALL_FPU_INTERNAL	vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
+define fgroup ALL_FPU_INTERNAL	vfp_base vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
 # Similarly, but including fp16 and other extensions that aren't part of
 # -mfpu support.
 define fgroup ALL_FPU_EXTERNAL fp16 bf16
@@ -279,10 +283,12 @@  define fgroup ARMv8r      ARMv8a
 define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main
 
 # Useful combinations.
-define fgroup VFPv2	vfpv2
+define fgroup VFPv2	vfp_base vfpv2
 define fgroup VFPv3	VFPv2 vfpv3
 define fgroup VFPv4	VFPv3 vfpv4 fp16conv
 define fgroup FPv5	VFPv4 fpv5
+define fgroup MVE      mve vfp_base armv7em
+define fgroup MVE_FP   MVE FPv5 fp16 mve_float
 
 define fgroup FP_DBL	fp_dbl
 define fgroup FP_D32	FP_DBL fp_d32
@@ -699,8 +705,8 @@  begin arch armv8.1-m.main
  option fp add FPv5 fp16
  option fp.dp add FPv5 FP_DBL fp16
  option nofp remove ALL_FP
- option mve add mve armv7em
- option mve.fp add mve FPv5 fp16 mve_float armv7em
+ option mve add MVE
+ option mve.fp add MVE_FP
 end arch armv8.1-m.main
 
 begin arch iwmmxt
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 10cf0e6..7e993b9 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -334,6 +334,19 @@  emission of floating point pcs attributes.  */
 						isa_bit_mve_float) \
 			       && !TARGET_GENERAL_REGS_ONLY)
 
+/* MVE have few common instructions as VFP, like VLDM alias VPOP, VLDR, VSTM
+   alia VPUSH, VSTR and VMOV, VMSR and VMRS.  In the same manner it updates few
+   registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and MVFR2.  All
+   the VFP instructions, RTL patterns and register are guarded by
+   TARGET_HARD_FLOAT.  But the common instructions, RTL pattern and registers
+   between MVE and VFP will be guarded by the following macro TARGET_VFP_BASE
+   hereafter.  */
+
+#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
+			 && bitmap_bit_p (arm_active_target.isa, \
+					  isa_bit_vfp_base) \
+			 && !TARGET_GENERAL_REGS_ONLY)
+
 /* Nonzero if integer division instructions supported.  */
 #define TARGET_IDIV	((TARGET_ARM && arm_arch_arm_hwdiv)	\
 			 || (TARGET_THUMB && arm_arch_thumb_hwdiv))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3a95ea3..c28a475 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -4295,7 +4295,7 @@  use_return_insn (int iscond, rtx sibling)
 
   /* Can't be done if any of the VFP regs are pushed,
      since this also requires an insn.  */
-  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+  if (TARGET_VFP_BASE)
     for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
       if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p (regno))
 	return 0;
@@ -6289,7 +6289,7 @@  use_vfp_abi (enum arm_pcs pcs_variant, bool is_double)
     return false;
 
   return (TARGET_32BIT && TARGET_HARD_FLOAT &&
-	  (TARGET_VFP_DOUBLE || !is_double));
+	 (TARGET_VFP_DOUBLE || !is_double));
 }
 
 /* Return true if an argument whose type is TYPE, or mode is MODE, is
@@ -8512,7 +8512,7 @@  thumb2_legitimate_index_p (machine_mode mode, rtx index, int strict_p)
 
   /* ??? Combine arm and thumb2 coprocessor addressing modes.  */
   /* Standard coprocessor addressing modes.  */
-  if (TARGET_HARD_FLOAT
+  if (TARGET_VFP_BASE
       && (mode == SFmode || mode == DFmode))
     return (code == CONST_INT && INTVAL (index) < 1024
 	    /* Thumb-2 allows only > -256 index range for it's core register
@@ -9905,7 +9905,7 @@  arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	  /* Assume that most copies can be done with a single insn,
 	     unless we don't have HW FP, in which case everything
 	     larger than word mode will require two insns.  */
-	  *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+	  *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE
 				   && GET_MODE_SIZE (mode) > 4)
 				  || mode == DImode)
 				 ? 2 : 1);
@@ -20821,7 +20821,7 @@  arm_get_vfp_saved_size (void)
 
   saved = 0;
   /* Space for saved VFP registers.  */
-  if (TARGET_HARD_FLOAT)
+  if (TARGET_VFP_BASE)
     {
       count = 0;
       for (regno = FIRST_VFP_REGNUM;
@@ -22364,7 +22364,7 @@  arm_compute_frame_layout (void)
       func_type = arm_current_func_type ();
       /* Space for saved VFP registers.  */
       if (! IS_VOLATILE (func_type)
-	  && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
+	  && TARGET_VFP_BASE)
 	saved += arm_get_vfp_saved_size ();
 
       /* Allocate space for saving/restoring FPCXTNS in Armv8.1-M Mainline
@@ -22588,7 +22588,7 @@  arm_save_coproc_regs(void)
 	saved_size += 8;
       }
 
-  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+  if (TARGET_VFP_BASE)
     {
       start_reg = FIRST_VFP_REGNUM;
 
@@ -24546,7 +24546,7 @@  arm_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2)
     return false;
 
   *p1 = CC_REGNUM;
-  *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;
+  *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;
   return true;
 }
 
@@ -24965,7 +24965,7 @@  arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
 {
   if (GET_MODE_CLASS (mode) == MODE_CC)
     return (regno == CC_REGNUM
-	    || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+	    || (TARGET_VFP_BASE
 		&& regno == VFPCC_REGNUM));
 
   if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
@@ -24982,7 +24982,7 @@  arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
        start of an even numbered register pair.  */
     return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
 
-  if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
+  if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))
     {
       if (mode == DFmode)
 	return VFP_REGNO_OK_FOR_DOUBLE (regno);
@@ -26933,7 +26933,7 @@  arm_expand_epilogue_apcs_frame (bool really_return)
         floats_from_frame += 4;
       }
 
-  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+  if (TARGET_VFP_BASE)
     {
       int start_reg;
       rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
@@ -27179,7 +27179,7 @@  arm_expand_epilogue (bool really_return)
         }
     }
 
-  if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+  if (TARGET_VFP_BASE)
     {
       /* Generate VFP register multi-pop.  */
       int end_reg = LAST_VFP_REGNUM + 1;
@@ -29695,7 +29695,7 @@  arm_conditional_register_usage (void)
   if (TARGET_THUMB1)
     fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
 
-  if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
+  if (TARGET_32BIT && TARGET_VFP_BASE)
     {
       /* VFPv3 registers are disabled when earlier VFP
 	 versions are selected due to the definition of
@@ -32470,7 +32470,8 @@  arm_declare_function_name (FILE *stream, const char *name, tree decl)
     = TARGET_SOFT_FLOAT
 	? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
 
-  if (fpu_to_print != arm_last_printed_arch_string)
+  if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)
+      && (fpu_to_print != arm_last_printed_arch_string))
     {
       asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
       arm_last_printed_fpu_string = fpu_to_print;
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 3a12f18..dcafb71 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -129,7 +129,7 @@ 
 ; arm_arch6.  "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
 ; Baseline.  This attribute is used to compute attribute "enabled",
 ; use type "any" to enable an alternative in all cases.
-(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
+(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"
   (const_string "any"))
 
 (define_attr "arch_enabled" "no,yes"
@@ -183,6 +183,10 @@ 
 	 (and (eq_attr "arch" "neon")
 	      (match_test "TARGET_NEON"))
 	 (const_string "yes")
+
+	 (and (eq_attr "arch" "mve")
+	      (match_test "TARGET_HAVE_MVE"))
+	 (const_string "yes")
 	]
 
 	(const_string "no")))
@@ -11744,7 +11748,7 @@ 
                    (match_operand:SI 2 "const_int_I_operand" "I")))
      (set (match_operand:DF 3 "vfp_hard_register_operand" "")
           (mem:DF (match_dup 1)))])]
-  "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
+  "TARGET_32BIT && TARGET_VFP_BASE"
   "*
   {
     int num_regs = XVECLEN (operands[0], 0);
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 3577fb9..0908c79 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -38,7 +38,7 @@ 
 ;; in all states: Pf, Pg
 
 ;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Uf
 ;; in ARM state: Uq
 ;; in Thumb state: Uu, Uw
 ;; in all states: Q
@@ -46,6 +46,9 @@ 
 (define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
   "MVE VPR register")
 
+(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
+  "MVE FPCCR register")
+
 (define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
  "The VFP registers @code{s0}-@code{s31}.")
 
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index b0d3bd1..793f670 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -517,7 +517,7 @@ 
 			  [(match_operand 4 "cc_register" "") (const_int 0)])
 			 (match_operand:SF 1 "s_register_operand" "0,r")
 			 (match_operand:SF 2 "s_register_operand" "r,0")))]
-  "TARGET_THUMB2 && TARGET_SOFT_FLOAT"
+  "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
   "@
    it\\t%D3\;mov%D3\\t%0, %2
    it\\t%d3\;mov%d3\\t%0, %1"
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 8f4a705..73588fc 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -170,6 +170,7 @@ 
   UNSPEC_TORC		; Used by the intrinsic form of the iWMMXt TORC instruction.
   UNSPEC_TORVSC		; Used by the intrinsic form of the iWMMXt TORVSC instruction.
   UNSPEC_TEXTRC		; Used by the intrinsic form of the iWMMXt TEXTRC instruction.
+  UNSPEC_GET_FPSCR	; Represent fetch of FPSCR content.
 ])
 
 
@@ -216,7 +217,6 @@ 
   VUNSPEC_SLX		; Represent a store-register-release-exclusive.
   VUNSPEC_LDA		; Represent a store-register-acquire.
   VUNSPEC_STL		; Represent a store-register-release.
-  VUNSPEC_GET_FPSCR	; Represent fetch of FPSCR content.
   VUNSPEC_SET_FPSCR	; Represent assign of FPSCR content.
   VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
   VUNSPEC_CDP		; Represent the coprocessor cdp instruction.
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index ab16a6b..eb6ae7b 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -74,10 +74,10 @@ 
 (define_insn "*thumb2_movhi_vfp"
  [(set
    (match_operand:HI 0 "nonimmediate_operand"
-    "=rk, r, l, r, m, r, *t, r, *t")
+    "=rk, r, l, r, m, r, *t, r, *t, Up, r")
    (match_operand:HI 1 "general_operand"
-    "rk, I, Py, n, r, m, r, *t, *t"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+    "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
+ "TARGET_THUMB2 && TARGET_VFP_BASE
   && !TARGET_VFP_FP16INST
   && (register_operand (operands[0], HImode)
        || register_operand (operands[1], HImode))"
@@ -99,20 +99,24 @@ 
       return "vmov%?\t%0, %1\t%@ int";
     case 8:
       return "vmov%?.f32\t%0, %1\t%@ int";
+    case 9:
+      return "vmsr%?\t P0, %1\t@ movhi";
+    case 10:
+      return "vmrs%?\t %0, P0\t@ movhi";
     default:
       gcc_unreachable ();
     }
 }
  [(set_attr "predicable" "yes")
   (set_attr "predicable_short_it"
-   "yes, no, yes, no, no, no, no, no, no")
+   "yes, no, yes, no, no, no, no, no, no, no, no")
   (set_attr "type"
    "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
-    f_mcr, f_mrc, fmov")
-  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
-  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
-  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
-  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
+    f_mcr, f_mrc, fmov, mve_move, mve_move")
+  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
+  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
+  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
+  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
 )
 
 ;; Patterns for HI moves which provide more data transfer instructions when FP16
@@ -170,10 +174,10 @@ 
 (define_insn "*thumb2_movhi_fp16"
  [(set
    (match_operand:HI 0 "nonimmediate_operand"
-    "=rk, r, l, r, m, r, *t, r, *t")
+    "=rk, r, l, r, m, r, *t, r, *t, Up, r")
    (match_operand:HI 1 "general_operand"
-    "rk, I, Py, n, r, m, r, *t, *t"))]
- "TARGET_THUMB2 && TARGET_VFP_FP16INST
+    "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
+ "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)
   && (register_operand (operands[0], HImode)
        || register_operand (operands[1], HImode))"
 {
@@ -194,21 +198,25 @@ 
       return "vmov.f16\t%0, %1\t%@ int";
     case 8:
       return "vmov%?.f32\t%0, %1\t%@ int";
+    case 9:
+      return "vmsr%?\tP0, %1\t%@ movhi";
+    case 10:
+      return "vmrs%?\t%0, P0\t%@ movhi";
     default:
       gcc_unreachable ();
     }
 }
  [(set_attr "predicable"
-   "yes, yes, yes, yes, yes, yes, no, no, yes")
+   "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")
   (set_attr "predicable_short_it"
-   "yes, no, yes, no, no, no, no, no, no")
+   "yes, no, yes, no, no, no, no, no, no, no, no")
   (set_attr "type"
    "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
-    f_mcr, f_mrc, fmov")
-  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
-  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
-  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
-  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
+    f_mcr, f_mrc, fmov, mve_move, mve_move")
+  (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
+  (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
+  (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
+  (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
 )
 
 ;; SImode moves
@@ -258,9 +266,11 @@ 
 ;; is chosen with length 2 when the instruction is predicated for
 ;; arm_restrict_it.
 (define_insn "*thumb2_movsi_vfp"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t,  *Uv")
-	(match_operand:SI 1 "general_operand"	   "rk,I,Py,K,j,mi,lk*r, r,*t,*t,*UvTu,*t"))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,l,*hk,m,*m,*t,\
+						    r,*t,*t,*Uv, Up, r,Uf,r")
+	(match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\
+					       *t,*UvTu,*t, r, Up,r,Uf"))]
+  "TARGET_THUMB2 && TARGET_VFP_BASE
    && (   s_register_operand (operands[0], SImode)
        || s_register_operand (operands[1], SImode))"
   "*
@@ -275,30 +285,44 @@ 
     case 4:
       return \"movw%?\\t%0, %1\";
     case 5:
+    case 6:
       /* Cannot load it directly, split to load it via MOV / MOVT.  */
       if (!MEM_P (operands[1]) && arm_disable_literal_pool)
 	return \"#\";
       return \"ldr%?\\t%0, %1\";
-    case 6:
-      return \"str%?\\t%1, %0\";
     case 7:
-      return \"vmov%?\\t%0, %1\\t%@ int\";
     case 8:
-      return \"vmov%?\\t%0, %1\\t%@ int\";
+      return \"str%?\\t%1, %0\";
     case 9:
+      return \"vmov%?\\t%0, %1\\t%@ int\";
+    case 10:
+      return \"vmov%?\\t%0, %1\\t%@ int\";
+    case 11:
       return \"vmov%?.f32\\t%0, %1\\t%@ int\";
-    case 10: case 11:
+    case 12: case 13:
       return output_move_vfp (operands);
+    case 14:
+      return \"vmsr\\t P0, %1\";
+    case 15:
+      return \"vmrs\\t %0, P0\";
+    case 16:
+      return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";
+    case 17:
+      return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";
     default:
       gcc_unreachable ();
     }
   "
   [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no")
-   (set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")
-   (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")
-   (set_attr "pool_range"     "*,*,*,*,*,1018,*,*,*,*,1018,*")
-   (set_attr "neg_pool_range" "*,*,*,*,*,   0,*,*,*,*,1008,*")]
+   (set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\
+	      no,no,no,no,no")
+   (set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\
+	     store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\
+	     mve_move,mrs,mrs")
+   (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")
+   (set_attr "pool_range"     "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")
+   (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")
+   (set_attr "neg_pool_range" "*,*,*,*,*,   0,   0,*,*,*,*,*,1008,*,*,*,*,*")]
 )
 
 
@@ -306,12 +330,12 @@ 
 
 (define_insn "*movdi_vfp"
   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,r,r,m,w,!r,w,w, Uv")
-	(match_operand:DI 1 "di_operand"	      "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
-  "TARGET_32BIT && TARGET_HARD_FLOAT
+	(match_operand:DI 1 "di_operand"       "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
+  "TARGET_32BIT && TARGET_VFP_BASE
    && (   register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))
-   && !(TARGET_NEON && CONST_INT_P (operands[1])
-	&& simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
+   && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])
+       && simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
   "*
   switch (which_alternative)
     {
@@ -333,7 +357,7 @@ 
     case 8:
       return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
     case 9:
-      if (TARGET_VFP_SINGLE)
+      if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)
 	return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0, %p1\\t%@ int\";
       else
 	return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
@@ -390,9 +414,15 @@ 
     case 6: /* S register from immediate.  */
       return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";
     case 7: /* S register from memory.  */
-      return \"vld1.16\\t{%z0}, %A1\";
+      if (TARGET_HAVE_MVE)
+	return \"vldr.16\\t%0, %A1\";
+      else
+	return \"vld1.16\\t{%z0}, %A1\";
     case 8: /* Memory from S register.  */
-      return \"vst1.16\\t{%z1}, %A0\";
+      if (TARGET_HAVE_MVE)
+	return \"vstr.16\\t%1, %A0\";
+      else
+	return \"vst1.16\\t{%z1}, %A0\";
     case 9: /* ARM register from constant.  */
       {
 	long bits;
@@ -593,7 +623,7 @@ 
 (define_insn "*thumb2_movsf_vfp"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r ,m,t,r")
 	(match_operand:SF 1 "hard_sf_operand"	   " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT
+  "TARGET_THUMB2 && TARGET_VFP_BASE
    && (   s_register_operand (operands[0], SFmode)
        || s_register_operand (operands[1], SFmode))"
   "*
@@ -682,7 +712,7 @@ 
 (define_insn "*thumb2_movdf_vfp"
   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  ,Uv,r ,m,w,r")
 	(match_operand:DF 1 "hard_df_operand"		   " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT
+  "TARGET_THUMB2 && TARGET_VFP_BASE
    && (   register_operand (operands[0], DFmode)
        || register_operand (operands[1], DFmode))"
   "*
@@ -760,7 +790,7 @@ 
 	    [(match_operand 4 "cc_register" "") (const_int 0)])
 	  (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
 	  (match_operand:SF 2 "s_register_operand" "t,0,t,?r,0,?r,t,0,t")))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"
+  "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"
   "@
    it\\t%D3\;vmov%D3.f32\\t%0, %2
    it\\t%d3\;vmov%d3.f32\\t%0, %1
@@ -806,7 +836,8 @@ 
 	    [(match_operand 4 "cc_register" "") (const_int 0)])
 	  (match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
 	  (match_operand:DF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
-  "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE && !arm_restrict_it"
+  "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE
+   && !arm_restrict_it"
   "@
    it\\t%D3\;vmov%D3.f64\\t%P0, %P2
    it\\t%d3\;vmov%d3.f64\\t%P0, %P1
@@ -1977,7 +2008,7 @@ 
     [(set (match_operand:BLK 0 "memory_operand" "=m")
 	  (unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]
 		      UNSPEC_PUSH_MULT))])]
-  "TARGET_32BIT && TARGET_HARD_FLOAT"
+  "TARGET_32BIT && TARGET_VFP_BASE"
   "* return vfp_output_vstmd (operands);"
   [(set_attr "type" "f_stored")]
 )
@@ -2065,16 +2096,18 @@ 
 
 ;; Write Floating-point Status and Control Register.
 (define_insn "set_fpscr"
-  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
-  "TARGET_HARD_FLOAT"
+  [(set (reg:SI VFPCC_REGNUM)
+	(unspec_volatile:SI
+	 [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR))]
+  "TARGET_VFP_BASE"
   "mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"
   [(set_attr "type" "mrs")])
 
 ;; Read Floating-point Status and Control Register.
 (define_insn "get_fpscr"
   [(set (match_operand:SI 0 "register_operand" "=r")
-        (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]
-  "TARGET_HARD_FLOAT"
+	(unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]
+  "TARGET_VFP_BASE"
   "mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"
   [(set_attr "type" "mrs")])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
new file mode 100644
index 0000000..17ba616
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=hard -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+  int8x16_t b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
new file mode 100644
index 0000000..7b877c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=softfp -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+  int8x16_t b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
new file mode 100644
index 0000000..85fbb57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+  int8x16_t b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler-not "\.fpu softvfp" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
new file mode 100644
index 0000000..23b3683
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=softfp -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+  int8x16_t b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler-not "\.fpu softvfp" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
new file mode 100644
index 0000000..8f7fa34
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=soft -mthumb" } */
+
+int
+foo1 (int value)
+{
+  int b = value;
+  return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu softvfp" }  } */