[2/57,Arm,GAS] Add support for MVE instructions: vpst, vadd, vsub and vabd

Message ID 70aab06d-34c1-476d-e760-59ac05f645df@arm.com
State New
Headers show
Series
  • : Add support for Armv8.1-M Mainline MVE instructions
Related show

Commit Message

Andre Vieira (lists) May 1, 2019, 4:54 p.m.
Hello,

This patch adds most of the framework used by the rest of the GAS 
patches for MVE.

This framework adds
1) VPT/VPST block-handling, reusing IT-block handling already present in 
GAS. This patch also cleans up the IT-block handling
adding more appropriate diagnostics for instructions used inside an IT 
block where they shouldn't, or using condition codes when they are not 
allowed.
The comments above the now named 'handle_pred_state' function, explain 
this in detail.

2) Helper functions to check neon availability and/or mve.


This patch further implements the MVE instructions VPST, VADD, VSUB and 
VABD.  VPST is completely new. But VADD, VSUB and VABD clash mnemonic 
wise with NEON, this patch illustrates how other instructions with 
similar mnemonic clashes are handled.

Cheers,
Andre



gas/ChangeLog:

2019-05-01  Andre Vieira  <andre.simoesdiasvieira@arm.com>

	* config/tc-arm.c (enum it_instruction_type): Rename to...
         (enum pred_instruction_type): ... this. Include VPT types.
         (it_insn_type): Rename to ...
         (pred_insn_type): .. this.
         (arm_it): Change comment.
        	(enum arm_reg_type): Add new value.
         (reg_expected_msgs): New entry.
         (asm_opcode): Add mayBeVecPred member.
         (BAD_SYNTAX, BAD_NOT_VPT, BAD_OUT_VPT, BAD_VPT_COND, MVE_NOT_IT,
          MVE_NOT_VPT, MVE_BAD_PC, MVE_BAD_SP): New diagnostic MACROS.
         (arm_vcond_hsh): New table for vector condition codes.
	(now_it): Rename to ...
	(now_pred): ... this.
	(now_it_compatible): Rename to ...
	(now_pred_compatible): ... this.
	(in_it_block): Rename to ...
	(in_pred_block): ... this.
	(handle_it_state): Rename to ...
	(handle_pred_state): ... this. And change it to accept VPT blocks.
	(set_it_insn_type): Rename to ...
	(set_pred_insn_type): ... this.
	(set_it_insn_type_nonvoid): Rename to ...
	(set_pred_insn_type_nonvoid): ... this.
	(set_it_insn_type_last): Rename to ...
	(set_pred_insn_type_last): ... this.
	(record_feature_use): Moved.
	(mark_feature_used): Likewise.
	(parse_typed_reg_or_scalar): Add new case for REG_TYPE_MQ.
	(emit_insn): Use renamed functions and variables.
   	(enum operand_parse_code): Add new operands.
	(parse_operands): Handle new operands.
	(do_scalar_fp16_v82_encode): Change predication detection.
	(do_it): Use renamed functions and variables.
	(do_t_add_sub): Likewise.
	(do_t_arit3): Likewise.
	(do_t_arit3c): Likewise.
	(do_t_blx): Likewise.
	(do_t_branch): Likewise.
	(do_t_bkpt_hlt1): Likewise.
	(do_t_branch23): Likewise.
	(do_t_bx): Likewise.
	(do_t_bxj): Likewise.
	(do_t_cond): Likewise.
	(do_t_csdb): Likewise.
	(do_t_cps): Likewise.
	(do_t_cpsi): Likewise.
	(do_t_cbz): Likewise.
	(do_t_it): Likewise.
	(do_mve_vpt): New function to handle VPT blocks.
	(encode_thumb2_multi): Use renamed functions and variables.
	(do_t_ldst): Use renamed functions and variables.
	(do_t_mov_cmp): Likewise.
	(do_t_mvn_tst): Likewise.
	(do_t_mul): Likewise.
	(do_t_nop): Likewise.
	(do_t_neg): Likewise.
	(do_t_rsb): Likewise.
	(do_t_setend): Likewise.
	(do_t_shift): Likewise.
	(do_t_smc): Likewise.
	(do_t_tb): Likewise.
	(do_t_udf): Likewise.
	(do_t_loloop): Likewise.
	(do_neon_cvt_1): Likewise.
	(do_vfp_nsyn_cvt_fpv8): Likewise.
	(do_vsel): Likewise.
	(do_vmaxnm): Likewise.
	(do_vrint_1): Likewise.
	(do_crypto_2op_1): Likewise.
	(do_crypto_3op_1): Likewise.
	(do_crc32_1): Likewise.
	(it_fsm_pre_encode): Likewise.
	(it_fsm_post_encode): Likewise.
	(force_automatic_it_block_close): Likewise.
	(check_it_blocks_finished): Likewise.
	(check_pred_blocks_finished): Likewise.
	(arm_cleanup): Likewise.
	(now_it_add_mask): Rename to ...
	(now_pred_add_mask): ... this. And use new variables and functions.
	(NEON_ENC_TAB): Add entries for vabdl, vaddl and vsubl.
       	(N_I_MVE, N_F_MVE, N_SU_MVE): New MACROs.
	(neon_check_type): Generalize error message.
	(mve_encode_qqr): New MVE generic encoding function.
	(neon_dyadic_misc): Change to accept MVE variants.
	(do_neon_dyadic_if_su): Likewise.
	(do_neon_addsub_if_i): Likewise.
	(do_neon_dyadic_long): Likewise.
	(vfp_or_neon_is_neon): Add extra checks.
	(check_simd_pred_availability): Helper function to check
         SIMD instruction availability with respect to predication.
	(enum opcode_tag): New suffix value.
	(opcode_lookup): Change to handle VPT blocks.
	(new_automatic_it_block): Rename to ...
	(close_automatic_it_block): ...this.
         (TxCE, TxC3, TxC3w, TUE, TUEc, TUF, CE, C3, ToC, ToU,
         toC, toU, CL, cCE, cCL, C3E, xCM_, UE, UF, NUF, nUF,
         NCE_tag, NCE, NCEF, nCE_tag, nCE, nCEF): Add default value for new
         field.
         (mCEF, mnCEF, mnCE, MNUF, mnUF, mToC, MNCE, MNCEF): New
         MACROs.
         (insns): Redefine vadd, vsub, cabd, vabdl, vaddl, vsubl to accept
         MVE variants. Add entries for vscclrm, and vpst.
         (md_begin): Add arm_vcond_hsh initialization.
	* config/tc-arm.h (enum it_state): Rename to...
	(enum pred_state): ...this.
	(struct current_it): Rename to...
	(struct current_pred): ...this.
	(enum pred_type): New enum.
	(struct arm_segment_info_type): Use current_pred.
	* testsuite/gas/arm/armv8_3-a-fp-bad.l: Update error message.
	* testsuite/gas/arm/armv8_3-a-simd-bad.l: Update error message.
	* testsuite/gas/arm/dotprod-illegal.l: Update error message.
	* testsuite/gas/arm/mve-vaddsubabd-bad-1.d: New test.
	* testsuite/gas/arm/mve-vaddsubabd-bad-1.l: New test.
	* testsuite/gas/arm/mve-vaddsubabd-bad-1.s: New test.
	* testsuite/gas/arm/mve-vaddsubabd-bad-2.d: New test.
	* testsuite/gas/arm/mve-vaddsubabd-bad-2.l: New test.
	* testsuite/gas/arm/mve-vaddsubabd-bad-2.s: New test.
	* testsuite/gas/arm/mve-vpst-bad.d: New test.
	* testsuite/gas/arm/mve-vpst-bad.l: New test.
	* testsuite/gas/arm/mve-vpst-bad.s: New test.
	* testsuite/gas/arm/neon-ldst-es-bad.l:

On 01/05/2019 17:51, Andre Vieira (lists) wrote:
> Hi,

> 

> This patch series adds support for all M-profile Vector Extension(MVE) 

> instructions to GAS and Objdump.  Their specifications can be found on 

> Arm Developer (see https://developer.arm.com/docs/ddi0553/latest).

> 

> The patch series is split into three main groups:

> - patches 1-36 are GAS patches

> - patches 37-56 are OBJDUMP patches

> - patch 57 contains all positive testing

> 

> The reason to split the testing is because we use assembly macros to 

> generate extensive testing, which leads to massive 'expected result' 

> files. Which would require zipping most of the patches to be able to 

> send them over email. So instead we decided to collate all positive 

> testing into one patch and only zip that one. The negative tests are 

> smaller and have been included per relevant patch.

> 

> The expected value for positive tests have been compared to a different, 

> internal implementation.

> 

> Cheers,

> Andre

Comments

Nick Clifton May 2, 2019, 10:56 a.m. | #1
Hi Andre,

> This patch adds most of the framework used by the rest of the GAS patches for MVE.


I noticed that this function:

> +static int

> +check_simd_pred_availability (int fp, unsigned check)


returns an integer value, but it is only ever used in boolean
tests.  IMHO it should either have a bfd_boolean return type,
or else an enum with the return values having textual names to
indicate their meaning.

I also saw that in do_neon_logic() there is a test against 
the function returning FAIL:

      if (rs == NS_QQQ
	  && check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC)
	  == FAIL)

But FAIL is not one of the values explicitly returned by 
check_simd_pred_availability()....

Cheers
  Nick
Andre Vieira (lists) May 13, 2019, 1:41 p.m. | #2
Hi,

After Nick's comments I decided to clean up the definition and uses of 
check_simd_pred_availability.  I hope it makes the function clearer now. 
  This is to be applied on top of the MVE series (was easier and cleaner 
than rebasing everything).

Is this OK?

Cheers,
Andre

gas/ChangeLog
2019-05-13  Andre Vieira  <andre.simoesdiasvieira@arm.com>

         * config/tc-arm.c (check_simd_pred_availability): Refactor.
         (do_neon_dyadic_i_su): Refactor use of 
check_simd_pred_availability.
         (do_neon_dyadic_i64_su): Likewise.
         (do_neon_shl): Likewise.
         (do_neon_qshl): Likewise.
         (do_neon_rshl): Likewise.
         (do_neon_logic): Likewise.
         (do_neon_dyadic_if_su): Likewise.
         (do_neon_addsub_if_i): Likewise.
         (do_neon_mac_maybe_scalar): Likewise.
         (do_neon_fmac): Likewise.
         (do_neon_mul): Likewise.
         (do_neon_qdmulh): Likewise.
         (do_neon_qrdmlah): Likewise.
         (do_neon_abs_neg): Likewise.
         (do_neon_sli): Likewise.
         (do_neon_sri): Likewise.
         (do_neon_qshlu_imm): Likewise.
         (do_neon_cvt_1): Likewise.
         (do_neon_cvttb_1): Likewise.
         (do_neon_mvn): Likewise.
         (do_neon_rev): Likewise.
         (do_neon_dup): Likewise.
         (do_neon_mov): Likewise.
         (do_neon_rshift_round_imm): Likewise.
         (do_neon_sat_abs_neg): Likewise.
         (do_neon_cls): Likewise.
         (do_neon_clz): Likewise.
         (do_vmaxnm): Likewise.
         (do_vrint_1): Likewise.
         (do_vcmla): Likewise.
         (do_vcadd): Likewise.

On 02/05/2019 11:56, Nick Clifton wrote:
> Hi Andre,

> 

>> This patch adds most of the framework used by the rest of the GAS patches for MVE.

> 

> I noticed that this function:

> 

>> +static int

>> +check_simd_pred_availability (int fp, unsigned check)

> 

> returns an integer value, but it is only ever used in boolean

> tests.  IMHO it should either have a bfd_boolean return type,

> or else an enum with the return values having textual names to

> indicate their meaning.

> 

> I also saw that in do_neon_logic() there is a test against

> the function returning FAIL:

> 

>        if (rs == NS_QQQ

> 	  && check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC)

> 	  == FAIL)

> 

> But FAIL is not one of the values explicitly returned by

> check_simd_pred_availability()....

> 

> Cheers

>    Nick

> 

>
diff --git a/gas/config/tc-arm.c b/gas/config/tc-arm.c
index 20db5d9b278ccda70ac266f7df9f18e87cc430ea..6ba5e735cb78ec0e6f9d8ea1d93dc4a993a1a9a3 100644
--- a/gas/config/tc-arm.c
+++ b/gas/config/tc-arm.c
@@ -16562,7 +16562,13 @@ if (!thumb_mode && (check & NEON_CHECK_CC))
 return SUCCESS;
 }
 
-static int
+
+/* Return TRUE if the SIMD instruction is available for the current
+   cpu_variant.  FP is set to TRUE if this is a SIMD floating-point
+   instruction.  CHECK contains th.  CHECK contains the set of bits to pass to
+   vfp_or_neon_is_neon for the NEON specific checks.  */
+
+static bfd_boolean
 check_simd_pred_availability (int fp, unsigned check)
 {
 if (inst.cond > COND_ALWAYS)
@@ -16570,7 +16576,7 @@ if (inst.cond > COND_ALWAYS)
     if (!ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
       {
 	inst.error = BAD_FPU;
-	return 1;
+	return FALSE;
       }
     inst.pred_insn_type = INSIDE_VPT_INSN;
   }
@@ -16579,18 +16585,18 @@ else if (inst.cond < COND_ALWAYS)
     if (ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
       inst.pred_insn_type = MVE_OUTSIDE_PRED_INSN;
     else if (vfp_or_neon_is_neon (check) == FAIL)
-      return 2;
+      return FALSE;
   }
 else
   {
     if (!ARM_CPU_HAS_FEATURE (cpu_variant, fp ? mve_fp_ext : mve_ext)
 	&& vfp_or_neon_is_neon (check) == FAIL)
-      return 3;
+      return FALSE;
 
     if (ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
       inst.pred_insn_type = MVE_OUTSIDE_PRED_INSN;
   }
-return 0;
+return TRUE;
 }
 
 /* Neon instruction encoders, in approximate order of appearance.  */
@@ -16598,7 +16604,7 @@ return 0;
 static void
 do_neon_dyadic_i_su (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
    return;
 
   enum neon_shape rs;
@@ -16620,7 +16626,7 @@ do_neon_dyadic_i_su (void)
 static void
 do_neon_dyadic_i64_su (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_CC | NEON_CHECK_ARCH))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_CC | NEON_CHECK_ARCH))
     return;
   enum neon_shape rs;
   struct neon_type_el et;
@@ -16662,7 +16668,7 @@ neon_imm_shift (int write_ubit, int uval, int isquad, struct neon_type_el et,
 static void
 do_neon_shl (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
    return;
 
   if (!inst.operands[2].isreg)
@@ -16742,7 +16748,7 @@ do_neon_shl (void)
 static void
 do_neon_qshl (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
    return;
 
   if (!inst.operands[2].isreg)
@@ -16816,7 +16822,7 @@ do_neon_qshl (void)
 static void
 do_neon_rshl (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
    return;
 
   enum neon_shape rs;
@@ -16930,8 +16936,8 @@ do_neon_logic (void)
     {
       enum neon_shape rs = neon_select_shape (NS_DDD, NS_QQQ, NS_NULL);
       if (rs == NS_QQQ
-	  && check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC)
-	  == FAIL)
+	  && !check_simd_pred_availability (FALSE,
+					    NEON_CHECK_ARCH | NEON_CHECK_CC))
 	return;
       else if (rs != NS_QQQ
 	       && !ARM_CPU_HAS_FEATURE (cpu_variant, fpu_neon_ext_v1))
@@ -16953,8 +16959,8 @@ do_neon_logic (void)
       /* Because neon_select_shape makes the second operand a copy of the first
 	 if the second operand is not present.  */
       if (rs == NS_QQI
-	  && check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC)
-	  == FAIL)
+	  && !check_simd_pred_availability (FALSE,
+					    NEON_CHECK_ARCH | NEON_CHECK_CC))
 	return;
       else if (rs != NS_QQI
 	       && !ARM_CPU_HAS_FEATURE (cpu_variant, fpu_neon_ext_v1))
@@ -17397,8 +17403,8 @@ do_neon_dyadic_if_su (void)
 	      && et.type == NT_float
 	      && !ARM_CPU_HAS_FEATURE (cpu_variant,fpu_neon_ext_v1), BAD_FPU);
 
-  if (check_simd_pred_availability (et.type == NT_float,
-				    NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (et.type == NT_float,
+				     NEON_CHECK_ARCH | NEON_CHECK_CC))
     return;
 
   neon_dyadic_misc (NT_unsigned, N_SUF_32, 0);
@@ -17422,8 +17428,8 @@ do_neon_addsub_if_i (void)
      they are predicated or not.  */
   if ((rs == NS_QQQ || rs == NS_QQR) && et.size != 64)
     {
-      if (check_simd_pred_availability (et.type == NT_float,
-					NEON_CHECK_ARCH | NEON_CHECK_CC))
+      if (!check_simd_pred_availability (et.type == NT_float,
+					 NEON_CHECK_ARCH | NEON_CHECK_CC))
 	return;
     }
   else
@@ -17584,7 +17590,7 @@ do_neon_mac_maybe_scalar (void)
   if (try_vfp_nsyn (3, do_vfp_nsyn_mla_mls) == SUCCESS)
     return;
 
-  if (check_simd_pred_availability (0, NEON_CHECK_CC | NEON_CHECK_ARCH))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_CC | NEON_CHECK_ARCH))
     return;
 
   if (inst.operands[2].isscalar)
@@ -17621,7 +17627,7 @@ do_neon_fmac (void)
       && try_vfp_nsyn (3, do_vfp_nsyn_fma_fms) == SUCCESS)
     return;
 
-  if (check_simd_pred_availability (1, NEON_CHECK_CC | NEON_CHECK_ARCH))
+  if (!check_simd_pred_availability (TRUE, NEON_CHECK_CC | NEON_CHECK_ARCH))
     return;
 
   if (ARM_CPU_HAS_FEATURE (cpu_variant, mve_fp_ext))
@@ -17675,7 +17681,7 @@ do_neon_mul (void)
   if (try_vfp_nsyn (3, do_vfp_nsyn_mul) == SUCCESS)
     return;
 
-  if (check_simd_pred_availability (0, NEON_CHECK_CC | NEON_CHECK_ARCH))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_CC | NEON_CHECK_ARCH))
     return;
 
   if (inst.operands[2].isscalar)
@@ -17708,7 +17714,7 @@ do_neon_mul (void)
 static void
 do_neon_qdmulh (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
    return;
 
   if (inst.operands[2].isscalar)
@@ -18145,7 +18151,7 @@ do_mve_vmaxv (void)
 static void
 do_neon_qrdmlah (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
    return;
   if (!ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
     {
@@ -18225,8 +18231,8 @@ do_neon_abs_neg (void)
   rs = neon_select_shape (NS_DD, NS_QQ, NS_NULL);
   et = neon_check_type (2, rs, N_EQK, N_S_32 | N_F_16_32 | N_KEY);
 
-  if (check_simd_pred_availability (et.type == NT_float,
-				    NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (et.type == NT_float,
+				     NEON_CHECK_ARCH | NEON_CHECK_CC))
     return;
 
   inst.instruction |= LOW4 (inst.operands[0].reg) << 12;
@@ -18243,7 +18249,7 @@ do_neon_abs_neg (void)
 static void
 do_neon_sli (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
     return;
 
   enum neon_shape rs;
@@ -18269,7 +18275,7 @@ do_neon_sli (void)
 static void
 do_neon_sri (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
     return;
 
   enum neon_shape rs;
@@ -18294,7 +18300,7 @@ do_neon_sri (void)
 static void
 do_neon_qshlu_imm (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
     return;
 
   enum neon_shape rs;
@@ -18767,7 +18773,8 @@ do_neon_cvt_1 (enum neon_cvt_mode mode)
 	      || flavour == neon_cvt_flavour_s32_f32
 	      || flavour == neon_cvt_flavour_u32_f32))
 	{
-	  if (check_simd_pred_availability (1, NEON_CHECK_CC | NEON_CHECK_ARCH))
+	  if (!check_simd_pred_availability (TRUE,
+					     NEON_CHECK_CC | NEON_CHECK_ARCH))
 	    return;
 	}
       else if (mode == neon_cvt_mode_n)
@@ -18854,8 +18861,8 @@ do_neon_cvt_1 (enum neon_cvt_mode mode)
 	      || flavour == neon_cvt_flavour_s32_f32
 	      || flavour == neon_cvt_flavour_u32_f32))
 	{
-	  if (check_simd_pred_availability (1,
-					    NEON_CHECK_CC | NEON_CHECK_ARCH8))
+	  if (!check_simd_pred_availability (TRUE,
+					     NEON_CHECK_CC | NEON_CHECK_ARCH8))
 	    return;
 	}
       else if (mode == neon_cvt_mode_z
@@ -18868,8 +18875,8 @@ do_neon_cvt_1 (enum neon_cvt_mode mode)
 		   || flavour == neon_cvt_flavour_s32_f32
 		   || flavour == neon_cvt_flavour_u32_f32))
 	{
-	  if (check_simd_pred_availability (1,
-					    NEON_CHECK_CC | NEON_CHECK_ARCH))
+	  if (!check_simd_pred_availability (TRUE,
+					     NEON_CHECK_CC | NEON_CHECK_ARCH))
 	    return;
 	}
       /* fall through.  */
@@ -18878,8 +18885,8 @@ do_neon_cvt_1 (enum neon_cvt_mode mode)
 	{
 
 	  NEON_ENCODE (FLOAT, inst);
-	  if (check_simd_pred_availability (1,
-					    NEON_CHECK_CC | NEON_CHECK_ARCH8))
+	  if (!check_simd_pred_availability (TRUE,
+					     NEON_CHECK_CC | NEON_CHECK_ARCH8))
 	    return;
 
 	  inst.instruction |= LOW4 (inst.operands[0].reg) << 12;
@@ -19039,7 +19046,7 @@ do_neon_cvttb_1 (bfd_boolean t)
   else if (rs == NS_QQ || rs == NS_QQI)
     {
       int single_to_half = 0;
-      if (check_simd_pred_availability (1, NEON_CHECK_ARCH))
+      if (!check_simd_pred_availability (TRUE, NEON_CHECK_ARCH))
 	return;
 
       enum neon_cvt_flavour flavour = get_neon_cvt_flavour (rs);
@@ -19179,7 +19186,7 @@ neon_move_immediate (void)
 static void
 do_neon_mvn (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_CC | NEON_CHECK_ARCH))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_CC | NEON_CHECK_ARCH))
     return;
 
   if (inst.operands[1].isreg)
@@ -19523,7 +19530,7 @@ do_neon_ext (void)
 static void
 do_neon_rev (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
    return;
 
   enum neon_shape rs;
@@ -19588,7 +19595,7 @@ do_neon_dup (void)
 	N_8 | N_16 | N_32 | N_KEY, N_EQK);
       if (rs == NS_QR)
 	{
-	  if (check_simd_pred_availability (0, NEON_CHECK_ARCH))
+	  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH))
 	    return;
 	}
       else
@@ -19754,7 +19761,8 @@ do_neon_mov (void)
 
     case NS_QQ:  /* case 0/1.  */
       {
-	if (check_simd_pred_availability (0, NEON_CHECK_CC | NEON_CHECK_ARCH))
+	if (!check_simd_pred_availability (FALSE,
+					   NEON_CHECK_CC | NEON_CHECK_ARCH))
 	  return;
 	/* The architecture manual I have doesn't explicitly state which
 	   value the U bit should have for register->register moves, but
@@ -19784,7 +19792,8 @@ do_neon_mov (void)
       /* fall through.  */
 
     case NS_QI:  /* case 2/3.  */
-      if (check_simd_pred_availability (0, NEON_CHECK_CC | NEON_CHECK_ARCH))
+      if (!check_simd_pred_availability (FALSE,
+					 NEON_CHECK_CC | NEON_CHECK_ARCH))
 	return;
       inst.instruction = 0x0800010;
       neon_move_immediate ();
@@ -20089,7 +20098,7 @@ do_mve_movl (void)
 static void
 do_neon_rshift_round_imm (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
    return;
 
   enum neon_shape rs;
@@ -20186,7 +20195,7 @@ do_neon_zip_uzp (void)
 static void
 do_neon_sat_abs_neg (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_CC | NEON_CHECK_ARCH))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_CC | NEON_CHECK_ARCH))
     return;
 
   enum neon_shape rs;
@@ -20222,7 +20231,7 @@ do_neon_recip_est (void)
 static void
 do_neon_cls (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
     return;
 
   enum neon_shape rs;
@@ -20239,7 +20248,7 @@ do_neon_cls (void)
 static void
 do_neon_clz (void)
 {
-  if (check_simd_pred_availability (0, NEON_CHECK_ARCH | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (FALSE, NEON_CHECK_ARCH | NEON_CHECK_CC))
     return;
 
   enum neon_shape rs;
@@ -20792,7 +20801,7 @@ do_vmaxnm (void)
   if (try_vfp_nsyn (3, do_vfp_nsyn_fpv8) == SUCCESS)
     return;
 
-  if (check_simd_pred_availability (1, NEON_CHECK_CC | NEON_CHECK_ARCH8))
+  if (!check_simd_pred_availability (TRUE, NEON_CHECK_CC | NEON_CHECK_ARCH8))
     return;
 
   neon_dyadic_misc (NT_untyped, N_F_16_32, 0);
@@ -20856,7 +20865,8 @@ do_vrint_1 (enum neon_cvt_mode mode)
       if (et.type == NT_invtype)
 	return;
 
-      if (check_simd_pred_availability (1, NEON_CHECK_CC | NEON_CHECK_ARCH8))
+      if (!check_simd_pred_availability (TRUE,
+					 NEON_CHECK_CC | NEON_CHECK_ARCH8))
 	return;
 
       NEON_ENCODE (FLOAT, inst);
@@ -20959,7 +20969,8 @@ do_vcmla (void)
 	      _("immediate out of range"));
   rot /= 90;
 
-  if (check_simd_pred_availability (1, NEON_CHECK_ARCH8 | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (TRUE,
+				     NEON_CHECK_ARCH8 | NEON_CHECK_CC))
     return;
 
   if (inst.operands[2].isscalar)
@@ -21036,8 +21047,8 @@ do_vcadd (void)
   if (et.type == NT_invtype)
     return;
 
-  if (check_simd_pred_availability (et.type == NT_float, NEON_CHECK_ARCH8
-				    | NEON_CHECK_CC))
+  if (!check_simd_pred_availability (et.type == NT_float,
+				     NEON_CHECK_ARCH8 | NEON_CHECK_CC))
     return;
 
   if (et.type == NT_float)
Nick Clifton May 14, 2019, 4:53 p.m. | #3
Hi Andre,

> Sure no problem. Let's see if it all fits in one email.


Thanks - that was very helpful.  I have no more concerns about the patch,
so please consider it approved and apply it at your leisure.

Cheers
  Nick
Nick Clifton May 14, 2019, 4:54 p.m. | #4
Hi Andre,

  Sorry, just to be clear - I meant that the entire patch series
  is approved, not just patch 2/57....

Cheers
  Nick

Patch

diff --git a/gas/config/tc-arm.h b/gas/config/tc-arm.h
index 0d5e79c7ad9fe9a8c416d6f95609a0e2596eba26..39cc9680b96524c730c15fff1af22b180b9c70d1 100644
--- a/gas/config/tc-arm.h
+++ b/gas/config/tc-arm.h
@@ -254,21 +254,25 @@  arm_min (int am_p1, int am_p2)
 /* Registers are generally saved at negative offsets to the CFA.  */
 #define DWARF2_CIE_DATA_ALIGNMENT     (-4)
 
-/* State variables for IT block handling.  */
-enum it_state
+/* State variables for predication block handling.  */
+enum pred_state
 {
-  OUTSIDE_IT_BLOCK, MANUAL_IT_BLOCK, AUTOMATIC_IT_BLOCK
+  OUTSIDE_PRED_BLOCK, MANUAL_PRED_BLOCK, AUTOMATIC_PRED_BLOCK
 };
-struct current_it
+enum pred_type {
+  SCALAR_PRED, VECTOR_PRED
+};
+struct current_pred
 {
   int mask;
-  enum it_state state;
+  enum pred_state state;
   int cc;
   int block_length;
   char *insn;
   int state_handled;
   int warn_deprecated;
   int insn_cond;
+  enum pred_type type;
 };
 
 #ifdef OBJ_ELF
@@ -303,7 +307,7 @@  struct arm_segment_info_type
      emitted only once per section, to save unnecessary bloat.  */
   unsigned int marked_pr_dependency;
 
-  struct current_it current_it;
+  struct current_pred current_pred;
 };
 
 /* We want .cfi_* pseudo-ops for generating unwind info.  */
diff --git a/gas/config/tc-arm.c b/gas/config/tc-arm.c
index 692a73cc20be01a043dfff5fbf4ab0d3bdc87f2c..19729d290c7c5de95ee620d5b6817ab6497b613d 100644
--- a/gas/config/tc-arm.c
+++ b/gas/config/tc-arm.c
@@ -453,16 +453,20 @@  struct neon_type
   unsigned elems;
 };
 
-enum it_instruction_type
+enum pred_instruction_type
 {
-   OUTSIDE_IT_INSN,
+   OUTSIDE_PRED_INSN,
+   INSIDE_VPT_INSN,
    INSIDE_IT_INSN,
    INSIDE_IT_LAST_INSN,
    IF_INSIDE_IT_LAST_INSN, /* Either outside or inside;
 			      if inside, should be the last one.  */
    NEUTRAL_IT_INSN,        /* This could be either inside or outside,
 			      i.e. BKPT and NOP.  */
-   IT_INSN                 /* The IT insn has been parsed.  */
+   IT_INSN,		   /* The IT insn has been parsed.  */
+   VPT_INSN,		   /* The VPT/VPST insn has been parsed.  */
+   MVE_OUTSIDE_PRED_INSN   /* Instruction to indicate a MVE instruction without
+			      a predication code.  */
 };
 
 /* The maximum number of operands we need.  */
@@ -494,7 +498,7 @@  struct arm_it
     int			     pc_rel;
   } relocs[ARM_IT_MAX_RELOCS];
 
-  enum it_instruction_type it_insn_type;
+  enum pred_instruction_type pred_insn_type;
 
   struct
   {
@@ -511,7 +515,7 @@  struct arm_it
        instructions. This allows us to disambiguate ARM <-> vector insns.  */
     unsigned regisimm   : 1;  /* 64-bit immediate, reg forms high 32 bits.  */
     unsigned isvec      : 1;  /* Is a single, double or quad VFP/Neon reg.  */
-    unsigned isquad     : 1;  /* Operand is Neon quad-precision register.  */
+    unsigned isquad     : 1;  /* Operand is SIMD quad register.  */
     unsigned issingle   : 1;  /* Operand is VFP single-precision register.  */
     unsigned hasreloc	: 1;  /* Operand has relocation suffix.  */
     unsigned writeback	: 1;  /* Operand has trailing !  */
@@ -633,12 +637,13 @@  enum arm_reg_type
   REG_TYPE_MVFX,
   REG_TYPE_MVDX,
   REG_TYPE_MVAX,
+  REG_TYPE_MQ,
   REG_TYPE_DSPSC,
   REG_TYPE_MMXWR,
   REG_TYPE_MMXWC,
   REG_TYPE_MMXWCG,
   REG_TYPE_XSCALE,
-  REG_TYPE_RNB
+  REG_TYPE_RNB,
 };
 
 /* Structure for a hash table entry for a register.
@@ -680,6 +685,7 @@  const char * const reg_expected_msgs[] =
   [REG_TYPE_MMXWC]  = N_("iWMMXt control register expected"),
   [REG_TYPE_MMXWCG] = N_("iWMMXt scalar register expected"),
   [REG_TYPE_XSCALE] = N_("XScale accumulator register expected"),
+  [REG_TYPE_MQ]	    = N_("MVE vector register expected"),
   [REG_TYPE_RNB]    = N_("")
 };
 
@@ -719,6 +725,9 @@  struct asm_opcode
 
   /* Function to call to encode instruction in Thumb format.  */
   void (* tencode) (void);
+
+  /* Indicates whether this instruction may be vector predicated.  */
+  unsigned int mayBeVecPred : 1;
 };
 
 /* Defines for various bits that we will want to toggle.  */
@@ -841,6 +850,7 @@  struct asm_opcode
 #define THUMB_LOAD_BIT 0x0800
 #define THUMB2_LOAD_BIT 0x00100000
 
+#define BAD_SYNTAX	_("syntax error")
 #define BAD_ARGS	_("bad arguments to instruction")
 #define BAD_SP          _("r13 not allowed here")
 #define BAD_PC		_("r15 not allowed here")
@@ -852,9 +862,13 @@  struct asm_opcode
 #define BAD_BRANCH	_("branch must be last instruction in IT block")
 #define BAD_BRANCH_OFF	_("branch out of range or not a multiple of 2")
 #define BAD_NOT_IT	_("instruction not allowed in IT block")
+#define BAD_NOT_VPT	_("instruction missing MVE vector predication code")
 #define BAD_FPU		_("selected FPU does not support instruction")
 #define BAD_OUT_IT 	_("thumb conditional instruction should be in IT block")
+#define BAD_OUT_VPT	\
+	_("vector predicated instruction should be in VPT/VPST block")
 #define BAD_IT_COND	_("incorrect condition in IT block")
+#define BAD_VPT_COND	_("incorrect condition in VPT/VPST block")
 #define BAD_IT_IT 	_("IT falling in the range of a previous IT block")
 #define MISSING_FNSTART	_("missing .fnstart before unwinding directive")
 #define BAD_PC_ADDRESSING \
@@ -865,9 +879,18 @@  struct asm_opcode
 #define BAD_FP16	_("selected processor does not support fp16 instruction")
 #define UNPRED_REG(R)	_("using " R " results in unpredictable behaviour")
 #define THUMB1_RELOC_ONLY  _("relocation valid in thumb1 code only")
+#define MVE_NOT_IT	_("Warning: instruction is UNPREDICTABLE in an IT " \
+			  "block")
+#define MVE_NOT_VPT	_("Warning: instruction is UNPREDICTABLE in a VPT " \
+			  "block")
+#define MVE_BAD_PC	_("Warning: instruction is UNPREDICTABLE with PC" \
+			  " operand")
+#define MVE_BAD_SP	_("Warning: instruction is UNPREDICTABLE with SP" \
+			  " operand")
 
 static struct hash_control * arm_ops_hsh;
 static struct hash_control * arm_cond_hsh;
+static struct hash_control * arm_vcond_hsh;
 static struct hash_control * arm_shift_hsh;
 static struct hash_control * arm_psr_hsh;
 static struct hash_control * arm_v7m_psr_hsh;
@@ -919,15 +942,15 @@  typedef enum asmfunc_states
 static asmfunc_states asmfunc_state = OUTSIDE_ASMFUNC;
 
 #ifdef OBJ_ELF
-#  define now_it seg_info (now_seg)->tc_segment_info_data.current_it
+#  define now_pred seg_info (now_seg)->tc_segment_info_data.current_pred
 #else
-static struct current_it now_it;
+static struct current_pred now_pred;
 #endif
 
 static inline int
-now_it_compatible (int cond)
+now_pred_compatible (int cond)
 {
-  return (cond & ~1) == (now_it.cc & ~1);
+  return (cond & ~1) == (now_pred.cc & ~1);
 }
 
 static inline int
@@ -936,39 +959,39 @@  conditional_insn (void)
   return inst.cond != COND_ALWAYS;
 }
 
-static int in_it_block (void);
+static int in_pred_block (void);
 
-static int handle_it_state (void);
+static int handle_pred_state (void);
 
 static void force_automatic_it_block_close (void);
 
 static void it_fsm_post_encode (void);
 
-#define set_it_insn_type(type)			\
+#define set_pred_insn_type(type)			\
   do						\
     {						\
-      inst.it_insn_type = type;			\
-      if (handle_it_state () == FAIL)		\
+      inst.pred_insn_type = type;			\
+      if (handle_pred_state () == FAIL)		\
 	return;					\
     }						\
   while (0)
 
-#define set_it_insn_type_nonvoid(type, failret) \
+#define set_pred_insn_type_nonvoid(type, failret) \
   do						\
     {                                           \
-      inst.it_insn_type = type;			\
-      if (handle_it_state () == FAIL)		\
+      inst.pred_insn_type = type;			\
+      if (handle_pred_state () == FAIL)		\
 	return failret;				\
     }						\
   while(0)
 
-#define set_it_insn_type_last()				\
+#define set_pred_insn_type_last()				\
   do							\
     {							\
       if (inst.cond == COND_ALWAYS)			\
-	set_it_insn_type (IF_INSIDE_IT_LAST_INSN);	\
+	set_pred_insn_type (IF_INSIDE_IT_LAST_INSN);	\
       else						\
-	set_it_insn_type (INSIDE_IT_LAST_INSN);		\
+	set_pred_insn_type (INSIDE_IT_LAST_INSN);		\
     }							\
   while (0)
 
@@ -1500,6 +1523,32 @@  parse_neon_operand_type (struct neon_type_el *vectype, char **ccp)
 #define NEON_ALL_LANES		15
 #define NEON_INTERLEAVE_LANES	14
 
+/* Record a use of the given feature.  */
+static void
+record_feature_use (const arm_feature_set *feature)
+{
+  if (thumb_mode)
+    ARM_MERGE_FEATURE_SETS (thumb_arch_used, thumb_arch_used, *feature);
+  else
+    ARM_MERGE_FEATURE_SETS (arm_arch_used, arm_arch_used, *feature);
+}
+
+/* If the given feature available in the selected CPU, mark it as used.
+   Returns TRUE iff feature is available.  */
+static bfd_boolean
+mark_feature_used (const arm_feature_set *feature)
+{
+  /* Ensure the option is valid on the current architecture.  */
+  if (!ARM_CPU_HAS_FEATURE (cpu_variant, *feature))
+    return FALSE;
+
+  /* Add the appropriate architecture feature for the barrier option used.
+     */
+  record_feature_use (feature);
+
+  return TRUE;
+}
+
 /* Parse either a register or a scalar, with an optional type. Return the
    register number, and optionally fill in the actual type of the register
    when multiple alternatives were given (NEON_TYPE_NDQ) in *RTYPE, and
@@ -1546,6 +1595,26 @@  parse_typed_reg_or_scalar (char **ccp, enum arm_reg_type type,
 	  && (reg->type == REG_TYPE_MMXWCG)))
     type = (enum arm_reg_type) reg->type;
 
+  if (type == REG_TYPE_MQ)
+    {
+      if (!ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
+	return FAIL;
+
+      if (!reg || reg->type != REG_TYPE_NQ)
+	return FAIL;
+
+      if (reg->number > 14 && !mark_feature_used (&fpu_vfp_ext_d32))
+	{
+	  first_error (_("expected MVE register [q0..q7]"));
+	  return FAIL;
+	}
+      type = REG_TYPE_NQ;
+    }
+  else if (ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext)
+	   && (type == REG_TYPE_NQ))
+    return FAIL;
+
+
   if (type != reg->type)
     return FAIL;
 
@@ -3765,10 +3834,10 @@  emit_insn (expressionS *exp, int nbytes)
 	    }
 	  else
 	    {
-	      if (now_it.state == AUTOMATIC_IT_BLOCK)
-		set_it_insn_type_nonvoid (OUTSIDE_IT_INSN, 0);
+	      if (now_pred.state == AUTOMATIC_PRED_BLOCK)
+		set_pred_insn_type_nonvoid (OUTSIDE_PRED_INSN, 0);
 	      else
-		set_it_insn_type_nonvoid (NEUTRAL_IT_INSN, 0);
+		set_pred_insn_type_nonvoid (NEUTRAL_IT_INSN, 0);
 
 	      if (thumb_mode && (size > THUMB_SIZE) && !target_big_endian)
 		emit_thumb32_expr (exp);
@@ -6296,31 +6365,6 @@  parse_cond (char **str)
   return c->value;
 }
 
-/* Record a use of the given feature.  */
-static void
-record_feature_use (const arm_feature_set *feature)
-{
-  if (thumb_mode)
-    ARM_MERGE_FEATURE_SETS (thumb_arch_used, thumb_arch_used, *feature);
-  else
-    ARM_MERGE_FEATURE_SETS (arm_arch_used, arm_arch_used, *feature);
-}
-
-/* If the given feature is currently allowed, mark it as used and return TRUE.
-   Return FALSE otherwise.  */
-static bfd_boolean
-mark_feature_used (const arm_feature_set *feature)
-{
-  /* Ensure the option is currently allowed.  */
-  if (!ARM_CPU_HAS_FEATURE (cpu_variant, *feature))
-    return FALSE;
-
-  /* Add the appropriate architecture feature for the barrier option used.  */
-  record_feature_use (feature);
-
-  return TRUE;
-}
-
 /* Parse an option for a barrier instruction.  Returns the encoding for the
    option, or FAIL.  */
 static int
@@ -6646,10 +6690,15 @@  enum operand_parse_code
   OP_RVS,	/* VFP single precision register */
   OP_RVD,	/* VFP double precision register (0..15) */
   OP_RND,       /* Neon double precision register (0..31) */
+  OP_RNDMQ,     /* Neon double precision (0..31) or MVE vector register.  */
+  OP_RNDMQR,    /* Neon double precision (0..31), MVE vector or ARM register.
+		 */
   OP_RNQ,	/* Neon quad precision register */
+  OP_RNQMQ,	/* Neon quad or MVE vector register.  */
   OP_RVSD,	/* VFP single or double precision register */
   OP_RNSD,      /* Neon single or double precision register */
   OP_RNDQ,      /* Neon double or quad precision register */
+  OP_RNDQMQ,     /* Neon double, quad or MVE vector register.  */
   OP_RNSDQ,	/* Neon single, double or quad precision register */
   OP_RNSC,      /* Neon scalar D[X] */
   OP_RVC,	/* VFP control register */
@@ -6664,6 +6713,10 @@  enum operand_parse_code
   OP_RIWG,	/* iWMMXt wCG register */
   OP_RXA,	/* XScale accumulator register */
 
+  OP_RNSDQMQ,	/* Neon single, double or quad register or MVE vector register
+		 */
+  OP_RNSDQMQR,	/* Neon single, double or quad register, MVE vector register or
+		   GPR (no SP/SP)  */
   /* New operands for Armv8.1-M Mainline.  */
   OP_LR,	/* ARM LR register */
   OP_RRnpcsp_I32, /* ARM register (no BadReg) or literal 1 .. 32 */
@@ -6756,8 +6809,11 @@  enum operand_parse_code
   OP_oRRw,	 /* ARM register, not r15, optional trailing ! */
   OP_oRND,       /* Optional Neon double precision register */
   OP_oRNQ,       /* Optional Neon quad precision register */
+  OP_oRNDQMQ,     /* Optional Neon double, quad or MVE vector register.  */
   OP_oRNDQ,      /* Optional Neon double or quad precision register */
   OP_oRNSDQ,	 /* Optional single, double or quad precision vector register */
+  OP_oRNSDQMQ,	 /* Optional single, double or quad register or MVE vector
+		    register.  */
   OP_oSHll,	 /* LSL immediate */
   OP_oSHar,	 /* ASR immediate */
   OP_oSHllar,	 /* LSL or ASR immediate */
@@ -6929,6 +6985,14 @@  parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 	case OP_RVS:   po_reg_or_fail (REG_TYPE_VFS);	  break;
 	case OP_RVD:   po_reg_or_fail (REG_TYPE_VFD);	  break;
 	case OP_oRND:
+	case OP_RNDMQR:
+	  po_reg_or_goto (REG_TYPE_RN, try_rndmq);
+	  break;
+	try_rndmq:
+	case OP_RNDMQ:
+	  po_reg_or_goto (REG_TYPE_MQ, try_rnd);
+	  break;
+	try_rnd:
 	case OP_RND:   po_reg_or_fail (REG_TYPE_VFD);	  break;
 	case OP_RVC:
 	  po_reg_or_goto (REG_TYPE_VFC, coproc_reg);
@@ -6948,14 +7012,34 @@  parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 	case OP_RIWG:  po_reg_or_fail (REG_TYPE_MMXWCG);  break;
 	case OP_RXA:   po_reg_or_fail (REG_TYPE_XSCALE);  break;
 	case OP_oRNQ:
+	case OP_RNQMQ:
+	  po_reg_or_goto (REG_TYPE_MQ, try_nq);
+	  break;
+	try_nq:
 	case OP_RNQ:   po_reg_or_fail (REG_TYPE_NQ);      break;
 	case OP_RNSD:  po_reg_or_fail (REG_TYPE_NSD);     break;
+	case OP_oRNDQMQ:
+	case OP_RNDQMQ:
+	  po_reg_or_goto (REG_TYPE_MQ, try_rndq);
+	  break;
+	try_rndq:
 	case OP_oRNDQ:
 	case OP_RNDQ:  po_reg_or_fail (REG_TYPE_NDQ);     break;
 	case OP_RVSD:  po_reg_or_fail (REG_TYPE_VFSD);    break;
 	case OP_oRNSDQ:
 	case OP_RNSDQ: po_reg_or_fail (REG_TYPE_NSDQ);    break;
-
+	case OP_RNSDQMQR:
+	  po_reg_or_goto (REG_TYPE_RN, try_mq);
+	  break;
+	  try_mq:
+	case OP_oRNSDQMQ:
+	case OP_RNSDQMQ:
+	  po_reg_or_goto (REG_TYPE_MQ, try_nsdq2);
+	  break;
+	  try_nsdq2:
+	  po_reg_or_fail (REG_TYPE_NSDQ);
+	  inst.error = 0;
+	  break;
 	/* Neon scalar. Using an element size of 8 means that some invalid
 	   scalars are accepted here, so deal with those in later code.  */
 	case OP_RNSC:  po_scalar_or_goto (8, failure);    break;
@@ -7493,7 +7577,7 @@  parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 	  /* The parse routine should already have set inst.error, but set a
 	     default here just in case.  */
 	  if (!inst.error)
-	    inst.error = _("syntax error");
+	    inst.error = BAD_SYNTAX;
 	  return FAIL;
 	}
 
@@ -7505,7 +7589,7 @@  parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 	  && upat[i+1] == OP_stop)
 	{
 	  if (!inst.error)
-	    inst.error = _("syntax error");
+	    inst.error = BAD_SYNTAX;
 	  return FAIL;
 	}
 
@@ -7586,7 +7670,7 @@  parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 static void
 do_scalar_fp16_v82_encode (void)
 {
-  if (inst.cond != COND_ALWAYS)
+  if (inst.cond < COND_ALWAYS)
     as_warn (_("ARMv8.2 scalar fp16 instruction cannot be conditional,"
 	       " the behaviour is UNPREDICTABLE"));
   constraint (!ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_fp16),
@@ -9059,9 +9143,9 @@  do_it (void)
   inst.size = 0;
   if (unified_syntax)
     {
-      set_it_insn_type (IT_INSN);
-      now_it.mask = (inst.instruction & 0xf) | 0x10;
-      now_it.cc = inst.operands[0].imm;
+      set_pred_insn_type (IT_INSN);
+      now_pred.mask = (inst.instruction & 0xf) | 0x10;
+      now_pred.cc = inst.operands[0].imm;
     }
 }
 
@@ -10803,7 +10887,7 @@  do_t_add_sub (void)
 	: inst.operands[0].reg);  /* Rd, foo -> Rd, Rd, foo */
 
   if (Rd == REG_PC)
-    set_it_insn_type_last ();
+    set_pred_insn_type_last ();
 
   if (unified_syntax)
     {
@@ -10814,9 +10898,9 @@  do_t_add_sub (void)
       flags = (inst.instruction == T_MNEM_adds
 	       || inst.instruction == T_MNEM_subs);
       if (flags)
-	narrow = !in_it_block ();
+	narrow = !in_pred_block ();
       else
-	narrow = in_it_block ();
+	narrow = in_pred_block ();
       if (!inst.operands[2].isreg)
 	{
 	  int add;
@@ -11093,9 +11177,9 @@  do_t_arit3 (void)
 
 	  /* See if we can do this with a 16-bit instruction.  */
 	  if (THUMB_SETS_FLAGS (inst.instruction))
-	    narrow = !in_it_block ();
+	    narrow = !in_pred_block ();
 	  else
-	    narrow = in_it_block ();
+	    narrow = in_pred_block ();
 
 	  if (Rd > 7 || Rn > 7 || Rs > 7)
 	    narrow = FALSE;
@@ -11181,9 +11265,9 @@  do_t_arit3c (void)
 
 	  /* See if we can do this with a 16-bit instruction.  */
 	  if (THUMB_SETS_FLAGS (inst.instruction))
-	    narrow = !in_it_block ();
+	    narrow = !in_pred_block ();
 	  else
-	    narrow = in_it_block ();
+	    narrow = in_pred_block ();
 
 	  if (Rd > 7 || Rn > 7 || Rs > 7)
 	    narrow = FALSE;
@@ -11322,7 +11406,7 @@  do_t_bfx (void)
 static void
 do_t_blx (void)
 {
-  set_it_insn_type_last ();
+  set_pred_insn_type_last ();
 
   if (inst.operands[0].isreg)
     {
@@ -11346,9 +11430,9 @@  do_t_branch (void)
   bfd_reloc_code_real_type reloc;
 
   cond = inst.cond;
-  set_it_insn_type (IF_INSIDE_IT_LAST_INSN);
+  set_pred_insn_type (IF_INSIDE_IT_LAST_INSN);
 
-  if (in_it_block ())
+  if (in_pred_block ())
     {
       /* Conditional branches inside IT blocks are encoded as unconditional
 	 branches.  */
@@ -11415,7 +11499,7 @@  do_t_bkpt_hlt1 (int range)
       inst.instruction |= inst.operands[0].imm;
     }
 
-  set_it_insn_type (NEUTRAL_IT_INSN);
+  set_pred_insn_type (NEUTRAL_IT_INSN);
 }
 
 static void
@@ -11433,7 +11517,7 @@  do_t_bkpt (void)
 static void
 do_t_branch23 (void)
 {
-  set_it_insn_type_last ();
+  set_pred_insn_type_last ();
   encode_branch (BFD_RELOC_THUMB_PCREL_BRANCH23);
 
   /* md_apply_fix blows up with 'bl foo(PLT)' where foo is defined in
@@ -11461,7 +11545,7 @@  do_t_branch23 (void)
 static void
 do_t_bx (void)
 {
-  set_it_insn_type_last ();
+  set_pred_insn_type_last ();
   inst.instruction |= inst.operands[0].reg << 3;
   /* ??? FIXME: Should add a hacky reloc here if reg is REG_PC.	 The reloc
      should cause the alignment to be checked once it is known.	 This is
@@ -11473,7 +11557,7 @@  do_t_bxj (void)
 {
   int Rm;
 
-  set_it_insn_type_last ();
+  set_pred_insn_type_last ();
   Rm = inst.operands[0].reg;
   reject_bad_reg (Rm);
   inst.instruction |= Rm << 16;
@@ -11499,20 +11583,20 @@  do_t_clz (void)
 static void
 do_t_csdb (void)
 {
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
 }
 
 static void
 do_t_cps (void)
 {
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
   inst.instruction |= inst.operands[0].imm;
 }
 
 static void
 do_t_cpsi (void)
 {
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
   if (unified_syntax
       && (inst.operands[1].present || inst.size_req == 4)
       && ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_v6_notm))
@@ -11559,7 +11643,7 @@  do_t_cpy (void)
 static void
 do_t_cbz (void)
 {
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
   constraint (inst.operands[0].reg > 7, BAD_HIREG);
   inst.instruction |= inst.operands[0].reg;
   inst.relocs[0].pc_rel = 1;
@@ -11605,10 +11689,11 @@  do_t_it (void)
 {
   unsigned int cond = inst.operands[0].imm;
 
-  set_it_insn_type (IT_INSN);
-  now_it.mask = (inst.instruction & 0xf) | 0x10;
-  now_it.cc = cond;
-  now_it.warn_deprecated = FALSE;
+  set_pred_insn_type (IT_INSN);
+  now_pred.mask = (inst.instruction & 0xf) | 0x10;
+  now_pred.cc = cond;
+  now_pred.warn_deprecated = FALSE;
+  now_pred.type = SCALAR_PRED;
 
   /* If the condition is a negative condition, invert the mask.  */
   if ((cond & 0x1) == 0x0)
@@ -11618,22 +11703,22 @@  do_t_it (void)
       if ((mask & 0x7) == 0)
 	{
 	  /* No conversion needed.  */
-	  now_it.block_length = 1;
+	  now_pred.block_length = 1;
 	}
       else if ((mask & 0x3) == 0)
 	{
 	  mask ^= 0x8;
-	  now_it.block_length = 2;
+	  now_pred.block_length = 2;
 	}
       else if ((mask & 0x1) == 0)
 	{
 	  mask ^= 0xC;
-	  now_it.block_length = 3;
+	  now_pred.block_length = 3;
 	}
       else
 	{
 	  mask ^= 0xE;
-	  now_it.block_length = 4;
+	  now_pred.block_length = 4;
 	}
 
       inst.instruction &= 0xfff0;
@@ -11643,6 +11728,18 @@  do_t_it (void)
   inst.instruction |= cond << 4;
 }
 
+static void
+do_mve_vpt (void)
+{
+  /* We are dealing with a vector predicated block.  */
+  set_pred_insn_type (VPT_INSN);
+  now_pred.cc = 0;
+  now_pred.mask = ((inst.instruction & 0x00400000) >> 19)
+		  | ((inst.instruction & 0xe000) >> 13);
+  now_pred.warn_deprecated = FALSE;
+  now_pred.type = VECTOR_PRED;
+}
+
 /* Helper function used for both push/pop and ldm/stm.  */
 static void
 encode_thumb2_multi (bfd_boolean do_io, int base, unsigned mask,
@@ -11669,7 +11766,7 @@  encode_thumb2_multi (bfd_boolean do_io, int base, unsigned mask,
 	  if (mask & (1 << 14))
 	    inst.error = _("LR and PC should not both be in register list");
 	  else
-	    set_it_insn_type_last ();
+	    set_pred_insn_type_last ();
 	}
     }
   else if (store)
@@ -11883,7 +11980,7 @@  do_t_ldst (void)
   if (inst.operands[0].isreg
       && !inst.operands[0].preind
       && inst.operands[0].reg == REG_PC)
-    set_it_insn_type_last ();
+    set_pred_insn_type_last ();
 
   opcode = inst.instruction;
   if (unified_syntax)
@@ -12142,7 +12239,7 @@  do_t_mov_cmp (void)
   Rm = inst.operands[1].reg;
 
   if (Rn == REG_PC)
-    set_it_insn_type_last ();
+    set_pred_insn_type_last ();
 
   if (unified_syntax)
     {
@@ -12154,7 +12251,7 @@  do_t_mov_cmp (void)
 
       low_regs = (Rn <= 7 && Rm <= 7);
       opcode = inst.instruction;
-      if (in_it_block ())
+      if (in_pred_block ())
 	narrow = opcode != T_MNEM_movs;
       else
 	narrow = opcode != T_MNEM_movs || low_regs;
@@ -12225,7 +12322,7 @@  do_t_mov_cmp (void)
       if (!inst.operands[1].isreg)
 	{
 	  /* Immediate operand.  */
-	  if (!in_it_block () && opcode == T_MNEM_mov)
+	  if (!in_pred_block () && opcode == T_MNEM_mov)
 	    narrow = 0;
 	  if (low_regs && narrow)
 	    {
@@ -12261,7 +12358,7 @@  do_t_mov_cmp (void)
 	  /* Register shifts are encoded as separate shift instructions.  */
 	  bfd_boolean flags = (inst.instruction == T_MNEM_movs);
 
-	  if (in_it_block ())
+	  if (in_pred_block ())
 	    narrow = !flags;
 	  else
 	    narrow = flags;
@@ -12317,7 +12414,7 @@  do_t_mov_cmp (void)
 	      && (inst.instruction == T_MNEM_mov
 		  || inst.instruction == T_MNEM_movs))
 	    {
-	      if (in_it_block ())
+	      if (in_pred_block ())
 		narrow = (inst.instruction == T_MNEM_mov);
 	      else
 		narrow = (inst.instruction == T_MNEM_movs);
@@ -12496,9 +12593,9 @@  do_t_mvn_tst (void)
 	       || inst.instruction == T_MNEM_tst)
 	narrow = TRUE;
       else if (THUMB_SETS_FLAGS (inst.instruction))
-	narrow = !in_it_block ();
+	narrow = !in_pred_block ();
       else
-	narrow = in_it_block ();
+	narrow = in_pred_block ();
 
       if (!inst.operands[1].isreg)
 	{
@@ -12663,9 +12760,9 @@  do_t_mul (void)
 	  || Rm > 7)
 	narrow = FALSE;
       else if (inst.instruction == T_MNEM_muls)
-	narrow = !in_it_block ();
+	narrow = !in_pred_block ();
       else
-	narrow = in_it_block ();
+	narrow = in_pred_block ();
     }
   else
     {
@@ -12731,7 +12828,7 @@  do_t_mull (void)
 static void
 do_t_nop (void)
 {
-  set_it_insn_type (NEUTRAL_IT_INSN);
+  set_pred_insn_type (NEUTRAL_IT_INSN);
 
   if (unified_syntax)
     {
@@ -12769,9 +12866,9 @@  do_t_neg (void)
       bfd_boolean narrow;
 
       if (THUMB_SETS_FLAGS (inst.instruction))
-	narrow = !in_it_block ();
+	narrow = !in_pred_block ();
       else
-	narrow = in_it_block ();
+	narrow = in_pred_block ();
       if (inst.operands[0].reg > 7 || inst.operands[1].reg > 7)
 	narrow = FALSE;
       if (inst.size_req == 4)
@@ -13033,9 +13130,9 @@  do_t_rsb (void)
       bfd_boolean narrow;
 
       if ((inst.instruction & 0x00100000) != 0)
-	narrow = !in_it_block ();
+	narrow = !in_pred_block ();
       else
-	narrow = in_it_block ();
+	narrow = in_pred_block ();
 
       if (Rd > 7 || Rs > 7)
 	narrow = FALSE;
@@ -13073,7 +13170,7 @@  do_t_setend (void)
       && ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_v8))
       as_tsktsk (_("setend use is deprecated for ARMv8"));
 
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
   if (inst.operands[0].imm)
     inst.instruction |= 0x8;
 }
@@ -13103,9 +13200,9 @@  do_t_shift (void)
 	}
 
       if (THUMB_SETS_FLAGS (inst.instruction))
-	narrow = !in_it_block ();
+	narrow = !in_pred_block ();
       else
-	narrow = in_it_block ();
+	narrow = in_pred_block ();
       if (inst.operands[0].reg > 7 || inst.operands[1].reg > 7)
 	narrow = FALSE;
       if (!inst.operands[2].isreg && shift_kind == SHIFT_ROR)
@@ -13275,7 +13372,7 @@  do_t_smc (void)
   inst.instruction |= (value & 0x0ff0);
   inst.instruction |= (value & 0x000f) << 16;
   /* PR gas/15623: SMC instructions must be last in an IT block.  */
-  set_it_insn_type_last ();
+  set_pred_insn_type_last ();
 }
 
 static void
@@ -13450,7 +13547,7 @@  do_t_tb (void)
   int half;
 
   half = (inst.instruction & 0x10) != 0;
-  set_it_insn_type_last ();
+  set_pred_insn_type_last ();
   constraint (inst.operands[0].immisreg,
 	      _("instruction requires register index"));
 
@@ -13486,7 +13583,7 @@  do_t_udf (void)
       inst.instruction |= inst.operands[0].imm;
     }
 
-  set_it_insn_type (NEUTRAL_IT_INSN);
+  set_pred_insn_type (NEUTRAL_IT_INSN);
 }
 
 
@@ -13674,7 +13771,7 @@  do_t_loloop (void)
 {
   unsigned long insn = inst.instruction;
 
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
   inst.instruction = THUMB_OP32 (inst.instruction);
 
   switch (insn)
@@ -13716,13 +13813,16 @@  struct neon_tab_entry
 /* Map overloaded Neon opcodes to their respective encodings.  */
 #define NEON_ENC_TAB					\
   X(vabd,	0x0000700, 0x1200d00, N_INV),		\
+  X(vabdl,	0x0800700, N_INV,     N_INV),		\
   X(vmax,	0x0000600, 0x0000f00, N_INV),		\
   X(vmin,	0x0000610, 0x0200f00, N_INV),		\
   X(vpadd,	0x0000b10, 0x1000d00, N_INV),		\
   X(vpmax,	0x0000a00, 0x1000f00, N_INV),		\
   X(vpmin,	0x0000a10, 0x1200f00, N_INV),		\
   X(vadd,	0x0000800, 0x0000d00, N_INV),		\
+  X(vaddl,	0x0800000, N_INV,     N_INV),		\
   X(vsub,	0x1000800, 0x0200d00, N_INV),		\
+  X(vsubl,	0x0800200, N_INV,     N_INV),		\
   X(vceq,	0x1000810, 0x0000e00, 0x1b10100),	\
   X(vcge,	0x0000310, 0x1000e00, 0x1b10080),	\
   X(vcgt,	0x0000300, 0x1200e00, 0x1b10000),	\
@@ -13865,6 +13965,7 @@  NEON_ENC_TAB
   X(3, (Q, Q, I), QUAD),		\
   X(3, (D, D, S), DOUBLE),		\
   X(3, (Q, Q, S), QUAD),		\
+  X(3, (Q, Q, R), QUAD),		\
   X(2, (D, D), DOUBLE),			\
   X(2, (Q, Q), QUAD),			\
   X(2, (D, S), DOUBLE),			\
@@ -14054,6 +14155,9 @@  enum neon_type_mask
 #define N_I_ALL    (N_I8 | N_I16 | N_I32 | N_I64)
 #define N_IF_32    (N_I8 | N_I16 | N_I32 | N_F16 | N_F32)
 #define N_F_ALL    (N_F16 | N_F32 | N_F64)
+#define N_I_MVE	   (N_I8 | N_I16 | N_I32)
+#define N_F_MVE	   (N_F16 | N_F32)
+#define N_SU_MVE   (N_S8 | N_S16 | N_S32 | N_U8 | N_U16 | N_U32)
 
 /* Pass this as the first type argument to neon_check_type to ignore types
    altogether.  */
@@ -14582,7 +14686,7 @@  neon_check_type (unsigned els, enum neon_shape ns, ...)
 
 		  if ((given_type & types_allowed) == 0)
 		    {
-		      first_error (_("bad type in Neon instruction"));
+		      first_error (_("bad type in SIMD instruction"));
 		      return badtype;
 		    }
 		}
@@ -15025,6 +15129,45 @@  neon_logbits (unsigned x)
 #define LOW4(R) ((R) & 0xf)
 #define HI1(R) (((R) >> 4) & 1)
 
+static void
+mve_encode_qqr (int size, int fp)
+{
+  if (inst.operands[2].reg == REG_SP)
+    as_tsktsk (MVE_BAD_SP);
+  else if (inst.operands[2].reg == REG_PC)
+    as_tsktsk (MVE_BAD_PC);
+
+  if (fp)
+    {
+      /* vadd.  */
+      if (((unsigned)inst.instruction) == 0xd00)
+	inst.instruction = 0xee300f40;
+      /* vsub.  */
+      else if (((unsigned)inst.instruction) == 0x200d00)
+	inst.instruction = 0xee301f40;
+
+      /* Setting size which is 1 for F16 and 0 for F32.  */
+      inst.instruction |= (size == 16) << 28;
+    }
+  else
+    {
+      /* vadd.  */
+      if (((unsigned)inst.instruction) == 0x800)
+	inst.instruction = 0xee010f40;
+      /* vsub.  */
+      else if (((unsigned)inst.instruction) == 0x1000800)
+	inst.instruction = 0xee011f40;
+      /* Setting bits for size.  */
+      inst.instruction |= neon_logbits (size) << 20;
+    }
+  inst.instruction |= LOW4 (inst.operands[0].reg) << 12;
+  inst.instruction |= HI1 (inst.operands[0].reg) << 22;
+  inst.instruction |= LOW4 (inst.operands[1].reg) << 16;
+  inst.instruction |= HI1 (inst.operands[1].reg) << 7;
+  inst.instruction |= inst.operands[2].reg;
+  inst.is_neon = 1;
+}
+
 /* Encode insns with bit pattern:
 
   |28/24|23|22 |21 20|19 16|15 12|11    8|7|6|5|4|3  0|
@@ -15346,26 +15489,27 @@  static void
 neon_dyadic_misc (enum neon_el_type ubit_meaning, unsigned types,
 		  unsigned destbits)
 {
-  enum neon_shape rs = neon_select_shape (NS_DDD, NS_QQQ, NS_NULL);
+  enum neon_shape rs = neon_select_shape (NS_DDD, NS_QQQ, NS_QQR, NS_NULL);
   struct neon_type_el et = neon_check_type (3, rs, N_EQK | destbits, N_EQK,
 					    types | N_KEY);
   if (et.type == NT_float)
     {
       NEON_ENCODE (FLOAT, inst);
-      neon_three_same (neon_quad (rs), 0, et.size == 16 ? (int) et.size : -1);
+      if (rs == NS_QQR)
+	mve_encode_qqr (et.size, 1);
+      else
+	neon_three_same (neon_quad (rs), 0, et.size == 16 ? (int) et.size : -1);
     }
   else
     {
       NEON_ENCODE (INTEGER, inst);
-      neon_three_same (neon_quad (rs), et.type == ubit_meaning, et.size);
+      if (rs == NS_QQR)
+	mve_encode_qqr (et.size, 0);
+      else
+	neon_three_same (neon_quad (rs), et.type == ubit_meaning, et.size);
     }
 }
 
-static void
-do_neon_dyadic_if_su (void)
-{
-  neon_dyadic_misc (NT_unsigned, N_SUF_32, 0);
-}
 
 static void
 do_neon_dyadic_if_su_d (void)
@@ -15424,32 +15568,93 @@  vfp_or_neon_is_neon (unsigned check)
 	inst.instruction |= inst.uncond_value << 28;
     }
 
-  if ((check & NEON_CHECK_ARCH)
-      && !mark_feature_used (&fpu_neon_ext_v1))
+
+    if (((check & NEON_CHECK_ARCH) && !mark_feature_used (&fpu_neon_ext_v1))
+	|| ((check & NEON_CHECK_ARCH8)
+	    && !mark_feature_used (&fpu_neon_ext_armv8)))
+      {
+	first_error (_(BAD_FPU));
+	return FAIL;
+      }
+
+  return SUCCESS;
+}
+
+static int
+check_simd_pred_availability (int fp, unsigned check)
+{
+  if (inst.cond > COND_ALWAYS)
     {
-      first_error (_(BAD_FPU));
-      return FAIL;
+      if (!ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
+	{
+	  inst.error = BAD_FPU;
+	  return 1;
+	}
+      inst.pred_insn_type = INSIDE_VPT_INSN;
     }
-
-  if ((check & NEON_CHECK_ARCH8)
-      && !mark_feature_used (&fpu_neon_ext_armv8))
+  else if (inst.cond < COND_ALWAYS)
     {
-      first_error (_(BAD_FPU));
-      return FAIL;
+      if (ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
+	inst.pred_insn_type = MVE_OUTSIDE_PRED_INSN;
+      else if (vfp_or_neon_is_neon (check) == FAIL)
+	return 2;
     }
+  else
+    {
+      if (!ARM_CPU_HAS_FEATURE (cpu_variant, fp ? mve_fp_ext : mve_ext)
+	  && vfp_or_neon_is_neon (check) == FAIL)
+	return 3;
 
-  return SUCCESS;
+      if (ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
+	inst.pred_insn_type = MVE_OUTSIDE_PRED_INSN;
+    }
+  return 0;
 }
 
 static void
-do_neon_addsub_if_i (void)
+do_neon_dyadic_if_su (void)
 {
-  if (try_vfp_nsyn (3, do_vfp_nsyn_add_sub) == SUCCESS)
+  enum neon_shape rs = neon_select_shape (NS_DDD, NS_QQQ, NS_QQR, NS_NULL);
+  struct neon_type_el et = neon_check_type (3, rs, N_EQK , N_EQK,
+					    N_SUF_32 | N_KEY);
+
+  if (check_simd_pred_availability (et.type == NT_float,
+				    NEON_CHECK_ARCH | NEON_CHECK_CC))
     return;
 
-  if (vfp_or_neon_is_neon (NEON_CHECK_CC | NEON_CHECK_ARCH) == FAIL)
+  neon_dyadic_misc (NT_unsigned, N_SUF_32, 0);
+}
+
+static void
+do_neon_addsub_if_i (void)
+{
+  if (ARM_CPU_HAS_FEATURE (cpu_variant, fpu_vfp_ext_v1xd)
+      && try_vfp_nsyn (3, do_vfp_nsyn_add_sub) == SUCCESS)
     return;
 
+  enum neon_shape rs = neon_select_shape (NS_DDD, NS_QQQ, NS_QQR, NS_NULL);
+  struct neon_type_el et = neon_check_type (3, rs, N_EQK,
+					    N_EQK, N_IF_32 | N_I64 | N_KEY);
+
+  constraint (rs == NS_QQR && et.size == 64, BAD_FPU);
+  /* If we are parsing Q registers and the element types match MVE, which NEON
+     also supports, then we must check whether this is an instruction that can
+     be used by both MVE/NEON.  This distinction can be made based on whether
+     they are predicated or not.  */
+  if ((rs == NS_QQQ || rs == NS_QQR) && et.size != 64)
+    {
+      if (check_simd_pred_availability (et.type == NT_float,
+					NEON_CHECK_ARCH | NEON_CHECK_CC))
+	return;
+    }
+  else
+    {
+      /* If they are either in a D register or are using an unsupported.  */
+      if (rs != NS_QQR
+	  && vfp_or_neon_is_neon (NEON_CHECK_CC | NEON_CHECK_ARCH) == FAIL)
+	return;
+    }
+
   /* The "untyped" case can't happen. Do this to stop the "U" bit being
      affected if we specify unsigned args.  */
   neon_dyadic_misc (NT_untyped, N_IF_32 | N_I64, 0);
@@ -16131,7 +16336,7 @@  do_vfp_nsyn_cvt_fpv8 (enum neon_cvt_flavour flavour,
     constraint (!ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_fp16),
 		_(BAD_FP16));
 
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
 
   switch (flavour)
     {
@@ -16286,7 +16491,7 @@  do_neon_cvt_1 (enum neon_cvt_mode mode)
       if (mode != neon_cvt_mode_x && mode != neon_cvt_mode_z)
 	{
 	  NEON_ENCODE (FLOAT, inst);
-	  set_it_insn_type (OUTSIDE_IT_INSN);
+	  set_pred_insn_type (OUTSIDE_PRED_INSN);
 
 	  if (vfp_or_neon_is_neon (NEON_CHECK_CC | NEON_CHECK_ARCH8) == FAIL)
 	    return;
@@ -16587,10 +16792,49 @@  neon_mixed_length (struct neon_type_el et, unsigned size)
 static void
 do_neon_dyadic_long (void)
 {
-  /* FIXME: Type checking for lengthening op.  */
-  struct neon_type_el et = neon_check_type (3, NS_QDD,
-    N_EQK | N_DBL, N_EQK, N_SU_32 | N_KEY);
-  neon_mixed_length (et, et.size);
+  enum neon_shape rs = neon_select_shape (NS_QDD, NS_QQQ, NS_QQR, NS_NULL);
+  if (rs == NS_QDD)
+    {
+      if (vfp_or_neon_is_neon (NEON_CHECK_ARCH | NEON_CHECK_CC) == FAIL)
+	return;
+
+      NEON_ENCODE (INTEGER, inst);
+      /* FIXME: Type checking for lengthening op.  */
+      struct neon_type_el et = neon_check_type (3, NS_QDD,
+	N_EQK | N_DBL, N_EQK, N_SU_32 | N_KEY);
+      neon_mixed_length (et, et.size);
+    }
+  else if (ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext)
+	   && (inst.cond == 0xf || inst.cond == 0x10))
+    {
+      /* If parsing for MVE, vaddl/vsubl/vabdl{e,t} can only be vadd/vsub/vabd
+	 in an IT block with le/lt conditions.  */
+
+      if (inst.cond == 0xf)
+	inst.cond = 0xb;
+      else if (inst.cond == 0x10)
+	inst.cond = 0xd;
+
+      inst.pred_insn_type = INSIDE_IT_INSN;
+
+      if (inst.instruction == N_MNEM_vaddl)
+	{
+	  inst.instruction = N_MNEM_vadd;
+	  do_neon_addsub_if_i ();
+	}
+      else if (inst.instruction == N_MNEM_vsubl)
+	{
+	  inst.instruction = N_MNEM_vsub;
+	  do_neon_addsub_if_i ();
+	}
+      else if (inst.instruction == N_MNEM_vabdl)
+	{
+	  inst.instruction = N_MNEM_vabd;
+	  do_neon_dyadic_if_su ();
+	}
+    }
+  else
+    first_error (BAD_FPU);
 }
 
 static void
@@ -17863,7 +18107,7 @@  do_vfp_nsyn_fpv8 (enum neon_shape rs)
 static void
 do_vsel (void)
 {
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
 
   if (try_vfp_nsyn (3, do_vfp_nsyn_fpv8) != SUCCESS)
     first_error (_("invalid instruction shape"));
@@ -17872,7 +18116,7 @@  do_vsel (void)
 static void
 do_vmaxnm (void)
 {
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
 
   if (try_vfp_nsyn (3, do_vfp_nsyn_fpv8) == SUCCESS)
     return;
@@ -17905,7 +18149,7 @@  do_vrint_1 (enum neon_cvt_mode mode)
       /* VFP encodings.  */
       if (mode == neon_cvt_mode_a || mode == neon_cvt_mode_n
 	  || mode == neon_cvt_mode_p || mode == neon_cvt_mode_m)
-	set_it_insn_type (OUTSIDE_IT_INSN);
+	set_pred_insn_type (OUTSIDE_PRED_INSN);
 
       NEON_ENCODE (FPV8, inst);
       if (rs == NS_FF || rs == NS_HH)
@@ -17941,7 +18185,7 @@  do_vrint_1 (enum neon_cvt_mode mode)
       if (et.type == NT_invtype)
 	return;
 
-      set_it_insn_type (OUTSIDE_IT_INSN);
+      set_pred_insn_type (OUTSIDE_PRED_INSN);
       NEON_ENCODE (FLOAT, inst);
 
       if (vfp_or_neon_is_neon (NEON_CHECK_CC | NEON_CHECK_ARCH8) == FAIL)
@@ -18170,7 +18414,7 @@  do_neon_dotproduct_u (void)
 static void
 do_crypto_2op_1 (unsigned elttype, int op)
 {
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
 
   if (neon_check_type (2, NS_QQ, N_EQK | N_UNT, elttype | N_UNT | N_KEY).type
       == NT_invtype)
@@ -18195,7 +18439,7 @@  do_crypto_2op_1 (unsigned elttype, int op)
 static void
 do_crypto_3op_1 (int u, int op)
 {
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
 
   if (neon_check_type (3, NS_QQQ, N_EQK | N_UNT, N_EQK | N_UNT,
 		       N_32 | N_UNT | N_KEY).type == NT_invtype)
@@ -18298,7 +18542,7 @@  do_crc32_1 (unsigned int poly, unsigned int sz)
   unsigned int Rn = inst.operands[1].reg;
   unsigned int Rm = inst.operands[2].reg;
 
-  set_it_insn_type (OUTSIDE_IT_INSN);
+  set_pred_insn_type (OUTSIDE_PRED_INSN);
   inst.instruction |= LOW4 (Rd) << (thumb_mode ? 8 : 12);
   inst.instruction |= LOW4 (Rn) << 16;
   inst.instruction |= LOW4 (Rm);
@@ -18540,9 +18784,10 @@  enum opcode_tag
   OT_unconditionalF,	/* Instruction cannot be conditionalized
 			   and carries 0xF in its ARM condition field.  */
   OT_csuffix,		/* Instruction takes a conditional suffix.  */
-  OT_csuffixF,		/* Some forms of the instruction take a conditional
-			   suffix, others place 0xF where the condition field
-			   would be.  */
+  OT_csuffixF,		/* Some forms of the instruction take a scalar
+			   conditional suffix, others place 0xF where the
+			   condition field would be, others take a vector
+			   conditional suffix.  */
   OT_cinfix3,		/* Instruction takes a conditional infix,
 			   beginning at character index 3.  (In
 			   unified mode, it becomes a suffix.)  */
@@ -18688,17 +18933,35 @@  opcode_lookup (char **str)
       inst.cond = cond->value;
       return opcode;
     }
+ if (ARM_CPU_HAS_FEATURE (cpu_variant, mve_ext))
+   {
+    /* Cannot have a conditional suffix on a mnemonic of less than a character.
+     */
+    if (end - base < 2)
+      return NULL;
+     affix = end - 1;
+     cond = (const struct asm_cond *) hash_find_n (arm_vcond_hsh, affix, 1);
+     opcode = (const struct asm_opcode *) hash_find_n (arm_ops_hsh, base,
+						      affix - base);
+     /* If this opcode can not be vector predicated then don't accept it with a
+	vector predication code.  */
+     if (opcode && !opcode->mayBeVecPred)
+       opcode = NULL;
+   }
+  if (!opcode || !cond)
+    {
+      /* Cannot have a conditional suffix on a mnemonic of less than two
+	 characters.  */
+      if (end - base < 3)
+	return NULL;
 
-  /* Cannot have a conditional suffix on a mnemonic of less than two
-     characters.  */
-  if (end - base < 3)
-    return NULL;
+      /* Look for suffixed mnemonic.  */
+      affix = end - 2;
+      cond = (const struct asm_cond *) hash_find_n (arm_cond_hsh, affix, 2);
+      opcode = (const struct asm_opcode *) hash_find_n (arm_ops_hsh, base,
+							affix - base);
+    }
 
-  /* Look for suffixed mnemonic.  */
-  affix = end - 2;
-  cond = (const struct asm_cond *) hash_find_n (arm_cond_hsh, affix, 2);
-  opcode = (const struct asm_opcode *) hash_find_n (arm_ops_hsh, base,
-						    affix - base);
   if (opcode && cond)
     {
       /* step CE */
@@ -18777,7 +19040,7 @@  opcode_lookup (char **str)
 
 /* This function generates an initial IT instruction, leaving its block
    virtually open for the new instructions. Eventually,
-   the mask will be updated by now_it_add_mask () each time
+   the mask will be updated by now_pred_add_mask () each time
    a new instruction needs to be included in the IT block.
    Finally, the block is closed with close_automatic_it_block ().
    The block closure can be requested either from md_assemble (),
@@ -18786,14 +19049,14 @@  opcode_lookup (char **str)
 static void
 new_automatic_it_block (int cond)
 {
-  now_it.state = AUTOMATIC_IT_BLOCK;
-  now_it.mask = 0x18;
-  now_it.cc = cond;
-  now_it.block_length = 1;
+  now_pred.state = AUTOMATIC_PRED_BLOCK;
+  now_pred.mask = 0x18;
+  now_pred.cc = cond;
+  now_pred.block_length = 1;
   mapping_state (MAP_THUMB);
-  now_it.insn = output_it_inst (cond, now_it.mask, NULL);
-  now_it.warn_deprecated = FALSE;
-  now_it.insn_cond = TRUE;
+  now_pred.insn = output_it_inst (cond, now_pred.mask, NULL);
+  now_pred.warn_deprecated = FALSE;
+  now_pred.insn_cond = TRUE;
 }
 
 /* Close an automatic IT block.
@@ -18802,29 +19065,29 @@  new_automatic_it_block (int cond)
 static void
 close_automatic_it_block (void)
 {
-  now_it.mask = 0x10;
-  now_it.block_length = 0;
+  now_pred.mask = 0x10;
+  now_pred.block_length = 0;
 }
 
 /* Update the mask of the current automatically-generated IT
    instruction. See comments in new_automatic_it_block ().  */
 
 static void
-now_it_add_mask (int cond)
+now_pred_add_mask (int cond)
 {
 #define CLEAR_BIT(value, nbit)  ((value) & ~(1 << (nbit)))
 #define SET_BIT_VALUE(value, bitvalue, nbit)  (CLEAR_BIT (value, nbit) \
 					      | ((bitvalue) << (nbit)))
   const int resulting_bit = (cond & 1);
 
-  now_it.mask &= 0xf;
-  now_it.mask = SET_BIT_VALUE (now_it.mask,
+  now_pred.mask &= 0xf;
+  now_pred.mask = SET_BIT_VALUE (now_pred.mask,
 				   resulting_bit,
-				  (5 - now_it.block_length));
-  now_it.mask = SET_BIT_VALUE (now_it.mask,
+				  (5 - now_pred.block_length));
+  now_pred.mask = SET_BIT_VALUE (now_pred.mask,
 				   1,
-				   ((5 - now_it.block_length) - 1) );
-  output_it_inst (now_it.cc, now_it.mask, now_it.insn);
+				   ((5 - now_pred.block_length) - 1));
+  output_it_inst (now_pred.cc, now_pred.mask, now_pred.insn);
 
 #undef CLEAR_BIT
 #undef SET_BIT_VALUE
@@ -18832,9 +19095,9 @@  now_it_add_mask (int cond)
 
 /* The IT blocks handling machinery is accessed through the these functions:
      it_fsm_pre_encode ()               from md_assemble ()
-     set_it_insn_type ()                optional, from the tencode functions
-     set_it_insn_type_last ()           ditto
-     in_it_block ()                     ditto
+     set_pred_insn_type ()		optional, from the tencode functions
+     set_pred_insn_type_last ()		ditto
+     in_pred_block ()			ditto
      it_fsm_post_encode ()              from md_assemble ()
      force_automatic_it_block_close ()  from label handling functions
 
@@ -18844,37 +19107,38 @@  now_it_add_mask (int cond)
 	on the inst.condition.
      2) During the tencode function, two things may happen:
 	a) The tencode function overrides the IT insn type by
-	   calling either set_it_insn_type (type) or set_it_insn_type_last ().
+	   calling either set_pred_insn_type (type) or
+	   set_pred_insn_type_last ().
 	b) The tencode function queries the IT block state by
-	   calling in_it_block () (i.e. to determine narrow/not narrow mode).
+	   calling in_pred_block () (i.e. to determine narrow/not narrow mode).
 
-	Both set_it_insn_type and in_it_block run the internal FSM state
-	handling function (handle_it_state), because: a) setting the IT insn
+	Both set_pred_insn_type and in_pred_block run the internal FSM state
+	handling function (handle_pred_state), because: a) setting the IT insn
 	type may incur in an invalid state (exiting the function),
 	and b) querying the state requires the FSM to be updated.
 	Specifically we want to avoid creating an IT block for conditional
 	branches, so it_fsm_pre_encode is actually a guess and we can't
 	determine whether an IT block is required until the tencode () routine
 	has decided what type of instruction this actually it.
-	Because of this, if set_it_insn_type and in_it_block have to be used,
-	set_it_insn_type has to be called first.
+	Because of this, if set_pred_insn_type and in_pred_block have to be
+	used, set_pred_insn_type has to be called first.
 
-	set_it_insn_type_last () is a wrapper of set_it_insn_type (type), that
-	determines the insn IT type depending on the inst.cond code.
+	set_pred_insn_type_last () is a wrapper of set_pred_insn_type (type),
+	that determines the insn IT type depending on the inst.cond code.
 	When a tencode () routine encodes an instruction that can be
 	either outside an IT block, or, in the case of being inside, has to be
-	the last one, set_it_insn_type_last () will determine the proper
+	the last one, set_pred_insn_type_last () will determine the proper
 	IT instruction type based on the inst.cond code. Otherwise,
-	set_it_insn_type can be called for overriding that logic or
+	set_pred_insn_type can be called for overriding that logic or
 	for covering other cases.
 
-	Calling handle_it_state () may not transition the IT block state to
-	OUTSIDE_IT_BLOCK immediately, since the (current) state could be
+	Calling handle_pred_state () may not transition the IT block state to
+	OUTSIDE_PRED_BLOCK immediately, since the (current) state could be
 	still queried. Instead, if the FSM determines that the state should
-	be transitioned to OUTSIDE_IT_BLOCK, a flag is marked to be closed
+	be transitioned to OUTSIDE_PRED_BLOCK, a flag is marked to be closed
 	after the tencode () function: that's what it_fsm_post_encode () does.
 
-	Since in_it_block () calls the state handling function to get an
+	Since in_pred_block () calls the state handling function to get an
 	updated state, an error may occur (due to invalid insns combination).
 	In that case, inst.error is set.
 	Therefore, inst.error has to be checked after the execution of
@@ -18882,74 +19146,150 @@  now_it_add_mask (int cond)
 
      3) Back in md_assemble(), it_fsm_post_encode () is called to commit
 	any pending state change (if any) that didn't take place in
-	handle_it_state () as explained above.  */
+	handle_pred_state () as explained above.  */
 
 static void
 it_fsm_pre_encode (void)
 {
   if (inst.cond != COND_ALWAYS)
-    inst.it_insn_type = INSIDE_IT_INSN;
+    inst.pred_insn_type =  INSIDE_IT_INSN;
   else
-    inst.it_insn_type = OUTSIDE_IT_INSN;
+    inst.pred_insn_type = OUTSIDE_PRED_INSN;
 
-  now_it.state_handled = 0;
+  now_pred.state_handled = 0;
 }
 
 /* IT state FSM handling function.  */
+/* MVE instructions and non-MVE instructions are handled differently because of
+   the introduction of VPT blocks.
+   Specifications say that any non-MVE instruction inside a VPT block is
+   UNPREDICTABLE, with the exception of the BKPT instruction.  Whereas most MVE
+   instructions are deemed to be UNPREDICTABLE if inside an IT block.  For the
+   few exceptions this will be handled at their respective handler functions.
+   The error messages provided depending on the different combinations possible
+   are described in the cases below:
+   For 'most' MVE instructions:
+   1) In an IT block, with an IT code: syntax error
+   2) In an IT block, with a VPT code: error: must be in a VPT block
+   3) In an IT block, with no code: warning: UNPREDICTABLE
+   4) In a VPT block, with an IT code: syntax error
+   5) In a VPT block, with a VPT code: OK!
+   6) In a VPT block, with no code: error: missing code
+   7) Outside a pred block, with an IT code: error: syntax error
+   8) Outside a pred block, with a VPT code: error: should be in a VPT block
+   9) Outside a pred block, with no code: OK!
+   For non-MVE instructions:
+   10) In an IT block, with an IT code: OK!
+   11) In an IT block, with a VPT code: syntax error
+   12) In an IT block, with no code: error: missing code
+   13) In a VPT block, with an IT code: error: should be in an IT block
+   14) In a VPT block, with a VPT code: syntax error
+   15) In a VPT block, with no code: UNPREDICTABLE
+   16) Outside a pred block, with an IT code: error: should be in an IT block
+   17) Outside a pred block, with a VPT code: syntax error
+   18) Outside a pred block, with no code: OK!
+ */
+
 
 static int
-handle_it_state (void)
+handle_pred_state (void)
 {
-  now_it.state_handled = 1;
-  now_it.insn_cond = FALSE;
+  now_pred.state_handled = 1;
+  now_pred.insn_cond = FALSE;
 
-  switch (now_it.state)
+  switch (now_pred.state)
     {
-    case OUTSIDE_IT_BLOCK:
-      switch (inst.it_insn_type)
+    case OUTSIDE_PRED_BLOCK:
+      switch (inst.pred_insn_type)
 	{
-	case OUTSIDE_IT_INSN:
+	case MVE_OUTSIDE_PRED_INSN:
+	  if (inst.cond < COND_ALWAYS)
+	    {
+	      /* Case 7: Outside a pred block, with an IT code: error: syntax
+		 error.  */
+	      inst.error = BAD_SYNTAX;
+	      return FAIL;
+	    }
+	  /* Case 9:  Outside a pred block, with no code: OK!  */
+	  break;
+	case OUTSIDE_PRED_INSN:
+	  if (inst.cond > COND_ALWAYS)
+	    {
+	      /* Case 17:  Outside a pred block, with a VPT code: syntax error.
+	       */
+	      inst.error = BAD_SYNTAX;
+	      return FAIL;
+	    }
+	  /* Case 18: Outside a pred block, with no code: OK!  */
 	  break;
 
+	case INSIDE_VPT_INSN:
+	  /* Case 8: Outside a pred block, with a VPT code: error: should be in
+	     a VPT block.  */
+	  inst.error = BAD_OUT_VPT;
+	  return FAIL;
+
 	case INSIDE_IT_INSN:
 	case INSIDE_IT_LAST_INSN:
-	  if (thumb_mode == 0)
-	    {
-	      if (unified_syntax
-		  && !(implicit_it_mode & IMPLICIT_IT_MODE_ARM))
-		as_tsktsk (_("Warning: conditional outside an IT block"\
-			     " for Thumb."));
-	    }
-	  else
+	  if (inst.cond < COND_ALWAYS)
 	    {
-	      if ((implicit_it_mode & IMPLICIT_IT_MODE_THUMB)
-		  && ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_v6t2))
+	      /* Case 16: Outside a pred block, with an IT code: error: should
+		 be in an IT block.  */
+	      if (thumb_mode == 0)
 		{
-		  /* Automatically generate the IT instruction.  */
-		  new_automatic_it_block (inst.cond);
-		  if (inst.it_insn_type == INSIDE_IT_LAST_INSN)
-		    close_automatic_it_block ();
+		  if (unified_syntax
+		      && !(implicit_it_mode & IMPLICIT_IT_MODE_ARM))
+		    as_tsktsk (_("Warning: conditional outside an IT block"\
+				 " for Thumb."));
 		}
 	      else
 		{
-		  inst.error = BAD_OUT_IT;
-		  return FAIL;
+		  if ((implicit_it_mode & IMPLICIT_IT_MODE_THUMB)
+		      && ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_v6t2))
+		    {
+		      /* Automatically generate the IT instruction.  */
+		      new_automatic_it_block (inst.cond);
+		      if (inst.pred_insn_type == INSIDE_IT_LAST_INSN)
+			close_automatic_it_block ();
+		    }
+		  else
+		    {
+		      inst.error = BAD_OUT_IT;
+		      return FAIL;
+		    }
 		}
+	      break;
 	    }
-	  break;
-
+	  else if (inst.cond > COND_ALWAYS)
+	    {
+	      /* Case 17: Outside a pred block, with a VPT code: syntax error.
+	       */
+	      inst.error = BAD_SYNTAX;
+	      return FAIL;
+	    }
+	  else
+	    gas_assert (0);
 	case IF_INSIDE_IT_LAST_INSN:
 	case NEUTRAL_IT_INSN:
 	  break;
 
+	case VPT_INSN:
+	  if (inst.cond != COND_ALWAYS)
+	    first_error (BAD_SYNTAX);
+	  now_pred.state = MANUAL_PRED_BLOCK;
+	  now_pred.block_length = 0;
+	  now_pred.type = VECTOR_PRED;
+	  now_pred.cc = 0;
+	  break;
 	case IT_INSN:
-	  now_it.state = MANUAL_IT_BLOCK;
-	  now_it.block_length = 0;
+	  now_pred.state = MANUAL_PRED_BLOCK;
+	  now_pred.block_length = 0;
+	  now_pred.type = SCALAR_PRED;
 	  break;
 	}
       break;
 
-    case AUTOMATIC_IT_BLOCK:
+    case AUTOMATIC_PRED_BLOCK:
       /* Three things may happen now:
 	 a) We should increment current it block size;
 	 b) We should close current it block (closing insn or 4 insns);
@@ -18957,82 +19297,211 @@  handle_it_state (void)
 	 to incompatible conditions or
 	 4 insns-length block reached).  */
 
-      switch (inst.it_insn_type)
+      switch (inst.pred_insn_type)
 	{
-	case OUTSIDE_IT_INSN:
+	case INSIDE_VPT_INSN:
+	case VPT_INSN:
+	case MVE_OUTSIDE_PRED_INSN:
+	  gas_assert (0);
+	case OUTSIDE_PRED_INSN:
 	  /* The closure of the block shall happen immediately,
-	     so any in_it_block () call reports the block as closed.  */
+	     so any in_pred_block () call reports the block as closed.  */
 	  force_automatic_it_block_close ();
 	  break;
 
 	case INSIDE_IT_INSN:
 	case INSIDE_IT_LAST_INSN:
 	case IF_INSIDE_IT_LAST_INSN:
-	  now_it.block_length++;
+	  now_pred.block_length++;
 
-	  if (now_it.block_length > 4
-	      || !now_it_compatible (inst.cond))
+	  if (now_pred.block_length > 4
+	      || !now_pred_compatible (inst.cond))
 	    {
 	      force_automatic_it_block_close ();
-	      if (inst.it_insn_type != IF_INSIDE_IT_LAST_INSN)
+	      if (inst.pred_insn_type != IF_INSIDE_IT_LAST_INSN)
 		new_automatic_it_block (inst.cond);
 	    }
 	  else
 	    {
-	      now_it.insn_cond = TRUE;
-	      now_it_add_mask (inst.cond);
+	      now_pred.insn_cond = TRUE;
+	      now_pred_add_mask (inst.cond);
 	    }
 
-	  if (now_it.state == AUTOMATIC_IT_BLOCK
-	      && (inst.it_insn_type == INSIDE_IT_LAST_INSN
-		  || inst.it_insn_type == IF_INSIDE_IT_LAST_INSN))
+	  if (now_pred.state == AUTOMATIC_PRED_BLOCK
+	      && (inst.pred_insn_type == INSIDE_IT_LAST_INSN
+		  || inst.pred_insn_type == IF_INSIDE_IT_LAST_INSN))
 	    close_automatic_it_block ();
 	  break;
 
 	case NEUTRAL_IT_INSN:
-	  now_it.block_length++;
-	  now_it.insn_cond = TRUE;
+	  now_pred.block_length++;
+	  now_pred.insn_cond = TRUE;
 
-	  if (now_it.block_length > 4)
+	  if (now_pred.block_length > 4)
 	    force_automatic_it_block_close ();
 	  else
-	    now_it_add_mask (now_it.cc & 1);
+	    now_pred_add_mask (now_pred.cc & 1);
 	  break;
 
 	case IT_INSN:
 	  close_automatic_it_block ();
-	  now_it.state = MANUAL_IT_BLOCK;
+	  now_pred.state = MANUAL_PRED_BLOCK;
 	  break;
 	}
       break;
 
-    case MANUAL_IT_BLOCK:
+    case MANUAL_PRED_BLOCK:
       {
-	/* Check conditional suffixes.  */
-	const int cond = now_it.cc ^ ((now_it.mask >> 4) & 1) ^ 1;
-	int is_last;
-	now_it.mask <<= 1;
-	now_it.mask &= 0x1f;
-	is_last = (now_it.mask == 0x10);
-	now_it.insn_cond = TRUE;
-
-	switch (inst.it_insn_type)
+	int cond, is_last;
+	if (now_pred.type == SCALAR_PRED)
 	  {
-	  case OUTSIDE_IT_INSN:
-	    inst.error = BAD_NOT_IT;
-	    return FAIL;
+	    /* Check conditional suffixes.  */
+	    cond = now_pred.cc ^ ((now_pred.mask >> 4) & 1) ^ 1;
+	    now_pred.mask <<= 1;
+	    now_pred.mask &= 0x1f;
+	    is_last = (now_pred.mask == 0x10);
+	  }
+	else
+	  {
+	    now_pred.cc ^= (now_pred.mask >> 4);
+	    cond = now_pred.cc + 0xf;
+	    now_pred.mask <<= 1;
+	    now_pred.mask &= 0x1f;
+	    is_last = now_pred.mask == 0x10;
+	  }
+	now_pred.insn_cond = TRUE;
 
+	switch (inst.pred_insn_type)
+	  {
+	  case OUTSIDE_PRED_INSN:
+	    if (now_pred.type == SCALAR_PRED)
+	      {
+		if (inst.cond == COND_ALWAYS)
+		  {
+		    /* Case 12: In an IT block, with no code: error: missing
+		       code.  */
+		    inst.error = BAD_NOT_IT;
+		    return FAIL;
+		  }
+		else if (inst.cond > COND_ALWAYS)
+		  {
+		    /* Case 11: In an IT block, with a VPT code: syntax error.
+		     */
+		    inst.error = BAD_SYNTAX;
+		    return FAIL;
+		  }
+		else if (thumb_mode)
+		  {
+		    /* This is for some special cases where a non-MVE
+		       instruction is not allowed in an IT block, such as cbz,
+		       but are put into one with a condition code.
+		       You could argue this should be a syntax error, but we
+		       gave the 'not allowed in IT block' diagnostic in the
+		       past so we will keep doing so.  */
+		    inst.error = BAD_NOT_IT;
+		    return FAIL;
+		  }
+		break;
+	      }
+	    else
+	      {
+		/* Case 15: In a VPT block, with no code: UNPREDICTABLE.  */
+		as_tsktsk (MVE_NOT_VPT);
+		return SUCCESS;
+	      }
+	  case MVE_OUTSIDE_PRED_INSN:
+	    if (now_pred.type == SCALAR_PRED)
+	      {
+		if (inst.cond == COND_ALWAYS)
+		  {
+		    /* Case 3: In an IT block, with no code: warning:
+		       UNPREDICTABLE.  */
+		    as_tsktsk (MVE_NOT_IT);
+		    return SUCCESS;
+		  }
+		else if (inst.cond < COND_ALWAYS)
+		  {
+		    /* Case 1: In an IT block, with an IT code: syntax error.
+		     */
+		    inst.error = BAD_SYNTAX;
+		    return FAIL;
+		  }
+		else
+		  gas_assert (0);
+	      }
+	    else
+	      {
+		if (inst.cond < COND_ALWAYS)
+		  {
+		    /* Case 4: In a VPT block, with an IT code: syntax error.
+		     */
+		    inst.error = BAD_SYNTAX;
+		    return FAIL;
+		  }
+		else if (inst.cond == COND_ALWAYS)
+		  {
+		    /* Case 6: In a VPT block, with no code: error: missing
+		       code.  */
+		    inst.error = BAD_NOT_VPT;
+		    return FAIL;
+		  }
+		else
+		  {
+		    gas_assert (0);
+		  }
+	      }
 	  case INSIDE_IT_INSN:
-	    if (cond != inst.cond)
+	    if (inst.cond > COND_ALWAYS)
 	      {
-		inst.error = BAD_IT_COND;
+		/* Case 11: In an IT block, with a VPT code: syntax error.  */
+		/* Case 14: In a VPT block, with a VPT code: syntax error.  */
+		inst.error = BAD_SYNTAX;
+		return FAIL;
+	      }
+	    else if (now_pred.type == SCALAR_PRED)
+	      {
+		/* Case 10: In an IT block, with an IT code: OK!  */
+		if (cond != inst.cond)
+		  {
+		    inst.error = now_pred.type == SCALAR_PRED ? BAD_IT_COND :
+		      BAD_VPT_COND;
+		    return FAIL;
+		  }
+	      }
+	    else
+	      {
+		/* Case 13: In a VPT block, with an IT code: error: should be
+		   in an IT block.  */
+		inst.error = BAD_OUT_IT;
 		return FAIL;
 	      }
 	    break;
 
+	  case INSIDE_VPT_INSN:
+	    if (now_pred.type == SCALAR_PRED)
+	      {
+		/* Case 2: In an IT block, with a VPT code: error: must be in a
+		   VPT block.  */
+		inst.error = BAD_OUT_VPT;
+		return FAIL;
+	      }
+	    /* Case 5:  In a VPT block, with a VPT code: OK!  */
+	    else if (cond != inst.cond)
+	      {
+		inst.error = BAD_VPT_COND;
+		return FAIL;
+	      }
+	    break;
 	  case INSIDE_IT_LAST_INSN:
 	  case IF_INSIDE_IT_LAST_INSN:
-	    if (cond != inst.cond)
+	    if (now_pred.type == VECTOR_PRED || inst.cond > COND_ALWAYS)
+	      {
+		/* Case 4: In a VPT block, with an IT code: syntax error.  */
+		/* Case 11: In an IT block, with a VPT code: syntax error.  */
+		inst.error = BAD_SYNTAX;
+		return FAIL;
+	      }
+	    else if (cond != inst.cond)
 	      {
 		inst.error = BAD_IT_COND;
 		return FAIL;
@@ -19045,14 +19514,37 @@  handle_it_state (void)
 	    break;
 
 	  case NEUTRAL_IT_INSN:
-	    /* The BKPT instruction is unconditional even in an IT block.  */
+	    /* The BKPT instruction is unconditional even in a IT or VPT
+	       block.  */
 	    break;
 
 	  case IT_INSN:
-	    inst.error = BAD_IT_IT;
-	    return FAIL;
+	    if (now_pred.type == SCALAR_PRED)
+	      {
+		inst.error = BAD_IT_IT;
+		return FAIL;
+	      }
+	    /* fall through.  */
+	  case VPT_INSN:
+	    if (inst.cond == COND_ALWAYS)
+	      {
+		/* Executing a VPT/VPST instruction inside an IT block or a
+		   VPT/VPST/IT instruction inside a VPT block is UNPREDICTABLE.
+		 */
+		if (now_pred.type == SCALAR_PRED)
+		  as_tsktsk (MVE_NOT_IT);
+		else
+		  as_tsktsk (MVE_NOT_VPT);
+		return SUCCESS;
+	      }
+	    else
+	      {
+		/* VPT/VPST do not accept condition codes.  */
+		inst.error = BAD_SYNTAX;
+		return FAIL;
+	      }
 	  }
-      }
+	}
       break;
     }
 
@@ -19086,11 +19578,11 @@  it_fsm_post_encode (void)
 {
   int is_last;
 
-  if (!now_it.state_handled)
-    handle_it_state ();
+  if (!now_pred.state_handled)
+    handle_pred_state ();
 
-  if (now_it.insn_cond
-      && !now_it.warn_deprecated
+  if (now_pred.insn_cond
+      && !now_pred.warn_deprecated
       && warn_on_deprecated
       && ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_v8)
       && !ARM_CPU_HAS_FEATURE (cpu_variant, arm_ext_m))
@@ -19099,7 +19591,7 @@  it_fsm_post_encode (void)
 	{
 	  as_tsktsk (_("IT blocks containing 32-bit Thumb instructions are "
 		     "performance deprecated in ARMv8-A and ARMv8-R"));
-	  now_it.warn_deprecated = TRUE;
+	  now_pred.warn_deprecated = TRUE;
 	}
       else
 	{
@@ -19113,7 +19605,7 @@  it_fsm_post_encode (void)
 			       "instructions of the following class are "
 			       "performance deprecated in ARMv8-A and "
 			       "ARMv8-R: %s"), p->description);
-		  now_it.warn_deprecated = TRUE;
+		  now_pred.warn_deprecated = TRUE;
 		  break;
 		}
 
@@ -19121,41 +19613,41 @@  it_fsm_post_encode (void)
 	    }
 	}
 
-      if (now_it.block_length > 1)
+      if (now_pred.block_length > 1)
 	{
 	  as_tsktsk (_("IT blocks containing more than one conditional "
 		     "instruction are performance deprecated in ARMv8-A and "
 		     "ARMv8-R"));
-	  now_it.warn_deprecated = TRUE;
+	  now_pred.warn_deprecated = TRUE;
 	}
     }
 
-  is_last = (now_it.mask == 0x10);
-  if (is_last)
-    {
-      now_it.state = OUTSIDE_IT_BLOCK;
-      now_it.mask = 0;
-    }
+    is_last = (now_pred.mask == 0x10);
+    if (is_last)
+      {
+	now_pred.state = OUTSIDE_PRED_BLOCK;
+	now_pred.mask = 0;
+      }
 }
 
 static void
 force_automatic_it_block_close (void)
 {
-  if (now_it.state == AUTOMATIC_IT_BLOCK)
+  if (now_pred.state == AUTOMATIC_PRED_BLOCK)
     {
       close_automatic_it_block ();
-      now_it.state = OUTSIDE_IT_BLOCK;
-      now_it.mask = 0;
+      now_pred.state = OUTSIDE_PRED_BLOCK;
+      now_pred.mask = 0;
     }
 }
 
 static int
-in_it_block (void)
+in_pred_block (void)
 {
-  if (!now_it.state_handled)
-    handle_it_state ();
+  if (!now_pred.state_handled)
+    handle_pred_state ();
 
-  return now_it.state != OUTSIDE_IT_BLOCK;
+  return now_pred.state != OUTSIDE_PRED_BLOCK;
 }
 
 /* Whether OPCODE only has T32 encoding.  Since this function is only used by
@@ -19308,7 +19800,7 @@  md_assemble (char *str)
 
       if (!parse_operands (p, opcode->operands, /*thumb=*/TRUE))
 	{
-	  /* Prepare the it_insn_type for those encodings that don't set
+	  /* Prepare the pred_insn_type for those encodings that don't set
 	     it.  */
 	  it_fsm_pre_encode ();
 
@@ -19411,21 +19903,30 @@  md_assemble (char *str)
 }
 
 static void
-check_it_blocks_finished (void)
+check_pred_blocks_finished (void)
 {
 #ifdef OBJ_ELF
   asection *sect;
 
   for (sect = stdoutput->sections; sect != NULL; sect = sect->next)
-    if (seg_info (sect)->tc_segment_info_data.current_it.state
-	== MANUAL_IT_BLOCK)
+    if (seg_info (sect)->tc_segment_info_data.current_pred.state
+	== MANUAL_PRED_BLOCK)
       {
-	as_warn (_("section '%s' finished with an open IT block."),
-		 sect->name);
+	if (now_pred.type == SCALAR_PRED)
+	  as_warn (_("section '%s' finished with an open IT block."),
+		   sect->name);
+	else
+	  as_warn (_("section '%s' finished with an open VPT/VPST block."),
+		   sect->name);
       }
 #else
-  if (now_it.state == MANUAL_IT_BLOCK)
-    as_warn (_("file finished with an open IT block."));
+  if (now_pred.state == MANUAL_PRED_BLOCK)
+    {
+      if (now_pred.type == SCALAR_PRED)
+       as_warn (_("file finished with an open IT block."));
+      else
+	as_warn (_("file finished with an open VPT/VPST block."));
+    }
 #endif
 }
 
@@ -19827,7 +20328,7 @@  static struct reloc_entry reloc_names[] =
 };
 #endif
 
-/* Table of all conditional affixes.  0xF is not defined as a condition code.  */
+/* Table of all conditional affixes.  */
 static const struct asm_cond conds[] =
 {
   {"eq", 0x0},
@@ -19846,6 +20347,11 @@  static const struct asm_cond conds[] =
   {"le", 0xd},
   {"al", 0xe}
 };
+static const struct asm_cond vconds[] =
+{
+    {"t", 0xf},
+    {"e", 0x10}
+};
 
 #define UL_BARRIER(L,U,CODE,FEAT) \
   { L, CODE, ARM_FEATURE_CORE_LOW (FEAT) }, \
@@ -19904,7 +20410,7 @@  static struct asm_barrier_opt barrier_opt_names[] =
 /* The normal sort of mnemonic; has a Thumb variant; takes a conditional suffix.  */
 #define TxCE(mnem, op, top, nops, ops, ae, te) \
   { mnem, OPS##nops ops, OT_csuffix, 0x##op, top, ARM_VARIANT, \
-    THUMB_VARIANT, do_##ae, do_##te }
+    THUMB_VARIANT, do_##ae, do_##te, 0 }
 
 /* Two variants of the above - TCE for a numeric Thumb opcode, tCE for
    a T_MNEM_xyz enumerator.  */
@@ -19917,10 +20423,10 @@  static struct asm_barrier_opt barrier_opt_names[] =
    infix after the third character.  */
 #define TxC3(mnem, op, top, nops, ops, ae, te) \
   { mnem, OPS##nops ops, OT_cinfix3, 0x##op, top, ARM_VARIANT, \
-    THUMB_VARIANT, do_##ae, do_##te }
+    THUMB_VARIANT, do_##ae, do_##te, 0 }
 #define TxC3w(mnem, op, top, nops, ops, ae, te) \
   { mnem, OPS##nops ops, OT_cinfix3_deprecated, 0x##op, top, ARM_VARIANT, \
-    THUMB_VARIANT, do_##ae, do_##te }
+    THUMB_VARIANT, do_##ae, do_##te, 0 }
 #define TC3(mnem, aop, top, nops, ops, ae, te) \
       TxC3 (mnem, aop, 0x##top, nops, ops, ae, te)
 #define TC3w(mnem, aop, top, nops, ops, ae, te) \
@@ -19935,74 +20441,74 @@  static struct asm_barrier_opt barrier_opt_names[] =
    conditionally, so this is checked separately.  */
 #define TUE(mnem, op, top, nops, ops, ae, te)				\
   { mnem, OPS##nops ops, OT_unconditional, 0x##op, 0x##top, ARM_VARIANT, \
-    THUMB_VARIANT, do_##ae, do_##te }
+    THUMB_VARIANT, do_##ae, do_##te, 0 }
 
 /* Same as TUE but the encoding function for ARM and Thumb modes is the same.
    Used by mnemonics that have very minimal differences in the encoding for
    ARM and Thumb variants and can be handled in a common function.  */
 #define TUEc(mnem, op, top, nops, ops, en) \
   { mnem, OPS##nops ops, OT_unconditional, 0x##op, 0x##top, ARM_VARIANT, \
-    THUMB_VARIANT, do_##en, do_##en }
+    THUMB_VARIANT, do_##en, do_##en, 0 }
 
 /* Mnemonic that cannot be conditionalized, and bears 0xF in its ARM
    condition code field.  */
 #define TUF(mnem, op, top, nops, ops, ae, te)				\
   { mnem, OPS##nops ops, OT_unconditionalF, 0x##op, 0x##top, ARM_VARIANT, \
-    THUMB_VARIANT, do_##ae, do_##te }
+    THUMB_VARIANT, do_##ae, do_##te, 0 }
 
 /* ARM-only variants of all the above.  */
 #define CE(mnem,  op, nops, ops, ae)	\
-  { mnem, OPS##nops ops, OT_csuffix, 0x##op, 0x0, ARM_VARIANT, 0, do_##ae, NULL }
+  { mnem, OPS##nops ops, OT_csuffix, 0x##op, 0x0, ARM_VARIANT, 0, do_##ae, NULL, 0 }
 
 #define C3(mnem, op, nops, ops, ae)	\
-  { #mnem, OPS##nops ops, OT_cinfix3, 0x##op, 0x0, ARM_VARIANT, 0, do_##ae, NULL }
+  { #mnem, OPS##nops ops, OT_cinfix3, 0x##op, 0x0, ARM_VARIANT, 0, do_##ae, NULL, 0 }
 
 /* Thumb-only variants of TCE and TUE.  */
 #define ToC(mnem, top, nops, ops, te) \
   { mnem, OPS##nops ops, OT_csuffix, 0x0, 0x##top, 0, THUMB_VARIANT, NULL, \
-    do_##te }
+    do_##te, 0 }
 
 #define ToU(mnem, top, nops, ops, te) \
   { mnem, OPS##nops ops, OT_unconditional, 0x0, 0x##top, 0, THUMB_VARIANT, \
-    NULL, do_##te }
+    NULL, do_##te, 0 }
 
 /* T_MNEM_xyz enumerator variants of ToC.  */
 #define toC(mnem, top, nops, ops, te) \
   { mnem, OPS##nops ops, OT_csuffix, 0x0, T_MNEM##top, 0, THUMB_VARIANT, NULL, \
-    do_##te }
+    do_##te, 0 }
 
 /* T_MNEM_xyz enumerator variants of ToU.  */
 #define toU(mnem, top, nops, ops, te) \
   { mnem, OPS##nops ops, OT_unconditional, 0x0, T_MNEM##top, 0, THUMB_VARIANT, \
-    NULL, do_##te }
+    NULL, do_##te, 0 }
 
 /* Legacy mnemonics that always have conditional infix after the third
    character.  */
 #define CL(mnem, op, nops, ops, ae)	\
   { mnem, OPS##nops ops, OT_cinfix3_legacy, \
-    0x##op, 0x0, ARM_VARIANT, 0, do_##ae, NULL }
+    0x##op, 0x0, ARM_VARIANT, 0, do_##ae, NULL, 0 }
 
 /* Coprocessor instructions.  Isomorphic between Arm and Thumb-2.  */
 #define cCE(mnem,  op, nops, ops, ae)	\
-  { mnem, OPS##nops ops, OT_csuffix, 0x##op, 0xe##op, ARM_VARIANT, ARM_VARIANT, do_##ae, do_##ae }
+  { mnem, OPS##nops ops, OT_csuffix, 0x##op, 0xe##op, ARM_VARIANT, ARM_VARIANT, do_##ae, do_##ae, 0 }
 
 /* Legacy coprocessor instructions where conditional infix and conditional
    suffix are ambiguous.  For consistency this includes all FPA instructions,
    not just the potentially ambiguous ones.  */
 #define cCL(mnem, op, nops, ops, ae)	\
   { mnem, OPS##nops ops, OT_cinfix3_legacy, \
-    0x##op, 0xe##op, ARM_VARIANT, ARM_VARIANT, do_##ae, do_##ae }
+    0x##op, 0xe##op, ARM_VARIANT, ARM_VARIANT, do_##ae, do_##ae, 0 }
 
 /* Coprocessor, takes either a suffix or a position-3 infix
    (for an FPA corner case). */
 #define C3E(mnem, op, nops, ops, ae) \
   { mnem, OPS##nops ops, OT_csuf_or_in3, \
-    0x##op, 0xe##op, ARM_VARIANT, ARM_VARIANT, do_##ae, do_##ae }
+    0x##op, 0xe##op, ARM_VARIANT, ARM_VARIANT, do_##ae, do_##ae, 0 }
 
 #define xCM_(m1, m2, m3, op, nops, ops, ae)	\
   { m1 #m2 m3, OPS##nops ops, \
     sizeof (#m2) == 1 ? OT_odd_infix_unc : OT_odd_infix_0 + sizeof (m1) - 1, \
-    0x##op, 0x0, ARM_VARIANT, 0, do_##ae, NULL }
+    0x##op, 0x0, ARM_VARIANT, 0, do_##ae, NULL, 0 }
 
 #define CM(m1, m2, op, nops, ops, ae)	\
   xCM_ (m1,   , m2, op, nops, ops, ae),	\
@@ -20026,47 +20532,83 @@  static struct asm_barrier_opt barrier_opt_names[] =
   xCM_ (m1, al, m2, op, nops, ops, ae)
 
 #define UE(mnem, op, nops, ops, ae)	\
-  { #mnem, OPS##nops ops, OT_unconditional, 0x##op, 0, ARM_VARIANT, 0, do_##ae, NULL }
+  { #mnem, OPS##nops ops, OT_unconditional, 0x##op, 0, ARM_VARIANT, 0, do_##ae, NULL, 0 }
 
 #define UF(mnem, op, nops, ops, ae)	\
-  { #mnem, OPS##nops ops, OT_unconditionalF, 0x##op, 0, ARM_VARIANT, 0, do_##ae, NULL }
+  { #mnem, OPS##nops ops, OT_unconditionalF, 0x##op, 0, ARM_VARIANT, 0, do_##ae, NULL, 0 }
 
 /* Neon data-processing. ARM versions are unconditional with cond=0xf.
    The Thumb and ARM variants are mostly the same (bits 0-23 and 24/28), so we
    use the same encoding function for each.  */
 #define NUF(mnem, op, nops, ops, enc)					\
   { #mnem, OPS##nops ops, OT_unconditionalF, 0x##op, 0x##op,		\
-    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc }
+    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc, 0 }
 
 /* Neon data processing, version which indirects through neon_enc_tab for
    the various overloaded versions of opcodes.  */
 #define nUF(mnem, op, nops, ops, enc)					\
   { #mnem, OPS##nops ops, OT_unconditionalF, N_MNEM##op, N_MNEM##op,	\
-    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc }
+    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc, 0 }
 
 /* Neon insn with conditional suffix for the ARM version, non-overloaded
    version.  */
-#define NCE_tag(mnem, op, nops, ops, enc, tag)				\
+#define NCE_tag(mnem, op, nops, ops, enc, tag, mve_p)				\
   { #mnem, OPS##nops ops, tag, 0x##op, 0x##op, ARM_VARIANT,		\
-    THUMB_VARIANT, do_##enc, do_##enc }
+    THUMB_VARIANT, do_##enc, do_##enc, mve_p }
 
 #define NCE(mnem, op, nops, ops, enc)					\
-   NCE_tag (mnem, op, nops, ops, enc, OT_csuffix)
+   NCE_tag (mnem, op, nops, ops, enc, OT_csuffix, 0)
 
 #define NCEF(mnem, op, nops, ops, enc)					\
-    NCE_tag (mnem, op, nops, ops, enc, OT_csuffixF)
+    NCE_tag (mnem, op, nops, ops, enc, OT_csuffixF, 0)
 
 /* Neon insn with conditional suffix for the ARM version, overloaded types.  */
-#define nCE_tag(mnem, op, nops, ops, enc, tag)				\
+#define nCE_tag(mnem, op, nops, ops, enc, tag, mve_p)				\
   { #mnem, OPS##nops ops, tag, N_MNEM##op, N_MNEM##op,		\
-    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc }
+    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc, mve_p }
 
 #define nCE(mnem, op, nops, ops, enc)					\
-   nCE_tag (mnem, op, nops, ops, enc, OT_csuffix)
+   nCE_tag (mnem, op, nops, ops, enc, OT_csuffix, 0)
 
 #define nCEF(mnem, op, nops, ops, enc)					\
-    nCE_tag (mnem, op, nops, ops, enc, OT_csuffixF)
+    nCE_tag (mnem, op, nops, ops, enc, OT_csuffixF, 0)
+
+/*   */
+#define mCEF(mnem, op, nops, ops, enc)				\
+  { #mnem, OPS##nops ops, OT_csuffixF, 0, M_MNEM##op,		\
+    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc, 1 }
+
+
+/* nCEF but for MVE predicated instructions.  */
+#define mnCEF(mnem, op, nops, ops, enc)					\
+    nCE_tag (mnem, op, nops, ops, enc, OT_csuffixF, 1)
+
+/* nCE but for MVE predicated instructions.  */
+#define mnCE(mnem, op, nops, ops, enc)					\
+   nCE_tag (mnem, op, nops, ops, enc, OT_csuffix, 1)
 
+/* NUF but for potentially MVE predicated instructions.  */
+#define MNUF(mnem, op, nops, ops, enc)					\
+  { #mnem, OPS##nops ops, OT_unconditionalF, 0x##op, 0x##op,		\
+    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc, 1 }
+
+/* nUF but for potentially MVE predicated instructions.  */
+#define mnUF(mnem, op, nops, ops, enc)					\
+  { #mnem, OPS##nops ops, OT_unconditionalF, N_MNEM##op, N_MNEM##op,	\
+    ARM_VARIANT, THUMB_VARIANT, do_##enc, do_##enc, 1 }
+
+/* ToC but for potentially MVE predicated instructions.  */
+#define mToC(mnem, top, nops, ops, te) \
+  { mnem, OPS##nops ops, OT_csuffix, 0x0, 0x##top, 0, THUMB_VARIANT, NULL, \
+    do_##te, 1 }
+
+/* NCE but for MVE predicated instructions.  */
+#define MNCE(mnem, op, nops, ops, enc)					\
+   NCE_tag (mnem, op, nops, ops, enc, OT_csuffix, 1)
+
+/* NCEF but for MVE predicated instructions.  */
+#define MNCEF(mnem, op, nops, ops, enc)					\
+    NCE_tag (mnem, op, nops, ops, enc, OT_csuffixF, 1)
 #define do_0 0
 
 static const struct asm_opcode insns[] =
@@ -21370,9 +21912,6 @@  static const struct asm_opcode insns[] =
  nCEF(vmla,     _vmla,    3, (RNSDQ, oRNSDQ, RNSDQ_RNSC), neon_mac_maybe_scalar),
  nCEF(vmls,     _vmls,    3, (RNSDQ, oRNSDQ, RNSDQ_RNSC), neon_mac_maybe_scalar),
 
- nCEF(vadd,     _vadd,    3, (RNSDQ, oRNSDQ, RNSDQ), neon_addsub_if_i),
- nCEF(vsub,     _vsub,    3, (RNSDQ, oRNSDQ, RNSDQ), neon_addsub_if_i),
-
  NCEF(vabs,     1b10300, 2, (RNSDQ, RNSDQ), neon_abs_neg),
  NCEF(vneg,     1b10380, 2, (RNSDQ, RNSDQ), neon_abs_neg),
 
@@ -21465,7 +22004,6 @@  static const struct asm_opcode insns[] =
  NUF(vbif,      1300110, 3, (RNDQ, RNDQ, RNDQ), neon_bitfield),
  NUF(vbifq,     1300110, 3, (RNQ,  RNQ,  RNQ),  neon_bitfield),
   /* Int and float variants, types S8 S16 S32 U8 U16 U32 F16 F32.  */
- nUF(vabd,      _vabd,    3, (RNDQ, oRNDQ, RNDQ), neon_dyadic_if_su),
  nUF(vabdq,     _vabd,    3, (RNQ,  oRNQ,  RNQ),  neon_dyadic_if_su),
  nUF(vmax,      _vmax,    3, (RNDQ, oRNDQ, RNDQ), neon_dyadic_if_su),
  nUF(vmaxq,     _vmax,    3, (RNQ,  oRNQ,  RNQ),  neon_dyadic_if_su),
@@ -21566,9 +22104,6 @@  static const struct asm_opcode insns[] =
   /* Data processing, three registers of different lengths.  */
   /* Dyadic, long insns. Types S8 S16 S32 U8 U16 U32.  */
  NUF(vabal,     0800500, 3, (RNQ, RND, RND),  neon_abal),
- NUF(vabdl,     0800700, 3, (RNQ, RND, RND),  neon_dyadic_long),
- NUF(vaddl,     0800000, 3, (RNQ, RND, RND),  neon_dyadic_long),
- NUF(vsubl,     0800200, 3, (RNQ, RND, RND),  neon_dyadic_long),
   /* If not scalar, fall back to neon_dyadic_long.
      Vector types as above, scalar types S16 S32 U16 U32.  */
  nUF(vmlal,     _vmlal,   3, (RNQ, RND, RND_RNSC), neon_mac_maybe_scalar_long),
@@ -22083,7 +22618,40 @@  static const struct asm_opcode insns[] =
  toU("le",  _le,  2, (oLR, EXP),	 t_loloop),
 
  ToC("clrm",	e89f0000, 1, (CLRMLST),  t_clrm),
- ToC("vscclrm",	ec9f0a00, 1, (VRSDVLST), t_vscclrm)
+ ToC("vscclrm",	ec9f0a00, 1, (VRSDVLST), t_vscclrm),
+
+#undef  THUMB_VARIANT
+#define THUMB_VARIANT & mve_ext
+ ToC("vpst",	fe710f4d, 0, (), mve_vpt),
+ ToC("vpstt",	fe318f4d, 0, (), mve_vpt),
+ ToC("vpste",	fe718f4d, 0, (), mve_vpt),
+ ToC("vpsttt",	fe314f4d, 0, (), mve_vpt),
+ ToC("vpstte",	fe31cf4d, 0, (), mve_vpt),
+ ToC("vpstet",	fe71cf4d, 0, (), mve_vpt),
+ ToC("vpstee",	fe714f4d, 0, (), mve_vpt),
+ ToC("vpstttt",	fe312f4d, 0, (), mve_vpt),
+ ToC("vpsttte", fe316f4d, 0, (), mve_vpt),
+ ToC("vpsttet",	fe31ef4d, 0, (), mve_vpt),
+ ToC("vpsttee",	fe31af4d, 0, (), mve_vpt),
+ ToC("vpstett",	fe71af4d, 0, (), mve_vpt),
+ ToC("vpstete",	fe71ef4d, 0, (), mve_vpt),
+ ToC("vpsteet",	fe716f4d, 0, (), mve_vpt),
+ ToC("vpsteee",	fe712f4d, 0, (), mve_vpt),
+
+#undef  ARM_VARIANT
+#define ARM_VARIANT    & fpu_vfp_ext_v1xd
+#undef  THUMB_VARIANT
+#define THUMB_VARIANT  & arm_ext_v6t2
+
+ mnCEF(vadd,     _vadd,    3, (RNSDQMQ, oRNSDQMQ, RNSDQMQR), neon_addsub_if_i),
+ mnCEF(vsub,     _vsub,    3, (RNSDQMQ, oRNSDQMQ, RNSDQMQR), neon_addsub_if_i),
+
+#undef ARM_VARIANT
+#define ARM_VARIANT & fpu_neon_ext_v1
+ mnUF(vabd,      _vabd,    3, (RNDQMQ, oRNDQMQ, RNDQMQ), neon_dyadic_if_su),
+ mnUF(vabdl,     _vabdl,	  3, (RNQMQ, RNDMQ, RNDMQ),   neon_dyadic_long),
+ mnUF(vaddl,     _vaddl,	  3, (RNQMQ, RNDMQ, RNDMQR),  neon_dyadic_long),
+ mnUF(vsubl,     _vsubl,	  3, (RNQMQ, RNDMQ, RNDMQR),  neon_dyadic_long),
 };
 #undef ARM_VARIANT
 #undef THUMB_VARIANT
@@ -25962,8 +26530,8 @@  arm_cleanup (void)
 {
   literal_pool * pool;
 
-  /* Ensure that all the IT blocks are properly closed.  */
-  check_it_blocks_finished ();
+  /* Ensure that all the predication blocks are properly closed.  */
+  check_pred_blocks_finished ();
 
   for (pool = list_of_pools; pool; pool = pool->next)
     {
@@ -26155,6 +26723,7 @@  md_begin (void)
 
   if (	 (arm_ops_hsh = hash_new ()) == NULL
       || (arm_cond_hsh = hash_new ()) == NULL
+      || (arm_vcond_hsh = hash_new ()) == NULL
       || (arm_shift_hsh = hash_new ()) == NULL
       || (arm_psr_hsh = hash_new ()) == NULL
       || (arm_v7m_psr_hsh = hash_new ()) == NULL
@@ -26167,6 +26736,8 @@  md_begin (void)
     hash_insert (arm_ops_hsh, insns[i].template_name, (void *) (insns + i));
   for (i = 0; i < sizeof (conds) / sizeof (struct asm_cond); i++)
     hash_insert (arm_cond_hsh, conds[i].template_name, (void *) (conds + i));
+  for (i = 0; i < sizeof (vconds) / sizeof (struct asm_cond); i++)
+    hash_insert (arm_vcond_hsh, vconds[i].template_name, (void *) (vconds + i));
   for (i = 0; i < sizeof (shift_names) / sizeof (struct asm_shift_name); i++)
     hash_insert (arm_shift_hsh, shift_names[i].name, (void *) (shift_names + i));
   for (i = 0; i < sizeof (psrs) / sizeof (struct asm_psr); i++)
diff --git a/gas/testsuite/gas/arm/armv8_3-a-fp-bad.l b/gas/testsuite/gas/arm/armv8_3-a-fp-bad.l
index 755b6f74aed07b54a57efcfd1ac5e0aadd6e61b7..6b7e30ff1d96bf0cc1e49141bd3924bf8afa2b13 100644
--- a/gas/testsuite/gas/arm/armv8_3-a-fp-bad.l
+++ b/gas/testsuite/gas/arm/armv8_3-a-fp-bad.l
@@ -3,5 +3,5 @@ 
 [^:]+:4: Error: VFP single precision register expected -- `vjcvt\.s32\.f64 r0,d1'
 [^:]+:5: Error: VFP/Neon double precision register expected -- `vjcvt\.s32\.f64 s0,s1'
 [^:]+:6: Error: VFP/Neon double precision register expected -- `vjcvt\.s32\.f32 s0,s1'
-[^:]+:7: Error: bad type in Neon instruction -- `vjcvt\.s32\.f32 s0,d1'
-[^:]+:8: Error: bad type in Neon instruction -- `vjcvt\.f32\.f64 s0,d1'
+[^:]+:7: Error: bad type in SIMD instruction -- `vjcvt\.s32\.f32 s0,d1'
+[^:]+:8: Error: bad type in SIMD instruction -- `vjcvt\.f32\.f64 s0,d1'
diff --git a/gas/testsuite/gas/arm/armv8_3-a-simd-bad.l b/gas/testsuite/gas/arm/armv8_3-a-simd-bad.l
index 2a3ea9b1d24dde073216b951b9eebed48fa2130f..d440d64011f63b687691a7e05e4e0f5cedcfb19b 100644
--- a/gas/testsuite/gas/arm/armv8_3-a-simd-bad.l
+++ b/gas/testsuite/gas/arm/armv8_3-a-simd-bad.l
@@ -3,15 +3,15 @@ 
 [^:]+:7: Error: immediate out of range -- `vcadd\.f32 q0,q1,q2,#0'
 [^:]+:8: Error: immediate out of range -- `vcadd\.f32 q0,q1,q2,#180'
 [^:]+:9: Error: Neon double or quad precision register expected -- `vcadd\.f16 s0,s1,s2,#90'
-[^:]+:10: Error: bad type in Neon instruction -- `vcadd\.f64 d0,d1,d2,#90'
-[^:]+:11: Error: bad type in Neon instruction -- `vcadd\.f64 q0,q1,q2,#90'
+[^:]+:10: Error: bad type in SIMD instruction -- `vcadd\.f64 d0,d1,d2,#90'
+[^:]+:11: Error: bad type in SIMD instruction -- `vcadd\.f64 q0,q1,q2,#90'
 [^:]+:13: Error: operand types can't be inferred -- `vcmla d0,d1,d2,#90'
 [^:]+:14: Error: immediate out of range -- `vcmla\.f32 q0,q1,q2,#-90'
 [^:]+:15: Error: immediate out of range -- `vcmla\.f32 q0,q1,q2,#120'
 [^:]+:16: Error: immediate out of range -- `vcmla\.f32 q0,q1,q2,#360'
 [^:]+:17: Error: Neon double or quad precision register expected -- `vcmla\.f16 s0,s1,s2,#90'
-[^:]+:18: Error: bad type in Neon instruction -- `vcmla\.f64 d0,d1,d2,#90'
-[^:]+:19: Error: bad type in Neon instruction -- `vcmla\.f64 q0,q1,q2,#90'
+[^:]+:18: Error: bad type in SIMD instruction -- `vcmla\.f64 d0,d1,d2,#90'
+[^:]+:19: Error: bad type in SIMD instruction -- `vcmla\.f64 q0,q1,q2,#90'
 [^:]+:21: Error: only D registers may be indexed -- `vcmla\.f16 q0,q1,q2\[0\],#90'
 [^:]+:22: Error: only D registers may be indexed -- `vcmla\.f32 q0,q1,q2\[0\],#90'
 [^:]+:23: Error: scalar out of range -- `vcmla\.f16 d0,d1,d2\[2\],#90'
@@ -22,15 +22,15 @@ 
 [^:]+:32: Error: immediate out of range -- `vcadd\.f32 q0,q1,q2,#0'
 [^:]+:33: Error: immediate out of range -- `vcadd\.f32 q0,q1,q2,#180'
 [^:]+:34: Error: Neon double or quad precision register expected -- `vcadd\.f16 s0,s1,s2,#90'
-[^:]+:35: Error: bad type in Neon instruction -- `vcadd\.f64 d0,d1,d2,#90'
-[^:]+:36: Error: bad type in Neon instruction -- `vcadd\.f64 q0,q1,q2,#90'
+[^:]+:35: Error: bad type in SIMD instruction -- `vcadd\.f64 d0,d1,d2,#90'
+[^:]+:36: Error: bad type in SIMD instruction -- `vcadd\.f64 q0,q1,q2,#90'
 [^:]+:38: Error: operand types can't be inferred -- `vcmla d0,d1,d2,#90'
 [^:]+:39: Error: immediate out of range -- `vcmla\.f32 q0,q1,q2,#-90'
 [^:]+:40: Error: immediate out of range -- `vcmla\.f32 q0,q1,q2,#120'
 [^:]+:41: Error: immediate out of range -- `vcmla\.f32 q0,q1,q2,#360'
 [^:]+:42: Error: Neon double or quad precision register expected -- `vcmla\.f16 s0,s1,s2,#90'
-[^:]+:43: Error: bad type in Neon instruction -- `vcmla\.f64 d0,d1,d2,#90'
-[^:]+:44: Error: bad type in Neon instruction -- `vcmla\.f64 q0,q1,q2,#90'
+[^:]+:43: Error: bad type in SIMD instruction -- `vcmla\.f64 d0,d1,d2,#90'
+[^:]+:44: Error: bad type in SIMD instruction -- `vcmla\.f64 q0,q1,q2,#90'
 [^:]+:46: Error: only D registers may be indexed -- `vcmla\.f16 q0,q1,q2\[0\],#90'
 [^:]+:47: Error: only D registers may be indexed -- `vcmla\.f32 q0,q1,q2\[0\],#90'
 [^:]+:48: Error: scalar out of range -- `vcmla\.f16 d0,d1,d2\[2\],#90'
diff --git a/gas/testsuite/gas/arm/dotprod-illegal.l b/gas/testsuite/gas/arm/dotprod-illegal.l
index 5b88bc3002b9516956822f5604e33261d164722b..c0c8708b367ef14c36cb8ecc6d57722d3200591e 100644
--- a/gas/testsuite/gas/arm/dotprod-illegal.l
+++ b/gas/testsuite/gas/arm/dotprod-illegal.l
@@ -1,9 +1,9 @@ 
 [^:]*: Assembler messages:
-[^:]*:4: Error: bad type in Neon instruction -- `vudot.s8 d0,d2,d5'
-[^:]*:6: Error: bad type in Neon instruction -- `vudot.u16 d0,d2,d5'
-[^:]*:7: Error: bad type in Neon instruction -- `vsdot.s16 d1,d12,d18'
-[^:]*:9: Error: bad type in Neon instruction -- `vudot.u32 d2,d22,d1'
-[^:]*:10: Error: bad type in Neon instruction -- `vsdot.s32 d3,d30,d9'
+[^:]*:4: Error: bad type in SIMD instruction -- `vudot.s8 d0,d2,d5'
+[^:]*:6: Error: bad type in SIMD instruction -- `vudot.u16 d0,d2,d5'
+[^:]*:7: Error: bad type in SIMD instruction -- `vsdot.s16 d1,d12,d18'
+[^:]*:9: Error: bad type in SIMD instruction -- `vudot.u32 d2,d22,d1'
+[^:]*:10: Error: bad type in SIMD instruction -- `vsdot.s32 d3,d30,d9'
 [^:]*:12: Error: scalar out of range for multiply instruction -- `vudot.u8 d31,d2,d16\[0\]'
 [^:]*:13: Error: scalar out of range for multiply instruction -- `vsdot.s8 q13,q14,d22\[1\]'
 [^:]*:15: Error: scalar out of range for multiply instruction -- `vudot.u8 d1,d8,d15\[2\]'
diff --git a/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.d b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.d
new file mode 100644
index 0000000000000000000000000000000000000000..86394e3edbda1b9a9760a44ce07263bbfa9de694
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.d
@@ -0,0 +1,5 @@ 
+#name: bad MVE VADD, VSUB and VABD instructions
+#as: -march=armv8.1-m.main+mve
+#error_output: mve-vaddsubabd-bad-1.l
+
+.*: +file format .*arm.*
diff --git a/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.l b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.l
new file mode 100644
index 0000000000000000000000000000000000000000..d4d7bfe7f89b688de35744f19a4c3bd7f3519546
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.l
@@ -0,0 +1,55 @@ 
+[^:]*: Assembler messages:
+[^:]*:11: Error: bad type in SIMD instruction -- `vadd.p8 q0,q1,q2'
+[^:]*:12: Error: selected FPU does not support instruction -- `vadd.f16 q0,q1,q2'
+[^:]*:13: Error: selected FPU does not support instruction -- `vadd.f32 q0,q1,q2'
+[^:]*:14: Error: selected FPU does not support instruction -- `vadd.i64 q0,q1,q2'
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:16: Error: bad type in SIMD instruction -- `vsub.p8 q0,q1,q2'
+[^:]*:17: Error: selected FPU does not support instruction -- `vsub.f16 q0,q1,q2'
+[^:]*:18: Error: selected FPU does not support instruction -- `vsub.f32 q0,q1,q2'
+[^:]*:19: Error: selected FPU does not support instruction -- `vsub.i64 q0,q1,q2'
+[^:]*:20: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:20: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:20: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:20: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:20: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:20: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:21: Error: bad type in SIMD instruction -- `vadd.p8 q0,q1,r2'
+[^:]*:22: Error: selected FPU does not support instruction -- `vadd.f16 q0,q1,r2'
+[^:]*:23: Error: selected FPU does not support instruction -- `vadd.f32 q0,q1,r2'
+[^:]*:24: Error: selected FPU does not support instruction -- `vadd.i64 q0,q1,r2'
+[^:]*:25: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:25: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:25: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:25: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:25: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:25: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:26: Error: bad type in SIMD instruction -- `vsub.p8 q0,q1,r2'
+[^:]*:27: Error: selected FPU does not support instruction -- `vsub.f16 q0,q1,r2'
+[^:]*:28: Error: selected FPU does not support instruction -- `vsub.f32 q0,q1,r2'
+[^:]*:29: Error: selected FPU does not support instruction -- `vsub.i64 q0,q1,r2'
+[^:]*:30: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:30: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:30: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:30: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:30: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:30: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:31: Error: bad type in SIMD instruction -- `vabd.p8 q0,q1,q2'
+[^:]*:32: Error: selected FPU does not support instruction -- `vabd.f16 q0,q1,q2'
+[^:]*:33: Error: selected FPU does not support instruction -- `vabd.f32 q0,q1,q2'
+[^:]*:34: Error: bad type in SIMD instruction -- `vabd.i64 q0,q1,q2'
+[^:]*:35: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:35: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:35: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:35: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:35: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:35: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:36: Warning: instruction is UNPREDICTABLE with SP operand
+[^:]*:37: Warning: instruction is UNPREDICTABLE with SP operand
+[^:]*:38: Warning: instruction is UNPREDICTABLE with PC operand
+[^:]*:39: Warning: instruction is UNPREDICTABLE with PC operand
diff --git a/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.s b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.s
new file mode 100644
index 0000000000000000000000000000000000000000..809937d4c1ed99974b1670ce95edb70c2edf8ed1
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-1.s
@@ -0,0 +1,39 @@ 
+.macro cond op, lastreg, size
+.irp cond, eq, ne, gt, ge, lt, le
+it \cond
+\op\size q0, q1, \lastreg
+.endr
+.endm
+
+.syntax unified
+.text
+.thumb
+vadd.p8 q0, q1, q2
+vadd.f16 q0, q1, q2
+vadd.f32 q0, q1, q2
+vadd.i64 q0, q1, q2
+cond vadd, q2, .i32
+vsub.p8 q0, q1, q2
+vsub.f16 q0, q1, q2
+vsub.f32 q0, q1, q2
+vsub.i64 q0, q1, q2
+cond vsub, q2, .i32
+vadd.p8 q0, q1, r2
+vadd.f16 q0, q1, r2
+vadd.f32 q0, q1, r2
+vadd.i64 q0, q1, r2
+cond vadd, r2, .i32
+vsub.p8 q0, q1, r2
+vsub.f16 q0, q1, r2
+vsub.f32 q0, q1, r2
+vsub.i64 q0, q1, r2
+cond vsub, r2, .i32
+vabd.p8 q0, q1, q2
+vabd.f16 q0, q1, q2
+vabd.f32 q0, q1, q2
+vabd.i64 q0, q1, q2
+cond vabd, q2, .s32
+vadd.i32 q0, q1, sp
+vsub.i32 q0, q1, sp
+vadd.i32 q0, q1, pc
+vsub.i32 q0, q1, pc
diff --git a/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.d b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.d
new file mode 100644
index 0000000000000000000000000000000000000000..602dc3276ae1f916196f777ad9d934fbe229ec68
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.d
@@ -0,0 +1,6 @@ 
+#name: bad MVE FP VADD, VSUB and VABD instructions
+#as: -march=armv8.1-m.main+mve.fp
+#error_output: mve-vaddsubabd-bad-2.l
+
+.*: +file format .*arm.*
+
diff --git a/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.l b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.l
new file mode 100644
index 0000000000000000000000000000000000000000..77d634c45618edabd6f95174741dd66772b111cb
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.l
@@ -0,0 +1,46 @@ 
+[^:]*: Assembler messages:
+[^:]*:13: Error: bad type in SIMD instruction -- `vadd.p8 q0,q1,q2'
+[^:]*:14: Error: selected FPU does not support instruction -- `vadd.i64 q0,q1,q2'
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:16: Error: bad type in SIMD instruction -- `vsub.p8 q0,q1,q2'
+[^:]*:17: Error: selected FPU does not support instruction -- `vsub.i64 q0,q1,q2'
+[^:]*:18: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:18: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:18: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:18: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:18: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:18: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:19: Error: bad type in SIMD instruction -- `vadd.p8 q0,q1,r2'
+[^:]*:20: Error: selected FPU does not support instruction -- `vadd.i64 q0,q1,r2'
+[^:]*:21: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:21: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:21: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:21: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:21: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:21: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:22: Error: bad type in SIMD instruction -- `vsub.p8 q0,q1,r2'
+[^:]*:23: Error: selected FPU does not support instruction -- `vsub.i64 q0,q1,r2'
+[^:]*:24: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:24: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:24: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:24: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:24: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:24: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:25: Error: bad type in SIMD instruction -- `vabd.p8 q0,q1,q2'
+[^:]*:26: Error: bad type in SIMD instruction -- `vabd.i64 q0,q1,q2'
+[^:]*:27: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:27: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:27: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:27: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:27: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:27: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:28: Warning: instruction is UNPREDICTABLE with SP operand
+[^:]*:29: Warning: instruction is UNPREDICTABLE with SP operand
+[^:]*:30: Warning: instruction is UNPREDICTABLE with PC operand
+[^:]*:31: Warning: instruction is UNPREDICTABLE with PC operand
+
diff --git a/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.s b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.s
new file mode 100644
index 0000000000000000000000000000000000000000..15242909d9657ba56709c128d96928e677af2036
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vaddsubabd-bad-2.s
@@ -0,0 +1,32 @@ 
+.macro cond op, lastreg
+.irp cond, eq, ne, gt, ge, lt, le
+it \cond
+\op\().f32 q0, q1, \lastreg
+.endr
+.endm
+
+
+
+.syntax unified
+.text
+.thumb
+vadd.p8 q0, q1, q2
+vadd.i64 q0, q1, q2
+cond vadd, q2
+vsub.p8 q0, q1, q2
+vsub.i64 q0, q1, q2
+cond vsub, q2
+vadd.p8 q0, q1, r2
+vadd.i64 q0, q1, r2
+cond vadd, r2
+vsub.p8 q0, q1, r2
+vsub.i64 q0, q1, r2
+cond vsub, r2
+vabd.p8 q0, q1, q2
+vabd.i64 q0, q1, q2
+cond vabd, q2
+vadd.i32 q0, q1, sp
+vsub.i32 q0, q1, sp
+vadd.i32 q0, q1, pc
+vsub.i32 q0, q1, pc
+
diff --git a/gas/testsuite/gas/arm/mve-vpst-bad.d b/gas/testsuite/gas/arm/mve-vpst-bad.d
new file mode 100644
index 0000000000000000000000000000000000000000..f328abc80f188ded287273fe04edee4ed42e4f15
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vpst-bad.d
@@ -0,0 +1,6 @@ 
+#name: bad VPST instructions
+#as: -march=armv8.1-m.main+mve
+#error_output: mve-vpst-bad.l
+
+.*: +file format .*arm.*
+
diff --git a/gas/testsuite/gas/arm/mve-vpst-bad.l b/gas/testsuite/gas/arm/mve-vpst-bad.l
new file mode 100644
index 0000000000000000000000000000000000000000..35a56890c85fd44ac22e47ea97b3bddd6811f14e
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vpst-bad.l
@@ -0,0 +1,19 @@ 
+[^:]*: Assembler messages:
+[^:]*:6: Error: syntax error -- `vpsteq'
+[^:]*:9: Error: vector predicated instruction should be in VPT/VPST block -- `vaddt.i32 q0,q1,q2'
+[^:]*:12: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:15: Error: syntax error -- `vaddeq.i32 q0,q1,q2'
+[^:]*:21: Error: instruction missing MVE vector predication code -- `vadd.i32 q0,q1,q2'
+[^:]*:23: Error: syntax error -- `vaddeq.i32 q0,q1,q2'
+[^:]*:25: Error: vector predicated instruction should be in VPT/VPST block -- `vaddt.i32 q0,q1,q2'
+[^:]*:33: Error: bad instruction `addt r0,r0,r1'
+[^:]*:37: Error: instruction not allowed in IT block -- `add r0,r0,r1'
+[^:]*:40: Error: thumb conditional instruction should be in IT block -- `addeq r0,r0,r1'
+[^:]*:43: Error: bad instruction `addt r0,r0,r1'
+[^:]*:47: Warning: instruction is UNPREDICTABLE in a VPT block
+[^:]*:49: Error: thumb conditional instruction should be in IT block -- `addeq r0,r0,r1'
+[^:]*:51: Error: bad instruction `addt r0,r0,r1'
+[^:]*:55: Warning: instruction is UNPREDICTABLE in an IT block
+[^:]*:62: Error: incorrect condition in VPT/VPST block -- `vaddt.i32 q0,q1,q2'
+[^:]*:65: Error: syntax error -- `vaddeq.i32 q0,q1,q2'
+[^:]*:68: Warning: section '.text' finished with an open VPT/VPST block.
diff --git a/gas/testsuite/gas/arm/mve-vpst-bad.s b/gas/testsuite/gas/arm/mve-vpst-bad.s
new file mode 100644
index 0000000000000000000000000000000000000000..f41d66bfccb136cf042a7d82a13f8b639b7cc556
--- /dev/null
+++ b/gas/testsuite/gas/arm/mve-vpst-bad.s
@@ -0,0 +1,68 @@ 
+.syntax unified
+.text
+.thumb
+@ Case 1
+it eq
+vpsteq
+@ Case 2
+it eq
+vaddt.i32 q0, q1, q2
+@ Case 3
+it eq
+vadd.i32 q0, q1, q2
+@ Case 4
+vpst
+vaddeq.i32 q0, q1, q2
+@ Case 5
+vpst
+vaddt.i32 q0, q1, q2
+@ Case 6
+vpst
+vadd.i32 q0, q1, q2
+@ Case 7
+vaddeq.i32 q0, q1, q2
+@ Case 8
+vaddt.i32 q0, q1, q2
+@ Case 9
+vadd.i32 q0, q1, q2
+@ Case 10
+it eq
+addeq r0, r0, r1
+@ Case 11
+it eq
+addt r0, r0, r1
+addeq r0, r0, r1
+@ Case 12
+it eq
+add r0, r0, r1
+@ Case 13
+vpst
+addeq r0, r0, r1
+@ Case 14
+vpst
+addt r0, r0, r1
+vaddt.i32 q0, q0, q1
+@ Case 15
+vpst
+add r0, r0, r1
+@ Case 16
+addeq r0, r0, r1
+@ Case 17
+addt r0, r0, r1
+@ Case 18
+add r0, r0, r1
+it le
+vpstete
+vaddt.i32 q0, q1, q2
+vadde.i32 q0, q1, q2
+vaddt.i32 q0, q1, q2
+vadde.i32 q0, q1, q2
+vpste
+vaddt.i32 q0, q1, q2
+vaddt.i32 q0, q1, q2
+vpste
+vaddt.i32 q0, q1, q2
+vaddeq.i32 q0, q1, q2
+vpstet
+vaddt.i32 q0, q1, q2
+vadde.i32 q0, q1, q2
diff --git a/gas/testsuite/gas/arm/neon-ldst-es-bad.l b/gas/testsuite/gas/arm/neon-ldst-es-bad.l
index b0c854eee715cb15b14c418a157d0b5c6874b92f..84758c6b2b338be20160e63586b01b7fca4ef35b 100644
--- a/gas/testsuite/gas/arm/neon-ldst-es-bad.l
+++ b/gas/testsuite/gas/arm/neon-ldst-es-bad.l
@@ -1,12 +1,12 @@ 
 [^:]*: Assembler messages:
-[^:]*:2: Error: bad type in Neon instruction -- `vld1\.64 {d0\[1\]},\[r0\]'
-[^:]*:3: Error: bad type in Neon instruction -- `vld1\.64 {d0\[\]},\[r0\]'
-[^:]*:4: Error: bad type in Neon instruction -- `vld2\.64 {d0\[1\]},\[r0\]'
-[^:]*:5: Error: bad type in Neon instruction -- `vld2\.64 {d0\[\]},\[r0\]'
+[^:]*:2: Error: bad type in SIMD instruction -- `vld1\.64 {d0\[1\]},\[r0\]'
+[^:]*:3: Error: bad type in SIMD instruction -- `vld1\.64 {d0\[\]},\[r0\]'
+[^:]*:4: Error: bad type in SIMD instruction -- `vld2\.64 {d0\[1\]},\[r0\]'
+[^:]*:5: Error: bad type in SIMD instruction -- `vld2\.64 {d0\[\]},\[r0\]'
 [^:]*:6: Error: bad element type for instruction -- `vld2\.64 {d0-d1},\[r0\]'
-[^:]*:7: Error: bad type in Neon instruction -- `vld3\.64 {d0\[1\]},\[r0\]'
-[^:]*:8: Error: bad type in Neon instruction -- `vld3\.64 {d0\[\]},\[r0\]'
+[^:]*:7: Error: bad type in SIMD instruction -- `vld3\.64 {d0\[1\]},\[r0\]'
+[^:]*:8: Error: bad type in SIMD instruction -- `vld3\.64 {d0\[\]},\[r0\]'
 [^:]*:9: Error: bad element type for instruction -- `vld3\.64 {d0-d2},\[r0\]'
-[^:]*:10: Error: bad type in Neon instruction -- `vld4\.64 {d0\[1\]},\[r0\]'
-[^:]*:11: Error: bad type in Neon instruction -- `vld4\.64 {d0\[\]},\[r0\]'
+[^:]*:10: Error: bad type in SIMD instruction -- `vld4\.64 {d0\[1\]},\[r0\]'
+[^:]*:11: Error: bad type in SIMD instruction -- `vld4\.64 {d0\[\]},\[r0\]'
 [^:]*:12: Error: bad element type for instruction -- `vld4\.64 {d0-d3},\[r0\]'