[ARM,12/16] Scalar Low Overhead loop instructions for Armv8.1-M Mainline

Message ID 1b0a8ea3-6226-2d81-7453-542e9c15f6bb@arm.com
State New
Headers show
Series
  • Add support for Armv8.1-M Mainline
Related show

Commit Message

Andre Vieira (lists) April 4, 2019, 1:41 p.m.
Hi

This patch is part of a series of patches to add support for Armv8.1-M 
Mainline instructions to binutils.
This patch adds support to the Scalar low overhead loop instructions:
LE
WLS
DLS

We also add a new assembler resolvable relocation bfd_reloc_code_real 
enum for the 12-bit branch offset used in these instructions.
Testing: Builds successfully and no regressions. Added new tests for the 
valid and invalid instructions operands. Testsuite shows no regression 
when run for arm-none-eabi targets.

Thanks
Sudi


ChangeLog entries are as follows :

*** bfd/ChnageLog ***

2019-04-04  Sudakshina Das  <sudi.das@arm.com>

	* reloc.c (BFD_RELOC_ARM_THUMB_LOOP12): New.
	* bfd-in2.h: Regenerated.
	* libbfd.h: Regenerated.

*** gas/ChangeLog ***

2019-04-04  Sudakshina Das  <sudi.das@arm.com>

	* config/tc-arm.c (operand_parse_code): Add OP_LR and OP_oLR
	for the LR operand and optional LR operand.
	(parse_operands): Add switch cases for OP_LR and OP_oLR for
	both type checking and value checking.
	(encode_thumb32_addr_mode): New entries for DLS, WLS and LE.
	(v8_1_loop_reloc): New helper function for handling labels
	for the low overhead loop instructions.
	(do_t_loloop): New function to encode DLS, WLS and LE.
	(insns): New entries for WLS, DLS and LE.
	(md_pcrel_from_section): New switch case
	for BFD_RELOC_ARM_THUMB_LOOP12.
	(md_appdy_fix): Likewise.
	(tc_gen_reloc): Likewise.
	* testsuite/gas/arm/armv8_1-m-tloop.s: New.
	* testsuite/gas/arm/armv8_1-m-tloop.d: New.
	* testsuite/gas/arm/armv8_1-m-tloop-bad.s: New.
	* testsuite/gas/arm/armv8_1-m-tloop-bad.d: New.
	* testsuite/gas/arm/armv8_1-m-tloop-bad.l: New.

*** opcodes/ChangeLog ***

2019-04-04  Sudakshina Das  <sudi.das@arm.com>

	* arm-dis.c (print_insn_thumb32): Updated to accept new %P
	and %Q patterns.

Comments

Andre Vieira (lists) April 12, 2019, 10:39 a.m. | #1
Hi,

The former patch had an issue with the LE branch value sign flip.  It 
wasn't doing it for big-endian because of the wrong use of 'md_
chars_to_number'.  Swapping it for 'get_thumb32_insn' fixes the issue.

Also fixed a testism.

Is this OK?

Cheers,
Andre

*** bfd/ChnageLog ***

2019-04-12  Sudakshina Das  <sudi.das@arm.com>

	* reloc.c (BFD_RELOC_ARM_THUMB_LOOP12): New.
	* bfd-in2.h: Regenerated.
	* libbfd.h: Regenerated.

*** gas/ChangeLog ***

2019-04-12  Sudakshina Das  <sudi.das@arm.com>
             Andre Vieira  <andre.simoesdiasvieira@arm.com>

	* config/tc-arm.c (operand_parse_code): Add OP_LR and OP_oLR
	for the LR operand and optional LR operand.
	(parse_operands): Add switch cases for OP_LR and OP_oLR for
	both type checking and value checking.
	(encode_thumb32_addr_mode): New entries for DLS, WLS and LE.
	(v8_1_loop_reloc): New helper function for handling labels
	for the low overhead loop instructions.
	(do_t_loloop): New function to encode DLS, WLS and LE.
	(insns): New entries for WLS, DLS and LE.
	(md_pcrel_from_section): New switch case
	for BFD_RELOC_ARM_THUMB_LOOP12.
	(md_appdy_fix): Likewise.
	(tc_gen_reloc): Likewise.
	* testsuite/gas/arm/armv8_1-m-tloop.s: New.
	* testsuite/gas/arm/armv8_1-m-tloop.d: New.
	* testsuite/gas/arm/armv8_1-m-tloop-bad.s: New.
	* testsuite/gas/arm/armv8_1-m-tloop-bad.d: New.
	* testsuite/gas/arm/armv8_1-m-tloop-bad.l: New.

*** opcodes/ChangeLog ***

2019-04-12  Sudakshina Das  <sudi.das@arm.com>

	* arm-dis.c (print_insn_thumb32): Updated to accept new %P
	and %Q patterns.

On 04/04/2019 14:41, Andre Vieira (lists) wrote:
> Hi

> 

> This patch is part of a series of patches to add support for Armv8.1-M 

> Mainline instructions to binutils.

> This patch adds support to the Scalar low overhead loop instructions:

> LE

> WLS

> DLS

> 

> We also add a new assembler resolvable relocation bfd_reloc_code_real 

> enum for the 12-bit branch offset used in these instructions.

> Testing: Builds successfully and no regressions. Added new tests for the 

> valid and invalid instructions operands. Testsuite shows no regression 

> when run for arm-none-eabi targets.

> 

> Thanks

> Sudi

> 

> 

> ChangeLog entries are as follows :

> 

> *** bfd/ChnageLog ***

> 

> 2019-04-04  Sudakshina Das  <sudi.das@arm.com>

> 

>      * reloc.c (BFD_RELOC_ARM_THUMB_LOOP12): New.

>      * bfd-in2.h: Regenerated.

>      * libbfd.h: Regenerated.

> 

> *** gas/ChangeLog ***

> 

> 2019-04-04  Sudakshina Das  <sudi.das@arm.com>

> 

>      * config/tc-arm.c (operand_parse_code): Add OP_LR and OP_oLR

>      for the LR operand and optional LR operand.

>      (parse_operands): Add switch cases for OP_LR and OP_oLR for

>      both type checking and value checking.

>      (encode_thumb32_addr_mode): New entries for DLS, WLS and LE.

>      (v8_1_loop_reloc): New helper function for handling labels

>      for the low overhead loop instructions.

>      (do_t_loloop): New function to encode DLS, WLS and LE.

>      (insns): New entries for WLS, DLS and LE.

>      (md_pcrel_from_section): New switch case

>      for BFD_RELOC_ARM_THUMB_LOOP12.

>      (md_appdy_fix): Likewise.

>      (tc_gen_reloc): Likewise.

>      * testsuite/gas/arm/armv8_1-m-tloop.s: New.

>      * testsuite/gas/arm/armv8_1-m-tloop.d: New.

>      * testsuite/gas/arm/armv8_1-m-tloop-bad.s: New.

>      * testsuite/gas/arm/armv8_1-m-tloop-bad.d: New.

>      * testsuite/gas/arm/armv8_1-m-tloop-bad.l: New.

> 

> *** opcodes/ChangeLog ***

> 

> 2019-04-04  Sudakshina Das  <sudi.das@arm.com>

> 

>      * arm-dis.c (print_insn_thumb32): Updated to accept new %P

>      and %Q patterns.
diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h
index 4a3fa75867c814f082ba3ab3079cf60c30ad2b62..540b9f71c181841ea782c63d87d0d5271c864966 100644
--- a/bfd/bfd-in2.h
+++ b/bfd/bfd-in2.h
@@ -3579,6 +3579,9 @@ field in the instruction.  */
 /* ARM 19-bit pc-relative branch for Branch Future Link instruction.  */
   BFD_RELOC_ARM_THUMB_BF19,
 
+/* ARM 12-bit pc-relative branch for Low Overhead Loop instructions.  */
+  BFD_RELOC_ARM_THUMB_LOOP12,
+
 /* Thumb 7-, 9-, 12-, 20-, 23-, and 25-bit pc-relative branches.
 The lowest bit must be zero and is not stored in the instruction.
 Note that the corresponding ELF R_ARM_THM_JUMPnn constant has an
diff --git a/bfd/libbfd.h b/bfd/libbfd.h
index 32080db8c3f6141ae9aa9674c8776694db29905a..f64a8f3892ad3aff5c4570f0281875bce35846a6 100644
--- a/bfd/libbfd.h
+++ b/bfd/libbfd.h
@@ -1534,6 +1534,7 @@ static const char *const bfd_reloc_code_real_names[] = { "@@uninitialized@@",
   "BFD_RELOC_ARM_THUMB_BF17",
   "BFD_RELOC_ARM_THUMB_BF13",
   "BFD_RELOC_ARM_THUMB_BF19",
+  "BFD_RELOC_ARM_THUMB_LOOP12",
   "BFD_RELOC_THUMB_PCREL_BRANCH7",
   "BFD_RELOC_THUMB_PCREL_BRANCH9",
   "BFD_RELOC_THUMB_PCREL_BRANCH12",
diff --git a/bfd/reloc.c b/bfd/reloc.c
index c0e413cd19dbfaf5100143c8879d84cb63ba4a17..e6ba9e265027a6c34a8ad183dd5825a9af9c1f82 100644
--- a/bfd/reloc.c
+++ b/bfd/reloc.c
@@ -3039,6 +3039,11 @@ ENUM
 ENUMDOC
   ARM 19-bit pc-relative branch for Branch Future Link instruction.
 
+ENUM
+  BFD_RELOC_ARM_THUMB_LOOP12
+ENUMDOC
+  ARM 12-bit pc-relative branch for Low Overhead Loop instructions.
+
 ENUM
   BFD_RELOC_THUMB_PCREL_BRANCH7
 ENUMX
diff --git a/gas/config/tc-arm.c b/gas/config/tc-arm.c
index 5e59078890752b0580ae443c7b330e32baecccf6..828dfc1eddce9857a1c038b2ff927d71eccda6e9 100644
--- a/gas/config/tc-arm.c
+++ b/gas/config/tc-arm.c
@@ -6543,6 +6543,10 @@ enum operand_parse_code
   OP_RIWG,	/* iWMMXt wCG register */
   OP_RXA,	/* XScale accumulator register */
 
+  /* New operands for Armv8.1-M Mainline.  */
+  OP_LR,	/* ARM LR register */
+  OP_RRnpcsp_I32, /* ARM register (no BadReg) or literal 1 .. 32 */
+
   OP_REGLST,	/* ARM register list */
   OP_VRSLST,	/* VFP single-precision register list */
   OP_VRDLST,	/* VFP double-precision register list */
@@ -6622,6 +6626,7 @@ enum operand_parse_code
   OP_oI255c,	 /*	  curly-brace enclosed, 0 .. 255 */
 
   OP_oRR,	 /* ARM register */
+  OP_oLR,	 /* ARM LR register */
   OP_oRRnpc,	 /* ARM register, not the PC */
   OP_oRRnpcsp,	 /* ARM register, neither the PC nor the SP (a.k.a. BadReg) */
   OP_oRRw,	 /* ARM register, not r15, optional trailing ! */
@@ -6790,6 +6795,8 @@ parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 	case OP_RRnpc:
 	case OP_RRnpcsp:
 	case OP_oRR:
+	case OP_LR:
+	case OP_oLR:
 	case OP_RR:    po_reg_or_fail (REG_TYPE_RN);	  break;
 	case OP_RCP:   po_reg_or_fail (REG_TYPE_CP);	  break;
 	case OP_RCN:   po_reg_or_fail (REG_TYPE_CN);	  break;
@@ -7307,6 +7314,12 @@ parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 	  inst.operands[i].imm = val;
 	  break;
 
+	case OP_LR:
+	case OP_oLR:
+	  if (inst.operands[i].reg != REG_LR)
+	    inst.error = _("operand must be LR register");
+	  break;
+
 	default:
 	  break;
 	}
@@ -10518,6 +10531,7 @@ encode_thumb32_addr_mode (int i, bfd_boolean is_t, bfd_boolean is_d)
   X(_cpsid, b670, f3af8600),			\
   X(_cpy,   4600, ea4f0000),			\
   X(_dec_sp,80dd, f1ad0d00),			\
+  X(_dls,   0000, f040e001),			\
   X(_eor,   4040, ea800000),			\
   X(_eors,  4040, ea900000),			\
   X(_inc_sp,00dd, f10d0d00),			\
@@ -10530,6 +10544,7 @@ encode_thumb32_addr_mode (int i, bfd_boolean is_t, bfd_boolean is_d)
   X(_ldr_pc,4800, f85f0000),			\
   X(_ldr_pc2,4800, f85f0000),			\
   X(_ldr_sp,9800, f85d0000),			\
+  X(_le,    0000, f00fc001),			\
   X(_lsl,   0000, fa00f000),			\
   X(_lsls,  0000, fa10f000),			\
   X(_lsr,   0800, fa20f000),			\
@@ -10571,6 +10586,7 @@ encode_thumb32_addr_mode (int i, bfd_boolean is_t, bfd_boolean is_d)
   X(_yield, bf10, f3af8001),			\
   X(_wfe,   bf20, f3af8002),			\
   X(_wfi,   bf30, f3af8003),			\
+  X(_wls,   0000, f040c001),			\
   X(_sev,   bf40, f3af8004),                    \
   X(_sevl,  bf50, f3af8005),			\
   X(_udf,   de00, f7f0a000)
@@ -13434,6 +13450,64 @@ do_t_branch_future (void)
     }
 }
 
+/* Helper function for do_t_loloop to handle relocations.  */
+static void
+v8_1_loop_reloc (int is_le)
+{
+  if (inst.relocs[0].exp.X_op == O_constant)
+    {
+      int value = inst.relocs[0].exp.X_add_number;
+      value = (is_le) ? -value : value;
+
+      if (v8_1_branch_value_check (value, 12, FALSE) == FAIL)
+	as_bad (BAD_BRANCH_OFF);
+
+      int imml, immh;
+
+      immh = (value & 0x00000ffc) >> 2;
+      imml = (value & 0x00000002) >> 1;
+
+      inst.instruction |= (imml << 11) | (immh << 1);
+    }
+  else
+    {
+      inst.relocs[0].type = BFD_RELOC_ARM_THUMB_LOOP12;
+      inst.relocs[0].pc_rel = 1;
+    }
+}
+
+/* To handle the Scalar Low Overhead Loop instructions
+   in Armv8.1-M Mainline.  */
+static void
+do_t_loloop (void)
+{
+  unsigned long insn = inst.instruction;
+
+  set_it_insn_type (OUTSIDE_IT_INSN);
+  inst.instruction = THUMB_OP32 (inst.instruction);
+
+  switch (insn)
+    {
+    case T_MNEM_le:
+      /* le <label>.  */
+      if (!inst.operands[0].present)
+	inst.instruction |= 1 << 21;
+
+      v8_1_loop_reloc (TRUE);
+      break;
+
+    case T_MNEM_wls:
+      v8_1_loop_reloc (FALSE);
+      /* Fall through.  */
+    case T_MNEM_dls:
+      constraint (inst.operands[1].isreg != 1, BAD_ARGS);
+      inst.instruction |= (inst.operands[1].reg << 16);
+      break;
+
+    default: abort();
+    }
+}
+
 /* Neon instruction encoder helpers.  */
 
 /* Encodings for the different types for various Neon opcodes.  */
@@ -21756,6 +21830,10 @@ static const struct asm_opcode insns[] =
  toC("bfx",    _bfx,	2, (EXPs, RRnpcsp),	     t_branch_future),
  toC("bfl",    _bfl,	2, (EXPs, EXPs),	     t_branch_future),
  toC("bflx",   _bflx,	2, (EXPs, RRnpcsp),	     t_branch_future),
+
+ toU("dls", _dls, 2, (LR, RRnpcsp),	 t_loloop),
+ toU("wls", _wls, 3, (LR, RRnpcsp, EXP), t_loloop),
+ toU("le",  _le,  2, (oLR, EXP),	 t_loloop),
 };
 #undef ARM_VARIANT
 #undef THUMB_VARIANT
@@ -22996,6 +23074,7 @@ md_pcrel_from_section (fixS * fixP, segT seg)
     case BFD_RELOC_ARM_THUMB_BF17:
     case BFD_RELOC_ARM_THUMB_BF19:
     case BFD_RELOC_ARM_THUMB_BF13:
+    case BFD_RELOC_ARM_THUMB_LOOP12:
       return base + 4;
 
     case BFD_RELOC_THUMB_PCREL_BRANCH23:
@@ -25025,6 +25104,39 @@ md_apply_fix (fixS *	fixP,
 	}
       break;
 
+    case BFD_RELOC_ARM_THUMB_LOOP12:
+      if (fixP->fx_addsy
+	  && (S_GET_SEGMENT (fixP->fx_addsy) == seg)
+	  && !S_FORCE_RELOC (fixP->fx_addsy, TRUE)
+	  && ARM_IS_FUNC (fixP->fx_addsy)
+	  && ARM_CPU_HAS_FEATURE (selected_cpu, arm_ext_v8_1m_main))
+	{
+	  /* Force a relocation for a branch 12 bits wide.  */
+	  fixP->fx_done = 0;
+	}
+
+      bfd_vma insn = get_thumb32_insn (buf);
+      /* le lr, <label> or le <label> */
+      if (((insn & 0xffffffff) == 0xf00fc001)
+	  || ((insn & 0xffffffff) == 0xf02fc001))
+	value = -value;
+
+      if (v8_1_branch_value_check (value, 12, FALSE) == FAIL)
+	as_bad_where (fixP->fx_file, fixP->fx_line,
+		      BAD_BRANCH_OFF);
+      if (fixP->fx_done || !seg->use_rela_p)
+	{
+	  addressT imml, immh;
+
+	  immh = (value & 0x00000ffc) >> 2;
+	  imml = (value & 0x00000002) >> 1;
+
+	  newval  = md_chars_to_number (buf + THUMB_SIZE, THUMB_SIZE);
+	  newval |= (imml << 11) | (immh << 1);
+	  md_number_to_chars (buf + THUMB_SIZE, newval, THUMB_SIZE);
+	}
+      break;
+
     case BFD_RELOC_ARM_V4BX:
       /* This will need to go in the object file.  */
       fixP->fx_done = 0;
@@ -25241,6 +25353,7 @@ tc_gen_reloc (asection *section, fixS *fixp)
 
     case BFD_RELOC_THUMB_PCREL_BRANCH5:
     case BFD_RELOC_THUMB_PCREL_BFCSEL:
+    case BFD_RELOC_ARM_THUMB_LOOP12:
       as_bad_where (fixp->fx_file, fixp->fx_line,
 		    _("%s used for a symbol not defined in the same file"),
 		    bfd_get_reloc_code_name (fixp->fx_r_type));
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.d b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.d
new file mode 100644
index 0000000000000000000000000000000000000000..d1f2a8dfae4fdcb7ca031f8d198d3e5091697391
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.d
@@ -0,0 +1,4 @@
+#name: Invalid Armv8.1-M Mainline Low Overhead Loop instructions
+#source: armv8_1-m-loloop-bad.s
+#as: -march=armv8.1-m.main
+#error_output: armv8_1-m-loloop-bad.l
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.l b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.l
new file mode 100644
index 0000000000000000000000000000000000000000..691917ebdcefb33aadf8e25314a2aaa4dbc13f86
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.l
@@ -0,0 +1,7 @@
+.*: Assembler messages:
+.*:5: Error: operand must be LR register -- `wls r1,r2,.LB1'
+.*:6: Error: operand must be LR register -- `dls r2,r2'
+.*:7: Error: r15 not allowed here -- `dls lr,pc'
+.*:8: Error: branch out of range or not a multiple of 2
+.*:9: Error: branch out of range or not a multiple of 2
+.*:10: Error: branch out of range or not a multiple of 2
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.s b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.s
new file mode 100644
index 0000000000000000000000000000000000000000..b4f19625db1406e503c2d80ef19162c2d0c1f27e
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.s
@@ -0,0 +1,12 @@
+	.syntax unified
+	.text
+	.thumb
+foo:
+	wls r1, r2, .LB1
+	dls r2, r2
+	dls lr, pc
+	le lr, #4096
+	le #-4098
+	le #-4095
+.LB1:
+	mov r3, r2
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop.d b/gas/testsuite/gas/arm/armv8_1-m-loloop.d
new file mode 100644
index 0000000000000000000000000000000000000000..1e02b82651f7127c5c98d3518a0a653445db4c9a
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop.d
@@ -0,0 +1,17 @@
+#name: Valid Armv8.1-M Mainline Low Overhead loop instructions
+#source: armv8_1-m-loloop.s
+#as: -march=armv8.1-m.main
+#objdump: -dr --prefix-addresses --show-raw-insn
+
+.*: +file format .*arm.*
+
+Disassembly of section .text:
+0[0-9a-f]+ <[^>]+> f042 c00d 	wls	lr, r2, 0000001c <foo\+0x1c>
+0[0-9a-f]+ <[^>]+> f042 e001 	dls	lr, r2
+0[0-9a-f]+ <[^>]+> f04e e001 	dls	lr, lr
+0[0-9a-f]+ <[^>]+> f00f c009 	le	lr, 00000000 <foo>
+0[0-9a-f]+ <[^>]+> f02f c00b 	le	00000000 <foo>
+0[0-9a-f]+ <[^>]+> f00f c24b 	le	lr, fffffb84 <foo\+0xfffffb84>
+0[0-9a-f]+ <[^>]+> f02f c007 	le	00000010 <foo\+0x10>
+0[0-9a-f]+ <[^>]+> 4613      	mov	r3, r2
+#...
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop.s b/gas/testsuite/gas/arm/armv8_1-m-loloop.s
new file mode 100644
index 0000000000000000000000000000000000000000..8fb87e40aa5d17955903ff473e55c0f37eac50bb
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop.s
@@ -0,0 +1,14 @@
+	.syntax unified
+	.text
+	.thumb
+foo:
+.Lstart:
+	wls lr, r2, .LB1
+	dls lr, r2
+	dls lr, lr
+	le lr, .Lstart
+	le .Lstart
+	le lr, #-1172
+	le #-12
+.LB1:
+	mov r3, r2
diff --git a/opcodes/arm-dis.c b/opcodes/arm-dis.c
index b4865c1a42a04bb6a9b156d7369c065ca351efb4..2cf9507fbb5fe6505df66d2cd0c5abda2d53bdf9 100644
--- a/opcodes/arm-dis.c
+++ b/opcodes/arm-dis.c
@@ -2718,6 +2718,8 @@ static const struct opcode16 thumb_opcodes[] =
        %W		print an offset for BF instruction
        %Y		print an offset for BFL instruction
        %Z		print an offset for BFCSEL instruction
+       %Q		print an offset for Low Overhead Loop instructions
+       %P		print an offset for Low Overhead Loop end instructions
        %b		print a conditional branch offset
        %B		print an unconditional branch offset
        %s		print the shift field of an SSAT instruction
@@ -2751,6 +2753,15 @@ static const struct opcode16 thumb_opcodes[] =
 static const struct opcode32 thumb32_opcodes[] =
 {
   /* Armv8.1-M Mainline instructions.  */
+  {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
+    0xf040c001, 0xfff0f001, "wls\tlr, %16-19S, %Q"},
+  {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
+    0xf040e001, 0xfff0ffff, "dls\tlr, %16-19S"},
+  {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
+    0xf02fc001, 0xfffff001, "le\t%P"},
+  {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
+    0xf00fc001, 0xfffff001, "le\tlr, %P"},
+
   {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
     0xf040e001, 0xf860f001, "bf%c\t%G, %W"},
   {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
@@ -5945,6 +5956,32 @@ print_insn_thumb32 (bfd_vma pc, struct disassemble_info *info, long given)
 		}
 		break;
 
+	      case 'Q':
+		{
+		  unsigned int immh = (given & 0x000007feu) >> 1;
+		  unsigned int imml = (given & 0x00000800u) >> 11;
+		  bfd_vma imm32 = 0;
+
+		  imm32 |= immh << 2;
+		  imm32 |= imml << 1;
+
+		  info->print_address_func (pc + 4 + imm32, info);
+		}
+		break;
+
+	      case 'P':
+		{
+		  unsigned int immh = (given & 0x000007feu) >> 1;
+		  unsigned int imml = (given & 0x00000800u) >> 11;
+		  bfd_vma imm32 = 0;
+
+		  imm32 |= immh << 2;
+		  imm32 |= imml << 1;
+
+		  info->print_address_func (pc + 4 - imm32, info);
+		}
+		break;
+
 	      case 'b':
 		{
 		  unsigned int S = (given & 0x04000000u) >> 26;

Patch

diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h
index 4a3fa75867c814f082ba3ab3079cf60c30ad2b62..540b9f71c181841ea782c63d87d0d5271c864966 100644
--- a/bfd/bfd-in2.h
+++ b/bfd/bfd-in2.h
@@ -3579,6 +3579,9 @@  field in the instruction.  */
 /* ARM 19-bit pc-relative branch for Branch Future Link instruction.  */
   BFD_RELOC_ARM_THUMB_BF19,
 
+/* ARM 12-bit pc-relative branch for Low Overhead Loop instructions.  */
+  BFD_RELOC_ARM_THUMB_LOOP12,
+
 /* Thumb 7-, 9-, 12-, 20-, 23-, and 25-bit pc-relative branches.
 The lowest bit must be zero and is not stored in the instruction.
 Note that the corresponding ELF R_ARM_THM_JUMPnn constant has an
diff --git a/bfd/libbfd.h b/bfd/libbfd.h
index 32080db8c3f6141ae9aa9674c8776694db29905a..f64a8f3892ad3aff5c4570f0281875bce35846a6 100644
--- a/bfd/libbfd.h
+++ b/bfd/libbfd.h
@@ -1534,6 +1534,7 @@  static const char *const bfd_reloc_code_real_names[] = { "@@uninitialized@@",
   "BFD_RELOC_ARM_THUMB_BF17",
   "BFD_RELOC_ARM_THUMB_BF13",
   "BFD_RELOC_ARM_THUMB_BF19",
+  "BFD_RELOC_ARM_THUMB_LOOP12",
   "BFD_RELOC_THUMB_PCREL_BRANCH7",
   "BFD_RELOC_THUMB_PCREL_BRANCH9",
   "BFD_RELOC_THUMB_PCREL_BRANCH12",
diff --git a/bfd/reloc.c b/bfd/reloc.c
index c0e413cd19dbfaf5100143c8879d84cb63ba4a17..e6ba9e265027a6c34a8ad183dd5825a9af9c1f82 100644
--- a/bfd/reloc.c
+++ b/bfd/reloc.c
@@ -3039,6 +3039,11 @@  ENUM
 ENUMDOC
   ARM 19-bit pc-relative branch for Branch Future Link instruction.
 
+ENUM
+  BFD_RELOC_ARM_THUMB_LOOP12
+ENUMDOC
+  ARM 12-bit pc-relative branch for Low Overhead Loop instructions.
+
 ENUM
   BFD_RELOC_THUMB_PCREL_BRANCH7
 ENUMX
diff --git a/gas/config/tc-arm.c b/gas/config/tc-arm.c
index 592d658efbe7bb8169353f49e8faf6dd396647b0..ca143a9c7d1ffd0c73dafc972f2c50aed631665c 100644
--- a/gas/config/tc-arm.c
+++ b/gas/config/tc-arm.c
@@ -6543,6 +6543,10 @@  enum operand_parse_code
   OP_RIWG,	/* iWMMXt wCG register */
   OP_RXA,	/* XScale accumulator register */
 
+  /* New operands for Armv8.1-M Mainline.  */
+  OP_LR,	/* ARM LR register */
+  OP_RRnpcsp_I32, /* ARM register (no BadReg) or literal 1 .. 32 */
+
   OP_REGLST,	/* ARM register list */
   OP_VRSLST,	/* VFP single-precision register list */
   OP_VRDLST,	/* VFP double-precision register list */
@@ -6622,6 +6626,7 @@  enum operand_parse_code
   OP_oI255c,	 /*	  curly-brace enclosed, 0 .. 255 */
 
   OP_oRR,	 /* ARM register */
+  OP_oLR,	 /* ARM LR register */
   OP_oRRnpc,	 /* ARM register, not the PC */
   OP_oRRnpcsp,	 /* ARM register, neither the PC nor the SP (a.k.a. BadReg) */
   OP_oRRw,	 /* ARM register, not r15, optional trailing ! */
@@ -6790,6 +6795,8 @@  parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 	case OP_RRnpc:
 	case OP_RRnpcsp:
 	case OP_oRR:
+	case OP_LR:
+	case OP_oLR:
 	case OP_RR:    po_reg_or_fail (REG_TYPE_RN);	  break;
 	case OP_RCP:   po_reg_or_fail (REG_TYPE_CP);	  break;
 	case OP_RCN:   po_reg_or_fail (REG_TYPE_CN);	  break;
@@ -7307,6 +7314,12 @@  parse_operands (char *str, const unsigned int *pattern, bfd_boolean thumb)
 	  inst.operands[i].imm = val;
 	  break;
 
+	case OP_LR:
+	case OP_oLR:
+	  if (inst.operands[i].reg != REG_LR)
+	    inst.error = _("operand must be LR register");
+	  break;
+
 	default:
 	  break;
 	}
@@ -10518,6 +10531,7 @@  encode_thumb32_addr_mode (int i, bfd_boolean is_t, bfd_boolean is_d)
   X(_cpsid, b670, f3af8600),			\
   X(_cpy,   4600, ea4f0000),			\
   X(_dec_sp,80dd, f1ad0d00),			\
+  X(_dls,   0000, f040e001),			\
   X(_eor,   4040, ea800000),			\
   X(_eors,  4040, ea900000),			\
   X(_inc_sp,00dd, f10d0d00),			\
@@ -10530,6 +10544,7 @@  encode_thumb32_addr_mode (int i, bfd_boolean is_t, bfd_boolean is_d)
   X(_ldr_pc,4800, f85f0000),			\
   X(_ldr_pc2,4800, f85f0000),			\
   X(_ldr_sp,9800, f85d0000),			\
+  X(_le,    0000, f00fc001),			\
   X(_lsl,   0000, fa00f000),			\
   X(_lsls,  0000, fa10f000),			\
   X(_lsr,   0800, fa20f000),			\
@@ -10571,6 +10586,7 @@  encode_thumb32_addr_mode (int i, bfd_boolean is_t, bfd_boolean is_d)
   X(_yield, bf10, f3af8001),			\
   X(_wfe,   bf20, f3af8002),			\
   X(_wfi,   bf30, f3af8003),			\
+  X(_wls,   0000, f040c001),			\
   X(_sev,   bf40, f3af8004),                    \
   X(_sevl,  bf50, f3af8005),			\
   X(_udf,   de00, f7f0a000)
@@ -13434,6 +13450,64 @@  do_t_branch_future (void)
     }
 }
 
+/* Helper function for do_t_loloop to handle relocations.  */
+static void
+v8_1_loop_reloc (int is_le)
+{
+  if (inst.relocs[0].exp.X_op == O_constant)
+    {
+      int value = inst.relocs[0].exp.X_add_number;
+      value = (is_le) ? -value : value;
+
+      if (v8_1_branch_value_check (value, 12, FALSE) == FAIL)
+	as_bad (BAD_BRANCH_OFF);
+
+      int imml, immh;
+
+      immh = (value & 0x00000ffc) >> 2;
+      imml = (value & 0x00000002) >> 1;
+
+      inst.instruction |= (imml << 11) | (immh << 1);
+    }
+  else
+    {
+      inst.relocs[0].type = BFD_RELOC_ARM_THUMB_LOOP12;
+      inst.relocs[0].pc_rel = 1;
+    }
+}
+
+/* To handle the Scalar Low Overhead Loop instructions
+   in Armv8.1-M Mainline.  */
+static void
+do_t_loloop (void)
+{
+  unsigned long insn = inst.instruction;
+
+  set_it_insn_type (OUTSIDE_IT_INSN);
+  inst.instruction = THUMB_OP32 (inst.instruction);
+
+  switch (insn)
+    {
+    case T_MNEM_le:
+      /* le <label>.  */
+      if (!inst.operands[0].present)
+	inst.instruction |= 1 << 21;
+
+      v8_1_loop_reloc (TRUE);
+      break;
+
+    case T_MNEM_wls:
+      v8_1_loop_reloc (FALSE);
+      /* Fall through.  */
+    case T_MNEM_dls:
+      constraint (inst.operands[1].isreg != 1, BAD_ARGS);
+      inst.instruction |= (inst.operands[1].reg << 16);
+      break;
+
+    default: abort();
+    }
+}
+
 /* Neon instruction encoder helpers.  */
 
 /* Encodings for the different types for various Neon opcodes.  */
@@ -21756,6 +21830,10 @@  static const struct asm_opcode insns[] =
  toC("bfx",    _bfx,	2, (EXPs, RRnpcsp),	     t_branch_future),
  toC("bfl",    _bfl,	2, (EXPs, EXPs),	     t_branch_future),
  toC("bflx",   _bflx,	2, (EXPs, RRnpcsp),	     t_branch_future),
+
+ toU("dls", _dls, 2, (LR, RRnpcsp),	 t_loloop),
+ toU("wls", _wls, 3, (LR, RRnpcsp, EXP), t_loloop),
+ toU("le",  _le,  2, (oLR, EXP),	 t_loloop),
 };
 #undef ARM_VARIANT
 #undef THUMB_VARIANT
@@ -22996,6 +23074,7 @@  md_pcrel_from_section (fixS * fixP, segT seg)
     case BFD_RELOC_ARM_THUMB_BF17:
     case BFD_RELOC_ARM_THUMB_BF19:
     case BFD_RELOC_ARM_THUMB_BF13:
+    case BFD_RELOC_ARM_THUMB_LOOP12:
       return base + 4;
 
     case BFD_RELOC_THUMB_PCREL_BRANCH23:
@@ -25025,6 +25104,39 @@  md_apply_fix (fixS *	fixP,
 	}
       break;
 
+    case BFD_RELOC_ARM_THUMB_LOOP12:
+      if (fixP->fx_addsy
+	  && (S_GET_SEGMENT (fixP->fx_addsy) == seg)
+	  && !S_FORCE_RELOC (fixP->fx_addsy, TRUE)
+	  && ARM_IS_FUNC (fixP->fx_addsy)
+	  && ARM_CPU_HAS_FEATURE (selected_cpu, arm_ext_v8_1m_main))
+	{
+	  /* Force a relocation for a branch 12 bits wide.  */
+	  fixP->fx_done = 0;
+	}
+
+      bfd_vma insn = md_chars_to_number (buf, INSN_SIZE);
+      /* le lr, <label> or le <label> */
+      if (((insn & 0xffffffff) == 0xc001f00f)
+	  || ((insn & 0xffffffff) == 0xc001f02f))
+	value = -value;
+
+      if (v8_1_branch_value_check (value, 12, FALSE) == FAIL)
+	as_bad_where (fixP->fx_file, fixP->fx_line,
+		      BAD_BRANCH_OFF);
+      if (fixP->fx_done || !seg->use_rela_p)
+	{
+	  addressT imml, immh;
+
+	  immh = (value & 0x00000ffc) >> 2;
+	  imml = (value & 0x00000002) >> 1;
+
+	  newval  = md_chars_to_number (buf + THUMB_SIZE, THUMB_SIZE);
+	  newval |= (imml << 11) | (immh << 1);
+	  md_number_to_chars (buf + THUMB_SIZE, newval, THUMB_SIZE);
+	}
+      break;
+
     case BFD_RELOC_ARM_V4BX:
       /* This will need to go in the object file.  */
       fixP->fx_done = 0;
@@ -25241,6 +25353,7 @@  tc_gen_reloc (asection *section, fixS *fixp)
 
     case BFD_RELOC_THUMB_PCREL_BRANCH5:
     case BFD_RELOC_THUMB_PCREL_BFCSEL:
+    case BFD_RELOC_ARM_THUMB_LOOP12:
       as_bad_where (fixp->fx_file, fixp->fx_line,
 		    _("%s used for a symbol not defined in the same file"),
 		    bfd_get_reloc_code_name (fixp->fx_r_type));
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.d b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.d
new file mode 100644
index 0000000000000000000000000000000000000000..d1f2a8dfae4fdcb7ca031f8d198d3e5091697391
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.d
@@ -0,0 +1,4 @@ 
+#name: Invalid Armv8.1-M Mainline Low Overhead Loop instructions
+#source: armv8_1-m-loloop-bad.s
+#as: -march=armv8.1-m.main
+#error_output: armv8_1-m-loloop-bad.l
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.l b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.l
new file mode 100644
index 0000000000000000000000000000000000000000..691917ebdcefb33aadf8e25314a2aaa4dbc13f86
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.l
@@ -0,0 +1,7 @@ 
+.*: Assembler messages:
+.*:5: Error: operand must be LR register -- `wls r1,r2,.LB1'
+.*:6: Error: operand must be LR register -- `dls r2,r2'
+.*:7: Error: r15 not allowed here -- `dls lr,pc'
+.*:8: Error: branch out of range or not a multiple of 2
+.*:9: Error: branch out of range or not a multiple of 2
+.*:10: Error: branch out of range or not a multiple of 2
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.s b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.s
new file mode 100644
index 0000000000000000000000000000000000000000..b4f19625db1406e503c2d80ef19162c2d0c1f27e
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop-bad.s
@@ -0,0 +1,12 @@ 
+	.syntax unified
+	.text
+	.thumb
+foo:
+	wls r1, r2, .LB1
+	dls r2, r2
+	dls lr, pc
+	le lr, #4096
+	le #-4098
+	le #-4095
+.LB1:
+	mov r3, r2
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop.d b/gas/testsuite/gas/arm/armv8_1-m-loloop.d
new file mode 100644
index 0000000000000000000000000000000000000000..25661738a338c0054da0ba212e2e3388b966d2b2
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop.d
@@ -0,0 +1,16 @@ 
+#name: Valid Armv8.1-M Mainline Low Overhead loop instructions
+#source: armv8_1-m-loloop.s
+#as: -march=armv8.1-m.main
+#objdump: -dr --prefix-addresses --show-raw-insn
+
+.*: +file format .*arm.*
+
+Disassembly of section .text:
+0[0-9a-f]+ <[^>]+> f042 c00d 	wls	lr, r2, 0000001c <foo\+0x1c>
+0[0-9a-f]+ <[^>]+> f042 e001 	dls	lr, r2
+0[0-9a-f]+ <[^>]+> f04e e001 	dls	lr, lr
+0[0-9a-f]+ <[^>]+> f00f c009 	le	lr, 00000000 <foo>
+0[0-9a-f]+ <[^>]+> f02f c00b 	le	00000000 <foo>
+0[0-9a-f]+ <[^>]+> f00f c24b 	le	lr, fffffb84 <foo\+0xfffffb84>
+0[0-9a-f]+ <[^>]+> f02f c007 	le	00000010 <foo\+0x10>
+0[0-9a-f]+ <[^>]+> 4613      	mov	r3, r2
diff --git a/gas/testsuite/gas/arm/armv8_1-m-loloop.s b/gas/testsuite/gas/arm/armv8_1-m-loloop.s
new file mode 100644
index 0000000000000000000000000000000000000000..8fb87e40aa5d17955903ff473e55c0f37eac50bb
--- /dev/null
+++ b/gas/testsuite/gas/arm/armv8_1-m-loloop.s
@@ -0,0 +1,14 @@ 
+	.syntax unified
+	.text
+	.thumb
+foo:
+.Lstart:
+	wls lr, r2, .LB1
+	dls lr, r2
+	dls lr, lr
+	le lr, .Lstart
+	le .Lstart
+	le lr, #-1172
+	le #-12
+.LB1:
+	mov r3, r2
diff --git a/opcodes/arm-dis.c b/opcodes/arm-dis.c
index b4865c1a42a04bb6a9b156d7369c065ca351efb4..2cf9507fbb5fe6505df66d2cd0c5abda2d53bdf9 100644
--- a/opcodes/arm-dis.c
+++ b/opcodes/arm-dis.c
@@ -2718,6 +2718,8 @@  static const struct opcode16 thumb_opcodes[] =
        %W		print an offset for BF instruction
        %Y		print an offset for BFL instruction
        %Z		print an offset for BFCSEL instruction
+       %Q		print an offset for Low Overhead Loop instructions
+       %P		print an offset for Low Overhead Loop end instructions
        %b		print a conditional branch offset
        %B		print an unconditional branch offset
        %s		print the shift field of an SSAT instruction
@@ -2751,6 +2753,15 @@  static const struct opcode16 thumb_opcodes[] =
 static const struct opcode32 thumb32_opcodes[] =
 {
   /* Armv8.1-M Mainline instructions.  */
+  {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
+    0xf040c001, 0xfff0f001, "wls\tlr, %16-19S, %Q"},
+  {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
+    0xf040e001, 0xfff0ffff, "dls\tlr, %16-19S"},
+  {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
+    0xf02fc001, 0xfffff001, "le\t%P"},
+  {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
+    0xf00fc001, 0xfffff001, "le\tlr, %P"},
+
   {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
     0xf040e001, 0xf860f001, "bf%c\t%G, %W"},
   {ARM_FEATURE_CORE_HIGH (ARM_EXT2_V8_1M_MAIN),
@@ -5945,6 +5956,32 @@  print_insn_thumb32 (bfd_vma pc, struct disassemble_info *info, long given)
 		}
 		break;
 
+	      case 'Q':
+		{
+		  unsigned int immh = (given & 0x000007feu) >> 1;
+		  unsigned int imml = (given & 0x00000800u) >> 11;
+		  bfd_vma imm32 = 0;
+
+		  imm32 |= immh << 2;
+		  imm32 |= imml << 1;
+
+		  info->print_address_func (pc + 4 + imm32, info);
+		}
+		break;
+
+	      case 'P':
+		{
+		  unsigned int immh = (given & 0x000007feu) >> 1;
+		  unsigned int imml = (given & 0x00000800u) >> 11;
+		  bfd_vma imm32 = 0;
+
+		  imm32 |= immh << 2;
+		  imm32 |= imml << 1;
+
+		  info->print_address_func (pc + 4 - imm32, info);
+		}
+		break;
+
 	      case 'b':
 		{
 		  unsigned int S = (given & 0x04000000u) >> 26;