[ARM] Fix PR89222

Message ID DB5PR08MB10309574AEA34670F820784F83640@DB5PR08MB1030.eurprd08.prod.outlook.com
State New
Headers show
Series
  • [ARM] Fix PR89222
Related show

Commit Message

Wilco Dijkstra Feb. 11, 2019, 5:35 p.m.
The GCC optimizer can generate symbols with non-zero offset from simple
if-statements. Bit zero is used for the Arm/Thumb state bit, so relocations
with offsets fail if it changes bit zero and the relocation forces bit zero
to true.  The fix is to disable offsets on function pointer symbols.  

ARMv5te bootstrap OK, regression tests pass. OK for commit?

ChangeLog:
2019-02-06  Wilco Dijkstra  <wdijkstr@arm.com>

    gcc/
	PR target/89222
	* config/arm/arm.md (movsi): Use arm_cannot_force_const_mem
	to decide when to split off an offset from a symbol.
	* config/arm/arm.c (arm_cannot_force_const_mem): Disallow offsets
	in function symbols.
	* config/arm/arm-protos.h (arm_cannot_force_const_mem): Add.

    testsuite/
	PR target/89222
	* gcc.target/arm/pr89222.c: Add new test.

--

Comments

Alexander Monakov Feb. 11, 2019, 6:12 p.m. | #1
On Mon, 11 Feb 2019, Wilco Dijkstra wrote:

> The GCC optimizer can generate symbols with non-zero offset from simple

> if-statements. Bit zero is used for the Arm/Thumb state bit, so relocations

> with offsets fail if it changes bit zero and the relocation forces bit zero

> to true.  The fix is to disable offsets on function pointer symbols.  

> 

> ARMv5te bootstrap OK, regression tests pass. OK for commit?


Just to be sure the issue is analyzed properly: if it's certain that this usage
is not allowed, shouldn't the linker produce a diagnostic instead of silently
concealing the issue?

With Gold linker this is handled correctly.  So it looks to me like a
bug in BFD linker, where it ignores any addend (not just +1/-1) when
resolving a relocation against a Thumb function.

Alexander
Wilco Dijkstra Feb. 11, 2019, 7:37 p.m. | #2
Hi Alexander,

> Just to be sure the issue is analyzed properly: if it's certain that this usage

> is not allowed, shouldn't the linker produce a diagnostic instead of silently

> concealing the issue?


The ABI doesn't require this but yes a linker could report a warning if the
addend of a function symbol has bit 0 set.

> With Gold linker this is handled correctly.  So it looks to me like a

> bug in BFD linker, where it ignores any addend (not just +1/-1) when

> resolving a relocation against a Thumb function.


If the Gold linker doesn't fail that means Gold has a serious bug in the way
it handles Thumb relocations. Can you elaborate, does it do S+A+1 rather than
(S+A) | 1 as the ARM-ELF spec requires?

Cheers,
Wilco
Alexander Monakov Feb. 11, 2019, 8:36 p.m. | #3
On Mon, 11 Feb 2019, Wilco Dijkstra wrote:
> > With Gold linker this is handled correctly.  So it looks to me like a

> > bug in BFD linker, where it ignores any addend (not just +1/-1) when

> > resolving a relocation against a Thumb function.

> 

> If the Gold linker doesn't fail that means Gold has a serious bug in the way

> it handles Thumb relocations. Can you elaborate, does it do S+A+1 rather than

> (S+A) | 1 as the ARM-ELF spec requires?


Apologies - it appears I might have mistyped something, as re-trying my tests
shows that both linkers properly implement the required '(S+A) | 1'.  I can't
reproduce any linker bug I suspected.

It seems odd to me that the spec requires '(S+A) | T' instead of the (imho
more intuitive) '(S|T) + A', but apart from the missing diagnostic from the
linkers, it seems they do as they must and GCC was at fault.

(perhaps it's okay to allow addends with low bit zero though, instead of
allowing only zero addends as your patch does?)

Thanks for clearing this up!
Alexander
Ramana Radhakrishnan Feb. 11, 2019, 9:32 p.m. | #4
On Mon, Feb 11, 2019 at 5:35 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>

> The GCC optimizer can generate symbols with non-zero offset from simple

> if-statements. Bit zero is used for the Arm/Thumb state bit, so relocations

> with offsets fail if it changes bit zero and the relocation forces bit zero

> to true.  The fix is to disable offsets on function pointer symbols.

>

> ARMv5te bootstrap OK, regression tests pass. OK for commit?


Interesting bug. armv5te-linux bootstrap ? Can you share your --target
and --with-arch flags ?

>

> ChangeLog:

> 2019-02-06  Wilco Dijkstra  <wdijkstr@arm.com>

>

>     gcc/

>         PR target/89222

>         * config/arm/arm.md (movsi): Use arm_cannot_force_const_mem

>         to decide when to split off an offset from a symbol.

>         * config/arm/arm.c (arm_cannot_force_const_mem): Disallow offsets

>         in function symbols.

>         * config/arm/arm-protos.h (arm_cannot_force_const_mem): Add.

>

>     testsuite/

>         PR target/89222

>         * gcc.target/arm/pr89222.c: Add new test.

>

> --

> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h

> index 79ede0db174fcce87abe8b4d18893550d4c7e2f6..0bedbe5110853617ecf7456bbaa56b1405fb65dd 100644

> --- a/gcc/config/arm/arm-protos.h

> +++ b/gcc/config/arm/arm-protos.h

> @@ -184,6 +184,7 @@ extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree);

>  extern bool arm_pad_reg_upward (machine_mode, tree, int);

>  #endif

>  extern int arm_apply_result_size (void);

> +extern bool arm_cannot_force_const_mem (machine_mode, rtx);

>

>  #endif /* RTX_CODE */

>

> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c

> index c4c9b4a667100d81d918196713e40b01ee232ee2..ccd4211045066d8edb89dd4c23d554517639f8f6 100644

> --- a/gcc/config/arm/arm.c

> +++ b/gcc/config/arm/arm.c

> @@ -178,7 +178,6 @@ static void arm_internal_label (FILE *, const char *, unsigned long);

>  static void arm_output_mi_thunk (FILE *, tree, HOST_WIDE_INT, HOST_WIDE_INT,

>                                  tree);

>  static bool arm_have_conditional_execution (void);

> -static bool arm_cannot_force_const_mem (machine_mode, rtx);

>  static bool arm_legitimate_constant_p (machine_mode, rtx);

>  static bool arm_rtx_costs (rtx, machine_mode, int, int, int *, bool);

>  static int arm_address_cost (rtx, machine_mode, addr_space_t, bool);

> @@ -8936,15 +8935,20 @@ arm_legitimate_constant_p (machine_mode mode, rtx x)

>

>  /* Implement TARGET_CANNOT_FORCE_CONST_MEM.  */

>

> -static bool


Let's keep this static ...

> +bool

>  arm_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)

>  {

>    rtx base, offset;

> +  split_const (x, &base, &offset);

>

> -  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)

> +  if (GET_CODE (base) == SYMBOL_REF)


Isn't there a SYMBOL_REF_P predicate for this ?

>      {

> -      split_const (x, &base, &offset);

> -      if (GET_CODE (base) == SYMBOL_REF

> +      /* Function symbols cannot have an offset due to the Thumb bit.  */

> +      if ((SYMBOL_REF_FLAGS (base) & SYMBOL_FLAG_FUNCTION)

> +         && INTVAL (offset) != 0)

> +       return true;

> +


Can we look to allow anything that is a power of 2 as an offset i.e.
anything with bit 0 set to 0 ? Could you please file an enhancement
request on binutils for both gold and ld to catch the linker warning
case ? I suspect we are looking for addends which have the lower bit
set and function symbols ?


> +      if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P

>           && !offset_within_block_p (base, INTVAL (offset)))

>         return true;

>      }


this looks ok.

> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md

> index aa759624f8f617576773aa75fd6239d6e06e8a13..00fccd964a86dd814f15e4a1fdf5b47173a3ee3f 100644

> --- a/gcc/config/arm/arm.md

> +++ b/gcc/config/arm/arm.md

> @@ -5981,17 +5981,13 @@ (define_expand "movsi"

>          }

>      }

>

> -  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)

> +  if (arm_cannot_force_const_mem (SImode, operands[1]))


Firstly (targetm.cannot_force_const_mem (...)) please instead of
arm_cannot_force_const_mem , then that can remain static.  Let's look
to use the targetm interface instead of direct calls here. We weren't
hitting this path for non-vxworks code , however now we do so if
arm_tls_referenced_p is true at the end of arm_cannot_force_const_mem
which means that we could well have a TLS address getting spat out or
am I mis-reading something ?

This is my main concern with this patch ..

>      {

>        split_const (operands[1], &base, &offset);


> -      if (GET_CODE (base) == SYMBOL_REF

> -         && !offset_within_block_p (base, INTVAL (offset)))

> -       {

> -         tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];

> -         emit_move_insn (tmp, base);

> -         emit_insn (gen_addsi3 (operands[0], tmp, offset));

> -         DONE;

> -       }

> +      tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];

> +      emit_move_insn (tmp, base);

> +      emit_insn (gen_addsi3 (operands[0], tmp, offset));

> +      DONE;


Can Olivier or someone who cares about vxworks also give this a quick
sanity run for the alternate code path once we resolve some of the
review questions here ? Don't we also need to worry about
-mslow-flash-data where we get rid of literal pools and have movw /
movt instructions ?

regards
Ramana

>      }

>

>    /* Recognize the case where operand[1] is a reference to thread-local

> diff --git a/gcc/testsuite/gcc.target/arm/pr89222.c b/gcc/testsuite/gcc.target/arm/pr89222.c

> new file mode 100644

> index 0000000000000000000000000000000000000000..d26d7df17544db8426331e67b9a36d749ec6c6d1

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/arm/pr89222.c

> @@ -0,0 +1,32 @@

> +/* { dg-do compile } */

> +/* { dg-options "-O2" } */

> +

> +void g (void);

> +

> +void f1 (int x)

> +{

> +  if (x != (int) g + 3)

> +    return;

> +  g();

> +}

> +

> +void (*a2)(void);

> +

> +void f2 (void)

> +{

> +  a2 = &g + 3;

> +}

> +

> +typedef void (*__sighandler_t)(int);

> +void handler (int);

> +

> +void f3 (int x)

> +{

> +  __sighandler_t h = &handler;

> +  if (h != (__sighandler_t) 2 && h != (__sighandler_t) 1)

> +    h (x);

> +}

> +

> +/* { dg-final { scan-assembler-times {add(?:s)?\tr[0-9]+, r[0-9]+, #3} 2 } } */

> +/* { dg-final { scan-assembler-not {.word\tg\+3} } } */

> +/* { dg-final { scan-assembler-not {.word\thandler-1} } } */

>
Wilco Dijkstra Feb. 12, 2019, 12:51 p.m. | #5
Hi Alexander,

> It seems odd to me that the spec requires '(S+A) | T' instead of the (imho

> more intuitive) '(S|T) + A', but apart from the missing diagnostic from the

> linkers, it seems they do as they must and GCC was at fault.


Doing (S+A) | T means bit zero always correctly encodes the Thumb state,
otherwise the +A could change Thumb into Arm and visa versa.

> (perhaps it's okay to allow addends with low bit zero though, instead of

> allowing only zero addends as your patch does?)


Maybe, but there aren't many cases where an addend is useful. One really
shouldn't be doing arbitrary arithmetic with function pointers.

Cheers,
Wilco
Olivier Hainque Feb. 13, 2019, 9:24 a.m. | #6
> On 11 Feb 2019, at 22:32, Ramana Radhakrishnan <ramana.gcc@googlemail.com> wrote:


> Can Olivier or someone who cares about vxworks also give this a quick

> sanity run for the alternate code path once we resolve some of the

> review questions here ? Don't we also need to worry about

> -mslow-flash-data where we get rid of literal pools and have movw /

> movt instructions ?


Thanks fo asking :)

I don't see obvious reasons why this would be problematic, but
you never know for sure until you try and I'll be happy to give
it a whirl on VxWorks.

I'll probably only be able to perform execution testing
on top of a gcc-8 toolchain though. I can also attempt
a VxWorks toolchain build with mainline of course.

Olivier
Wilco Dijkstra Feb. 13, 2019, 12:23 p.m. | #7
Hi Ramana,

>> ARMv5te bootstrap OK, regression tests pass. OK for commit?

>

> Interesting bug. armv5te-linux bootstrap ? Can you share your --target

> and --with-arch flags ?


--target/host/build=arm-linux-gnueabi --with-arch=armv5te  --with-mode=arm

>> +  if (GET_CODE (base) == SYMBOL_REF)

>

> Isn't there a SYMBOL_REF_P predicate for this ?


Yes, I've changed this one, but this really should be done as a cleanup across
Arm and AArch64 given there are 100 occurrences that need to be fixed.

> Can we look to allow anything that is a power of 2 as an offset i.e.

> anything with bit 0 set to 0 ? Could you please file an enhancement

> request on binutils for both gold and ld to catch the linker warning

> case ? I suspect we are looking for addends which have the lower bit

> set and function symbols ?


I don't think it is useful optimization to allow an offset on function symbols.
A linker warning would be useful indeed, I'll file an enhancement request.

> Firstly (targetm.cannot_force_const_mem (...)) please instead of

>arm_cannot_force_const_mem , then that can remain static.  Let's look

> to use the targetm interface instead of direct calls here. We weren't


I've changed it to use targetm in the new version.

> hitting this path for non-vxworks code , however now we do so if

> arm_tls_referenced_p is true at the end of arm_cannot_force_const_mem

> which means that we could well have a TLS address getting spat out or

> am I mis-reading something ?


Yes there was a path where we could end up in an endless expansion loop
if arm_tls_referenced_p is true. I fixed this by checking the offset is nonzero
before expanding. This also allowed a major simplification of the TLS code
which was trying to do the same thing.

> Can Olivier or someone who cares about vxworks also give this a quick

> sanity run for the alternate code path once we resolve some of the

> review questions here ? Don't we also need to worry about

> -mslow-flash-data where we get rid of literal pools and have movw /

> movt instructions ?


Splitting the offset early means it works fine for MOVW/MOVT. Eg before
my change -mcpu=cortex-m3 -mslow-flash-data:

f3:
	movw	r3, #:lower16:handler-1
	movt	r3, #:upper16:handler-1

After:
	movw	r3, #:lower16:handler
	movt	r3, #:upper16:handler
	subs	r3, r3, #1


Here is the updated version:

The GCC optimizer can generate symbols with non-zero offset from simple
if-statements. Bit zero is used for the Arm/Thumb state bit, so relocations
with offsets fail if it changes bit zero and the relocation forces bit zero
to true.  The fix is to disable offsets on function pointer symbols.  

ARMv5te and armhf bootstrap OK, regression tests pass. OK for commit?

ChangeLog:
2019-02-12  Wilco Dijkstra  <wdijkstr@arm.com>

    gcc/
	PR target/89222
	* config/arm/arm.md (movsi): Use targetm.cannot_force_const_mem
	to decide when to split off a non-zero offset from a symbol.
	* config/arm/arm.c (arm_cannot_force_const_mem): Disallow offsets
	in function symbols.

    testsuite/
	PR target/89222
	* gcc.target/arm/pr89222.c: Add new test.
---
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index f07f4cc47b6cfcea8f44960bf4760ea9a46b8f87..69b74a237a5f10b4137aa995ad43b77d6ecd04db 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8940,11 +8940,16 @@ static bool
 arm_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
 {
   rtx base, offset;
+  split_const (x, &base, &offset);
 
-  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)
+  if (SYMBOL_REF_P (base))
     {
-      split_const (x, &base, &offset);
-      if (GET_CODE (base) == SYMBOL_REF
+      /* Function symbols cannot have an offset due to the Thumb bit.  */
+      if ((SYMBOL_REF_FLAGS (base) & SYMBOL_FLAG_FUNCTION)
+	  && INTVAL (offset) != 0)
+	return true;
+
+      if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P
 	  && !offset_within_block_p (base, INTVAL (offset)))
 	return true;
     }
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index aa759624f8f617576773aa75fd6239d6e06e8a13..88110cb732b52866d4fdcad70fd5a202aa62bd03 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5981,53 +5981,29 @@ (define_expand "movsi"
         }
     }
 
-  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)
+  split_const (operands[1], &base, &offset);
+  if (INTVAL (offset) != 0
+      && targetm.cannot_force_const_mem (SImode, operands[1]))
     {
-      split_const (operands[1], &base, &offset);
-      if (GET_CODE (base) == SYMBOL_REF
-	  && !offset_within_block_p (base, INTVAL (offset)))
-	{
-	  tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];
-	  emit_move_insn (tmp, base);
-	  emit_insn (gen_addsi3 (operands[0], tmp, offset));
-	  DONE;
-	}
+      tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];
+      emit_move_insn (tmp, base);
+      emit_insn (gen_addsi3 (operands[0], tmp, offset));
+      DONE;
     }
 
+  tmp = can_create_pseudo_p () ? NULL_RTX : operands[0];
+
   /* Recognize the case where operand[1] is a reference to thread-local
-     data and load its address to a register.  */
+     data and load its address to a register.  Offsets have been split off
+     already.  */
   if (arm_tls_referenced_p (operands[1]))
-    {
-      rtx tmp = operands[1];
-      rtx addend = NULL;
-
-      if (GET_CODE (tmp) == CONST && GET_CODE (XEXP (tmp, 0)) == PLUS)
-        {
-          addend = XEXP (XEXP (tmp, 0), 1);
-          tmp = XEXP (XEXP (tmp, 0), 0);
-        }
-
-      gcc_assert (GET_CODE (tmp) == SYMBOL_REF);
-      gcc_assert (SYMBOL_REF_TLS_MODEL (tmp) != 0);
-
-      tmp = legitimize_tls_address (tmp,
-				    !can_create_pseudo_p () ? operands[0] : 0);
-      if (addend)
-        {
-          tmp = gen_rtx_PLUS (SImode, tmp, addend);
-          tmp = force_operand (tmp, operands[0]);
-        }
-      operands[1] = tmp;
-    }
+    operands[1] = legitimize_tls_address (operands[1], tmp);
   else if (flag_pic
 	   && (CONSTANT_P (operands[1])
 	       || symbol_mentioned_p (operands[1])
 	       || label_mentioned_p (operands[1])))
-      operands[1] = legitimize_pic_address (operands[1], SImode,
-					    (!can_create_pseudo_p ()
-					     ? operands[0]
-					     : NULL_RTX), NULL_RTX,
-					    false /*compute_now*/);
+    operands[1] =
+      legitimize_pic_address (operands[1], SImode, tmp, NULL_RTX, false);
   }
   "
 )
diff --git a/gcc/testsuite/gcc.target/arm/pr89222.c b/gcc/testsuite/gcc.target/arm/pr89222.c
new file mode 100644
index 0000000000000000000000000000000000000000..d26d7df17544db8426331e67b9a36d749ec6c6d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr89222.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void g (void);
+
+void f1 (int x)
+{
+  if (x != (int) g + 3)
+    return;
+  g();
+}
+
+void (*a2)(void);
+
+void f2 (void)
+{
+  a2 = &g + 3;
+}
+
+typedef void (*__sighandler_t)(int);
+void handler (int);
+
+void f3 (int x)
+{
+  __sighandler_t h = &handler;
+  if (h != (__sighandler_t) 2 && h != (__sighandler_t) 1)
+    h (x);
+}
+
+/* { dg-final { scan-assembler-times {add(?:s)?\tr[0-9]+, r[0-9]+, #3} 2 } } */
+/* { dg-final { scan-assembler-not {.word\tg\+3} } } */
+/* { dg-final { scan-assembler-not {.word\thandler-1} } } */
Wilco Dijkstra March 5, 2019, 12:33 p.m. | #8
ping



From: Wilco Dijkstra

Sent: 13 February 2019 12:23
To: Ramana Radhakrishnan
Cc: GCC Patches; nd; Olivier Hainque
Subject: Re: [PATCH][ARM] Fix PR89222
  

Hi Ramana,

>> ARMv5te bootstrap OK, regression tests pass. OK for commit?

>

> Interesting bug. armv5te-linux bootstrap ? Can you share your --target

> and --with-arch flags ?


--target/host/build=arm-linux-gnueabi --with-arch=armv5te  --with-mode=arm

>> +  if (GET_CODE (base) == SYMBOL_REF)

>

> Isn't there a SYMBOL_REF_P predicate for this ?


Yes, I've changed this one, but this really should be done as a cleanup across
Arm and AArch64 given there are 100 occurrences that need to be fixed.

> Can we look to allow anything that is a power of 2 as an offset i.e.

> anything with bit 0 set to 0 ? Could you please file an enhancement

> request on binutils for both gold and ld to catch the linker warning

> case ? I suspect we are looking for addends which have the lower bit

> set and function symbols ?


I don't think it is useful optimization to allow an offset on function symbols.
A linker warning would be useful indeed, I'll file an enhancement request.

> Firstly (targetm.cannot_force_const_mem (...)) please instead of

>arm_cannot_force_const_mem , then that can remain static.  Let's look

> to use the targetm interface instead of direct calls here. We weren't


I've changed it to use targetm in the new version.

> hitting this path for non-vxworks code , however now we do so if

> arm_tls_referenced_p is true at the end of arm_cannot_force_const_mem

> which means that we could well have a TLS address getting spat out or

> am I mis-reading something ?


Yes there was a path where we could end up in an endless expansion loop
if arm_tls_referenced_p is true. I fixed this by checking the offset is nonzero
before expanding. This also allowed a major simplification of the TLS code
which was trying to do the same thing.

> Can Olivier or someone who cares about vxworks also give this a quick

> sanity run for the alternate code path once we resolve some of the

> review questions here ? Don't we also need to worry about

> -mslow-flash-data where we get rid of literal pools and have movw /

> movt instructions ?


Splitting the offset early means it works fine for MOVW/MOVT. Eg before
my change -mcpu=cortex-m3 -mslow-flash-data:

f3:
        movw    r3, #:lower16:handler-1
        movt    r3, #:upper16:handler-1

After:
        movw    r3, #:lower16:handler
        movt    r3, #:upper16:handler
        subs    r3, r3, #1


Here is the updated version:

The GCC optimizer can generate symbols with non-zero offset from simple
if-statements. Bit zero is used for the Arm/Thumb state bit, so relocations
with offsets fail if it changes bit zero and the relocation forces bit zero
to true.  The fix is to disable offsets on function pointer symbols.  

ARMv5te and armhf bootstrap OK, regression tests pass. OK for commit?

ChangeLog:
2019-02-12  Wilco Dijkstra  <wdijkstr@arm.com>

    gcc/
        PR target/89222
        * config/arm/arm.md (movsi): Use targetm.cannot_force_const_mem
        to decide when to split off a non-zero offset from a symbol.
        * config/arm/arm.c (arm_cannot_force_const_mem): Disallow offsets
        in function symbols.

    testsuite/
        PR target/89222
        * gcc.target/arm/pr89222.c: Add new test.
---
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index f07f4cc47b6cfcea8f44960bf4760ea9a46b8f87..69b74a237a5f10b4137aa995ad43b77d6ecd04db 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8940,11 +8940,16 @@ static bool
 arm_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
 {
   rtx base, offset;
+  split_const (x, &base, &offset);
 
-  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)
+  if (SYMBOL_REF_P (base))
     {
-      split_const (x, &base, &offset);
-      if (GET_CODE (base) == SYMBOL_REF
+      /* Function symbols cannot have an offset due to the Thumb bit.  */
+      if ((SYMBOL_REF_FLAGS (base) & SYMBOL_FLAG_FUNCTION)
+         && INTVAL (offset) != 0)
+       return true;
+
+      if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P
           && !offset_within_block_p (base, INTVAL (offset)))
         return true;
     }
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index aa759624f8f617576773aa75fd6239d6e06e8a13..88110cb732b52866d4fdcad70fd5a202aa62bd03 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5981,53 +5981,29 @@ (define_expand "movsi"
         }
     }
 
-  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)
+  split_const (operands[1], &base, &offset);
+  if (INTVAL (offset) != 0
+      && targetm.cannot_force_const_mem (SImode, operands[1]))
     {
-      split_const (operands[1], &base, &offset);
-      if (GET_CODE (base) == SYMBOL_REF
-         && !offset_within_block_p (base, INTVAL (offset)))
-       {
-         tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];
-         emit_move_insn (tmp, base);
-         emit_insn (gen_addsi3 (operands[0], tmp, offset));
-         DONE;
-       }
+      tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];
+      emit_move_insn (tmp, base);
+      emit_insn (gen_addsi3 (operands[0], tmp, offset));
+      DONE;
     }
 
+  tmp = can_create_pseudo_p () ? NULL_RTX : operands[0];
+
   /* Recognize the case where operand[1] is a reference to thread-local
-     data and load its address to a register.  */
+     data and load its address to a register.  Offsets have been split off
+     already.  */
   if (arm_tls_referenced_p (operands[1]))
-    {
-      rtx tmp = operands[1];
-      rtx addend = NULL;
-
-      if (GET_CODE (tmp) == CONST && GET_CODE (XEXP (tmp, 0)) == PLUS)
-        {
-          addend = XEXP (XEXP (tmp, 0), 1);
-          tmp = XEXP (XEXP (tmp, 0), 0);
-        }
-
-      gcc_assert (GET_CODE (tmp) == SYMBOL_REF);
-      gcc_assert (SYMBOL_REF_TLS_MODEL (tmp) != 0);
-
-      tmp = legitimize_tls_address (tmp,
-                                   !can_create_pseudo_p () ? operands[0] : 0);
-      if (addend)
-        {
-          tmp = gen_rtx_PLUS (SImode, tmp, addend);
-          tmp = force_operand (tmp, operands[0]);
-        }
-      operands[1] = tmp;
-    }
+    operands[1] = legitimize_tls_address (operands[1], tmp);
   else if (flag_pic
            && (CONSTANT_P (operands[1])
                || symbol_mentioned_p (operands[1])
                || label_mentioned_p (operands[1])))
-      operands[1] = legitimize_pic_address (operands[1], SImode,
-                                           (!can_create_pseudo_p ()
-                                            ? operands[0]
-                                            : NULL_RTX), NULL_RTX,
-                                           false /*compute_now*/);
+    operands[1] =
+      legitimize_pic_address (operands[1], SImode, tmp, NULL_RTX, false);
   }
   "
 )
diff --git a/gcc/testsuite/gcc.target/arm/pr89222.c b/gcc/testsuite/gcc.target/arm/pr89222.c
new file mode 100644
index 0000000000000000000000000000000000000000..d26d7df17544db8426331e67b9a36d749ec6c6d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr89222.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void g (void);
+
+void f1 (int x)
+{
+  if (x != (int) g + 3)
+    return;
+  g();
+}
+
+void (*a2)(void);
+
+void f2 (void)
+{
+  a2 = &g + 3;
+}
+
+typedef void (*__sighandler_t)(int);
+void handler (int);
+
+void f3 (int x)
+{
+  __sighandler_t h = &handler;
+  if (h != (__sighandler_t) 2 && h != (__sighandler_t) 1)
+    h (x);
+}
+
+/* { dg-final { scan-assembler-times {add(?:s)?\tr[0-9]+, r[0-9]+, #3} 2 } } */
+/* { dg-final { scan-assembler-not {.word\tg\+3} } } */
+/* { dg-final { scan-assembler-not {.word\thandler-1} } } */
Ramana Radhakrishnan March 5, 2019, 12:58 p.m. | #9
On 05/03/2019 12:33, Wilco Dijkstra wrote:
> 

> 

> ping

> 

> 

> 

> From: Wilco Dijkstra

> Sent: 13 February 2019 12:23

> To: Ramana Radhakrishnan

> Cc: GCC Patches; nd; Olivier Hainque

> Subject: Re: [PATCH][ARM] Fix PR89222

>    

> 

> Hi Ramana,

> 

>>> ARMv5te bootstrap OK, regression tests pass. OK for commit?

>>

>> Interesting bug. armv5te-linux bootstrap ? Can you share your --target

>> and --with-arch flags ?

> 

> --target/host/build=arm-linux-gnueabi --with-arch=armv5te  --with-mode=arm

> 

>>> +  if (GET_CODE (base) == SYMBOL_REF)

>>

>> Isn't there a SYMBOL_REF_P predicate for this ?

> 

> Yes, I've changed this one, but this really should be done as a cleanup across

> Arm and AArch64 given there are 100 occurrences that need to be fixed.

> 

>> Can we look to allow anything that is a power of 2 as an offset i.e.

>> anything with bit 0 set to 0 ? Could you please file an enhancement

>> request on binutils for both gold and ld to catch the linker warning

>> case ? I suspect we are looking for addends which have the lower bit

>> set and function symbols ?

> 

> I don't think it is useful optimization to allow an offset on function symbols.

> A linker warning would be useful indeed, I'll file an enhancement request.

> 

>> Firstly (targetm.cannot_force_const_mem (...)) please instead of

>> arm_cannot_force_const_mem , then that can remain static.  Let's look

>> to use the targetm interface instead of direct calls here. We weren't

> 

> I've changed it to use targetm in the new version.

> 

>> hitting this path for non-vxworks code , however now we do so if

>> arm_tls_referenced_p is true at the end of arm_cannot_force_const_mem

>> which means that we could well have a TLS address getting spat out or

>> am I mis-reading something ?

> 

> Yes there was a path where we could end up in an endless expansion loop

> if arm_tls_referenced_p is true. I fixed this by checking the offset is nonzero

> before expanding. This also allowed a major simplification of the TLS code

> which was trying to do the same thing.

> 

>> Can Olivier or someone who cares about vxworks also give this a quick

>> sanity run for the alternate code path once we resolve some of the

>> review questions here ? Don't we also need to worry about

>> -mslow-flash-data where we get rid of literal pools and have movw /

>> movt instructions ?

> 

> Splitting the offset early means it works fine for MOVW/MOVT. Eg before

> my change -mcpu=cortex-m3 -mslow-flash-data:

> 

> f3:

>          movw    r3, #:lower16:handler-1

>          movt    r3, #:upper16:handler-1

> 

> After:

>          movw    r3, #:lower16:handler

>          movt    r3, #:upper16:handler

>          subs    r3, r3, #1

> 

> 

> Here is the updated version:

> 

> The GCC optimizer can generate symbols with non-zero offset from simple

> if-statements. Bit zero is used for the Arm/Thumb state bit, so relocations

> with offsets fail if it changes bit zero and the relocation forces bit zero

> to true.  The fix is to disable offsets on function pointer symbols.

> 

> ARMv5te and armhf bootstrap OK, regression tests pass. OK for commit?

> 

> ChangeLog:

> 2019-02-12  Wilco Dijkstra  <wdijkstr@arm.com>

> 

>      gcc/

>          PR target/89222

>          * config/arm/arm.md (movsi): Use targetm.cannot_force_const_mem

>          to decide when to split off a non-zero offset from a symbol.

>          * config/arm/arm.c (arm_cannot_force_const_mem): Disallow offsets

>          in function symbols.

> 

>      testsuite/

>          PR target/89222

>          * gcc.target/arm/pr89222.c: Add new test.

> ---

> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c

> index f07f4cc47b6cfcea8f44960bf4760ea9a46b8f87..69b74a237a5f10b4137aa995ad43b77d6ecd04db 100644

> --- a/gcc/config/arm/arm.c

> +++ b/gcc/config/arm/arm.c

> @@ -8940,11 +8940,16 @@ static bool

>   arm_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)

>   {

>     rtx base, offset;

> +  split_const (x, &base, &offset);

>   

> -  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)

> +  if (SYMBOL_REF_P (base))

>       {

> -      split_const (x, &base, &offset);

> -      if (GET_CODE (base) == SYMBOL_REF

> +      /* Function symbols cannot have an offset due to the Thumb bit.  */

> +      if ((SYMBOL_REF_FLAGS (base) & SYMBOL_FLAG_FUNCTION)

> +         && INTVAL (offset) != 0)

> +       return true;

> +

> +      if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P

>             && !offset_within_block_p (base, INTVAL (offset)))

>           return true;

>       }

> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md

> index aa759624f8f617576773aa75fd6239d6e06e8a13..88110cb732b52866d4fdcad70fd5a202aa62bd03 100644

> --- a/gcc/config/arm/arm.md

> +++ b/gcc/config/arm/arm.md

> @@ -5981,53 +5981,29 @@ (define_expand "movsi"

>           }

>       }

>   

> -  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)

> +  split_const (operands[1], &base, &offset);

> +  if (INTVAL (offset) != 0

> +      && targetm.cannot_force_const_mem (SImode, operands[1]))

>       {

> -      split_const (operands[1], &base, &offset);

> -      if (GET_CODE (base) == SYMBOL_REF

> -         && !offset_within_block_p (base, INTVAL (offset)))

> -       {

> -         tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];

> -         emit_move_insn (tmp, base);

> -         emit_insn (gen_addsi3 (operands[0], tmp, offset));

> -         DONE;

> -       }

> +      tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];

> +      emit_move_insn (tmp, base);

> +      emit_insn (gen_addsi3 (operands[0], tmp, offset));

> +      DONE;

>       }

>   

> +  tmp = can_create_pseudo_p () ? NULL_RTX : operands[0];

> +

>     /* Recognize the case where operand[1] is a reference to thread-local

> -     data and load its address to a register.  */

> +     data and load its address to a register.  Offsets have been split off

> +     already.  */

>     if (arm_tls_referenced_p (operands[1]))

> -    {

> -      rtx tmp = operands[1];

> -      rtx addend = NULL;

> -

> -      if (GET_CODE (tmp) == CONST && GET_CODE (XEXP (tmp, 0)) == PLUS)

> -        {

> -          addend = XEXP (XEXP (tmp, 0), 1);

> -          tmp = XEXP (XEXP (tmp, 0), 0);

> -        }

> -

> -      gcc_assert (GET_CODE (tmp) == SYMBOL_REF);

> -      gcc_assert (SYMBOL_REF_TLS_MODEL (tmp) != 0);

> -

> -      tmp = legitimize_tls_address (tmp,

> -                                   !can_create_pseudo_p () ? operands[0] : 0);

> -      if (addend)

> -        {

> -          tmp = gen_rtx_PLUS (SImode, tmp, addend);

> -          tmp = force_operand (tmp, operands[0]);

> -        }

> -      operands[1] = tmp;

> -    }

> +    operands[1] = legitimize_tls_address (operands[1], tmp);

>     else if (flag_pic

>              && (CONSTANT_P (operands[1])

>                  || symbol_mentioned_p (operands[1])

>                  || label_mentioned_p (operands[1])))

> -      operands[1] = legitimize_pic_address (operands[1], SImode,

> -                                           (!can_create_pseudo_p ()

> -                                            ? operands[0]

> -                                            : NULL_RTX), NULL_RTX,

> -                                           false /*compute_now*/);

> +    operands[1] =

> +      legitimize_pic_address (operands[1], SImode, tmp, NULL_RTX, false);

>     }

>     "

>   )

> diff --git a/gcc/testsuite/gcc.target/arm/pr89222.c b/gcc/testsuite/gcc.target/arm/pr89222.c

> new file mode 100644

> index 0000000000000000000000000000000000000000..d26d7df17544db8426331e67b9a36d749ec6c6d1

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/arm/pr89222.c

> @@ -0,0 +1,32 @@

> +/* { dg-do compile } */

> +/* { dg-options "-O2" } */

> +

> +void g (void);

> +

> +void f1 (int x)

> +{

> +  if (x != (int) g + 3)

> +    return;

> +  g();

> +}

> +

> +void (*a2)(void);

> +

> +void f2 (void)

> +{

> +  a2 = &g + 3;

> +}

> +

> +typedef void (*__sighandler_t)(int);

> +void handler (int);

> +

> +void f3 (int x)

> +{

> +  __sighandler_t h = &handler;

> +  if (h != (__sighandler_t) 2 && h != (__sighandler_t) 1)

> +    h (x);

> +}

> +

> +/* { dg-final { scan-assembler-times {add(?:s)?\tr[0-9]+, r[0-9]+, #3} 2 } } */

> +/* { dg-final { scan-assembler-not {.word\tg\+3} } } */

> +/* { dg-final { scan-assembler-not {.word\thandler-1} } } */

>      

> 


OK. (Watch out for any testisms / regressions as usual).

Ramana

Patch

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 79ede0db174fcce87abe8b4d18893550d4c7e2f6..0bedbe5110853617ecf7456bbaa56b1405fb65dd 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -184,6 +184,7 @@  extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree);
 extern bool arm_pad_reg_upward (machine_mode, tree, int);
 #endif
 extern int arm_apply_result_size (void);
+extern bool arm_cannot_force_const_mem (machine_mode, rtx);
 
 #endif /* RTX_CODE */
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c4c9b4a667100d81d918196713e40b01ee232ee2..ccd4211045066d8edb89dd4c23d554517639f8f6 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -178,7 +178,6 @@  static void arm_internal_label (FILE *, const char *, unsigned long);
 static void arm_output_mi_thunk (FILE *, tree, HOST_WIDE_INT, HOST_WIDE_INT,
 				 tree);
 static bool arm_have_conditional_execution (void);
-static bool arm_cannot_force_const_mem (machine_mode, rtx);
 static bool arm_legitimate_constant_p (machine_mode, rtx);
 static bool arm_rtx_costs (rtx, machine_mode, int, int, int *, bool);
 static int arm_address_cost (rtx, machine_mode, addr_space_t, bool);
@@ -8936,15 +8935,20 @@  arm_legitimate_constant_p (machine_mode mode, rtx x)
 
 /* Implement TARGET_CANNOT_FORCE_CONST_MEM.  */
 
-static bool
+bool
 arm_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
 {
   rtx base, offset;
+  split_const (x, &base, &offset);
 
-  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)
+  if (GET_CODE (base) == SYMBOL_REF)
     {
-      split_const (x, &base, &offset);
-      if (GET_CODE (base) == SYMBOL_REF
+      /* Function symbols cannot have an offset due to the Thumb bit.  */
+      if ((SYMBOL_REF_FLAGS (base) & SYMBOL_FLAG_FUNCTION)
+	  && INTVAL (offset) != 0)
+	return true;
+
+      if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P
 	  && !offset_within_block_p (base, INTVAL (offset)))
 	return true;
     }
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index aa759624f8f617576773aa75fd6239d6e06e8a13..00fccd964a86dd814f15e4a1fdf5b47173a3ee3f 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5981,17 +5981,13 @@  (define_expand "movsi"
         }
     }
 
-  if (ARM_OFFSETS_MUST_BE_WITHIN_SECTIONS_P)
+  if (arm_cannot_force_const_mem (SImode, operands[1]))
     {
       split_const (operands[1], &base, &offset);
-      if (GET_CODE (base) == SYMBOL_REF
-	  && !offset_within_block_p (base, INTVAL (offset)))
-	{
-	  tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];
-	  emit_move_insn (tmp, base);
-	  emit_insn (gen_addsi3 (operands[0], tmp, offset));
-	  DONE;
-	}
+      tmp = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];
+      emit_move_insn (tmp, base);
+      emit_insn (gen_addsi3 (operands[0], tmp, offset));
+      DONE;
     }
 
   /* Recognize the case where operand[1] is a reference to thread-local
diff --git a/gcc/testsuite/gcc.target/arm/pr89222.c b/gcc/testsuite/gcc.target/arm/pr89222.c
new file mode 100644
index 0000000000000000000000000000000000000000..d26d7df17544db8426331e67b9a36d749ec6c6d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr89222.c
@@ -0,0 +1,32 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void g (void);
+
+void f1 (int x)
+{
+  if (x != (int) g + 3)
+    return;
+  g();
+}
+
+void (*a2)(void);
+
+void f2 (void)
+{
+  a2 = &g + 3;
+}
+
+typedef void (*__sighandler_t)(int);
+void handler (int);
+
+void f3 (int x)
+{
+  __sighandler_t h = &handler;
+  if (h != (__sighandler_t) 2 && h != (__sighandler_t) 1)
+    h (x);
+}
+
+/* { dg-final { scan-assembler-times {add(?:s)?\tr[0-9]+, r[0-9]+, #3} 2 } } */
+/* { dg-final { scan-assembler-not {.word\tg\+3} } } */
+/* { dg-final { scan-assembler-not {.word\thandler-1} } } */