[PR,target/83994] Fix stack-clash-protection code generation on x86

Message ID b13ea7fe-42be-357d-cc03-25a234fdae30@gmail.com
State New
Headers show
Series
  • [PR,target/83994] Fix stack-clash-protection code generation on x86
Related show

Commit Message

Jeff Law Jan. 23, 2018, 11:15 p.m.
pr83994 is a code generation bug in the stack-clash support that affects
openssl (we've turned on stack-clash-protection by default for the F28
builds).

The core problem is stack-clash (like stack-check) will emit a probing
loop if the prologue allocates enough stack space.  When emitting a loop
both implementations will need a scratch register.

They use get_scratch_register_at_entry to find a suitable scratch
register.  This routine assumes that callee registers saves are
completed at the point where the scratch register is going to be used.

In this particular testcase we select %ebx because ax,cx,dx are used for
parameter passing.  That's fine.  The problem is %ebx hasn't been saved yet!

-fstack-check has a bit of code in the frame setup/layout code which
forces the prologue to use pushes rather than reg->mem moves for saving
registers.  There's a gcc_assert in the prologue expander to catch any
case where the registers aren't saved.

-fstack-clash-protection doesn't have that same bit of magic in the
frame setup/layout code and it bypasses the assertion due to a change I
made back in Nov 2017 due to not being aware of this particular issue.

This patch reverts the assertion bypass I added back in Nov 2017 and
adds clarifying comments.  The patch also forces use of push to save
integer registers for a stack-clash protected prologue if probes are
going to be needed.

Bootstrapped and regression tested on x86_64.

While the bug is not marked as a regression, ISTM this needs to be fixed
for gcc-8.

OK for the trunk?

Jeff
* i386.c (get_probe_interval): Move to earlier point.
	(ix86_compute_frame_layout): If -fstack-clash-protection and
	the frame is larger than the probe interval, then use pushes
	to save registers rather than reg->mem moves.
	(ix86_expand_prologue): Remove conditional for int_registers_saved
	assertion.

	* gcc.target/i386/pr83994.c: New test.

Comments

Uros Bizjak Jan. 24, 2018, 7:11 a.m. | #1
On Wed, Jan 24, 2018 at 12:15 AM, Jeff Law <suzanne.jeff.law@gmail.com> wrote:
>

> pr83994 is a code generation bug in the stack-clash support that affects

> openssl (we've turned on stack-clash-protection by default for the F28

> builds).

>

> The core problem is stack-clash (like stack-check) will emit a probing

> loop if the prologue allocates enough stack space.  When emitting a loop

> both implementations will need a scratch register.

>

> They use get_scratch_register_at_entry to find a suitable scratch

> register.  This routine assumes that callee registers saves are

> completed at the point where the scratch register is going to be used.

>

> In this particular testcase we select %ebx because ax,cx,dx are used for

> parameter passing.  That's fine.  The problem is %ebx hasn't been saved yet!

>

> -fstack-check has a bit of code in the frame setup/layout code which

> forces the prologue to use pushes rather than reg->mem moves for saving

> registers.  There's a gcc_assert in the prologue expander to catch any

> case where the registers aren't saved.

>

> -fstack-clash-protection doesn't have that same bit of magic in the

> frame setup/layout code and it bypasses the assertion due to a change I

> made back in Nov 2017 due to not being aware of this particular issue.

>

> This patch reverts the assertion bypass I added back in Nov 2017 and

> adds clarifying comments.  The patch also forces use of push to save

> integer registers for a stack-clash protected prologue if probes are

> going to be needed.

>

> Bootstrapped and regression tested on x86_64.

>

> While the bug is not marked as a regression, ISTM this needs to be fixed

> for gcc-8.

>

> OK for the trunk?

>

> Jeff

>

>         * i386.c (get_probe_interval): Move to earlier point.

>         (ix86_compute_frame_layout): If -fstack-clash-protection and

>         the frame is larger than the probe interval, then use pushes

>         to save registers rather than reg->mem moves.

>         (ix86_expand_prologue): Remove conditional for int_registers_saved

>         assertion.

>

>         * gcc.target/i386/pr83994.c: New test.


OK with the fixed testcase (see below).

Thanks,
Uros.

>

> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

> index 72d25ae..4cb55a8 100644

> --- a/gcc/config/i386/i386.c

> +++ b/gcc/config/i386/i386.c

> @@ -11497,6 +11497,18 @@ static void warn_once_call_ms2sysv_xlogues (const char *feature)

>      }

>  }

>

> +/* Return the probing interval for -fstack-clash-protection.  */

> +

> +static HOST_WIDE_INT

> +get_probe_interval (void)

> +{

> +  if (flag_stack_clash_protection)

> +    return (HOST_WIDE_INT_1U

> +           << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL));

> +  else

> +    return (HOST_WIDE_INT_1U << STACK_CHECK_PROBE_INTERVAL_EXP);

> +}

> +

>  /* When using -fsplit-stack, the allocation routines set a field in

>     the TCB to the bottom of the stack plus this much space, measured

>     in bytes.  */

> @@ -11773,7 +11785,14 @@ ix86_compute_frame_layout (void)

>    to_allocate = offset - frame->sse_reg_save_offset;

>

>    if ((!to_allocate && frame->nregs <= 1)

> -      || (TARGET_64BIT && to_allocate >= HOST_WIDE_INT_C (0x80000000)))

> +      || (TARGET_64BIT && to_allocate >= HOST_WIDE_INT_C (0x80000000))

> +      /* If stack clash probing needs a loop, then it needs a

> +        scratch register.  But the returned register is only guaranteed

> +        to be safe to use after register saves are complete.  So if

> +        stack clash protections are enabled and the allocated frame is

> +        larger than the probe interval, then use pushes to save

> +        callee saved registers.  */

> +      || (flag_stack_clash_protection && to_allocate > get_probe_interval ()))

>      frame->save_regs_using_mov = false;

>

>    if (ix86_using_red_zone ()

> @@ -12567,18 +12586,6 @@ release_scratch_register_on_entry (struct scratch_reg *sr)

>      }

>  }

>

> -/* Return the probing interval for -fstack-clash-protection.  */

> -

> -static HOST_WIDE_INT

> -get_probe_interval (void)

> -{

> -  if (flag_stack_clash_protection)

> -    return (HOST_WIDE_INT_1U

> -           << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL));

> -  else

> -    return (HOST_WIDE_INT_1U << STACK_CHECK_PROBE_INTERVAL_EXP);

> -}

> -

>  /* Emit code to adjust the stack pointer by SIZE bytes while probing it.

>

>     This differs from the next routine in that it tries hard to prevent

> @@ -13727,12 +13734,11 @@ ix86_expand_prologue (void)

>        && (flag_stack_check == STATIC_BUILTIN_STACK_CHECK

>           || flag_stack_clash_protection))

>      {

> -      /* This assert wants to verify that integer registers were saved

> -        prior to probing.  This is necessary when probing may be implemented

> -        as a function call (Windows).  It is not necessary for stack clash

> -        protection probing.  */

> -      if (!flag_stack_clash_protection)

> -       gcc_assert (int_registers_saved);

> +      /* We expect the GP registers to be saved when probes are used

> +        as the probing sequences might need a scratch register and

> +        the routine to allocate one assumes the integer registers

> +        have already been saved.  */

> +      gcc_assert (int_registers_saved);

>

>        if (flag_stack_clash_protection)

>         {

> diff --git a/gcc/testsuite/gcc.target/i386/pr83994.c b/gcc/testsuite/gcc.target/i386/pr83994.c

> new file mode 100644

> index 0000000..b57b04b

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/i386/pr83994.c

> @@ -0,0 +1,15 @@

> +/* { dg-do compile } */

> +/* { dg-options "-O2 -m32 -march=i686 -fpic -fstack-clash-protection" } */


Please use

/* { dg-require-effective-target ia32 } */

and remove "-m32" from dg-options.

> +void f1 (char *);

> +

> +__attribute__ ((regparm (3)))

> +int

> +f2 (int arg1, int arg2, int arg3)

> +{

> +  char buf[16384];

> +  f1 (buf);

> +  f1 (buf);

> +  return 0;

> +}

> +

>
Jeff Law Jan. 24, 2018, 10:17 p.m. | #2
On 01/24/2018 12:11 AM, Uros Bizjak wrote:
> On Wed, Jan 24, 2018 at 12:15 AM, Jeff Law <suzanne.jeff.law@gmail.com> wrote:

>>

>> pr83994 is a code generation bug in the stack-clash support that affects

>> openssl (we've turned on stack-clash-protection by default for the F28

>> builds).

>>

>> The core problem is stack-clash (like stack-check) will emit a probing

>> loop if the prologue allocates enough stack space.  When emitting a loop

>> both implementations will need a scratch register.

>>

>> They use get_scratch_register_at_entry to find a suitable scratch

>> register.  This routine assumes that callee registers saves are

>> completed at the point where the scratch register is going to be used.

>>

>> In this particular testcase we select %ebx because ax,cx,dx are used for

>> parameter passing.  That's fine.  The problem is %ebx hasn't been saved yet!

>>

>> -fstack-check has a bit of code in the frame setup/layout code which

>> forces the prologue to use pushes rather than reg->mem moves for saving

>> registers.  There's a gcc_assert in the prologue expander to catch any

>> case where the registers aren't saved.

>>

>> -fstack-clash-protection doesn't have that same bit of magic in the

>> frame setup/layout code and it bypasses the assertion due to a change I

>> made back in Nov 2017 due to not being aware of this particular issue.

>>

>> This patch reverts the assertion bypass I added back in Nov 2017 and

>> adds clarifying comments.  The patch also forces use of push to save

>> integer registers for a stack-clash protected prologue if probes are

>> going to be needed.

>>

>> Bootstrapped and regression tested on x86_64.

>>

>> While the bug is not marked as a regression, ISTM this needs to be fixed

>> for gcc-8.

>>

>> OK for the trunk?

>>

>> Jeff

>>

>>         * i386.c (get_probe_interval): Move to earlier point.

>>         (ix86_compute_frame_layout): If -fstack-clash-protection and

>>         the frame is larger than the probe interval, then use pushes

>>         to save registers rather than reg->mem moves.

>>         (ix86_expand_prologue): Remove conditional for int_registers_saved

>>         assertion.

>>

>>         * gcc.target/i386/pr83994.c: New test.

> 

> OK with the fixed testcase (see below).

> 

> Thanks,

> Uros.


[ ... snip ... ]

>> diff --git a/gcc/testsuite/gcc.target/i386/pr83994.c b/gcc/testsuite/gcc.target/i386/pr83994.c

>> new file mode 100644

>> index 0000000..b57b04b

>> --- /dev/null

>> +++ b/gcc/testsuite/gcc.target/i386/pr83994.c

>> @@ -0,0 +1,15 @@

>> +/* { dg-do compile } */

>> +/* { dg-options "-O2 -m32 -march=i686 -fpic -fstack-clash-protection" } */

> 

> Please use

> 

> /* { dg-require-effective-target ia32 } */

> 

> and remove "-m32" from dg-options.

Done.  Thanks for the quick turnaround.

jeff

Patch

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 72d25ae..4cb55a8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11497,6 +11497,18 @@  static void warn_once_call_ms2sysv_xlogues (const char *feature)
     }
 }
 
+/* Return the probing interval for -fstack-clash-protection.  */
+
+static HOST_WIDE_INT
+get_probe_interval (void)
+{
+  if (flag_stack_clash_protection)
+    return (HOST_WIDE_INT_1U
+	    << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL));
+  else
+    return (HOST_WIDE_INT_1U << STACK_CHECK_PROBE_INTERVAL_EXP);
+}
+
 /* When using -fsplit-stack, the allocation routines set a field in
    the TCB to the bottom of the stack plus this much space, measured
    in bytes.  */
@@ -11773,7 +11785,14 @@  ix86_compute_frame_layout (void)
   to_allocate = offset - frame->sse_reg_save_offset;
 
   if ((!to_allocate && frame->nregs <= 1)
-      || (TARGET_64BIT && to_allocate >= HOST_WIDE_INT_C (0x80000000)))
+      || (TARGET_64BIT && to_allocate >= HOST_WIDE_INT_C (0x80000000))
+      /* If stack clash probing needs a loop, then it needs a
+	 scratch register.  But the returned register is only guaranteed
+	 to be safe to use after register saves are complete.  So if
+	 stack clash protections are enabled and the allocated frame is
+	 larger than the probe interval, then use pushes to save
+	 callee saved registers.  */
+      || (flag_stack_clash_protection && to_allocate > get_probe_interval ()))
     frame->save_regs_using_mov = false;
 
   if (ix86_using_red_zone ()
@@ -12567,18 +12586,6 @@  release_scratch_register_on_entry (struct scratch_reg *sr)
     }
 }
 
-/* Return the probing interval for -fstack-clash-protection.  */
-
-static HOST_WIDE_INT
-get_probe_interval (void)
-{
-  if (flag_stack_clash_protection)
-    return (HOST_WIDE_INT_1U
-	    << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL));
-  else
-    return (HOST_WIDE_INT_1U << STACK_CHECK_PROBE_INTERVAL_EXP);
-}
-
 /* Emit code to adjust the stack pointer by SIZE bytes while probing it.
 
    This differs from the next routine in that it tries hard to prevent
@@ -13727,12 +13734,11 @@  ix86_expand_prologue (void)
       && (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
 	  || flag_stack_clash_protection))
     {
-      /* This assert wants to verify that integer registers were saved
-	 prior to probing.  This is necessary when probing may be implemented
-	 as a function call (Windows).  It is not necessary for stack clash
-	 protection probing.  */
-      if (!flag_stack_clash_protection)
-	gcc_assert (int_registers_saved);
+      /* We expect the GP registers to be saved when probes are used
+	 as the probing sequences might need a scratch register and
+	 the routine to allocate one assumes the integer registers
+	 have already been saved.  */
+      gcc_assert (int_registers_saved);
 
       if (flag_stack_clash_protection)
 	{
diff --git a/gcc/testsuite/gcc.target/i386/pr83994.c b/gcc/testsuite/gcc.target/i386/pr83994.c
new file mode 100644
index 0000000..b57b04b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr83994.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -m32 -march=i686 -fpic -fstack-clash-protection" } */
+
+void f1 (char *);
+
+__attribute__ ((regparm (3)))
+int
+f2 (int arg1, int arg2, int arg3)
+{
+  char buf[16384];
+  f1 (buf);
+  f1 (buf);
+  return 0;
+}
+