i386: Also skip ENDBR32 at the target function entry

Message ID 20200213120601.GA160128@gmail.com
State New
Headers show
Series
  • i386: Also skip ENDBR32 at the target function entry
Related show

Commit Message

H.J. Lu Feb. 13, 2020, 12:06 p.m.
On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote:
> On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> >

> > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubizjak@gmail.com> wrote:

> > >

> > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > >

> > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > >

> > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > >

> > > > > > Since nested function isn't only called directly, there is ENDBR32 at

> > > > > > function entry and we need to skip it for direct jump in trampoline.

> > > > >

> > > > > Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?

> > > > >

> > > >

> > > > ix86_trampoline_init has

> > > >

> > > >      /* Compute offset from the end of the jmp to the target function.

> > > >          In the case in which the trampoline stores the static chain on

> > > >          the stack, we need to skip the first insn which pushes the

> > > >          (call-saved) register static chain; this push is 1 byte.  */

> > > >       offset += 5;

> > > >       disp = expand_binop (SImode, sub_optab, fnaddr,

> > > >                            plus_constant (Pmode, XEXP (m_tramp, 0),

> > > >                                           offset - (MEM_P (chain) ? 1 : 0)),

> > > >                            NULL_RTX, 1, OPTAB_DIRECT);

> > > >       emit_move_insn (mem, disp);

> > > >

> > > > Without CET, we got

> > > >

> > > > 0000011 <bar.1878>:

> > > >   11: 56                    push   %esi

> > > >   12: 55                    push   %ebp   <<<<<< trampoline jumps here.

> > > >   13: 89 e5                mov    %esp,%ebp

> > > >   15: 83 ec 08              sub    $0x8,%esp

> > > >

> > > > With CET, if bar isn't only called directly, we got

> > > >

> > > > 00000015 <bar.1878>:

> > > >   15: f3 0f 1e fb          endbr32

> > > >   19: 56                    push   %esi

> > > >   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.

> > > >   1b: 89 e5                mov    %esp,%ebp

> > > >   1d: 83 ec 08              sub    $0x8,%esp

> > > >

> > > > We need to add 4 bytes for trampoline to skip endbr32.

> > > >

> > > > Here is the updated patch to check if nested function isn't only

> > > > called directly,

> > >

> > > Please figure out the final patch. I don't want to waste my time

> > > reviewing different version every half hour. Ping me in a couple of

> > > days.

> >

> > This is the final version:

> >

> > https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html

> >

> > You can try the testcase in the patch on any machine with CET binutils

> > since ENDBR32 is nop on none-CET machines.  Without this patch,

> > the test will fail.

> 

> Please rephrase the comment. I don't understand what it tries to say.

> 


Here is the patch with updated comments.  OK for master and 8/9 branches?

Thanks.

H.J.
---
Since pass_insert_endbranch inserts ENDBR32 at entry of the target
function if it may be called indirectly, we also need to skip the
4-byte ENDBR32 for direct jump in trampoline if it is the case.

Tested on Linux/x86-64 CET machine with and without -m32.

gcc/

	PR target/93656
	* config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
	the target function entry.

gcc/testsuite/

	PR target/93656
	* gcc.target/i386/pr93656.c: New test.
---
 gcc/config/i386/i386.c                  | 9 ++++++++-
 gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

-- 
2.24.1

Comments

Uros Bizjak Feb. 13, 2020, 12:28 p.m. | #1
On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>

> On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote:

> > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > >

> > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > >

> > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > >

> > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > > >

> > > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > >

> > > > > > > Since nested function isn't only called directly, there is ENDBR32 at

> > > > > > > function entry and we need to skip it for direct jump in trampoline.

> > > > > >

> > > > > > Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?

> > > > > >

> > > > >

> > > > > ix86_trampoline_init has

> > > > >

> > > > >      /* Compute offset from the end of the jmp to the target function.

> > > > >          In the case in which the trampoline stores the static chain on

> > > > >          the stack, we need to skip the first insn which pushes the

> > > > >          (call-saved) register static chain; this push is 1 byte.  */

> > > > >       offset += 5;

> > > > >       disp = expand_binop (SImode, sub_optab, fnaddr,

> > > > >                            plus_constant (Pmode, XEXP (m_tramp, 0),

> > > > >                                           offset - (MEM_P (chain) ? 1 : 0)),

> > > > >                            NULL_RTX, 1, OPTAB_DIRECT);

> > > > >       emit_move_insn (mem, disp);

> > > > >

> > > > > Without CET, we got

> > > > >

> > > > > 0000011 <bar.1878>:

> > > > >   11: 56                    push   %esi

> > > > >   12: 55                    push   %ebp   <<<<<< trampoline jumps here.

> > > > >   13: 89 e5                mov    %esp,%ebp

> > > > >   15: 83 ec 08              sub    $0x8,%esp

> > > > >

> > > > > With CET, if bar isn't only called directly, we got

> > > > >

> > > > > 00000015 <bar.1878>:

> > > > >   15: f3 0f 1e fb          endbr32

> > > > >   19: 56                    push   %esi

> > > > >   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.

> > > > >   1b: 89 e5                mov    %esp,%ebp

> > > > >   1d: 83 ec 08              sub    $0x8,%esp

> > > > >

> > > > > We need to add 4 bytes for trampoline to skip endbr32.

> > > > >

> > > > > Here is the updated patch to check if nested function isn't only

> > > > > called directly,

> > > >

> > > > Please figure out the final patch. I don't want to waste my time

> > > > reviewing different version every half hour. Ping me in a couple of

> > > > days.

> > >

> > > This is the final version:

> > >

> > > https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html

> > >

> > > You can try the testcase in the patch on any machine with CET binutils

> > > since ENDBR32 is nop on none-CET machines.  Without this patch,

> > > the test will fail.

> >

> > Please rephrase the comment. I don't understand what it tries to say.

> >

>

> Here is the patch with updated comments.  OK for master and 8/9 branches?

>

> Thanks.

>

> H.J.

> ---

> Since pass_insert_endbranch inserts ENDBR32 at entry of the target

> function if it may be called indirectly, we also need to skip the

> 4-byte ENDBR32 for direct jump in trampoline if it is the case.


"
Skip ENDBR32 at the entry if called indirectly.
"
pass_insert_endbranch inserts ENDBR at the entry of the target
function.  We need to skip
>

> Tested on Linux/x86-64 CET machine with and without -m32.

>

> gcc/

>

>         PR target/93656

>         * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at

>         the target function entry.

>

> gcc/testsuite/

>

>         PR target/93656

>         * gcc.target/i386/pr93656.c: New test.

> ---

>  gcc/config/i386/i386.c                  | 9 ++++++++-

>  gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++

>  2 files changed, 12 insertions(+), 1 deletion(-)

>  create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

>

> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

> index 44bc0e0176a..52640b74cc8 100644

> --- a/gcc/config/i386/i386.c

> +++ b/gcc/config/i386/i386.c

> @@ -16839,9 +16839,16 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)

>          the stack, we need to skip the first insn which pushes the

>          (call-saved) register static chain; this push is 1 byte.  */

>        offset += 5;

> +      int skip = MEM_P (chain) ? 1 : 0;


offset += 5 - MEM_P (chain) ? 1: 0;

> +      /* Since pass_insert_endbranch inserts ENDBR32 at entry of the

> +        target function if it may be called indirectly, we also need

> +        to skip the 4-byte ENDBR32 if it is the case.  */


/* Skip ENDBR32 at the entry of the target function when called indirectly.  */

> +      if (need_endbr

> +         && !cgraph_node::get (fndecl)->only_called_directly_p ())

> +       skip += 4;


offset += 4;

>        disp = expand_binop (SImode, sub_optab, fnaddr,

>                            plus_constant (Pmode, XEXP (m_tramp, 0),

> -                                         offset - (MEM_P (chain) ? 1 : 0)),

> +                                         offset - skip),


plus_constant (Pmode, XEXP (m_tramp, 0), offset),

Uros.

>                            NULL_RTX, 1, OPTAB_DIRECT);

>        emit_move_insn (mem, disp);

>      }

> diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c b/gcc/testsuite/gcc.target/i386/pr93656.c

> new file mode 100644

> index 00000000000..f0ac8c8edaa

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/i386/pr93656.c

> @@ -0,0 +1,4 @@

> +/* { dg-do run { target { ia32 && cet } } } */

> +/* { dg-options "-O2 -fcf-protection" } */

> +

> +#include "pr67770.c"

> --

> 2.24.1

>
H.J. Lu Feb. 13, 2020, 12:42 p.m. | #2
On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote:
> On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> >

> > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote:

> > > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > >

> > > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > >

> > > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > >

> > > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > > > >

> > > > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > > >

> > > > > > > > Since nested function isn't only called directly, there is ENDBR32 at

> > > > > > > > function entry and we need to skip it for direct jump in trampoline.

> > > > > > >

> > > > > > > Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?

> > > > > > >

> > > > > >

> > > > > > ix86_trampoline_init has

> > > > > >

> > > > > >      /* Compute offset from the end of the jmp to the target function.

> > > > > >          In the case in which the trampoline stores the static chain on

> > > > > >          the stack, we need to skip the first insn which pushes the

> > > > > >          (call-saved) register static chain; this push is 1 byte.  */

> > > > > >       offset += 5;

> > > > > >       disp = expand_binop (SImode, sub_optab, fnaddr,

> > > > > >                            plus_constant (Pmode, XEXP (m_tramp, 0),

> > > > > >                                           offset - (MEM_P (chain) ? 1 : 0)),

> > > > > >                            NULL_RTX, 1, OPTAB_DIRECT);

> > > > > >       emit_move_insn (mem, disp);

> > > > > >

> > > > > > Without CET, we got

> > > > > >

> > > > > > 0000011 <bar.1878>:

> > > > > >   11: 56                    push   %esi

> > > > > >   12: 55                    push   %ebp   <<<<<< trampoline jumps here.

> > > > > >   13: 89 e5                mov    %esp,%ebp

> > > > > >   15: 83 ec 08              sub    $0x8,%esp

> > > > > >

> > > > > > With CET, if bar isn't only called directly, we got

> > > > > >

> > > > > > 00000015 <bar.1878>:

> > > > > >   15: f3 0f 1e fb          endbr32

> > > > > >   19: 56                    push   %esi

> > > > > >   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.

> > > > > >   1b: 89 e5                mov    %esp,%ebp

> > > > > >   1d: 83 ec 08              sub    $0x8,%esp

> > > > > >

> > > > > > We need to add 4 bytes for trampoline to skip endbr32.

> > > > > >

> > > > > > Here is the updated patch to check if nested function isn't only

> > > > > > called directly,

> > > > >

> > > > > Please figure out the final patch. I don't want to waste my time

> > > > > reviewing different version every half hour. Ping me in a couple of

> > > > > days.

> > > >

> > > > This is the final version:

> > > >

> > > > https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html

> > > >

> > > > You can try the testcase in the patch on any machine with CET binutils

> > > > since ENDBR32 is nop on none-CET machines.  Without this patch,

> > > > the test will fail.

> > >

> > > Please rephrase the comment. I don't understand what it tries to say.

> > >

> >

> > Here is the patch with updated comments.  OK for master and 8/9 branches?

> >

> > Thanks.

> >

> > H.J.

> > ---

> > Since pass_insert_endbranch inserts ENDBR32 at entry of the target

> > function if it may be called indirectly, we also need to skip the

> > 4-byte ENDBR32 for direct jump in trampoline if it is the case.

> 

> "

> Skip ENDBR32 at the entry if called indirectly.

> "

> pass_insert_endbranch inserts ENDBR at the entry of the target

> function.  We need to skip

> >

> > Tested on Linux/x86-64 CET machine with and without -m32.

> >

> > gcc/

> >

> >         PR target/93656

> >         * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at

> >         the target function entry.

> >

> > gcc/testsuite/

> >

> >         PR target/93656

> >         * gcc.target/i386/pr93656.c: New test.

> > ---

> >  gcc/config/i386/i386.c                  | 9 ++++++++-

> >  gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++

> >  2 files changed, 12 insertions(+), 1 deletion(-)

> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

> >

> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

> > index 44bc0e0176a..52640b74cc8 100644

> > --- a/gcc/config/i386/i386.c

> > +++ b/gcc/config/i386/i386.c

> > @@ -16839,9 +16839,16 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)

> >          the stack, we need to skip the first insn which pushes the

> >          (call-saved) register static chain; this push is 1 byte.  */

> >        offset += 5;

> > +      int skip = MEM_P (chain) ? 1 : 0;

> 

> offset += 5 - MEM_P (chain) ? 1: 0;


This is done on purpose sinc there is

  gcc_assert (offset <= TRAMPOLINE_SIZE);

after it.  We shouldn't update offset.

> 

> > +      /* Since pass_insert_endbranch inserts ENDBR32 at entry of the

> > +        target function if it may be called indirectly, we also need

> > +        to skip the 4-byte ENDBR32 if it is the case.  */

> 

> /* Skip ENDBR32 at the entry of the target function when called indirectly.  */


Fixed.

> 

> > +      if (need_endbr

> > +         && !cgraph_node::get (fndecl)->only_called_directly_p ())

> > +       skip += 4;

> 

> offset += 4;

> 

> >        disp = expand_binop (SImode, sub_optab, fnaddr,

> >                            plus_constant (Pmode, XEXP (m_tramp, 0),

> > -                                         offset - (MEM_P (chain) ? 1 : 0)),

> > +                                         offset - skip),

> 

> plus_constant (Pmode, XEXP (m_tramp, 0), offset),

> 

> Uros.

> 


Here is the updated patch.  OK for master and 8/9 branches?

Thanks.


H.J.
--
Skip ENDBR32 at the entry of the target function when called indirectly.

Tested on Linux/x86-64 CET machine with and without -m32.

gcc/

	PR target/93656
	* config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
	the target function entry.

gcc/testsuite/

	PR target/93656
	* gcc.target/i386/pr93656.c: New test.
---
 gcc/config/i386/i386.c                  | 8 +++++++-
 gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
 2 files changed, 11 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 44bc0e0176a..3f3dcd53eb2 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -16839,9 +16839,15 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
 	 the stack, we need to skip the first insn which pushes the
 	 (call-saved) register static chain; this push is 1 byte.  */
       offset += 5;
+      int skip = MEM_P (chain) ? 1 : 0;
+      /* Skip ENDBR32 at the entry of the target function when called
+	 indirectly.  */
+      if (need_endbr
+	  && !cgraph_node::get (fndecl)->only_called_directly_p ())
+	skip += 4;
       disp = expand_binop (SImode, sub_optab, fnaddr,
 			   plus_constant (Pmode, XEXP (m_tramp, 0),
-					  offset - (MEM_P (chain) ? 1 : 0)),
+					  offset - skip),
 			   NULL_RTX, 1, OPTAB_DIRECT);
       emit_move_insn (mem, disp);
     }
diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c b/gcc/testsuite/gcc.target/i386/pr93656.c
new file mode 100644
index 00000000000..f0ac8c8edaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr93656.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target { ia32 && cet } } } */
+/* { dg-options "-O2 -fcf-protection" } */
+
+#include "pr67770.c"
-- 
2.24.1
Uros Bizjak Feb. 13, 2020, 1:09 p.m. | #3
On Thu, Feb 13, 2020 at 1:42 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>

> On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote:

> > On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > >

> > > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote:

> > > > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > >

> > > > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > > >

> > > > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > >

> > > > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > > > > >

> > > > > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > > > >

> > > > > > > > > Since nested function isn't only called directly, there is ENDBR32 at

> > > > > > > > > function entry and we need to skip it for direct jump in trampoline.

> > > > > > > >

> > > > > > > > Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?

> > > > > > > >

> > > > > > >

> > > > > > > ix86_trampoline_init has

> > > > > > >

> > > > > > >      /* Compute offset from the end of the jmp to the target function.

> > > > > > >          In the case in which the trampoline stores the static chain on

> > > > > > >          the stack, we need to skip the first insn which pushes the

> > > > > > >          (call-saved) register static chain; this push is 1 byte.  */

> > > > > > >       offset += 5;

> > > > > > >       disp = expand_binop (SImode, sub_optab, fnaddr,

> > > > > > >                            plus_constant (Pmode, XEXP (m_tramp, 0),

> > > > > > >                                           offset - (MEM_P (chain) ? 1 : 0)),

> > > > > > >                            NULL_RTX, 1, OPTAB_DIRECT);

> > > > > > >       emit_move_insn (mem, disp);

> > > > > > >

> > > > > > > Without CET, we got

> > > > > > >

> > > > > > > 0000011 <bar.1878>:

> > > > > > >   11: 56                    push   %esi

> > > > > > >   12: 55                    push   %ebp   <<<<<< trampoline jumps here.

> > > > > > >   13: 89 e5                mov    %esp,%ebp

> > > > > > >   15: 83 ec 08              sub    $0x8,%esp

> > > > > > >

> > > > > > > With CET, if bar isn't only called directly, we got

> > > > > > >

> > > > > > > 00000015 <bar.1878>:

> > > > > > >   15: f3 0f 1e fb          endbr32

> > > > > > >   19: 56                    push   %esi

> > > > > > >   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.

> > > > > > >   1b: 89 e5                mov    %esp,%ebp

> > > > > > >   1d: 83 ec 08              sub    $0x8,%esp

> > > > > > >

> > > > > > > We need to add 4 bytes for trampoline to skip endbr32.

> > > > > > >

> > > > > > > Here is the updated patch to check if nested function isn't only

> > > > > > > called directly,

> > > > > >

> > > > > > Please figure out the final patch. I don't want to waste my time

> > > > > > reviewing different version every half hour. Ping me in a couple of

> > > > > > days.

> > > > >

> > > > > This is the final version:

> > > > >

> > > > > https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html

> > > > >

> > > > > You can try the testcase in the patch on any machine with CET binutils

> > > > > since ENDBR32 is nop on none-CET machines.  Without this patch,

> > > > > the test will fail.

> > > >

> > > > Please rephrase the comment. I don't understand what it tries to say.

> > > >

> > >

> > > Here is the patch with updated comments.  OK for master and 8/9 branches?

> > >

> > > Thanks.

> > >

> > > H.J.

> > > ---

> > > Since pass_insert_endbranch inserts ENDBR32 at entry of the target

> > > function if it may be called indirectly, we also need to skip the

> > > 4-byte ENDBR32 for direct jump in trampoline if it is the case.

> >

> > "

> > Skip ENDBR32 at the entry if called indirectly.

> > "

> > pass_insert_endbranch inserts ENDBR at the entry of the target

> > function.  We need to skip

> > >

> > > Tested on Linux/x86-64 CET machine with and without -m32.

> > >

> > > gcc/

> > >

> > >         PR target/93656

> > >         * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at

> > >         the target function entry.

> > >

> > > gcc/testsuite/

> > >

> > >         PR target/93656

> > >         * gcc.target/i386/pr93656.c: New test.

> > > ---

> > >  gcc/config/i386/i386.c                  | 9 ++++++++-

> > >  gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++

> > >  2 files changed, 12 insertions(+), 1 deletion(-)

> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

> > >

> > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

> > > index 44bc0e0176a..52640b74cc8 100644

> > > --- a/gcc/config/i386/i386.c

> > > +++ b/gcc/config/i386/i386.c

> > > @@ -16839,9 +16839,16 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)

> > >          the stack, we need to skip the first insn which pushes the

> > >          (call-saved) register static chain; this push is 1 byte.  */

> > >        offset += 5;

> > > +      int skip = MEM_P (chain) ? 1 : 0;

> >

> > offset += 5 - MEM_P (chain) ? 1: 0;

>

> This is done on purpose sinc there is

>

>   gcc_assert (offset <= TRAMPOLINE_SIZE);

>

> after it.  We shouldn't update offset.

>

> >

> > > +      /* Since pass_insert_endbranch inserts ENDBR32 at entry of the

> > > +        target function if it may be called indirectly, we also need

> > > +        to skip the 4-byte ENDBR32 if it is the case.  */

> >

> > /* Skip ENDBR32 at the entry of the target function when called indirectly.  */

>

> Fixed.

>

> >

> > > +      if (need_endbr

> > > +         && !cgraph_node::get (fndecl)->only_called_directly_p ())

> > > +       skip += 4;

> >

> > offset += 4;

> >

> > >        disp = expand_binop (SImode, sub_optab, fnaddr,

> > >                            plus_constant (Pmode, XEXP (m_tramp, 0),

> > > -                                         offset - (MEM_P (chain) ? 1 : 0)),

> > > +                                         offset - skip),

> >

> > plus_constant (Pmode, XEXP (m_tramp, 0), offset),

> >

> > Uros.

> >

>

> Here is the updated patch.  OK for master and 8/9 branches?

>

> Thanks.

>

>

> H.J.

> --

> Skip ENDBR32 at the entry of the target function when called indirectly.


Hm, this was sent too early, I was trying to say:

"Skip ENDBR32 at the entry of target function when initializing trampoline."

>

> Tested on Linux/x86-64 CET machine with and without -m32.

>

> gcc/

>

>         PR target/93656

>         * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at

>         the target function entry.

>

> gcc/testsuite/

>

>         PR target/93656

>         * gcc.target/i386/pr93656.c: New test.


LGTM for mainline, please wait a week before backporting.

Uros.

> ---

>  gcc/config/i386/i386.c                  | 8 +++++++-

>  gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++

>  2 files changed, 11 insertions(+), 1 deletion(-)

>  create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

>

> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

> index 44bc0e0176a..3f3dcd53eb2 100644

> --- a/gcc/config/i386/i386.c

> +++ b/gcc/config/i386/i386.c

> @@ -16839,9 +16839,15 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)

>          the stack, we need to skip the first insn which pushes the

>          (call-saved) register static chain; this push is 1 byte.  */

>        offset += 5;

> +      int skip = MEM_P (chain) ? 1 : 0;

> +      /* Skip ENDBR32 at the entry of the target function when called

> +        indirectly.  */


Just say "Skip ENDBR32 at the entry of the target function."

The condition below is clear, the purpose is not.

> +      if (need_endbr

> +         && !cgraph_node::get (fndecl)->only_called_directly_p ())

> +       skip += 4;

>        disp = expand_binop (SImode, sub_optab, fnaddr,

>                            plus_constant (Pmode, XEXP (m_tramp, 0),

> -                                         offset - (MEM_P (chain) ? 1 : 0)),

> +                                         offset - skip),

>                            NULL_RTX, 1, OPTAB_DIRECT);

>        emit_move_insn (mem, disp);

>      }

> diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c b/gcc/testsuite/gcc.target/i386/pr93656.c

> new file mode 100644

> index 00000000000..f0ac8c8edaa

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/i386/pr93656.c

> @@ -0,0 +1,4 @@

> +/* { dg-do run { target { ia32 && cet } } } */

> +/* { dg-options "-O2 -fcf-protection" } */

> +

> +#include "pr67770.c"

> --

> 2.24.1

>
H.J. Lu Feb. 13, 2020, 1:20 p.m. | #4
On Thu, Feb 13, 2020 at 5:10 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>

> On Thu, Feb 13, 2020 at 1:42 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> >

> > On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote:

> > > On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > >

> > > > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote:

> > > > > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > >

> > > > > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > > > >

> > > > > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > > >

> > > > > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:

> > > > > > > > >

> > > > > > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > > > > >

> > > > > > > > > > Since nested function isn't only called directly, there is ENDBR32 at

> > > > > > > > > > function entry and we need to skip it for direct jump in trampoline.

> > > > > > > > >

> > > > > > > > > Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?

> > > > > > > > >

> > > > > > > >

> > > > > > > > ix86_trampoline_init has

> > > > > > > >

> > > > > > > >      /* Compute offset from the end of the jmp to the target function.

> > > > > > > >          In the case in which the trampoline stores the static chain on

> > > > > > > >          the stack, we need to skip the first insn which pushes the

> > > > > > > >          (call-saved) register static chain; this push is 1 byte.  */

> > > > > > > >       offset += 5;

> > > > > > > >       disp = expand_binop (SImode, sub_optab, fnaddr,

> > > > > > > >                            plus_constant (Pmode, XEXP (m_tramp, 0),

> > > > > > > >                                           offset - (MEM_P (chain) ? 1 : 0)),

> > > > > > > >                            NULL_RTX, 1, OPTAB_DIRECT);

> > > > > > > >       emit_move_insn (mem, disp);

> > > > > > > >

> > > > > > > > Without CET, we got

> > > > > > > >

> > > > > > > > 0000011 <bar.1878>:

> > > > > > > >   11: 56                    push   %esi

> > > > > > > >   12: 55                    push   %ebp   <<<<<< trampoline jumps here.

> > > > > > > >   13: 89 e5                mov    %esp,%ebp

> > > > > > > >   15: 83 ec 08              sub    $0x8,%esp

> > > > > > > >

> > > > > > > > With CET, if bar isn't only called directly, we got

> > > > > > > >

> > > > > > > > 00000015 <bar.1878>:

> > > > > > > >   15: f3 0f 1e fb          endbr32

> > > > > > > >   19: 56                    push   %esi

> > > > > > > >   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.

> > > > > > > >   1b: 89 e5                mov    %esp,%ebp

> > > > > > > >   1d: 83 ec 08              sub    $0x8,%esp

> > > > > > > >

> > > > > > > > We need to add 4 bytes for trampoline to skip endbr32.

> > > > > > > >

> > > > > > > > Here is the updated patch to check if nested function isn't only

> > > > > > > > called directly,

> > > > > > >

> > > > > > > Please figure out the final patch. I don't want to waste my time

> > > > > > > reviewing different version every half hour. Ping me in a couple of

> > > > > > > days.

> > > > > >

> > > > > > This is the final version:

> > > > > >

> > > > > > https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html

> > > > > >

> > > > > > You can try the testcase in the patch on any machine with CET binutils

> > > > > > since ENDBR32 is nop on none-CET machines.  Without this patch,

> > > > > > the test will fail.

> > > > >

> > > > > Please rephrase the comment. I don't understand what it tries to say.

> > > > >

> > > >

> > > > Here is the patch with updated comments.  OK for master and 8/9 branches?

> > > >

> > > > Thanks.

> > > >

> > > > H.J.

> > > > ---

> > > > Since pass_insert_endbranch inserts ENDBR32 at entry of the target

> > > > function if it may be called indirectly, we also need to skip the

> > > > 4-byte ENDBR32 for direct jump in trampoline if it is the case.

> > >

> > > "

> > > Skip ENDBR32 at the entry if called indirectly.

> > > "

> > > pass_insert_endbranch inserts ENDBR at the entry of the target

> > > function.  We need to skip

> > > >

> > > > Tested on Linux/x86-64 CET machine with and without -m32.

> > > >

> > > > gcc/

> > > >

> > > >         PR target/93656

> > > >         * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at

> > > >         the target function entry.

> > > >

> > > > gcc/testsuite/

> > > >

> > > >         PR target/93656

> > > >         * gcc.target/i386/pr93656.c: New test.

> > > > ---

> > > >  gcc/config/i386/i386.c                  | 9 ++++++++-

> > > >  gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++

> > > >  2 files changed, 12 insertions(+), 1 deletion(-)

> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

> > > >

> > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

> > > > index 44bc0e0176a..52640b74cc8 100644

> > > > --- a/gcc/config/i386/i386.c

> > > > +++ b/gcc/config/i386/i386.c

> > > > @@ -16839,9 +16839,16 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)

> > > >          the stack, we need to skip the first insn which pushes the

> > > >          (call-saved) register static chain; this push is 1 byte.  */

> > > >        offset += 5;

> > > > +      int skip = MEM_P (chain) ? 1 : 0;

> > >

> > > offset += 5 - MEM_P (chain) ? 1: 0;

> >

> > This is done on purpose sinc there is

> >

> >   gcc_assert (offset <= TRAMPOLINE_SIZE);

> >

> > after it.  We shouldn't update offset.

> >

> > >

> > > > +      /* Since pass_insert_endbranch inserts ENDBR32 at entry of the

> > > > +        target function if it may be called indirectly, we also need

> > > > +        to skip the 4-byte ENDBR32 if it is the case.  */

> > >

> > > /* Skip ENDBR32 at the entry of the target function when called indirectly.  */

> >

> > Fixed.

> >

> > >

> > > > +      if (need_endbr

> > > > +         && !cgraph_node::get (fndecl)->only_called_directly_p ())

> > > > +       skip += 4;

> > >

> > > offset += 4;

> > >

> > > >        disp = expand_binop (SImode, sub_optab, fnaddr,

> > > >                            plus_constant (Pmode, XEXP (m_tramp, 0),

> > > > -                                         offset - (MEM_P (chain) ? 1 : 0)),

> > > > +                                         offset - skip),

> > >

> > > plus_constant (Pmode, XEXP (m_tramp, 0), offset),

> > >

> > > Uros.

> > >

> >

> > Here is the updated patch.  OK for master and 8/9 branches?

> >

> > Thanks.

> >

> >

> > H.J.

> > --

> > Skip ENDBR32 at the entry of the target function when called indirectly.

>

> Hm, this was sent too early, I was trying to say:

>

> "Skip ENDBR32 at the entry of target function when initializing trampoline."

>

> >

> > Tested on Linux/x86-64 CET machine with and without -m32.

> >

> > gcc/

> >

> >         PR target/93656

> >         * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at

> >         the target function entry.

> >

> > gcc/testsuite/

> >

> >         PR target/93656

> >         * gcc.target/i386/pr93656.c: New test.

>

> LGTM for mainline, please wait a week before backporting.

>

> Uros.

>

> > ---

> >  gcc/config/i386/i386.c                  | 8 +++++++-

> >  gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++

> >  2 files changed, 11 insertions(+), 1 deletion(-)

> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

> >

> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

> > index 44bc0e0176a..3f3dcd53eb2 100644

> > --- a/gcc/config/i386/i386.c

> > +++ b/gcc/config/i386/i386.c

> > @@ -16839,9 +16839,15 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)

> >          the stack, we need to skip the first insn which pushes the

> >          (call-saved) register static chain; this push is 1 byte.  */

> >        offset += 5;

> > +      int skip = MEM_P (chain) ? 1 : 0;

> > +      /* Skip ENDBR32 at the entry of the target function when called

> > +        indirectly.  */

>

> Just say "Skip ENDBR32 at the entry of the target function."

>

> The condition below is clear, the purpose is not.

>


This is the patch I am checking in.  I will backport it next week.

Thanks.

-- 
H.J.
From 66e602cc3d1ada59013f99e2a9a629135c369b22 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Mon, 10 Feb 2020 11:10:52 -0800
Subject: [PATCH] i386: Skip ENDBR32 at the target function entry

Skip ENDBR32 at the target function entry when initializing trampoline.

Tested on Linux/x86-64 CET machine with and without -m32.

gcc/

	PR target/93656
	* config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
	the target function entry.

gcc/testsuite/

	PR target/93656
	* gcc.target/i386/pr93656.c: New test.
---
 gcc/config/i386/i386.c                  | 7 ++++++-
 gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 44bc0e0176a..dac7a3fc5fd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -16839,9 +16839,14 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
 	 the stack, we need to skip the first insn which pushes the
 	 (call-saved) register static chain; this push is 1 byte.  */
       offset += 5;
+      int skip = MEM_P (chain) ? 1 : 0;
+      /* Skip ENDBR32 at the entry of the target function.  */
+      if (need_endbr
+	  && !cgraph_node::get (fndecl)->only_called_directly_p ())
+	skip += 4;
       disp = expand_binop (SImode, sub_optab, fnaddr,
 			   plus_constant (Pmode, XEXP (m_tramp, 0),
-					  offset - (MEM_P (chain) ? 1 : 0)),
+					  offset - skip),
 			   NULL_RTX, 1, OPTAB_DIRECT);
       emit_move_insn (mem, disp);
     }
diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c b/gcc/testsuite/gcc.target/i386/pr93656.c
new file mode 100644
index 00000000000..f0ac8c8edaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr93656.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target { ia32 && cet } } } */
+/* { dg-options "-O2 -fcf-protection" } */
+
+#include "pr67770.c"

Patch

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 44bc0e0176a..52640b74cc8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -16839,9 +16839,16 @@  ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
 	 the stack, we need to skip the first insn which pushes the
 	 (call-saved) register static chain; this push is 1 byte.  */
       offset += 5;
+      int skip = MEM_P (chain) ? 1 : 0;
+      /* Since pass_insert_endbranch inserts ENDBR32 at entry of the
+	 target function if it may be called indirectly, we also need
+	 to skip the 4-byte ENDBR32 if it is the case.  */
+      if (need_endbr
+	  && !cgraph_node::get (fndecl)->only_called_directly_p ())
+	skip += 4;
       disp = expand_binop (SImode, sub_optab, fnaddr,
 			   plus_constant (Pmode, XEXP (m_tramp, 0),
-					  offset - (MEM_P (chain) ? 1 : 0)),
+					  offset - skip),
 			   NULL_RTX, 1, OPTAB_DIRECT);
       emit_move_insn (mem, disp);
     }
diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c b/gcc/testsuite/gcc.target/i386/pr93656.c
new file mode 100644
index 00000000000..f0ac8c8edaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr93656.c
@@ -0,0 +1,4 @@ 
+/* { dg-do run { target { ia32 && cet } } } */
+/* { dg-options "-O2 -fcf-protection" } */
+
+#include "pr67770.c"