Fix target clone indirection elimination.

Message ID 20200620161646.1821883-2-yyc1992@gmail.com
State New
Headers show
Series
  • Fix target clone indirection elimination.
Related show

Commit Message

Christophe Lyon via Gcc-patches June 20, 2020, 4:16 p.m.
The current logic seems to be comparing the whole attribute tree between the callee
and caller (or at least the tree starting from the target attribute).
This is unnecessary and causes strange dependency of the indirection
elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

This changes the comparison to be only on the `target` attribute which should be
the intent of the code.
---
 gcc/multiple_target.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
2.27.0

Comments

Christophe Lyon via Gcc-patches June 20, 2020, 5:24 p.m. | #1
On Sat, Jun 20, 2020 at 9:17 AM Yichao Yu via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>

> The current logic seems to be comparing the whole attribute tree between the callee

> and caller (or at least the tree starting from the target attribute).

> This is unnecessary and causes strange dependency of the indirection

> elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).


Does it fix PR95780 and PR95778?  Can you include testcases for them?

> This changes the comparison to be only on the `target` attribute which should be

> the intent of the code.

> ---

>  gcc/multiple_target.c | 4 ++--

>  1 file changed, 2 insertions(+), 2 deletions(-)

>

> diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c

> index c1cfe8ff978..9eb0afd58cc 100644

> --- a/gcc/multiple_target.c

> +++ b/gcc/multiple_target.c

> @@ -483,7 +483,7 @@ redirect_to_specific_clone (cgraph_node *node)

>                                             DECL_ATTRIBUTES (e->callee->decl));

>

>        /* Function is not calling proper target clone.  */

> -      if (!attribute_list_equal (attr_target, attr_target2))

> +      if (attr_target2 == NULL_TREE || !attribute_value_equal (attr_target, attr_target2))

>         {

>           while (fv2->prev != NULL)

>             fv2 = fv2->prev;

> @@ -494,7 +494,7 @@ redirect_to_specific_clone (cgraph_node *node)

>               cgraph_node *callee = fv2->this_node;

>               attr_target2 = lookup_attribute ("target",

>                                                DECL_ATTRIBUTES (callee->decl));

> -             if (attribute_list_equal (attr_target, attr_target2))

> +             if (attr_target2 != NULL_TREE && attribute_value_equal (attr_target, attr_target2))

>                 {

>                   e->redirect_callee (callee);

>                   cgraph_edge::redirect_call_stmt_to_callee (e);

> --

> 2.27.0

>



-- 
H.J.
Christophe Lyon via Gcc-patches June 20, 2020, 5:54 p.m. | #2
> > The current logic seems to be comparing the whole attribute tree between the callee

> > and caller (or at least the tree starting from the target attribute).

> > This is unnecessary and causes strange dependency of the indirection

> > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

>

> Does it fix PR95780 and PR95778?  Can you include testcases for them?


Yes it at least partially fixes them. As mentioned in the other email,
this still does not allow inlining but maybe that should be reported
as another issue?

I need to work on the test and might need some help.
Right now I'm not sure how to write such a test (I assume blaming on
the old implementation should tell me where the right test is though).
Also, make -k check somehow gives me some segfaults from LLVM. I was
wondering if it is related to that I was using an out-of-tree non
bootstrap build so I'm recompiling a bootstrap in-source build...

>

> > This changes the comparison to be only on the `target` attribute which should be

> > the intent of the code.

> > ---

> >  gcc/multiple_target.c | 4 ++--

> >  1 file changed, 2 insertions(+), 2 deletions(-)

> >

> > diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c

> > index c1cfe8ff978..9eb0afd58cc 100644

> > --- a/gcc/multiple_target.c

> > +++ b/gcc/multiple_target.c

> > @@ -483,7 +483,7 @@ redirect_to_specific_clone (cgraph_node *node)

> >                                             DECL_ATTRIBUTES (e->callee->decl));

> >

> >        /* Function is not calling proper target clone.  */

> > -      if (!attribute_list_equal (attr_target, attr_target2))

> > +      if (attr_target2 == NULL_TREE || !attribute_value_equal (attr_target, attr_target2))

> >         {

> >           while (fv2->prev != NULL)

> >             fv2 = fv2->prev;

> > @@ -494,7 +494,7 @@ redirect_to_specific_clone (cgraph_node *node)

> >               cgraph_node *callee = fv2->this_node;

> >               attr_target2 = lookup_attribute ("target",

> >                                                DECL_ATTRIBUTES (callee->decl));

> > -             if (attribute_list_equal (attr_target, attr_target2))

> > +             if (attr_target2 != NULL_TREE && attribute_value_equal (attr_target, attr_target2))

> >                 {

> >                   e->redirect_callee (callee);

> >                   cgraph_edge::redirect_call_stmt_to_callee (e);

> > --

> > 2.27.0

> >

>

>

> --

> H.J.
Christophe Lyon via Gcc-patches June 20, 2020, 6:37 p.m. | #3
On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:
>

> > > The current logic seems to be comparing the whole attribute tree between the callee

> > > and caller (or at least the tree starting from the target attribute).

> > > This is unnecessary and causes strange dependency of the indirection

> > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> >

> > Does it fix PR95780 and PR95778?  Can you include testcases for them?


OK so replacing
https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5
with/adding to the test the following one should work. I still
couldn't get test to run though......

```
/* { dg-do compile } */
/* { dg-require-ifunc "" } */
/* { dg-options "-O2 -fdump-tree-optimized" } */

__attribute__ ((flatten,target ("default"),cold))
static unsigned foo(const char *buf, unsigned size) {
 return 1;
}

__attribute__ ((flatten,target ("avx"),cold))
static unsigned foo(const char *buf, unsigned size) {
 return 2;
}

__attribute__ ((target ("default")))
unsigned bar() {
 char buf[4096];
 unsigned acc = 0;
 for (int i = 0; i < sizeof(buf); i++) {
   acc += foo(&buf[i], 1);
 }
 return acc;
}

__attribute__ ((target ("avx")))
unsigned bar() {
 char buf[4096];
 unsigned acc = 0;
 for (int i = 0; i < sizeof(buf); i++) {
   acc += foo(&buf[i], 1);
 }
 return acc;
}

/* { dg-final { scan-tree-dump-times "return 4096;" 1 "optimized" } } */
/* { dg-final { scan-tree-dump-times "return 8192;" 1 "optimized" } } */
```

>

> Yes it at least partially fixes them. As mentioned in the other email,

> this still does not allow inlining but maybe that should be reported

> as another issue?

>

> I need to work on the test and might need some help.

> Right now I'm not sure how to write such a test (I assume blaming on

> the old implementation should tell me where the right test is though).

> Also, make -k check somehow gives me some segfaults from LLVM. I was

> wondering if it is related to that I was using an out-of-tree non

> bootstrap build so I'm recompiling a bootstrap in-source build...

>

> >

> > > This changes the comparison to be only on the `target` attribute which should be

> > > the intent of the code.

> > > ---

> > >  gcc/multiple_target.c | 4 ++--

> > >  1 file changed, 2 insertions(+), 2 deletions(-)

> > >

> > > diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c

> > > index c1cfe8ff978..9eb0afd58cc 100644

> > > --- a/gcc/multiple_target.c

> > > +++ b/gcc/multiple_target.c

> > > @@ -483,7 +483,7 @@ redirect_to_specific_clone (cgraph_node *node)

> > >                                             DECL_ATTRIBUTES (e->callee->decl));

> > >

> > >        /* Function is not calling proper target clone.  */

> > > -      if (!attribute_list_equal (attr_target, attr_target2))

> > > +      if (attr_target2 == NULL_TREE || !attribute_value_equal (attr_target, attr_target2))

> > >         {

> > >           while (fv2->prev != NULL)

> > >             fv2 = fv2->prev;

> > > @@ -494,7 +494,7 @@ redirect_to_specific_clone (cgraph_node *node)

> > >               cgraph_node *callee = fv2->this_node;

> > >               attr_target2 = lookup_attribute ("target",

> > >                                                DECL_ATTRIBUTES (callee->decl));

> > > -             if (attribute_list_equal (attr_target, attr_target2))

> > > +             if (attr_target2 != NULL_TREE && attribute_value_equal (attr_target, attr_target2))

> > >                 {

> > >                   e->redirect_callee (callee);

> > >                   cgraph_edge::redirect_call_stmt_to_callee (e);

> > > --

> > > 2.27.0

> > >

> >

> >

> > --

> > H.J.
Christophe Lyon via Gcc-patches June 20, 2020, 7:11 p.m. | #4
> https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> with/adding to the test the following one should work. I still

> couldn't get test to run though......


And yet another related issue that I think I would appreciate some
help on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95790 . For
`target_clones` usage I can probably just check that the target_clones
attribute matches but I'm not sure how to get the list of targets for
the function when the target attribute is used directly.

Another issue that I've noticed but haven't really debugged yet is
that this elimination doesn't work when target_clones and target are
mixed even in the case where the two generate basically identical code
without this optimization.

>

> ```

> /* { dg-do compile } */

> /* { dg-require-ifunc "" } */

> /* { dg-options "-O2 -fdump-tree-optimized" } */

>

> __attribute__ ((flatten,target ("default"),cold))

> static unsigned foo(const char *buf, unsigned size) {

>  return 1;

> }

>

> __attribute__ ((flatten,target ("avx"),cold))

> static unsigned foo(const char *buf, unsigned size) {

>  return 2;

> }

>

> __attribute__ ((target ("default")))

> unsigned bar() {

>  char buf[4096];

>  unsigned acc = 0;

>  for (int i = 0; i < sizeof(buf); i++) {

>    acc += foo(&buf[i], 1);

>  }

>  return acc;

> }

>

> __attribute__ ((target ("avx")))

> unsigned bar() {

>  char buf[4096];

>  unsigned acc = 0;

>  for (int i = 0; i < sizeof(buf); i++) {

>    acc += foo(&buf[i], 1);

>  }

>  return acc;

> }

>

> /* { dg-final { scan-tree-dump-times "return 4096;" 1 "optimized" } } */

> /* { dg-final { scan-tree-dump-times "return 8192;" 1 "optimized" } } */

> ```

>

> >

> > Yes it at least partially fixes them. As mentioned in the other email,

> > this still does not allow inlining but maybe that should be reported

> > as another issue?

> >

> > I need to work on the test and might need some help.

> > Right now I'm not sure how to write such a test (I assume blaming on

> > the old implementation should tell me where the right test is though).

> > Also, make -k check somehow gives me some segfaults from LLVM. I was

> > wondering if it is related to that I was using an out-of-tree non

> > bootstrap build so I'm recompiling a bootstrap in-source build...

> >

> > >

> > > > This changes the comparison to be only on the `target` attribute which should be

> > > > the intent of the code.

> > > > ---

> > > >  gcc/multiple_target.c | 4 ++--

> > > >  1 file changed, 2 insertions(+), 2 deletions(-)

> > > >

> > > > diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c

> > > > index c1cfe8ff978..9eb0afd58cc 100644

> > > > --- a/gcc/multiple_target.c

> > > > +++ b/gcc/multiple_target.c

> > > > @@ -483,7 +483,7 @@ redirect_to_specific_clone (cgraph_node *node)

> > > >                                             DECL_ATTRIBUTES (e->callee->decl));

> > > >

> > > >        /* Function is not calling proper target clone.  */

> > > > -      if (!attribute_list_equal (attr_target, attr_target2))

> > > > +      if (attr_target2 == NULL_TREE || !attribute_value_equal (attr_target, attr_target2))

> > > >         {

> > > >           while (fv2->prev != NULL)

> > > >             fv2 = fv2->prev;

> > > > @@ -494,7 +494,7 @@ redirect_to_specific_clone (cgraph_node *node)

> > > >               cgraph_node *callee = fv2->this_node;

> > > >               attr_target2 = lookup_attribute ("target",

> > > >                                                DECL_ATTRIBUTES (callee->decl));

> > > > -             if (attribute_list_equal (attr_target, attr_target2))

> > > > +             if (attr_target2 != NULL_TREE && attribute_value_equal (attr_target, attr_target2))

> > > >                 {

> > > >                   e->redirect_callee (callee);

> > > >                   cgraph_edge::redirect_call_stmt_to_callee (e);

> > > > --

> > > > 2.27.0

> > > >

> > >

> > >

> > > --

> > > H.J.
Christophe Lyon via Gcc-patches June 20, 2020, 7:12 p.m. | #5
On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:
>

> On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> >

> > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > and caller (or at least the tree starting from the target attribute).

> > > > This is unnecessary and causes strange dependency of the indirection

> > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > >

> > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

>

> OK so replacing

> https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> with/adding to the test the following one should work. I still

> couldn't get test to run though......

>

Can you try the enclosed testcases?

>

> >

> > Yes it at least partially fixes them. As mentioned in the other email,

> > this still does not allow inlining but maybe that should be reported

> > as another issue?


Please open a separate bug.

> > I need to work on the test and might need some help.

> > Right now I'm not sure how to write such a test (I assume blaming on

> > the old implementation should tell me where the right test is though).

> > Also, make -k check somehow gives me some segfaults from LLVM. I was

> > wondering if it is related to that I was using an out-of-tree non

> > bootstrap build so I'm recompiling a bootstrap in-source build...

> >

> > >

> > > > This changes the comparison to be only on the `target` attribute which should be

> > > > the intent of the code.

> > > > ---

> > > >  gcc/multiple_target.c | 4 ++--

> > > >  1 file changed, 2 insertions(+), 2 deletions(-)

> > > >

> > > > diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c

> > > > index c1cfe8ff978..9eb0afd58cc 100644

> > > > --- a/gcc/multiple_target.c

> > > > +++ b/gcc/multiple_target.c

> > > > @@ -483,7 +483,7 @@ redirect_to_specific_clone (cgraph_node *node)

> > > >                                             DECL_ATTRIBUTES (e->callee->decl));

> > > >

> > > >        /* Function is not calling proper target clone.  */

> > > > -      if (!attribute_list_equal (attr_target, attr_target2))

> > > > +      if (attr_target2 == NULL_TREE || !attribute_value_equal (attr_target, attr_target2))

> > > >         {

> > > >           while (fv2->prev != NULL)

> > > >             fv2 = fv2->prev;

> > > > @@ -494,7 +494,7 @@ redirect_to_specific_clone (cgraph_node *node)

> > > >               cgraph_node *callee = fv2->this_node;

> > > >               attr_target2 = lookup_attribute ("target",

> > > >                                                DECL_ATTRIBUTES (callee->decl));

> > > > -             if (attribute_list_equal (attr_target, attr_target2))

> > > > +             if (attr_target2 != NULL_TREE && attribute_value_equal (attr_target, attr_target2))

> > > >                 {

> > > >                   e->redirect_callee (callee);

> > > >                   cgraph_edge::redirect_call_stmt_to_callee (e);

> > > > --

> > > > 2.27.0

> > > >

> > >

> > >

> > > --

> > > H.J.




-- 
H.J.
/* { dg-do compile { target fpic } } */
/* { dg-options "-O3 -fPIC -fno-asynchronous-unwind-tables" } */
/* { dg-require-ifunc "" } */

__attribute__((target_clones("default,avx2")))
static int
f2(int *p)
{
  asm volatile ("" :: "r"(p) : "memory");
  return *p;
}

__attribute__((target_clones("default,avx2")))
int
g2(int *p)
{
  return f2(p);
}

/* { dg-final { scan-assembler "g2.default.1:\n\tjmp\tf2.default.1\n" } } */
/* { dg-final { scan-assembler "g2.avx2.0:\n\tjmp\tf2.avx2.0\n" } } */
/* { dg-do compile { target fpic } } */
/* { dg-options "-O3 -fPIC -fno-asynchronous-unwind-tables" } */
/* { dg-require-ifunc "" } */

__attribute__((visibility("internal"),target_clones("default,avx2")))
int
f2(int *p)
{
  asm volatile ("" :: "r"(p) : "memory");
  return *p;
}

__attribute__((target_clones("default,avx2")))
int
g2(int *p)
{
  return f2(p);
}

/* { dg-final { scan-assembler "g2.default.1:\n\tjmp\tf2.default.1\n" } } */
/* { dg-final { scan-assembler "g2.avx2.0:\n\tjmp\tf2.avx2.0\n" } } */
Christophe Lyon via Gcc-patches June 20, 2020, 7:20 p.m. | #6
On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>

> On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> >

> > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > >

> > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > and caller (or at least the tree starting from the target attribute).

> > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > >

> > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> >

> > OK so replacing

> > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > with/adding to the test the following one should work. I still

> > couldn't get test to run though......

> >

> Can you try the enclosed testcases?



The assembly produced are the following with my patch and if I
understand it correctly those should work. Unfortunately I don't know
how to actually run the test as a test (if that makes sense....).

yuyichao% PATH=~/projects/contrib/gcc/host-x86_64-pc-linux-gnu/gcc
~/projects/contrib/gcc/host-
x86_64-pc-linux-gnu/gcc/xg++ -O3 -fPIC pr95778-1.c -S -o -
-fno-asynchronous-unwind-tables -g0
-fno-exceptions
       .file   "pr95778-1.c"
       .text
       .p2align 4
       .type   _ZL2f2Pi.default.1, @function
_ZL2f2Pi.default.1:
       movl    (%rdi), %eax
       ret
       .size   _ZL2f2Pi.default.1, .-_ZL2f2Pi.default.1
       .p2align 4
       .type   _Z2g2Pi.default.1, @function
_Z2g2Pi.default.1:
       jmp     _ZL2f2Pi.default.1
       .size   _Z2g2Pi.default.1, .-_Z2g2Pi.default.1
       .p2align 4
       .type   _ZL2f2Pi.avx2.0, @function
_ZL2f2Pi.avx2.0:
       movl    (%rdi), %eax
       ret
       .size   _ZL2f2Pi.avx2.0, .-_ZL2f2Pi.avx2.0
       .p2align 4
       .type   _Z2g2Pi.avx2.0, @function
_Z2g2Pi.avx2.0:
       jmp     _ZL2f2Pi.avx2.0
       .size   _Z2g2Pi.avx2.0, .-_Z2g2Pi.avx2.0
       .section
.text._Z2g2Pi.resolver,"axG",@progbits,_Z2g2Pi.resolver,comdat
       .p2align 4
       .weak   _Z2g2Pi.resolver
       .type   _Z2g2Pi.resolver, @function
_Z2g2Pi.resolver:
       subq    $8, %rsp
       call    __cpu_indicator_init@PLT
       movq    __cpu_model@GOTPCREL(%rip), %rax
       leaq    _Z2g2Pi.avx2.0(%rip), %rdx
       testb   $4, 13(%rax)
       leaq    _Z2g2Pi.default.1(%rip), %rax
       cmovne  %rdx, %rax
       addq    $8, %rsp
       ret
       .size   _Z2g2Pi.resolver, .-_Z2g2Pi.resolver
       .globl  _Z2g2Pi
       .type   _Z2g2Pi, @gnu_indirect_function
       .set    _Z2g2Pi,_Z2g2Pi.resolver
       .text
       .p2align 4
       .type   _ZL2f2Pi.resolver, @function
_ZL2f2Pi.resolver:
       subq    $8, %rsp
       call    __cpu_indicator_init@PLT
       movq    __cpu_model@GOTPCREL(%rip), %rax
       leaq    _ZL2f2Pi.avx2.0(%rip), %rdx
       testb   $4, 13(%rax)
       leaq    _ZL2f2Pi.default.1(%rip), %rax
       cmovne  %rdx, %rax
       addq    $8, %rsp
       ret
       .size   _ZL2f2Pi.resolver, .-_ZL2f2Pi.resolver
       .ident  "GCC: (GNU) 11.0.0 20200620 (experimental)"
       .section        .note.GNU-stack,"",@progbits
office:~
yuyichao% PATH=~/projects/contrib/gcc/host-x86_64-pc-linux-gnu/gcc
~/projects/contrib/gcc/host-
x86_64-pc-linux-gnu/gcc/xg++ -O3 -fPIC pr95778-2.c -S -o -
-fno-asynchronous-unwind-tables -g0
-fno-exceptions
       .file   "pr95778-2.c"
       .text
       .p2align 4
       .type   _Z2f2Pi.default.1, @function
_Z2f2Pi.default.1:
       movl    (%rdi), %eax
       ret
       .size   _Z2f2Pi.default.1, .-_Z2f2Pi.default.1
       .p2align 4
       .type   _Z2g2Pi.default.1, @function
_Z2g2Pi.default.1:
       jmp     _Z2f2Pi.default.1
       .size   _Z2g2Pi.default.1, .-_Z2g2Pi.default.1
       .p2align 4
       .type   _Z2f2Pi.avx2.0, @function
_Z2f2Pi.avx2.0:
       movl    (%rdi), %eax
       ret
       .size   _Z2f2Pi.avx2.0, .-_Z2f2Pi.avx2.0
       .p2align 4
       .type   _Z2g2Pi.avx2.0, @function
_Z2g2Pi.avx2.0:
       jmp     _Z2f2Pi.avx2.0
       .size   _Z2g2Pi.avx2.0, .-_Z2g2Pi.avx2.0
       .section
.text._Z2g2Pi.resolver,"axG",@progbits,_Z2g2Pi.resolver,comdat
       .p2align 4
       .weak   _Z2g2Pi.resolver
       .type   _Z2g2Pi.resolver, @function
_Z2g2Pi.resolver:
       subq    $8, %rsp
       call    __cpu_indicator_init@PLT
       movq    __cpu_model@GOTPCREL(%rip), %rax
       leaq    _Z2g2Pi.avx2.0(%rip), %rdx
       testb   $4, 13(%rax)
       leaq    _Z2g2Pi.default.1(%rip), %rax
       cmovne  %rdx, %rax
       addq    $8, %rsp
       ret
       .size   _Z2g2Pi.resolver, .-_Z2g2Pi.resolver
       .globl  _Z2g2Pi
       .type   _Z2g2Pi, @gnu_indirect_function
       .set    _Z2g2Pi,_Z2g2Pi.resolver
       .section
.text._Z2f2Pi.resolver,"axG",@progbits,_Z2f2Pi.resolver,comdat
       .p2align 4
       .weak   _Z2f2Pi.resolver
       .type   _Z2f2Pi.resolver, @function
_Z2f2Pi.resolver:
       subq    $8, %rsp
       call    __cpu_indicator_init@PLT
       movq    __cpu_model@GOTPCREL(%rip), %rax
       leaq    _Z2f2Pi.avx2.0(%rip), %rdx
       testb   $4, 13(%rax)
       leaq    _Z2f2Pi.default.1(%rip), %rax
       cmovne  %rdx, %rax
       addq    $8, %rsp
       ret
       .size   _Z2f2Pi.resolver, .-_Z2f2Pi.resolver
       .globl  _Z2f2Pi
       .type   _Z2f2Pi, @gnu_indirect_function
       .set    _Z2f2Pi,_Z2f2Pi.resolver
       .ident  "GCC: (GNU) 11.0.0 20200620 (experimental)"
       .section        .note.GNU-stack,"",@progbits




>

> >

> > >

> > > Yes it at least partially fixes them. As mentioned in the other email,

> > > this still does not allow inlining but maybe that should be reported

> > > as another issue?

>

> Please open a separate bug.


OK, will do. thanks.

>

> > > I need to work on the test and might need some help.

> > > Right now I'm not sure how to write such a test (I assume blaming on

> > > the old implementation should tell me where the right test is though).

> > > Also, make -k check somehow gives me some segfaults from LLVM. I was

> > > wondering if it is related to that I was using an out-of-tree non

> > > bootstrap build so I'm recompiling a bootstrap in-source build...

> > >

> > > >

> > > > > This changes the comparison to be only on the `target` attribute which should be

> > > > > the intent of the code.

> > > > > ---

> > > > >  gcc/multiple_target.c | 4 ++--

> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)

> > > > >

> > > > > diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c

> > > > > index c1cfe8ff978..9eb0afd58cc 100644

> > > > > --- a/gcc/multiple_target.c

> > > > > +++ b/gcc/multiple_target.c

> > > > > @@ -483,7 +483,7 @@ redirect_to_specific_clone (cgraph_node *node)

> > > > >                                             DECL_ATTRIBUTES (e->callee->decl));

> > > > >

> > > > >        /* Function is not calling proper target clone.  */

> > > > > -      if (!attribute_list_equal (attr_target, attr_target2))

> > > > > +      if (attr_target2 == NULL_TREE || !attribute_value_equal (attr_target, attr_target2))

> > > > >         {

> > > > >           while (fv2->prev != NULL)

> > > > >             fv2 = fv2->prev;

> > > > > @@ -494,7 +494,7 @@ redirect_to_specific_clone (cgraph_node *node)

> > > > >               cgraph_node *callee = fv2->this_node;

> > > > >               attr_target2 = lookup_attribute ("target",

> > > > >                                                DECL_ATTRIBUTES (callee->decl));

> > > > > -             if (attribute_list_equal (attr_target, attr_target2))

> > > > > +             if (attr_target2 != NULL_TREE && attribute_value_equal (attr_target, attr_target2))

> > > > >                 {

> > > > >                   e->redirect_callee (callee);

> > > > >                   cgraph_edge::redirect_call_stmt_to_callee (e);

> > > > > --

> > > > > 2.27.0

> > > > >

> > > >

> > > >

> > > > --

> > > > H.J.

>

>

>

> --

> H.J.
Christophe Lyon via Gcc-patches June 20, 2020, 7:24 p.m. | #7
On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu <yyc1992@gmail.com> wrote:
>

> On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> >

> > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> > >

> > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > >

> > > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > > and caller (or at least the tree starting from the target attribute).

> > > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > > >

> > > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> > >

> > > OK so replacing

> > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > > with/adding to the test the following one should work. I still

> > > couldn't get test to run though......

> > >

> > Can you try the enclosed testcases?

>

>

> The assembly produced are the following with my patch and if I

> understand it correctly those should work. Unfortunately I don't know

> how to actually run the test as a test (if that makes sense....).

>


Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build directory, do

$ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'
i386.exp=pr95778-*.c"

They should fail without your patch and pass with your patch.


-- 
H.J.
Christophe Lyon via Gcc-patches June 20, 2020, 7:26 p.m. | #8
On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>

> On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu <yyc1992@gmail.com> wrote:

> >

> > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > >

> > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> > > >

> > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > >

> > > > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > > > and caller (or at least the tree starting from the target attribute).

> > > > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > > > >

> > > > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> > > >

> > > > OK so replacing

> > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > > > with/adding to the test the following one should work. I still

> > > > couldn't get test to run though......

> > > >

> > > Can you try the enclosed testcases?

> >

> >

> > The assembly produced are the following with my patch and if I

> > understand it correctly those should work. Unfortunately I don't know

> > how to actually run the test as a test (if that makes sense....).

> >

>

> Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build directory, do

>

> $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'

> i386.exp=pr95778-*.c"

>

> They should fail without your patch and pass with your patch.


Thanks. (need to run now will be back in a hour or two)

>

>

> --

> H.J.
Christophe Lyon via Gcc-patches June 20, 2020, 7:41 p.m. | #9
On Sat, Jun 20, 2020 at 3:26 PM Yichao Yu <yyc1992@gmail.com> wrote:
>

> On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> >

> > On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > >

> > > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > >

> > > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > >

> > > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > >

> > > > > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > > > > and caller (or at least the tree starting from the target attribute).

> > > > > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > > > > >

> > > > > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> > > > >

> > > > > OK so replacing

> > > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > > > > with/adding to the test the following one should work. I still

> > > > > couldn't get test to run though......

> > > > >

> > > > Can you try the enclosed testcases?

> > >

> > >

> > > The assembly produced are the following with my patch and if I

> > > understand it correctly those should work. Unfortunately I don't know

> > > how to actually run the test as a test (if that makes sense....).

> > >

> >

> > Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build directory, do

> >

> > $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'

> > i386.exp=pr95778-*.c"

> >


Finally got it start running after installing dejagnu...
It seems to be runing something unrelated though....

e.g. Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp
...
Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp
...

> > They should fail without your patch and pass with your patch.

>

> Thanks. (need to run now will be back in a hour or two)

>

> >

> >

> > --

> > H.J.
Christophe Lyon via Gcc-patches June 21, 2020, 12:16 a.m. | #10
On Sat, Jun 20, 2020 at 3:41 PM Yichao Yu <yyc1992@gmail.com> wrote:
>

> On Sat, Jun 20, 2020 at 3:26 PM Yichao Yu <yyc1992@gmail.com> wrote:

> >

> > On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > >

> > > On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > >

> > > > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > >

> > > > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > >

> > > > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > >

> > > > > > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > > > > > and caller (or at least the tree starting from the target attribute).

> > > > > > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > > > > > >

> > > > > > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> > > > > >

> > > > > > OK so replacing

> > > > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > > > > > with/adding to the test the following one should work. I still

> > > > > > couldn't get test to run though......

> > > > > >

> > > > > Can you try the enclosed testcases?

> > > >

> > > >

> > > > The assembly produced are the following with my patch and if I

> > > > understand it correctly those should work. Unfortunately I don't know

> > > > how to actually run the test as a test (if that makes sense....).

> > > >

> > >

> > > Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build directory, do

> > >

> > > $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'

> > > i386.exp=pr95778-*.c"

> > >

>

> Finally got it start running after installing dejagnu...

> It seems to be runing something unrelated though....

>

> e.g. Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp

> ...

> Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp


OK it seems that the `-*.c` syntax doesn't work for me somehow, there
are still a lot of unrelated outputs but when I replaced the * with 1
and 2 separately I can confirm that there are only expected pass with
my patch and there are some unexpected failures when the patch is
reverted.

> ...

>

> > > They should fail without your patch and pass with your patch.

> >

> > Thanks. (need to run now will be back in a hour or two)

> >

> > >

> > >

> > > --

> > > H.J.
Christophe Lyon via Gcc-patches June 21, 2020, 12:31 a.m. | #11
On Sat, Jun 20, 2020 at 8:16 PM Yichao Yu <yyc1992@gmail.com> wrote:
>

> On Sat, Jun 20, 2020 at 3:41 PM Yichao Yu <yyc1992@gmail.com> wrote:

> >

> > On Sat, Jun 20, 2020 at 3:26 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > >

> > > On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > >

> > > > On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > >

> > > > > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > >

> > > > > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > >

> > > > > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > >

> > > > > > > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > > > > > > and caller (or at least the tree starting from the target attribute).

> > > > > > > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > > > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > > > > > > >

> > > > > > > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> > > > > > >

> > > > > > > OK so replacing

> > > > > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > > > > > > with/adding to the test the following one should work. I still

> > > > > > > couldn't get test to run though......

> > > > > > >

> > > > > > Can you try the enclosed testcases?

> > > > >

> > > > >

> > > > > The assembly produced are the following with my patch and if I

> > > > > understand it correctly those should work. Unfortunately I don't know

> > > > > how to actually run the test as a test (if that makes sense....).

> > > > >

> > > >

> > > > Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build directory, do

> > > >

> > > > $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'

> > > > i386.exp=pr95778-*.c"

> > > >

> >

> > Finally got it start running after installing dejagnu...

> > It seems to be runing something unrelated though....

> >

> > e.g. Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp

> > ...

> > Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp

>

> OK it seems that the `-*.c` syntax doesn't work for me somehow, there

> are still a lot of unrelated outputs but when I replaced the * with 1

> and 2 separately I can confirm that there are only expected pass with

> my patch and there are some unexpected failures when the patch is

> reverted.


And the updated patch is

From b219facb0960cd09bbcef72e9c03cc045474a8e6 Mon Sep 17 00:00:00 2001
From: Yichao Yu <yyc1992@gmail.com>

Date: Sat, 20 Jun 2020 11:59:36 -0400
Subject: [PATCH] Fix target clone indirection elimination.

The current logic seems to be comparing the whole attribute tree
between the callee
and caller (or at least the tree starting from the target attribute).
This is unnecessary and causes strange dependency of the indirection
elimination on unrelated properties like `noinline`(PR95780) and
`visibility`(PR95778).

This changes the comparison to be only on the `target` attribute which should be
the intent of the code.
---
gcc/multiple_target.c                     |  4 ++--
gcc/testsuite/gcc.target/i386/pr95778-1.c | 21 +++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr95778-2.c | 21 +++++++++++++++++++++
3 files changed, 44 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr95778-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr95778-2.c

diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index c1cfe8ff978..9eb0afd58cc 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -483,7 +483,7 @@ redirect_to_specific_clone (cgraph_node *node)
                                           DECL_ATTRIBUTES (e->callee->decl));

      /* Function is not calling proper target clone.  */
-      if (!attribute_list_equal (attr_target, attr_target2))
+      if (attr_target2 == NULL_TREE || !attribute_value_equal
(attr_target, attr_target2))
       {
         while (fv2->prev != NULL)
           fv2 = fv2->prev;
@@ -494,7 +494,7 @@ redirect_to_specific_clone (cgraph_node *node)
             cgraph_node *callee = fv2->this_node;
             attr_target2 = lookup_attribute ("target",
                                              DECL_ATTRIBUTES (callee->decl));
-             if (attribute_list_equal (attr_target, attr_target2))
+             if (attr_target2 != NULL_TREE && attribute_value_equal
(attr_target, attr_target2
))
               {
                 e->redirect_callee (callee);
                 cgraph_edge::redirect_call_stmt_to_callee (e);
diff --git a/gcc/testsuite/gcc.target/i386/pr95778-1.c
b/gcc/testsuite/gcc.target/i386/pr95778-
1.c
new file mode 100644
index 00000000000..3238303d696
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr95778-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target fpic } } */
+/* { dg-options "-O3 -fPIC -fno-asynchronous-unwind-tables" } */
+/* { dg-require-ifunc "" } */
+
+__attribute__((target_clones("default,avx2")))
+static int
+f2(int *p)
+{
+  asm volatile ("" :: "r"(p) : "memory");
+  return *p;
+}
+
+__attribute__((target_clones("default,avx2")))
+int
+g2(int *p)
+{
+  return f2(p);
+}
+
+/* { dg-final { scan-assembler "g2.default.1:\n\tjmp\tf2.default.1\n" } } */
+/* { dg-final { scan-assembler "g2.avx2.0:\n\tjmp\tf2.avx2.0\n" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr95778-2.c
b/gcc/testsuite/gcc.target/i386/pr95778-
2.c
new file mode 100644
index 00000000000..e88702d2b82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr95778-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target fpic } } */
+/* { dg-options "-O3 -fPIC -fno-asynchronous-unwind-tables" } */
+/* { dg-require-ifunc "" } */
+
+__attribute__((visibility("internal"),target_clones("default,avx2")))
+int
+f2(int *p)
+{
+  asm volatile ("" :: "r"(p) : "memory");
+  return *p;
+}
+
+__attribute__((target_clones("default,avx2")))
+int
+g2(int *p)
+{
+  return f2(p);
+}
+
+/* { dg-final { scan-assembler "g2.default.1:\n\tjmp\tf2.default.1\n" } } */
+/* { dg-final { scan-assembler "g2.avx2.0:\n\tjmp\tf2.avx2.0\n" } } */
--
2.27.0
Christophe Lyon via Gcc-patches June 22, 2020, 9:24 a.m. | #12
On Sun, Jun 21, 2020 at 2:32 AM Yichao Yu via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>

> On Sat, Jun 20, 2020 at 8:16 PM Yichao Yu <yyc1992@gmail.com> wrote:

> >

> > On Sat, Jun 20, 2020 at 3:41 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > >

> > > On Sat, Jun 20, 2020 at 3:26 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > >

> > > > On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > >

> > > > > On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > >

> > > > > > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > >

> > > > > > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > >

> > > > > > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > > >

> > > > > > > > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > > > > > > > and caller (or at least the tree starting from the target attribute).

> > > > > > > > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > > > > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > > > > > > > >

> > > > > > > > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> > > > > > > >

> > > > > > > > OK so replacing

> > > > > > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > > > > > > > with/adding to the test the following one should work. I still

> > > > > > > > couldn't get test to run though......

> > > > > > > >

> > > > > > > Can you try the enclosed testcases?

> > > > > >

> > > > > >

> > > > > > The assembly produced are the following with my patch and if I

> > > > > > understand it correctly those should work. Unfortunately I don't know

> > > > > > how to actually run the test as a test (if that makes sense....).

> > > > > >

> > > > >

> > > > > Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build directory, do

> > > > >

> > > > > $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'

> > > > > i386.exp=pr95778-*.c"

> > > > >

> > >

> > > Finally got it start running after installing dejagnu...

> > > It seems to be runing something unrelated though....

> > >

> > > e.g. Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp

> > > ...

> > > Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp

> >

> > OK it seems that the `-*.c` syntax doesn't work for me somehow, there

> > are still a lot of unrelated outputs but when I replaced the * with 1

> > and 2 separately I can confirm that there are only expected pass with

> > my patch and there are some unexpected failures when the patch is

> > reverted.

>

> And the updated patch is


OK.

Thanks,
Richard,

> From b219facb0960cd09bbcef72e9c03cc045474a8e6 Mon Sep 17 00:00:00 2001

> From: Yichao Yu <yyc1992@gmail.com>

> Date: Sat, 20 Jun 2020 11:59:36 -0400

> Subject: [PATCH] Fix target clone indirection elimination.

>

> The current logic seems to be comparing the whole attribute tree

> between the callee

> and caller (or at least the tree starting from the target attribute).

> This is unnecessary and causes strange dependency of the indirection

> elimination on unrelated properties like `noinline`(PR95780) and

> `visibility`(PR95778).

>

> This changes the comparison to be only on the `target` attribute which should be

> the intent of the code.

> ---

> gcc/multiple_target.c                     |  4 ++--

> gcc/testsuite/gcc.target/i386/pr95778-1.c | 21 +++++++++++++++++++++

> gcc/testsuite/gcc.target/i386/pr95778-2.c | 21 +++++++++++++++++++++

> 3 files changed, 44 insertions(+), 2 deletions(-)

> create mode 100644 gcc/testsuite/gcc.target/i386/pr95778-1.c

> create mode 100644 gcc/testsuite/gcc.target/i386/pr95778-2.c

>

> diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c

> index c1cfe8ff978..9eb0afd58cc 100644

> --- a/gcc/multiple_target.c

> +++ b/gcc/multiple_target.c

> @@ -483,7 +483,7 @@ redirect_to_specific_clone (cgraph_node *node)

>                                            DECL_ATTRIBUTES (e->callee->decl));

>

>       /* Function is not calling proper target clone.  */

> -      if (!attribute_list_equal (attr_target, attr_target2))

> +      if (attr_target2 == NULL_TREE || !attribute_value_equal

> (attr_target, attr_target2))

>        {

>          while (fv2->prev != NULL)

>            fv2 = fv2->prev;

> @@ -494,7 +494,7 @@ redirect_to_specific_clone (cgraph_node *node)

>              cgraph_node *callee = fv2->this_node;

>              attr_target2 = lookup_attribute ("target",

>                                               DECL_ATTRIBUTES (callee->decl));

> -             if (attribute_list_equal (attr_target, attr_target2))

> +             if (attr_target2 != NULL_TREE && attribute_value_equal

> (attr_target, attr_target2

> ))

>                {

>                  e->redirect_callee (callee);

>                  cgraph_edge::redirect_call_stmt_to_callee (e);

> diff --git a/gcc/testsuite/gcc.target/i386/pr95778-1.c

> b/gcc/testsuite/gcc.target/i386/pr95778-

> 1.c

> new file mode 100644

> index 00000000000..3238303d696

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/i386/pr95778-1.c

> @@ -0,0 +1,21 @@

> +/* { dg-do compile { target fpic } } */

> +/* { dg-options "-O3 -fPIC -fno-asynchronous-unwind-tables" } */

> +/* { dg-require-ifunc "" } */

> +

> +__attribute__((target_clones("default,avx2")))

> +static int

> +f2(int *p)

> +{

> +  asm volatile ("" :: "r"(p) : "memory");

> +  return *p;

> +}

> +

> +__attribute__((target_clones("default,avx2")))

> +int

> +g2(int *p)

> +{

> +  return f2(p);

> +}

> +

> +/* { dg-final { scan-assembler "g2.default.1:\n\tjmp\tf2.default.1\n" } } */

> +/* { dg-final { scan-assembler "g2.avx2.0:\n\tjmp\tf2.avx2.0\n" } } */

> diff --git a/gcc/testsuite/gcc.target/i386/pr95778-2.c

> b/gcc/testsuite/gcc.target/i386/pr95778-

> 2.c

> new file mode 100644

> index 00000000000..e88702d2b82

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/i386/pr95778-2.c

> @@ -0,0 +1,21 @@

> +/* { dg-do compile { target fpic } } */

> +/* { dg-options "-O3 -fPIC -fno-asynchronous-unwind-tables" } */

> +/* { dg-require-ifunc "" } */

> +

> +__attribute__((visibility("internal"),target_clones("default,avx2")))

> +int

> +f2(int *p)

> +{

> +  asm volatile ("" :: "r"(p) : "memory");

> +  return *p;

> +}

> +

> +__attribute__((target_clones("default,avx2")))

> +int

> +g2(int *p)

> +{

> +  return f2(p);

> +}

> +

> +/* { dg-final { scan-assembler "g2.default.1:\n\tjmp\tf2.default.1\n" } } */

> +/* { dg-final { scan-assembler "g2.avx2.0:\n\tjmp\tf2.avx2.0\n" } } */

> --

> 2.27.0
Christophe Lyon via Gcc-patches June 26, 2020, 9:50 p.m. | #13
On Mon, 2020-06-22 at 11:24 +0200, Richard Biener via Gcc-patches wrote:
> On Sun, Jun 21, 2020 at 2:32 AM Yichao Yu via Gcc-patches

> <gcc-patches@gcc.gnu.org> wrote:

> > On Sat, Jun 20, 2020 at 8:16 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > On Sat, Jun 20, 2020 at 3:41 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > On Sat, Jun 20, 2020 at 3:26 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > > > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > > > > > > > > and caller (or at least the tree starting from the target attribute).

> > > > > > > > > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > > > > > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > > > > > > > > > 

> > > > > > > > > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> > > > > > > > > 

> > > > > > > > > OK so replacing

> > > > > > > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > > > > > > > > with/adding to the test the following one should work. I still

> > > > > > > > > couldn't get test to run though......

> > > > > > > > > 

> > > > > > > > Can you try the enclosed testcases?

> > > > > > > 

> > > > > > > The assembly produced are the following with my patch and if I

> > > > > > > understand it correctly those should work. Unfortunately I don't know

> > > > > > > how to actually run the test as a test (if that makes sense....).

> > > > > > > 

> > > > > > 

> > > > > > Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build directory, do

> > > > > > 

> > > > > > $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'

> > > > > > i386.exp=pr95778-*.c"

> > > > > > 

> > > > 

> > > > Finally got it start running after installing dejagnu...

> > > > It seems to be runing something unrelated though....

> > > > 

> > > > e.g. Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp

> > > > ...

> > > > Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp

> > > 

> > > OK it seems that the `-*.c` syntax doesn't work for me somehow, there

> > > are still a lot of unrelated outputs but when I replaced the * with 1

> > > and 2 separately I can confirm that there are only expected pass with

> > > my patch and there are some unexpected failures when the patch is

> > > reverted.

> > 

> > And the updated patch is

> 

> OK.

I fixed up the obvious formatting goofs and constructed a ChangeLog and have
pushed the change to the trunk.
jeff
>
Christophe Lyon via Gcc-patches June 27, 2020, 1:34 a.m. | #14
On Fri, Jun 26, 2020 at 5:51 PM Jeff Law <law@redhat.com> wrote:
>

> On Mon, 2020-06-22 at 11:24 +0200, Richard Biener via Gcc-patches wrote:

> > On Sun, Jun 21, 2020 at 2:32 AM Yichao Yu via Gcc-patches

> > <gcc-patches@gcc.gnu.org> wrote:

> > > On Sat, Jun 20, 2020 at 8:16 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > On Sat, Jun 20, 2020 at 3:41 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > On Sat, Jun 20, 2020 at 3:26 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > > On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > > > > > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu <yyc1992@gmail.com> wrote:

> > > > > > > > > > > > > The current logic seems to be comparing the whole attribute tree between the callee

> > > > > > > > > > > > > and caller (or at least the tree starting from the target attribute).

> > > > > > > > > > > > > This is unnecessary and causes strange dependency of the indirection

> > > > > > > > > > > > > elimination on unrelated properties like `noinline`(PR95780) and `visibility`(PR95778).

> > > > > > > > > > > >

> > > > > > > > > > > > Does it fix PR95780 and PR95778?  Can you include testcases for them?

> > > > > > > > > >

> > > > > > > > > > OK so replacing

> > > > > > > > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5

> > > > > > > > > > with/adding to the test the following one should work. I still

> > > > > > > > > > couldn't get test to run though......

> > > > > > > > > >

> > > > > > > > > Can you try the enclosed testcases?

> > > > > > > >

> > > > > > > > The assembly produced are the following with my patch and if I

> > > > > > > > understand it correctly those should work. Unfortunately I don't know

> > > > > > > > how to actually run the test as a test (if that makes sense....).

> > > > > > > >

> > > > > > >

> > > > > > > Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build directory, do

> > > > > > >

> > > > > > > $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'

> > > > > > > i386.exp=pr95778-*.c"

> > > > > > >

> > > > >

> > > > > Finally got it start running after installing dejagnu...

> > > > > It seems to be runing something unrelated though....

> > > > >

> > > > > e.g. Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp

> > > > > ...

> > > > > Running /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp

> > > >

> > > > OK it seems that the `-*.c` syntax doesn't work for me somehow, there

> > > > are still a lot of unrelated outputs but when I replaced the * with 1

> > > > and 2 separately I can confirm that there are only expected pass with

> > > > my patch and there are some unexpected failures when the patch is

> > > > reverted.

> > >

> > > And the updated patch is

> >

> > OK.

> I fixed up the obvious formatting goofs and constructed a ChangeLog and have

> pushed the change to the trunk.


Thank you.

> jeff

> >

>

Patch

diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index c1cfe8ff978..9eb0afd58cc 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -483,7 +483,7 @@  redirect_to_specific_clone (cgraph_node *node)
 					    DECL_ATTRIBUTES (e->callee->decl));
 
       /* Function is not calling proper target clone.  */
-      if (!attribute_list_equal (attr_target, attr_target2))
+      if (attr_target2 == NULL_TREE || !attribute_value_equal (attr_target, attr_target2))
 	{
 	  while (fv2->prev != NULL)
 	    fv2 = fv2->prev;
@@ -494,7 +494,7 @@  redirect_to_specific_clone (cgraph_node *node)
 	      cgraph_node *callee = fv2->this_node;
 	      attr_target2 = lookup_attribute ("target",
 					       DECL_ATTRIBUTES (callee->decl));
-	      if (attribute_list_equal (attr_target, attr_target2))
+	      if (attr_target2 != NULL_TREE && attribute_value_equal (attr_target, attr_target2))
 		{
 		  e->redirect_callee (callee);
 		  cgraph_edge::redirect_call_stmt_to_callee (e);