[poweprc] RFA: patch changing expected code generation for test vsx-simode2.c

Message ID e4e6092f-540a-0ec3-c137-33d55e2edb93@redhat.com
State New
Headers show
Series
  • [poweprc] RFA: patch changing expected code generation for test vsx-simode2.c
Related show

Commit Message

Vladimir Makarov Feb. 8, 2019, 7:18 p.m.
Recently I committed a patch solving

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88560

The patch resulted in test vsx-simode2.c failure.  Here is the 
difference in generated code:

@@ -13,9 +13,8 @@ foo:
  .LFB0:
         .cfi_startproc
         std 3,-16(1)
-       ori 2,2,0
-       lwz 9,-12(1)
-       mtvsrwz 32,9
+       addi 9,1,-12
+       lxsiwzx 32,0,9

The new version is one insn less.  So I propose the following patch 
changing the expected code generation.

Is it ok to commit it?

Comments

Segher Boessenkool Feb. 9, 2019, 1:28 p.m. | #1
Hi Vlad,

On Fri, Feb 08, 2019 at 02:18:40PM -0500, Vladimir Makarov wrote:
> Recently I committed a patch solving

> 

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88560

> 

> The patch resulted in test vsx-simode2.c failure.  Here is the 

> difference in generated code:

> 

> @@ -13,9 +13,8 @@ foo:

>  .LFB0:

>         .cfi_startproc

>         std 3,-16(1)

> -       ori 2,2,0

> -       lwz 9,-12(1)

> -       mtvsrwz 32,9

> +       addi 9,1,-12

> +       lxsiwzx 32,0,9

> 

> The new version is one insn less.  So I propose the following patch 

> changing the expected code generation.

> 

> Is it ok to commit it?


This is not okay.  The test is supposed to test that we get a direct
move instruction instead of going via memory.  But, trunk does the
std+lwz as you see; this is because IRA decides this pseudo needs to
go to memory:

    r125: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS

  a1(r125,l0) costs: BASE_REGS:14004,14004 GENERAL_REGS:14004,14004 LINK_REGS:24010,24010 CTR_REGS:24010,24010 LINK_OR_CTR_REGS:24010,24010 SPEC_OR_GEN_REGS:24010,24010 MEM:12000,12000

Is there something wrong in our tuning?


For reference, 7 and 8 do just

        mtvsrwz 32,3
#APP
 # 10 "vsx-simode2.c" 1
        xxlor 32,32,32  # v, v constraints
 # 0 "" 2
#NO_APP
        mfvsrwz 3,32
        blr

which is the expected code.  The test really should check there is no
memory used, or that there are no extra insns other than the 4 expected.

Your patch seems to be fine btw, this breakage was really there already,
just not detected by the testcase.


Segher
Vladimir Makarov Feb. 9, 2019, 9:13 p.m. | #2
On 2019-02-09 8:28 a.m., Segher Boessenkool wrote:
> Hi Vlad,

>

> On Fri, Feb 08, 2019 at 02:18:40PM -0500, Vladimir Makarov wrote:

>> Recently I committed a patch solving

>>

>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88560

>>

>> The patch resulted in test vsx-simode2.c failure.  Here is the

>> difference in generated code:

>>

>> @@ -13,9 +13,8 @@ foo:

>>   .LFB0:

>>          .cfi_startproc

>>          std 3,-16(1)

>> -       ori 2,2,0

>> -       lwz 9,-12(1)

>> -       mtvsrwz 32,9

>> +       addi 9,1,-12

>> +       lxsiwzx 32,0,9

>>

>> The new version is one insn less.  So I propose the following patch

>> changing the expected code generation.

>>

>> Is it ok to commit it?

> This is not okay.  The test is supposed to test that we get a direct

> move instruction instead of going via memory.  But, trunk does the

> std+lwz as you see; this is because IRA decides this pseudo needs to

> go to memory:

>

>      r125: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS

>

>    a1(r125,l0) costs: BASE_REGS:14004,14004 GENERAL_REGS:14004,14004 LINK_REGS:24010,24010 CTR_REGS:24010,24010 LINK_OR_CTR_REGS:24010,24010 SPEC_OR_GEN_REGS:24010,24010 MEM:12000,12000


Thank you for informing me what we expect from the test.

Apparently, the test did not catch what was supposed to be catched.

Although the new generated code is better than the old one (2 insns vs 3 
insns, one insn is a load in the both cases), I see there is no sense 
for this patch.  Simply, the test did not fail before even if the code 
was bad.  Now the test fails as it should be.

> Is there something wrong in our tuning?

>

I have no idea.  It needs more investigation.
> For reference, 7 and 8 do just

>

>          mtvsrwz 32,3

> #APP

>   # 10 "vsx-simode2.c" 1

>          xxlor 32,32,32  # v, v constraints

>   # 0 "" 2

> #NO_APP

>          mfvsrwz 3,32

>          blr

>

> which is the expected code.  The test really should check there is no

> memory used, or that there are no extra insns other than the 4 expected.

>

> Your patch seems to be fine btw, this breakage was really there already,

> just not detected by the testcase.

>

Yes, the patch is fine in a sense that the code is a bit better. But 
still the generated code is bad and the test started to fail. I don't 
think we need to change the test.  The original test now reminds us to 
fix the bad code generation.
Segher Boessenkool Feb. 9, 2019, 10:14 p.m. | #3
On Sat, Feb 09, 2019 at 04:13:57PM -0500, Vladimir Makarov wrote:
> 

> On 2019-02-09 8:28 a.m., Segher Boessenkool wrote:

> >Hi Vlad,

> >

> >On Fri, Feb 08, 2019 at 02:18:40PM -0500, Vladimir Makarov wrote:

> >>Recently I committed a patch solving

> >>

> >>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88560

> >>

> >>The patch resulted in test vsx-simode2.c failure.  Here is the

> >>difference in generated code:

> >>

> >>@@ -13,9 +13,8 @@ foo:

> >>  .LFB0:

> >>         .cfi_startproc

> >>         std 3,-16(1)

> >>-       ori 2,2,0

> >>-       lwz 9,-12(1)

> >>-       mtvsrwz 32,9

> >>+       addi 9,1,-12

> >>+       lxsiwzx 32,0,9

> >>

> >>The new version is one insn less.  So I propose the following patch

> >>changing the expected code generation.

> >>

> >>Is it ok to commit it?

> >This is not okay.  The test is supposed to test that we get a direct

> >move instruction instead of going via memory.  But, trunk does the

> >std+lwz as you see; this is because IRA decides this pseudo needs to

> >go to memory:

> >

> >     r125: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS

> >

> >   a1(r125,l0) costs: BASE_REGS:14004,14004 GENERAL_REGS:14004,14004 

> >   LINK_REGS:24010,24010 CTR_REGS:24010,24010 LINK_OR_CTR_REGS:24010,24010 

> >   SPEC_OR_GEN_REGS:24010,24010 MEM:12000,12000

> 

> Thank you for informing me what we expect from the test.

> 

> Apparently, the test did not catch what was supposed to be catched.


Yes, exactly.

> Although the new generated code is better than the old one (2 insns vs 3 

> insns, one insn is a load in the both cases), I see there is no sense 

> for this patch.  Simply, the test did not fail before even if the code 

> was bad.  Now the test fails as it should be.

> 

> >Is there something wrong in our tuning?

> >

> I have no idea.  It needs more investigation.


Where do the above costs come from?  Regs 14k, mem 12k.

> >For reference, 7 and 8 do just

> >

> >         mtvsrwz 32,3

> >#APP

> >  # 10 "vsx-simode2.c" 1

> >         xxlor 32,32,32  # v, v constraints

> >  # 0 "" 2

> >#NO_APP

> >         mfvsrwz 3,32

> >         blr

> >

> >which is the expected code.  The test really should check there is no

> >memory used, or that there are no extra insns other than the 4 expected.

> >

> >Your patch seems to be fine btw, this breakage was really there already,

> >just not detected by the testcase.

> >

> Yes, the patch is fine in a sense that the code is a bit better.


If we decided to assign memory to this pseudo, it now uses better code
for that.  (Not really fewer insns though, the ori 2,2,0 went missing,
and that is still required for good performance on Power8 at least).

> But 

> still the generated code is bad and the test started to fail. I don't 

> think we need to change the test.  The original test now reminds us to 

> fix the bad code generation.


Yeah.  And I'll improve the test a bit so it would have failed earlier.

It didn't fail before because combine used to usurp the RA job, doing some
kind of greedy register allocation, increasing the lifetime of argument
registers.

I opened PR89271 (and put you on cc:).


Segher

Patch

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 268581)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@ 
+2019-02-08  Vladimir Makarov  <vmakarov@redhat.com>
+
+	* gcc.target/powerpc/vsx-simode2.c: Expect lxsiwzx instead of
+	mtvsrwz.
+
 2019-02-06  Richard Biener  <rguenther@suse.de>
 
 	PR tree-optimization/89182
Index: gcc.target/powerpc/vsx-simode2.c
===================================================================
--- gcc.target/powerpc/vsx-simode2.c	(revision 268581)
+++ gcc.target/powerpc/vsx-simode2.c	(working copy)
@@ -11,5 +11,5 @@  unsigned int foo (unsigned int u)
   return ret;
 }
 
-/* { dg-final { scan-assembler "mtvsrwz" } } */
+/* { dg-final { scan-assembler "lxsiwzx" } } */
 /* { dg-final { scan-assembler "mfvsrwz" } } */