[rs6000] Add ALTIVEC_REGS as pressure class

Message ID 63c9b8a8-ff2f-cb54-6f97-ba8d401612d1@linux.ibm.com
State New
Headers show
Series
  • [rs6000] Add ALTIVEC_REGS as pressure class
Related show

Commit Message

Marek Polacek via Gcc-patches May 7, 2021, 3:53 p.m.
Add ALTIVEC_REGS as pressure class.

Code that has heavy register pressure on Altivec registers can suffer from
over-aggressive scheduling during sched1, which then leads to increased
register spill. This is due to the fact that registers that prefer
ALTIVEC_REGS are currently assigned an allocno class of VSX_REGS. This then
misleads the scheduler to think there are 64 regs available, when in reality
there are only 32 Altivec regs. This patch fixes the problem by assigning an
allocno class of ALTIVEC_REGS and adding ALTIVEC_REGS as a pressure class.

Bootstrap/regtest on powerpc64/powerpc64le with no new regressions. Testing
on CPU2017 showed no significant differences. Ok for trunk?

-Pat


2021-05-07  Pat Haugen  <pthaugen@linux.ibm.com>

gcc/ChangeLog:

	* config/rs6000/rs6000.c (rs6000_ira_change_pseudo_allocno_class):
	Return ALTIVEC_REGS if that is best_class.
	(rs6000_compute_pressure_classes): Add ALTIVEC_REGS.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/fold-vec-insert-float-p9.c: Adjust instruction counts.
	* gcc.target/powerpc/vec-rlmi-rlnm.c: Likewise.

Comments

Segher Boessenkool May 7, 2021, 11 p.m. | #1
Hi!

On Fri, May 07, 2021 at 10:53:31AM -0500, Pat Haugen wrote:
> Code that has heavy register pressure on Altivec registers can suffer from

> over-aggressive scheduling during sched1, which then leads to increased

> register spill. This is due to the fact that registers that prefer

> ALTIVEC_REGS are currently assigned an allocno class of VSX_REGS. This then

> misleads the scheduler to think there are 64 regs available, when in reality

> there are only 32 Altivec regs. This patch fixes the problem by assigning an

> allocno class of ALTIVEC_REGS and adding ALTIVEC_REGS as a pressure class.


> gcc/testsuite/ChangeLog:

> 

> 	* gcc.target/powerpc/fold-vec-insert-float-p9.c: Adjust instruction counts.


(line too long)

> +    case VSX_REGS:

> +      if (best_class == ALTIVEC_REGS)

> +	return ALTIVEC_REGS;


Should this be under just this case, or should we do this always?
Maybe change the big switch to be on best_class instead of on
allocno_class?

> --- a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c

> +++ b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c

> @@ -62,6 +62,6 @@ rlnm_test_2 (vector unsigned long long x, vector unsigned long long y,

>  /* { dg-final { scan-assembler-times "vextsb2d" 1 } } */

>  /* { dg-final { scan-assembler-times "vslw" 1 } } */

>  /* { dg-final { scan-assembler-times "vsld" 1 } } */

> -/* { dg-final { scan-assembler-times "xxlor" 3 } } */

> +/* { dg-final { scan-assembler-times "xxlor" 2 } } */

>  /* { dg-final { scan-assembler-times "vrlwnm" 2 } } */

>  /* { dg-final { scan-assembler-times "vrldnm" 2 } } */


So what is this replaced with?  Was it an "xxlmr" and it is just
unnecessary now?

The patch is okay for trunk.  Thanks!


Segher
Marek Polacek via Gcc-patches May 10, 2021, 12:52 p.m. | #2
On 5/7/21 6:00 PM, Segher Boessenkool wrote:
>> --- a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c

>> +++ b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c

>> @@ -62,6 +62,6 @@ rlnm_test_2 (vector unsigned long long x, vector unsigned long long y,

>>  /* { dg-final { scan-assembler-times "vextsb2d" 1 } } */

>>  /* { dg-final { scan-assembler-times "vslw" 1 } } */

>>  /* { dg-final { scan-assembler-times "vsld" 1 } } */

>> -/* { dg-final { scan-assembler-times "xxlor" 3 } } */

>> +/* { dg-final { scan-assembler-times "xxlor" 2 } } */

>>  /* { dg-final { scan-assembler-times "vrlwnm" 2 } } */

>>  /* { dg-final { scan-assembler-times "vrldnm" 2 } } */

> So what is this replaced with?  Was it an "xxlmr" and it is just

> unnecessary now?


Different RA choice made the reg copy unnecessary.

< 	xxspltib 0,8
< 	xxlor 32,0,0
---
> 	xxspltib 32,8


-Pat
Marek Polacek via Gcc-patches May 10, 2021, 1:53 p.m. | #3
On 5/10/21 7:52 AM, Pat Haugen wrote:
> On 5/7/21 6:00 PM, Segher Boessenkool wrote:

>> So what is this replaced with?  Was it an "xxlmr" and it is just

>> unnecessary now?

> 

> Different RA choice made the reg copy unnecessary.

> 

> < 	xxspltib 0,8

> < 	xxlor 32,0,0

> ---

>> 	xxspltib 32,8


Given how we use xxlor's for vsx reg copies and how easily they
can change, I'm not sure we should even be counting them at all,
since they can change with the phase of the moon or the day of 
the week.

Peter
Segher Boessenkool May 10, 2021, 2:28 p.m. | #4
On Mon, May 10, 2021 at 08:53:31AM -0500, Peter Bergner wrote:
> On 5/10/21 7:52 AM, Pat Haugen wrote:

> > On 5/7/21 6:00 PM, Segher Boessenkool wrote:

> >> So what is this replaced with?  Was it an "xxlmr" and it is just

> >> unnecessary now?

> > 

> > Different RA choice made the reg copy unnecessary.

> > 

> > < 	xxspltib 0,8

> > < 	xxlor 32,0,0

> > ---

> >> 	xxspltib 32,8

> 

> Given how we use xxlor's for vsx reg copies and how easily they

> can change, I'm not sure we should even be counting them at all,

> since they can change with the phase of the moon or the day of 

> the week.


Yeah -- otoh, it is probably a good idea to keep it for some simpler
testcases, so we are alerted to regressions on this.  It's a tradeoff,
there is no one best way.  But maybe remove it if it "randomly" changed
on a testcase?


Segher

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 844fee8..fee4eef 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -22487,11 +22487,14 @@  rs6000_ira_change_pseudo_allocno_class (int regno ATTRIBUTE_UNUSED,
 	 of allocno class.  */
       if (best_class == BASE_REGS)
 	return GENERAL_REGS;
-      if (TARGET_VSX
-	  && (best_class == FLOAT_REGS || best_class == ALTIVEC_REGS))
+      if (TARGET_VSX && best_class == FLOAT_REGS)
 	return VSX_REGS;
       return best_class;
 
+    case VSX_REGS:
+      if (best_class == ALTIVEC_REGS)
+	return ALTIVEC_REGS;
+
     default:
       break;
     }
@@ -23609,12 +23612,12 @@  rs6000_compute_pressure_classes (enum reg_class *pressure_classes)
 
   n = 0;
   pressure_classes[n++] = GENERAL_REGS;
+  if (TARGET_ALTIVEC)
+    pressure_classes[n++] = ALTIVEC_REGS;
   if (TARGET_VSX)
     pressure_classes[n++] = VSX_REGS;
   else
     {
-      if (TARGET_ALTIVEC)
-	pressure_classes[n++] = ALTIVEC_REGS;
       if (TARGET_HARD_FLOAT)
 	pressure_classes[n++] = FLOAT_REGS;
     }
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c
index 1c57672..4541768 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c
@@ -31,5 +31,5 @@  testf_cst (float f, vector float vf)
 /* { dg-final { scan-assembler-times {\mstfs\M} 2 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {\mlxv\M} 2 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {\mlvewx\M} 1 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times {\mvperm\M} 1 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times {\mxxperm\M} 2 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mvperm\M} 2 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mxxperm\M} 1 { target ilp32 } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
index 1e7d739..5512c0f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
@@ -62,6 +62,6 @@  rlnm_test_2 (vector unsigned long long x, vector unsigned long long y,
 /* { dg-final { scan-assembler-times "vextsb2d" 1 } } */
 /* { dg-final { scan-assembler-times "vslw" 1 } } */
 /* { dg-final { scan-assembler-times "vsld" 1 } } */
-/* { dg-final { scan-assembler-times "xxlor" 3 } } */
+/* { dg-final { scan-assembler-times "xxlor" 2 } } */
 /* { dg-final { scan-assembler-times "vrlwnm" 2 } } */
 /* { dg-final { scan-assembler-times "vrldnm" 2 } } */