[04/25] SPECIAL_REGNO_P

Message ID a02c39933eaad0ade4276dd9b55f6fc6ab1954ce.1536144068.git.ams@codesourcery.com
State New
Headers show
Series
  • AMD GCN Port
Related show

Commit Message

Andrew Stubbs Sept. 5, 2018, 11:48 a.m.
GCN has some registers which are special purpose, but not "fixed" because we
want the register allocator to track their usage and select alternatives that
use different special registers (e.g. scalar cc vs. vector cc).

Sometimes this leads the regrename pass to ICE.  Quite how it gets confused is
not well understood, but considering such registers for renaming is surely not
useful.

This patch creates a new macro SPECIAL_REGNO_P which disables regrename.  In
other words, the register is fixed once allocated.

2018-09-05  Kwok Cheung Yeung  <kcy@codesourcery.com>

	gcc/
	* defaults.h (SPECIAL_REGNO_P): Define to false by default.
	* regrename.c (check_new_reg_p): Do not rename to a special register.
	(rename_chains): Do not rename special registers.
---
 gcc/defaults.h  | 4 ++++
 gcc/regrename.c | 2 ++
 2 files changed, 6 insertions(+)

Comments

Joseph Myers Sept. 5, 2018, 12:21 p.m. | #1
On Wed, 5 Sep 2018, ams@codesourcery.com wrote:

> This patch creates a new macro SPECIAL_REGNO_P which disables regrename.  In

> other words, the register is fixed once allocated.


Creating new target macros is generally suspect - the presumption is that 
target hooks should be used instead, unless it's clear the macro is part 
of a group of very closely related macros that should all become hooks at 
the same time (e.g. if adding a new one of the set of *_TYPE macros for 
standard typedefs, a macro would probably be appropriate rather than 
making just the new one into a hook).

-- 
Joseph S. Myers
joseph@codesourcery.com
Jeff Law Sept. 11, 2018, 10:42 p.m. | #2
On 9/5/18 5:48 AM, ams@codesourcery.com wrote:
> 

> GCN has some registers which are special purpose, but not "fixed" because we

> want the register allocator to track their usage and select alternatives that

> use different special registers (e.g. scalar cc vs. vector cc).

> 

> Sometimes this leads the regrename pass to ICE.  Quite how it gets confused is

> not well understood, but considering such registers for renaming is surely not

> useful.

> 

> This patch creates a new macro SPECIAL_REGNO_P which disables regrename.  In

> other words, the register is fixed once allocated.

> 

> 2018-09-05  Kwok Cheung Yeung  <kcy@codesourcery.com>

> 

> 	gcc/

> 	* defaults.h (SPECIAL_REGNO_P): Define to false by default.

> 	* regrename.c (check_new_reg_p): Do not rename to a special register.

> 	(rename_chains): Do not rename special registers.

This feels like you're papering over a problem in regrename and/or the
GCN port..  regrename should be checking the predicate and constraints
when it makes changes.  And I think that you're still allowed to refer
to a fixed register in alternatives.

Jeff
Andrew Stubbs Sept. 12, 2018, 11:29 a.m. | #3
On 11/09/18 23:42, Jeff Law wrote:
> This feels like you're papering over a problem in regrename and/or the

> GCN port..  regrename should be checking the predicate and constraints

> when it makes changes.  And I think that you're still allowed to refer

> to a fixed register in alternatives.


I think you're allowed to use a constraint to match an already-present 
hardreg, fixed or otherwise, but my understanding is that LRA will never 
convert a pseudoreg to a fixed hardreg, no matter what the constraint says.

Just to make sure, I just tried to fix EXEC (the only register matching 
the "e" constraint, and one of the "special" ones), and as expected the 
compiler blows up with "unable to generate reloads for ...".

Anyway, back to the issue of SPECIAL_REGNO_P ...

I've just retested the motivating example that we had, and that no 
longer fails in regrename.  That could be because the problem is fixed, 
or simply that the compiler no longer generates the exact instruction 
sequence that demonstrates the problem.

If I can't reproduce the issue then this macro becomes just a small 
compile-time optimization and we can remove it safely.

I'll report back when I've done more testing.

Andrew
Richard Henderson Sept. 12, 2018, 3:31 p.m. | #4
On 09/05/2018 04:48 AM, ams@codesourcery.com wrote:
> @@ -1198,6 +1198,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see

>  #define NO_FUNCTION_CSE false

>  #endif

>  

> +#ifndef SPECIAL_REGNO_P

> +#define SPECIAL_REGNO_P(REGNO) false

> +#endif

> +

>  #ifndef HARD_REGNO_RENAME_OK

>  #define HARD_REGNO_RENAME_OK(FROM, TO) true

>  #endif

...

> @@ -320,6 +320,7 @@ check_new_reg_p (int reg ATTRIBUTE_UNUSED, int new_reg,

>      if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)

>  	|| fixed_regs[new_reg + i]

>  	|| global_regs[new_reg + i]

> +	|| SPECIAL_REGNO_P (new_reg + i)

>  	/* Can't use regs which aren't saved by the prologue.  */

>  	|| (! df_regs_ever_live_p (new_reg + i)

>  	    && ! call_used_regs[new_reg + i])


How is this different from HARD_REGNO_RENAME_OK via the TO argument?
Seems like the hook you're looking for already exists...


r~
Andrew Stubbs Sept. 12, 2018, 4:14 p.m. | #5
On 12/09/18 16:31, Richard Henderson wrote:
> How is this different from HARD_REGNO_RENAME_OK via the TO argument?

> Seems like the hook you're looking for already exists...


I don't know how we got here (I didn't do the original work), but the 
SPECIAL_REGNO_P was indeed used in HARD_REGNO_RENAME_OK, as well as in 
some extra places in regrename.

This definitely was necessary at some point, but I've not yet reproduced 
the issue in the current code-base, so I suspect you may be correct.

Andrew
Andrew Stubbs Sept. 13, 2018, 10:01 a.m. | #6
On 12/09/18 12:29, Andrew Stubbs wrote:
> I'll report back when I've done more testing.


I reproduced the problem, in the latest sources, with the 
SPECIAL_REGNO_P patch removed (and HARD_REGNO_RENAME_OK adjusted 
accordingly).

Testcase: gcc.c-torture/compile/20020706-2.c -O3 -funroll-loops

> during RTL pass: rnreg

> dump file: /scratch/astubbs/amd/upstream/tmp/target.290r.rnreg

> /scratch/astubbs/amd/upstream/src/gcc-gcn-master/gcc/testsuite/gcc.c-torture/compile/20020706-2.c: In function 'crashIt':                                                                                                                     

> /scratch/astubbs/amd/upstream/src/gcc-gcn-master/gcc/testsuite/gcc.c-torture/compile/20020706-2.c:26:1: internal compiler error: in merge_overlapping_regs, at regrename.c:300                                                                

> 26 | }

>    | ^

> 0xef149d merge_overlapping_regs

>         /scratch/astubbs/amd/upstream/src/gcc-gcn-master/gcc/regrename.c:300

> 0xef17cb find_rename_reg(du_head*, reg_class, unsigned long (*) [7], int, bool)

>         /scratch/astubbs/amd/upstream/src/gcc-gcn-master/gcc/regrename.c:373

> 0xef1c84 rename_chains

>         /scratch/astubbs/amd/upstream/src/gcc-gcn-master/gcc/regrename.c:497

> 0xef612b regrename_optimize

>         /scratch/astubbs/amd/upstream/src/gcc-gcn-master/gcc/regrename.c:1951

> 0xef61ae execute

>         /scratch/astubbs/amd/upstream/src/gcc-gcn-master/gcc/regrename.c:1986

> Please submit a full bug report,

> with preprocessed source if appropriate.

> Please include the complete backtrace with any bug report.

> See <https://sourcery.mentor.com/GNUToolchain/> for instructions.


The register that find_rename_reg is considering is SCC, which is one of 
the "special" registers.  There is a short-cut in rename_chains for 
fixed registers, global registers, and frame pointers.  It does not 
check HARD_REGNO_RENAME_OK.

Presumably the bug is not that it will actually try to rename SCC, but 
that it trips an assert while trying to compute the other parameter for 
the HARD_REGNO_RENAME_OK hook.

The SPECIAL_REGNO_P macro fixed the issue by extending the short-cut to 
include the additional registers.

The assert is caused because the def-use chains indicate that SCC 
conflicts with itself. I suppose the question is why is it doing that, 
but it's probably do do with that being a special register that gets 
used in split2 (particularly by the addptrdi3 pattern). Although, those 
patterns are careful to save SCC to one side and then restore it again 
after, so I'd have thought the DF analysis would work out?

Andrew
Andrew Stubbs Sept. 13, 2018, 2:08 p.m. | #7
On 13/09/18 11:01, Andrew Stubbs wrote:
> The assert is caused because the def-use chains indicate that SCC 

> conflicts with itself. I suppose the question is why is it doing that, 

> but it's probably do do with that being a special register that gets 

> used in split2 (particularly by the addptrdi3 pattern). Although, those 

> patterns are careful to save SCC to one side and then restore it again 

> after, so I'd have thought the DF analysis would work out?


I think I may have a theory on this one now....

The addptrdi3 pattern must use two 32-bit adds with a carry in SCC, but 
addptr patterns are not allowed to clobber SCC, so the splitter 
carefully saves and restores the old value.

This is correct at runtime, and looks correct in RTL dumps, but it means 
that there's still a single rtx REG instance holding the live SCC 
register, and its still live before and after the new add instruction.

Would I be right in thinking that the dataflow analysis doesn't like this?

I think I have a work-around (by using different instructions), but is 
there a correct way to do this if there weren't an alternative?

Andrew
Paul Koning Sept. 13, 2018, 2:16 p.m. | #8
> On Sep 13, 2018, at 10:08 AM, Andrew Stubbs <ams@codesourcery.com> wrote:

> 

> On 13/09/18 11:01, Andrew Stubbs wrote:

>> The assert is caused because the def-use chains indicate that SCC conflicts with itself. I suppose the question is why is it doing that, but it's probably do do with that being a special register that gets used in split2 (particularly by the addptrdi3 pattern). Although, those patterns are careful to save SCC to one side and then restore it again after, so I'd have thought the DF analysis would work out?

> 

> I think I may have a theory on this one now....

> 

> The addptrdi3 pattern must use two 32-bit adds with a carry in SCC, but addptr patterns are not allowed to clobber SCC, so the splitter carefully saves and restores the old value.


If you don't have machine operations that add without messing with condition codes, wouldn't it make sense to omit the definition of the add-pointer patterns?  GCC will build things out of normal (CC-clobbering) adds if there are no add-pointer operations, which may well be more efficient in most cases than explicitly saving/restoring a CC that may in fact not matter right at that spot.

	paul
Andrew Stubbs Sept. 13, 2018, 2:39 p.m. | #9
On 13/09/18 15:16, Paul Koning wrote:
> If you don't have machine operations that add without messing with

> condition codes, wouldn't it make sense to omit the definition of the

> add-pointer patterns?  GCC will build things out of normal

> (CC-clobbering) adds if there are no add-pointer operations, which

> may well be more efficient in most cases than explicitly

> saving/restoring a CC that may in fact not matter right at that

> spot.


I thought the whole point of addptr is that it *is* needed when add
clobbers CC? As in, LRA spills are malformed without this.

Did something change? The internals manual still says "It only needs to
be defined if addm3 sets the condition code."

Andrew
Paul Koning Sept. 13, 2018, 2:49 p.m. | #10
> On Sep 13, 2018, at 10:39 AM, Andrew Stubbs <ams@codesourcery.com> wrote:

> 

> On 13/09/18 15:16, Paul Koning wrote:

>> If you don't have machine operations that add without messing with

>> condition codes, wouldn't it make sense to omit the definition of the

>> add-pointer patterns?  GCC will build things out of normal

>> (CC-clobbering) adds if there are no add-pointer operations, which

>> may well be more efficient in most cases than explicitly

>> saving/restoring a CC that may in fact not matter right at that

>> spot.

> 

> I thought the whole point of addptr is that it *is* needed when add

> clobbers CC? As in, LRA spills are malformed without this.

> 

> Did something change? The internals manual still says "It only needs to

> be defined if addm3 sets the condition code."


It's ambiguous, because the last sentence of that paragraph says "addm3 is used if addptrm3 is not defined."  

I don't know of any change in this area.  All I know is that pdp11 has adds that clobber CC and it doesn't define addptrm3, relying on that last sentence.  I've tried LRA and for the most part it compiles successfully, I suppose I should verify the generated code based on the point you raised.  If I really have to define addptr, I'm in trouble because  save/restore CC is not easy on pdp11.

	paul
Andrew Stubbs Sept. 13, 2018, 2:58 p.m. | #11
On 13/09/18 15:49, Paul Koning wrote:
> It's ambiguous, because the last sentence of that paragraph says "addm3 is used if addptrm3 is not defined."


I didn't read that as ambiguous; I read it as addm3 is assumed to work 
fine when addptr is not defined.

> I don't know of any change in this area.  All I know is that pdp11 has adds that clobber CC and it doesn't define addptrm3, relying on that last sentence.  I've tried LRA and for the most part it compiles successfully, I suppose I should verify the generated code based on the point you raised.  If I really have to define addptr, I'm in trouble because  save/restore CC is not easy on pdp11.


The code was added because we had a number of testcases that failed at 
runtime without it.

Admittedly, that was in a GCC 7 code-base, and I can't reproduce the 
failure with one of those test cases now (with addptr deleted), but 
possibly that's just noise.

Andrew
Paul Koning Sept. 13, 2018, 5:09 p.m. | #12
> On Sep 13, 2018, at 10:58 AM, Andrew Stubbs <ams@codesourcery.com> wrote:

> 

> On 13/09/18 15:49, Paul Koning wrote:

>> It's ambiguous, because the last sentence of that paragraph says "addm3 is used if addptrm3 is not defined."

> 

> I didn't read that as ambiguous; I read it as addm3 is assumed to work fine when addptr is not defined.

> 

>> I don't know of any change in this area.  All I know is that pdp11 has adds that clobber CC and it doesn't define addptrm3, relying on that last sentence.  I've tried LRA and for the most part it compiles successfully, I suppose I should verify the generated code based on the point you raised.  If I really have to define addptr, I'm in trouble because  save/restore CC is not easy on pdp11.

> 

> The code was added because we had a number of testcases that failed at runtime without it.

> 

> Admittedly, that was in a GCC 7 code-base, and I can't reproduce the failure with one of those test cases now (with addptr deleted), but possibly that's just noise.


Possibly relevant is that pdp11 is a "type 2" CC setting target, one where the machine description doesn't mention CC until after reload.  So if reload (LRA) is generating adds, the CC effect of that is invisible anyway until later passes that deal with the resulting clobbers and elimination, or not, of compares.

If that's what this is all about, some documentation clarification would help.  Can someone confirm (or refute) my guess?

	paul
Jeff Law Sept. 17, 2018, 10:56 p.m. | #13
On 9/13/18 8:08 AM, Andrew Stubbs wrote:
> On 13/09/18 11:01, Andrew Stubbs wrote:

>> The assert is caused because the def-use chains indicate that SCC

>> conflicts with itself. I suppose the question is why is it doing that,

>> but it's probably do do with that being a special register that gets

>> used in split2 (particularly by the addptrdi3 pattern). Although,

>> those patterns are careful to save SCC to one side and then restore it

>> again after, so I'd have thought the DF analysis would work out?

> 

> I think I may have a theory on this one now....

> 

> The addptrdi3 pattern must use two 32-bit adds with a carry in SCC, but

> addptr patterns are not allowed to clobber SCC, so the splitter

> carefully saves and restores the old value.

> 

> This is correct at runtime, and looks correct in RTL dumps, but it means

> that there's still a single rtx REG instance holding the live SCC

> register, and its still live before and after the new add instruction.

> 

> Would I be right in thinking that the dataflow analysis doesn't like this?

> 

> I think I have a work-around (by using different instructions), but is

> there a correct way to do this if there weren't an alternative?

I would expect dataflow to treat the SCC save as a use of the SCC
register.  That's likely to cause it to be live on all paths from the
entry to the SCC save.

Jeff
Jeff Law Oct. 4, 2018, 6:27 p.m. | #14
On 9/13/18 4:01 AM, Andrew Stubbs wrote:
> 

> The register that find_rename_reg is considering is SCC, which is one of

> the "special" registers.  There is a short-cut in rename_chains for

> fixed registers, global registers, and frame pointers.  It does not

> check HARD_REGNO_RENAME_OK.

I wonder if it expects the caller to have avoided putting these
registers in the chain.


> 

> The assert is caused because the def-use chains indicate that SCC

> conflicts with itself. I suppose the question is why is it doing that,

> but it's probably do do with that being a special register that gets

> used in split2 (particularly by the addptrdi3 pattern). Although, those

> patterns are careful to save SCC to one side and then restore it again

> after, so I'd have thought the DF analysis would work out?

If you have SCC before its first set, then DF is going to think the SCC
register is live at function entry.

Jeff

Patch

diff --git a/gcc/defaults.h b/gcc/defaults.h
index 9035b33..40ecf61 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1198,6 +1198,10 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define NO_FUNCTION_CSE false
 #endif
 
+#ifndef SPECIAL_REGNO_P
+#define SPECIAL_REGNO_P(REGNO) false
+#endif
+
 #ifndef HARD_REGNO_RENAME_OK
 #define HARD_REGNO_RENAME_OK(FROM, TO) true
 #endif
diff --git a/gcc/regrename.c b/gcc/regrename.c
index 8424093..92e403e 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -320,6 +320,7 @@  check_new_reg_p (int reg ATTRIBUTE_UNUSED, int new_reg,
     if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
 	|| fixed_regs[new_reg + i]
 	|| global_regs[new_reg + i]
+	|| SPECIAL_REGNO_P (new_reg + i)
 	/* Can't use regs which aren't saved by the prologue.  */
 	|| (! df_regs_ever_live_p (new_reg + i)
 	    && ! call_used_regs[new_reg + i])
@@ -480,6 +481,7 @@  rename_chains (void)
 	continue;
 
       if (fixed_regs[reg] || global_regs[reg]
+	  || SPECIAL_REGNO_P (reg)
 	  || (!HARD_FRAME_POINTER_IS_FRAME_POINTER && frame_pointer_needed
 	      && reg == HARD_FRAME_POINTER_REGNUM)
 	  || (HARD_FRAME_POINTER_IS_FRAME_POINTER && frame_pointer_needed