[committed] amdgcn: fix ICE on subreg of BI reg

Message ID f9c7c2db-2a50-bf93-6f19-745f66b4c9c0@codesourcery.com
State New
Headers show
  • [committed] amdgcn: fix ICE on subreg of BI reg
Related show

Commit Message

Andrew Stubbs Feb. 27, 2020, 12:23 p.m.
This patch fixes an LRA ICE that was exposed by another patch on Sunday. 
I can't see any reason why that patch should cause an ICE, so presumably 
it merely perturbed something.

The problem was that LRA with checking enabled was confirming that all 
the registers it had allocated are considered "ever live", and found 
that the high-part of the VCC register was not live.

The reason was that it had created an instruction like this:

   (set (reg:SI s2)
        (subreg:SI (reg:BI VCC) 0))

This seems like it ought to be fine, except that gcn.c defines (reg:BI 
VCC) to have nregs == 2, whereas (reg:SI VCC) nregs == 1.  The checking 
code uses nregs from the inner register mode, and DF uses nregs from the 
outer subreg mode, and the mismatch causes the ICE.

Using 64-bits for a BImode register is unusual but makes sense because 
instructions writing BImode condition codes to VCC will normally clobber 
the entire DImode register pair, whereas SImode register modes only 
touch one of the two registers.

The solution is to transform the instruction like this:

   (set (subreg:BI (reg:SI s2) 0)
        (reg:BI VCC))

This says approximately the same thing, but now nregs is firmly "2", and 
the ICE goes away.



amdgcn: fix ICE on subreg of BI reg.

BImode usually only requires one bit, but instructions that write to VCC also
clobber the reset of the DImode register pair, so gcn_class_max_nregs reports
that two registers are needed for BImode.  Paradoxically, accessing VCC via
SImode is therefore uses fewer registers than accessing via BImode.

The LRA checking code takes this into account, but the DF liveness data also
looks at the subreg, so it says (subreg:SI (reg:BI VCC) 0) only makes the low
part live.  Both are "correct", but they disagree, which causes an ICE.

This doesn't happen when writing conditions to VCC; it happens when accessing
VCC_LO via a regular move to a regular SImode register.

If we transform the subregs so that BImode is always the outer mode then it
basically means the same thing, except that now both LRA and DF calculate nregs
the same, and ICE goes away.

As soon as LRA is done the subregs all evaporate anyway.

2020-02-27  Andrew Stubbs  <ams@codesourcery.com>

	* config/gcn/gcn.md (mov<mode>): Add transformations for BI subregs.

diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md
index b527d9a7a8b..d8b49dfd640 100644
--- a/gcc/config/gcn/gcn.md
+++ b/gcc/config/gcn/gcn.md
@@ -395,6 +395,31 @@ 
 	(match_operand:MOV_MODE 1 "general_operand"))]
+    if (SUBREG_P (operands[1])
+	&& GET_MODE (operands[1]) == SImode
+	&& GET_MODE (SUBREG_REG (operands[1])) == BImode)
+    {
+      /* (reg:BI VCC) has nregs==2 to ensure it gets clobbered as a whole,
+	 but (subreg:SI (reg:BI VCC)) doesn't, which causes the LRA liveness
+	 checks to assert.  Transform this:
+	   (set (reg:SI) (subreg:SI (reg:BI)))
+	 to this:
+	   (set (subreg:BI (reg:SI)) (reg:BI))  */
+      operands[0] = gen_rtx_SUBREG (BImode, operands[0], 0);
+      operands[1] = SUBREG_REG (operands[1]);
+    }
+    if (SUBREG_P (operands[0])
+	&& GET_MODE (operands[0]) == SImode
+	&& GET_MODE (SUBREG_REG (operands[0])) == BImode)
+      {
+	/* Likewise, transform this:
+	     (set (subreg:SI (reg:BI)) (reg:SI))
+	   to this:
+	     (set (reg:BI) (subreg:BI (reg:SI))) */
+	operands[0] = SUBREG_REG (operands[0]);
+	operands[1] = gen_rtx_SUBREG (BImode, operands[1], 0);
+      }
     if (MEM_P (operands[0]))
       operands[1] = force_reg (<MODE>mode, operands[1]);