RL78 movdi improvement

Message ID 000001d372a4$9bd42150$d37c63f0$@renesas.com
State New
Headers show
Series
  • RL78 movdi improvement
Related show

Commit Message

Sebastian Perta Dec. 11, 2017, 5:22 p.m.
Hello,

The following patch improves 64 bit operations by instructing GCC to use 16
bit movw instead of 8 bit mov.
On the following test case the patch reduces the code size from 323 bytes to
245 bytes.
unsigned long long my_anddi3(unsigned long long x, unsigned long long y){ 
return x & y;
}
I did not add this to the regression as it very simple and there many test
cases in the regression especially c-torture which use this patch.
Regression test is OK, tested with the following command:
make -k check-gcc RUNTESTFLAGS=--target_board=rl78-sim

Please let me know if this is OK, Thank you!
Sebastian

Comments

Sebastian Perta Jan. 11, 2018, 7:39 p.m. | #1
Hi DJ,

I managed to reproduce the issue, it is in function test_7,  sorry I'm not
sure how I missed it I always compare the logs carefully with and without
the patch.
After investigating this I found the problem is caused by a removal of 1
instruction due to wrong UNUSED note being
added(rl78_calculate_death_notes), more exactly in rl78_note_reg_set :

  r = REGNO (d);
  if (dead [r])
    add_reg_note (insn, REG_UNUSED, gen_rtx_REG (GET_MODE (d), r));

For example if the reg is HImode and the high part is not dead (dead[r + 1])
the register should NOT be marked as UNUSED but it is.
After I corrected the problem the code was generated correctly; I will
prepare another patch with this fix.

Below you can see the relevant parts from the rtl dump(.314r.devirt) which
show this:

(insn 41 91 92 2 (set (reg:HI 0 x)
        (mem/c:HI (reg:HI 6 l) [0  S2 A16])) "../test_movdi.c":20 43
{*movhi_real}
     (nil))
(insn 92 41 42 2 (set (reg:HI 8 r8 [orig:44 a.0_3 ] [44])
        (reg:HI 0 x)) "../test_movdi.c":20 -1
     (nil))
...
(insn 17 105 106 2 (set (reg:QI 1 a)
        (reg:QI 9 r9 [ a.0_3+1 ])) "../test_movdi.c":20 42 {*movqi_real}
     (nil))
...
(insn 19 107 108 2 (set (reg:QI 1 a)
        (reg:QI 9 r9 [ a.0_3+1 ])) "../test_movdi.c":20 42 {*movqi_real}
     (nil))
...
(insn 22 110 23 2 (set (reg/f:HI 8 r8 [53])
        (symbol_ref:HI ("c") <var_decl 0xb713c0fc c>)) "../test_movdi.c":20
43 {*movhi_real}
     (expr_list:REG_EQUIV (symbol_ref:HI ("c") <var_decl 0xb713c0fc c>)
        (nil)))


When the death notes are calculated we have the following
(rl78_calculate_death_notes):
Because this parse the functions backwards insn 22 is processed first.

note set reg 8 size 2
(insn 22 110 23 2 (set (reg/f:HI 8 r8 [53])
        (symbol_ref:HI ("c") <var_decl 0xb713c0fc c>)) "../test_movdi.c":20
43 {*movhi_real}
     (expr_list:REG_EQUIV (symbol_ref:HI ("c") <var_decl 0xb713c0fc c>)
        (nil)))

--------------------------------------------------
Dead: x a c b r8 r9 r10 r11 r12 r13 r16 r17 <-----------As you can see R8
and R9 are marked as dead which is correct

....

note set reg 1 size 1
note use reg 9 size 1 on insn 19
(reg:QI 9 r9 [ a.0_3+1 ])
(insn 19 107 108 2 (set (reg:QI 1 a)
        (reg:QI 9 r9 [ a.0_3+1 ])) "../test_movdi.c":20 42 {*movqi_real}
     (expr_list:REG_DEAD (reg:QI 9 r9)
        (nil)))

--------------------------------------------------
Dead: x a c b r8 r10 r13 r16 r17 <-----------As you can see R9 is not dead
anymore
....
Now we get to insn 92 and we can see that r8 is wrongfully marked as unused
because the code check only R8 (and it should check R9 as well).
As a consequence insn 92 get removed later on.
--------------------------------------------------
Dead: x a c b r8 r10 r11 r12 r13 r14 r15 r16 r17
(insn 92 41 42 2 (set (reg:HI 8 r8 [orig:44 a.0_3 ] [44])
        (reg:HI 0 x)) "../test_movdi.c":20 43 {*movhi_real}
     (nil))
note set reg 8 size 2
note use reg 0 size 2 on insn 92
(reg:HI 0 x)
(insn 92 41 42 2 (set (reg:HI 8 r8 [orig:44 a.0_3 ] [44])
        (reg:HI 0 x)) "../test_movdi.c":20 43 {*movhi_real}
     (expr_list:REG_DEAD (reg:HI 0 x)
        (expr_list:REG_UNUSED (reg:HI 8
r8)<------------------------------------here is the problem
            (nil))))

Also later on (in 290r.peephole2) we see that instruction 41 is eliminated
as well:
DCE: Deleting insn 41
deleting insn with uid = 41.

Best Regards,
Sebastian

> -----Original Message-----

> From: DJ Delorie [mailto:dj@redhat.com]

> Sent: 09 January 2018 22:55

> To: Sebastian Perta <Sebastian.Perta@renesas.com>

> Cc: gcc-patches@gcc.gnu.org

> Subject: Re: [PATCH] RL78 movdi improvement

> 

> 

> I saw one regression with this patch:

> 

> PASS-FAIL: gcc.dg/torture/vshuf-v8qi.c   -O2  execution test

> 

> Could you confirm this?  Note: I'm testing with your other (pre-2018 at

> least) patches applied at the same time (smax etc, anddi, bswaphi) so it

> might be an interaction, but the code looks like a movdi bug.  The other

> patches look OK otherwise.

Patch

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 255538)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@ 
+2017-12-12  Sebastian Perta  <sebastian.perta@renesas.com>
+
+	* config/rl78/rl78-protos.h: New function declaration
rl78_split_movdi
+	* config/rl78/rl78.md: New define_expand "movdi"
+	* config/rl78/rl78.c: New function definition rl78_split_movdi
+	
 2017-12-10  Gerald Pfeifer  <gerald@pfeifer.com>
 
 	* doc/install.texi (Specific): Tweak link to mkssoftware.com.
Index: config/rl78/rl78-protos.h
===================================================================
--- config/rl78/rl78-protos.h	(revision 255538)
+++ config/rl78/rl78-protos.h	(working copy)
@@ -23,6 +23,7 @@ 
 void		rl78_expand_compare (rtx *);
 void		rl78_expand_movsi (rtx *);
 void		rl78_split_movsi (rtx *, machine_mode);
+void 		rl78_split_movdi (rtx *, enum machine_mode);
 int		rl78_force_nonfar_2 (rtx *, rtx (*gen)(rtx,rtx));
 int		rl78_force_nonfar_3 (rtx *, rtx (*gen)(rtx,rtx,rtx));
 void		rl78_expand_eh_epilogue (rtx);
Index: config/rl78/rl78.c
===================================================================
--- config/rl78/rl78.c	(revision 255538)
+++ config/rl78/rl78.c	(working copy)
@@ -596,6 +596,18 @@ 
     }
 }
 
+void
+rl78_split_movdi (rtx *operands, enum machine_mode omode)
+{
+    rtx op00, op04, op10, op14;
+    op00 = rl78_subreg (SImode, operands[0], omode, 0);
+    op04 = rl78_subreg (SImode, operands[0], omode, 4);
+    op10 = rl78_subreg (SImode, operands[1], omode, 0);
+    op14 = rl78_subreg (SImode, operands[1], omode, 4);
+    emit_insn (gen_movsi (op00, op10));
+    emit_insn (gen_movsi (op04, op14));
+}
+
 /* Used by various two-operand expanders which cannot accept all
    operands in the "far" namespace.  Force some such operands into
    registers so that each pattern has at most one far operand.  */
Index: config/rl78/rl78.md
===================================================================
--- config/rl78/rl78.md	(revision 255538)
+++ config/rl78/rl78.md	(working copy)
@@ -718,3 +718,11 @@ 
   [(set_attr "valloc" "macax")
    (set_attr "is_g13_muldiv_insn" "yes")]
 )
+
+(define_expand "movdi"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "")
+        (match_operand:DI 1 "general_operand"      ""))]
+  ""
+  "rl78_split_movdi(operands, DImode);
+  DONE;"
+)