[2/4,ARC] Use TARGET_INSN_COST.

Message ID 20200203113832.20270-2-claziss@gmail.com
State New
Headers show
Series
  • [1/4,ARC] Update mlo/mhi handling when big-endian CPU.
Related show

Commit Message

Claudiu Zissulescu Feb. 3, 2020, 11:38 a.m.
TARGET_INSN_COST gives us a better control over the instruction costs
than classical RTX_COSTS.  A simple cost scheme is in place for the
time being, when optimizing for size, the cost is given by the
instruction length. When optimizing for speed, the cost is 1 for any
recognized instruction, and 2 for any load/store instruction.  The
latter one can be overwritten by using cost attribute for an
instruction.  Due to this change, we need to update also a number of
instruction patterns with a new predicate to better reflect the costs.

gcc/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.c (arc_insn_cost): New function.
	(TARGET_INSN_COST): Define.
	* config/arc/arc.md (cost): New attribute.
	(add_n): Use arc_nonmemory_operand.
	(ashlsi3_insn): Likewise, also update constraints.
	(ashrsi3_insn): Likewise.
	(rotrsi3): Likewise.
	(add_shift): Likewise.
	* config/arc/predicates.md (arc_nonmemory_operand): New predicate.

testsuite/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

	* gcc.target/arc/or-cnst-size2.c: Update test.
---
 gcc/config/arc/arc.c                         | 52 ++++++++++++++++++++
 gcc/config/arc/arc.md                        | 47 +++++++++---------
 gcc/config/arc/predicates.md                 |  5 ++
 gcc/testsuite/gcc.target/arc/or-cnst-size2.c |  2 +-
 4 files changed, 83 insertions(+), 23 deletions(-)

-- 
2.24.1

Comments

Jeff Law Feb. 3, 2020, 4:24 p.m. | #1
On Mon, 2020-02-03 at 12:38 +0100, Claudiu Zissulescu wrote:
> TARGET_INSN_COST gives us a better control over the instruction costs

> than classical RTX_COSTS.  A simple cost scheme is in place for the

> time being, when optimizing for size, the cost is given by the

> instruction length. When optimizing for speed, the cost is 1 for any

> recognized instruction, and 2 for any load/store instruction.  The

> latter one can be overwritten by using cost attribute for an

> instruction.  Due to this change, we need to update also a number of

> instruction patterns with a new predicate to better reflect the costs.

> 

> gcc/

> xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

> 

> 	* config/arc/arc.c (arc_insn_cost): New function.

> 	(TARGET_INSN_COST): Define.

> 	* config/arc/arc.md (cost): New attribute.

> 	(add_n): Use arc_nonmemory_operand.

> 	(ashlsi3_insn): Likewise, also update constraints.

> 	(ashrsi3_insn): Likewise.

> 	(rotrsi3): Likewise.

> 	(add_shift): Likewise.

> 	* config/arc/predicates.md (arc_nonmemory_operand): New predicate.

> 

> testsuite/

> xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

> 

> 	* gcc.target/arc/or-cnst-size2.c: Update test.

My only worry would be asking for the length early in the RTL pipeline
may not be as accurate, but it's supposed to work, so if you're
comfortable with the end results, then OK.

jeff
>
Claudiu Zissulescu Feb. 4, 2020, 4:56 p.m. | #2
> My only worry would be asking for the length early in the RTL pipeline

> may not be as accurate, but it's supposed to work, so if you're

> comfortable with the end results, then OK.

>

Indeed, the length is not accurate, but the results seem slightly
better than using COST_RTX. Using INSN_COSTS seems to me a better
manageable way in controlling what combiner does.
Anyhow, for ARC the instruction size is accurate quite late in the
compilation process as it needs register and immediate value info :(

Thank you for you review,
Claudiu

Patch

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 7fa0fb51a4b..bd1e12b8a1f 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -11786,6 +11786,55 @@  arc_can_use_return_insn (void)
 	  && !ARC_INTERRUPT_P (arc_compute_function_type (cfun)));
 }
 
+/* Helper for INSN_COST.
+
+   Per Segher Boessenkool: rtx_costs computes the cost for any rtx (an
+   insn, a set, a set source, any random piece of one).  set_src_cost,
+   set_rtx_cost, etc. are helper functions that use that.
+
+   Those functions do not work for parallels.  Also, costs are not
+   additive like this simplified model assumes.  Also, more complex
+   backends tend to miss many cases in their rtx_costs function.
+
+   Many passes that want costs want to know the cost of a full insn.  Like
+   combine.  That's why I created insn_cost: it solves all of the above
+   problems.  */
+
+static int
+arc_insn_cost (rtx_insn *insn, bool speed)
+{
+  int cost;
+  if (recog_memoized (insn) < 0)
+    return 0;
+
+  /* If optimizing for size, we want the insn size.  */
+  if (!speed)
+    return get_attr_length (insn);
+
+  /* Use cost if provided.  */
+  cost = get_attr_cost (insn);
+  if (cost > 0)
+    return cost;
+
+  /* For speed make a simple cost model: memory access is more
+     expensive than any other instruction.  */
+  enum attr_type type = get_attr_type (insn);
+
+  switch (type)
+    {
+    case TYPE_LOAD:
+    case TYPE_STORE:
+      cost = COSTS_N_INSNS (2);
+      break;
+
+    default:
+      cost = COSTS_N_INSNS (1);
+      break;
+    }
+
+  return cost;
+}
+
 #undef TARGET_USE_ANCHORS_FOR_SYMBOL_P
 #define TARGET_USE_ANCHORS_FOR_SYMBOL_P arc_use_anchors_for_symbol_p
 
@@ -11807,6 +11856,9 @@  arc_can_use_return_insn (void)
 #undef TARGET_MEMORY_MOVE_COST
 #define TARGET_MEMORY_MOVE_COST arc_memory_move_cost
 
+#undef  TARGET_INSN_COST
+#define TARGET_INSN_COST arc_insn_cost
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-arc.h"
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index f19f2c32641..fb25aafb024 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -229,6 +229,10 @@ 
   ]
 )
 
+;; What is the insn_cost for this insn?  The target hook can still override
+;; this.  For optimizing for size the "length" attribute is used instead.
+(define_attr "cost" "" (const_int 0))
+
 (define_attr "is_sfunc" "no,yes" (const_string "no"))
 
 ;; Insn type.  Used to default other attribute values.
@@ -3140,9 +3144,9 @@  archs4x, archs4xd"
   [(set (match_operand:SI 0 "dest_reg_operand" "=q,r,r")
 	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "q,r,r")
 			  (match_operand:SI 2 "_2_4_8_operand" ""))
-		 (match_operand:SI 3 "nonmemory_operand" "0,r,Csz")))]
+		 (match_operand:SI 3 "arc_nonmemory_operand" "0,r,Csz")))]
   ""
-  "add%z2%?\\t%0,%3,%1%&"
+  "add%z2%?\\t%0,%3,%1"
   [(set_attr "type" "shift")
    (set_attr "length" "*,4,8")
    (set_attr "predicable" "yes,no,no")
@@ -3573,26 +3577,26 @@  archs4x, archs4xd"
 ; to truncate a symbol in a u6 immediate; but that's rather exotic, so only
 ; provide one alternatice for this, without condexec support.
 (define_insn "*ashlsi3_insn"
-  [(set (match_operand:SI 0 "dest_reg_operand"           "=Rcq,Rcqq,Rcqq,Rcw, w,   w")
-	(ashift:SI (match_operand:SI 1 "nonmemory_operand" "!0,Rcqq,   0,  0, c,cCsz")
-		   (match_operand:SI 2 "nonmemory_operand"  "K,  K,RcqqM, cL,cL,cCal")))]
+  [(set (match_operand:SI 0 "dest_reg_operand"                 "=q,q, q, r, r,   r")
+	(ashift:SI (match_operand:SI 1 "arc_nonmemory_operand" "!0,q, 0, 0, r,rCsz")
+		   (match_operand:SI 2 "nonmemory_operand"      "K,K,qM,rL,rL,rCal")))]
   "TARGET_BARREL_SHIFTER
    && (register_operand (operands[1], SImode)
        || register_operand (operands[2], SImode))"
-  "asl%? %0,%1,%2%&"
+  "asl%?\\t%0,%1,%2"
   [(set_attr "type" "shift")
    (set_attr "iscompact" "maybe,maybe,maybe,false,false,false")
    (set_attr "predicable" "no,no,no,yes,no,no")
    (set_attr "cond" "canuse,nocond,canuse,canuse,nocond,nocond")])
 
 (define_insn "*ashrsi3_insn"
-  [(set (match_operand:SI 0 "dest_reg_operand"             "=Rcq,Rcqq,Rcqq,Rcw, w,   w")
-	(ashiftrt:SI (match_operand:SI 1 "nonmemory_operand" "!0,Rcqq,   0,  0, c,cCsz")
-		     (match_operand:SI 2 "nonmemory_operand"  "K,  K,RcqqM, cL,cL,cCal")))]
+  [(set (match_operand:SI 0 "dest_reg_operand"                   "=q,q, q, r, r,   r")
+	(ashiftrt:SI (match_operand:SI 1 "arc_nonmemory_operand" "!0,q, 0, 0, r,rCsz")
+		     (match_operand:SI 2 "nonmemory_operand"      "K,K,qM,rL,rL,rCal")))]
   "TARGET_BARREL_SHIFTER
    && (register_operand (operands[1], SImode)
        || register_operand (operands[2], SImode))"
-  "asr%? %0,%1,%2%&"
+  "asr%?\\t%0,%1,%2"
   [(set_attr "type" "shift")
    (set_attr "iscompact" "maybe,maybe,maybe,false,false,false")
    (set_attr "predicable" "no,no,no,yes,no,no")
@@ -3613,11 +3617,11 @@  archs4x, archs4xd"
    (set_attr "cond" "canuse,nocond,canuse,canuse,nocond,nocond")])
 
 (define_insn "rotrsi3"
-  [(set (match_operand:SI 0 "dest_reg_operand"             "=Rcw, w,   w")
-	(rotatert:SI (match_operand:SI 1 "register_operand"  " 0,cL,cCsz")
-		     (match_operand:SI 2 "nonmemory_operand" "cL,cL,cCal")))]
+  [(set (match_operand:SI 0 "dest_reg_operand"                    "=r, r,   r")
+	(rotatert:SI (match_operand:SI 1 "arc_nonmemory_operand"  " 0,rL,rCsz")
+		     (match_operand:SI 2 "nonmemory_operand"      "rL,rL,rCal")))]
   "TARGET_BARREL_SHIFTER"
-  "ror%? %0,%1,%2"
+  "ror%?\\t%0,%1,%2"
   [(set_attr "type" "shift,shift,shift")
    (set_attr "predicable" "yes,no,no")
    (set_attr "length" "4,4,8")])
@@ -4349,16 +4353,15 @@  archs4x, archs4xd"
 	(ashift:SI (match_operand:SI 1 "register_operand" "")
 		   (match_operand:SI 2 "_1_2_3_operand" "")))
   (set (match_operand:SI 3 "dest_reg_operand" "")
-       (plus:SI (match_operand:SI 4 "nonmemory_operand" "")
-		(match_operand:SI 5 "nonmemory_operand" "")))]
+       (plus:SI (match_operand:SI 4 "arc_nonmemory_operand" "")
+		(match_operand:SI 5 "arc_nonmemory_operand" "")))]
   "(true_regnum (operands[4]) == true_regnum (operands[0])
        || true_regnum (operands[5]) == true_regnum (operands[0]))
    && (peep2_reg_dead_p (2, operands[0])
-       || (true_regnum (operands[3]) == true_regnum (operands[0])))
-   && !(optimize_size && satisfies_constraint_I (operands[4]))
-   && !(optimize_size && satisfies_constraint_I (operands[5]))"
- ;; the preparation statements take care to put proper operand in operands[4]
- ;; operands[4] will always contain the correct operand. This is added to satisfy commutativity
+       || (true_regnum (operands[3]) == true_regnum (operands[0])))"
+ ;; the preparation statements take care to put proper operand in
+ ;; operands[4] operands[4] will always contain the correct
+ ;; operand. This is added to satisfy commutativity
   [(set (match_dup 3)
 	(plus:SI (mult:SI (match_dup 1)
 			  (match_dup 2))
@@ -6435,7 +6438,7 @@  archs4x, archs4xd"
   [(set (match_operand:SI 0 "register_operand" "=q,r,r")
 	(plus:SI (ashift:SI (match_operand:SI 1 "register_operand" "q,r,r")
 			    (match_operand:SI 2 "_1_2_3_operand" ""))
-		 (match_operand:SI 3 "nonmemory_operand"  "0,r,Csz")))]
+		 (match_operand:SI 3 "arc_nonmemory_operand"  "0,r,Csz")))]
   ""
   "add%2%?\\t%0,%3,%1"
   [(set_attr "length" "*,4,8")
diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
index 3c03436c901..2ad476d5755 100644
--- a/gcc/config/arc/predicates.md
+++ b/gcc/config/arc/predicates.md
@@ -795,3 +795,8 @@ 
   {
    return arc_check_multi (op, false);
 })
+
+(define_predicate "arc_nonmemory_operand"
+  (ior (match_test "register_operand (op, mode)")
+       (and (match_code "const_int, symbol_ref")
+	    (match_test "!optimize_size"))))
diff --git a/gcc/testsuite/gcc.target/arc/or-cnst-size2.c b/gcc/testsuite/gcc.target/arc/or-cnst-size2.c
index 33af97bbdbf..8fa1e65c7f6 100644
--- a/gcc/testsuite/gcc.target/arc/or-cnst-size2.c
+++ b/gcc/testsuite/gcc.target/arc/or-cnst-size2.c
@@ -9,4 +9,4 @@  int foo (void)
 }
 
 /* { dg-final { scan-assembler "tst" } } */
-/* { dg-final { scan-assembler "bset.eq" } } */
+/* { dg-final { scan-assembler "bset" } } */