RFC: make combine do as advertised (cheaper-than)?

Message ID 202007060201.06621sH0032715@ignucius.se.axis.com
State New
Headers show
Series
  • RFC: make combine do as advertised (cheaper-than)?
Related show

Commit Message

Richard Biener via Gcc-patches July 6, 2020, 2:01 a.m.
Most comments, including the second sentence in the head comment
of combine_validate_cost, the main decision-maker of the combine
pass, refer to the function as returning true if the new
insns(s) *cheaper* than the old insns, when in fact the function
returned true also if the cost was the same.  Returning true for
cheaper also seems more sane than as-cheap-as considering the
need to avoid oscillation between same-cost combinations.  Also,
it makes the job of later passes harder, having combine make
more complex combinations of the same cost.

Right, you can affect this with your target TARGET_RTX_COSTS and
TARGET_INSN_COST hooks, but only for trivial cases, and you have
increasingly more complex combinations (many-to-many
combinations) where you have to twist simple logic to appease
combine (stop it from combining) or give up.

Main-interest ports are unsurprisingly pretty tied to this
effect.  I'd love to install the following patch, adjusting the
function and the two opposing comments.  But...it causes
hundreds of regressions for each of x86_64-linux and
aarch64-linux (tens for ppc64le-linux), so I'm just not up to
the task, at least not without previous buy-in from reviewers.

It would need those targets to have their TARGET_INSN_COST
and/or TARGET_RTX_COSTS functions adjusted.

Alternatives from the top of my head, one of:

- With buy-in from global reviewers, installing this patch on a
development branch and let all target maintainers adjust their
target test-cases and cost-functions there, for merge when
first-class targets are done.  (I'm a dreamer.)

- A target combine hook for the decision (passing for inspection
tuples of from-insns and to-insns and costs) and just falling
back to the current addition of rtx costs.

- A simpler target combine decision hook that says which one of
"cheaper" or "as-cheap-as".

- Adjusting documentation and comments that are currently
untruthful about the cost decision to instead say (to the effect
of) "as cheap as" instead of "cheaper".

So, WDYT?

(Tested as above, causing massive pattern-match regressions.)

gcc:
	* combine.c (combine_validate_cost): Reject unless the new total
 	cost is cheaper than the original.  Adjust the minority of
 	comments that don't say "cheaper":

---
 gcc/combine.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

-- 
2.11.0

Comments

Richard Biener via Gcc-patches July 6, 2020, 7:50 a.m. | #1
On Mon, Jul 6, 2020 at 4:03 AM Hans-Peter Nilsson via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>

> Most comments, including the second sentence in the head comment

> of combine_validate_cost, the main decision-maker of the combine

> pass, refer to the function as returning true if the new

> insns(s) *cheaper* than the old insns, when in fact the function

> returned true also if the cost was the same.  Returning true for

> cheaper also seems more sane than as-cheap-as considering the

> need to avoid oscillation between same-cost combinations.  Also,

> it makes the job of later passes harder, having combine make

> more complex combinations of the same cost.

>

> Right, you can affect this with your target TARGET_RTX_COSTS and

> TARGET_INSN_COST hooks, but only for trivial cases, and you have

> increasingly more complex combinations (many-to-many

> combinations) where you have to twist simple logic to appease

> combine (stop it from combining) or give up.

>

> Main-interest ports are unsurprisingly pretty tied to this

> effect.  I'd love to install the following patch, adjusting the

> function and the two opposing comments.  But...it causes

> hundreds of regressions for each of x86_64-linux and

> aarch64-linux (tens for ppc64le-linux), so I'm just not up to

> the task, at least not without previous buy-in from reviewers.

>

> It would need those targets to have their TARGET_INSN_COST

> and/or TARGET_RTX_COSTS functions adjusted.

>

> Alternatives from the top of my head, one of:

>

> - With buy-in from global reviewers, installing this patch on a

> development branch and let all target maintainers adjust their

> target test-cases and cost-functions there, for merge when

> first-class targets are done.  (I'm a dreamer.)

>

> - A target combine hook for the decision (passing for inspection

> tuples of from-insns and to-insns and costs) and just falling

> back to the current addition of rtx costs.

>

> - A simpler target combine decision hook that says which one of

> "cheaper" or "as-cheap-as".


A target combine decision hook that on old == new cost
decides between both and given the actual insns?  Might
lead to quite hackish target hook implementations...

> - Adjusting documentation and comments that are currently

> untruthful about the cost decision to instead say (to the effect

> of) "as cheap as" instead of "cheaper".


Well, adjusting the function comment to reflect reality is
kind-of obvious ;)

> So, WDYT?

>

> (Tested as above, causing massive pattern-match regressions.)

>

> gcc:

>         * combine.c (combine_validate_cost): Reject unless the new total

>         cost is cheaper than the original.  Adjust the minority of

>         comments that don't say "cheaper":

>

> ---

>  gcc/combine.c | 8 ++++----

>  1 file changed, 4 insertions(+), 4 deletions(-)

>

> diff --git a/gcc/combine.c b/gcc/combine.c

> index f69413a..7da144e 100644

> --- a/gcc/combine.c

> +++ b/gcc/combine.c

> @@ -846,8 +846,8 @@ do_SUBST_LINK (struct insn_link **into, struct insn_link *newval)

>     than the original sequence I0, I1, I2, I3 and undobuf.other_insn.  Note

>     that I0, I1 and/or NEWI2PAT may be NULL_RTX.  Similarly, NEWOTHERPAT and

>     undobuf.other_insn may also both be NULL_RTX.  Return false if the cost

> -   of all the instructions can be estimated and the replacements are more

> -   expensive than the original sequence.  */

> +   of all the instructions can be estimated and the replacements are not

> +   cheaper than the original sequence.  */

>

>  static bool

>  combine_validate_cost (rtx_insn *i0, rtx_insn *i1, rtx_insn *i2, rtx_insn *i3,

> @@ -938,8 +938,8 @@ combine_validate_cost (rtx_insn *i0, rtx_insn *i1, rtx_insn *i2, rtx_insn *i3,

>      }

>

>    /* Disallow this combination if both new_cost and old_cost are greater than

> -     zero, and new_cost is greater than old cost.  */

> -  int reject = old_cost > 0 && new_cost > old_cost;

> +     zero, and new_cost is greater than or equal to the old cost.  */

> +  int reject = old_cost > 0 && new_cost >= old_cost;

>

>    if (dump_file)

>      {

> --

> 2.11.0
Richard Sandiford July 6, 2020, 9:48 a.m. | #2
Hans-Peter Nilsson via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Most comments, including the second sentence in the head comment

> of combine_validate_cost, the main decision-maker of the combine

> pass, refer to the function as returning true if the new

> insns(s) *cheaper* than the old insns, when in fact the function

> returned true also if the cost was the same.  Returning true for

> cheaper also seems more sane than as-cheap-as considering the

> need to avoid oscillation between same-cost combinations.  Also,

> it makes the job of later passes harder, having combine make

> more complex combinations of the same cost.

>

> Right, you can affect this with your target TARGET_RTX_COSTS and

> TARGET_INSN_COST hooks, but only for trivial cases, and you have

> increasingly more complex combinations (many-to-many

> combinations) where you have to twist simple logic to appease

> combine (stop it from combining) or give up.

>

> Main-interest ports are unsurprisingly pretty tied to this

> effect.  I'd love to install the following patch, adjusting the

> function and the two opposing comments.  But...it causes

> hundreds of regressions for each of x86_64-linux and

> aarch64-linux (tens for ppc64le-linux), so I'm just not up to

> the task, at least not without previous buy-in from reviewers.


Out of interest, how do the results change if we still allow the
combination for equal costs if the new sequence is shorter than
the original?  I think that still counts as “cheaper than”,
but maybe I'm too RISC-centric. ;-)  (I'm not saying that's
what we should do, I'm just curious.)

Originally combine always produced shorter sequences, so by the
above reasoning, combining for == was correct.  These days we allow
N-to-N replacements too, which are obviously a good thing when
the new cost is lower, but are more of a wash when the costs
are the same.  But even then, the combination should have a
“canonicalisation” effect.  (Unfortunately that effect includes
the result of expand_compound_operation/make_compound_operation.)

Do you have specific examples of where things go wrong?

Thanks,
Richard
Segher Boessenkool July 14, 2020, 9:44 p.m. | #3
Hi!

On Mon, Jul 06, 2020 at 10:48:25AM +0100, Richard Sandiford wrote:
> Originally combine always produced shorter sequences, so by the


(Shorter in # insns, not # bytes).

> above reasoning, combining for == was correct.  These days we allow

> N-to-N replacements too, which are obviously a good thing when


Only 2-2 so far (not 1-1 yet, there are some target problems to
overcome first).

> the new cost is lower, but are more of a wash when the costs

> are the same.  But even then, the combination should have a

> “canonicalisation” effect.  (Unfortunately that effect includes

> the result of expand_compound_operation/make_compound_operation.)


2-2 is always reducing latency if the costs are equal (and sane ;-) ),
that is a large part of what makes 2-2 combinations useful.  Originally
the output of i2 is input to i3, but not anymore in the new insns.


Segher
Segher Boessenkool July 14, 2020, 9:58 p.m. | #4
Hi!

On Mon, Jul 06, 2020 at 04:01:54AM +0200, Hans-Peter Nilsson via Gcc-patches wrote:
> Most comments, including the second sentence in the head comment

> of combine_validate_cost, the main decision-maker of the combine

> pass, refer to the function as returning true if the new

> insns(s) *cheaper* than the old insns, when in fact the function

> returned true also if the cost was the same.  Returning true for

> cheaper also seems more sane than as-cheap-as considering the

> need to avoid oscillation between same-cost combinations.  Also,

> it makes the job of later passes harder, having combine make

> more complex combinations of the same cost.


All of this was added in https://gcc.gnu.org/g:64b8935d4809 , which also
adds the

+  /* Disallow this recombination if both new_cost and old_cost are
+     greater than zero, and new_cost is greater than old cost.  */

comment (which is what it actually does).  Before that change, combine
didn't look at costs at all.

Combine never has used a "strictly smaller than" cost thing.

> Right, you can affect this with your target TARGET_RTX_COSTS and

> TARGET_INSN_COST hooks, but only for trivial cases, and you have

> increasingly more complex combinations (many-to-many

> combinations) where you have to twist simple logic to appease

> combine (stop it from combining) or give up.


There are 2-1, 3-1, 4-1, 3-2, 4-2, which all reduce the number of insns,
and then there is 2-2, which gets rid of one log_link.  If the isnn_cost
stays the same, it always wins something else.

> Alternatives from the top of my head, one of:


...

5) Improve your target so that its insn_cost reflects ithe costs of
the insns better.

Can you share some typical examples where things are worse with the
current behaviour?


Segher

Patch

diff --git a/gcc/combine.c b/gcc/combine.c
index f69413a..7da144e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -846,8 +846,8 @@  do_SUBST_LINK (struct insn_link **into, struct insn_link *newval)
    than the original sequence I0, I1, I2, I3 and undobuf.other_insn.  Note
    that I0, I1 and/or NEWI2PAT may be NULL_RTX.  Similarly, NEWOTHERPAT and
    undobuf.other_insn may also both be NULL_RTX.  Return false if the cost
-   of all the instructions can be estimated and the replacements are more
-   expensive than the original sequence.  */
+   of all the instructions can be estimated and the replacements are not
+   cheaper than the original sequence.  */
 
 static bool
 combine_validate_cost (rtx_insn *i0, rtx_insn *i1, rtx_insn *i2, rtx_insn *i3,
@@ -938,8 +938,8 @@  combine_validate_cost (rtx_insn *i0, rtx_insn *i1, rtx_insn *i2, rtx_insn *i3,
     }
 
   /* Disallow this combination if both new_cost and old_cost are greater than
-     zero, and new_cost is greater than old cost.  */
-  int reject = old_cost > 0 && new_cost > old_cost;
+     zero, and new_cost is greater than or equal to the old cost.  */
+  int reject = old_cost > 0 && new_cost >= old_cost;
 
   if (dump_file)
     {