patch to fix PR93564

Message ID 79613404-2576-4163-d75c-565b8b32d020@redhat.com
State New
Headers show
Series
  • patch to fix PR93564
Related show

Commit Message

Vladimir Makarov Feb. 23, 2020, 9:25 p.m.
The following patch is for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93564

The patch was successfully bootstrapped on x86-64 and benchmarked on 
SPEC2000.

Comments

Christophe Lyon Feb. 24, 2020, 9:54 a.m. | #1
Hi,

On Sun, 23 Feb 2020 at 22:26, Vladimir Makarov <vmakarov@redhat.com> wrote:
>

> The following patch is for

>

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93564

>

> The patch was successfully bootstrapped on x86-64 and benchmarked on

> SPEC2000.

>



It seems this patch causes regression on some arm cores (seen on
cortex-a15 and cortex-57).
When configuring
--target arm-none-linux-gnueabihf
--with-cpu cortex-a57
--with-fpu crypto-neon-fp-armv8

gcc.target/arm/unaligned-memcpy-2.c: ldrd found 1 times
FAIL: gcc.target/arm/unaligned-memcpy-2.c scan-assembler-times ldrd 0
gcc.target/arm/unaligned-memcpy-2.c: strd found 2 times
FAIL: gcc.target/arm/unaligned-memcpy-2.c scan-assembler-times strd 1

This one regresses only on cortex-a15:
gcc.target/arm/unaligned-memcpy-3.c: ldrd found 2 times
FAIL: gcc.target/arm/unaligned-memcpy-3.c scan-assembler-times ldrd 1
gcc.target/arm/unaligned-memcpy-3.c: strd found 1 times
FAIL: gcc.target/arm/unaligned-memcpy-3.c scan-assembler-times strd 0

Christophe
Andrew Stubbs Feb. 26, 2020, 3:16 p.m. | #2
On 23/02/2020 21:25, Vladimir Makarov wrote:
> The following patch is for

> 

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93564

> 

> The patch was successfully bootstrapped on x86-64 and benchmarked on 

> SPEC2000.


Since this patch I get an ICE with checking enabled, for amdgcn-amdhsa:

     during RTL pass: reload
     dump file: /scratch/astubbs/amd/upstreamB/tmp/target.283r.reload
     ....../gcc.c-torture/execute/ieee/compare-fp-1.c: In function 'ieq':
     ....../gcc.c-torture/execute/ieee/compare-fp-1.c:33:1: internal 
compiler error: in lra_constraints, at lra-constraints.c:5051
        33 | }
           | ^
     0x103d7cf lra_constraints(bool)
             ..../gcc/lra-constraints.c:5051
     0x102541a lra(_IO_FILE*)
             ..../gcc/lra.c:2437
     0xfcc9c9 do_reload
             ..../gcc/ira.c:5523
     0xfcce50 execute
             ..../gcc/ira.c:5709
     Please submit a full bug report,
     with preprocessed source if appropriate.
     Please include the complete backtrace with any bug report.
     See <https://gcc.gnu.org/bugs/> for instructions.

This happens at all optimization levels (but not -O0).

The problem appears to be that the high-part of a register pair is not 
marked as "ever live".  I'm trying to figure out whether this is some 
kind of target-specific issue that has merely been exposed, but it's 
difficult to see what's going on. I'm pretty sure I've never seen this 
one before.

Andrew
Andrew Stubbs Feb. 27, 2020, 12:33 p.m. | #3
On 26/02/2020 15:16, Andrew Stubbs wrote:
> The problem appears to be that the high-part of a register pair is not 

> marked as "ever live".  I'm trying to figure out whether this is some 

> kind of target-specific issue that has merely been exposed, but it's 

> difficult to see what's going on. I'm pretty sure I've never seen this 

> one before.


I'm now pretty sure your patch didn't cause this issue so much as expose it.

Either way, it's fixed now.

Andrew
Vladimir Makarov Feb. 27, 2020, 2:54 p.m. | #4
On 2020-02-27 7:33 a.m., Andrew Stubbs wrote:
> On 26/02/2020 15:16, Andrew Stubbs wrote:

>> The problem appears to be that the high-part of a register pair is 

>> not marked as "ever live".  I'm trying to figure out whether this is 

>> some kind of target-specific issue that has merely been exposed, but 

>> it's difficult to see what's going on. I'm pretty sure I've never 

>> seen this one before.

>

> I'm now pretty sure your patch didn't cause this issue so much as 

> expose it.

>

> Either way, it's fixed now.

>

Thank you for informing me about this.  Such heuristic changes should 
not affect generated code correctness unless they trigger a hidden bug 
in RA or machine-depended code used by RA.
Vladimir Makarov Feb. 28, 2020, 4:39 p.m. | #5
On 2020-02-24 4:54 a.m., Christophe Lyon wrote:
> Hi,

>

> On Sun, 23 Feb 2020 at 22:26, Vladimir Makarov <vmakarov@redhat.com> wrote:

>> The following patch is for

>>

>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93564

>>

>> The patch was successfully bootstrapped on x86-64 and benchmarked on

>> SPEC2000.

>>

>

> It seems this patch causes regression on some arm cores (seen on

> cortex-a15 and cortex-57).

> When configuring

> --target arm-none-linux-gnueabihf

> --with-cpu cortex-a57

> --with-fpu crypto-neon-fp-armv8

>

> gcc.target/arm/unaligned-memcpy-2.c: ldrd found 1 times

> FAIL: gcc.target/arm/unaligned-memcpy-2.c scan-assembler-times ldrd 0

> gcc.target/arm/unaligned-memcpy-2.c: strd found 2 times

> FAIL: gcc.target/arm/unaligned-memcpy-2.c scan-assembler-times strd 1

>

> This one regresses only on cortex-a15:

> gcc.target/arm/unaligned-memcpy-3.c: ldrd found 2 times

> FAIL: gcc.target/arm/unaligned-memcpy-3.c scan-assembler-times ldrd 1

> gcc.target/arm/unaligned-memcpy-3.c: strd found 1 times

> FAIL: gcc.target/arm/unaligned-memcpy-3.c scan-assembler-times strd 0

>

I've just submitted a patch which I hope will solve some of the failures.

Patch

commit 3133bed5d0327e8a9cd0a601b7ecdb9de4fc825d
Author: Vladimir N. Makarov <vmakarov@redhat.com>
Date:   Sun Feb 23 16:20:05 2020 -0500

    Changing cost propagation and ordering colorable bucket heuristics for PR93564.
    
    2020-02-23  Vladimir Makarov  <vmakarov@redhat.com>
    
            PR rtl-optimization/93564
            * ira-color.c (struct update_cost_queue_elem): New member start.
            (queue_update_cost, get_next_update_cost): Add new arg start.
            (allocnos_conflict_p): New function.
            (update_costs_from_allocno): Add new arg conflict_cost_update_p.
            Add checking conflicts with allocnos_conflict_p.
            (update_costs_from_prefs, restore_costs_from_copies): Adjust
            update_costs_from_allocno calls.
            (update_conflict_hard_regno_costs): Add checking conflicts with
            allocnos_conflict_p.  Adjust calls of queue_update_cost and
            get_next_update_cost.
            (assign_hard_reg): Adjust calls of queue_update_cost.  Add
            debugging print.
            (bucket_allocno_compare_func): Restore previous version.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cea52f00523..b3b9d92a1ec 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,20 @@ 
+2020-02-23  Vladimir Makarov  <vmakarov@redhat.com>
+
+	PR rtl-optimization/93564
+	* ira-color.c (struct update_cost_queue_elem): New member start.
+	(queue_update_cost, get_next_update_cost): Add new arg start.
+	(allocnos_conflict_p): New function.
+	(update_costs_from_allocno): Add new arg conflict_cost_update_p.
+	Add checking conflicts with allocnos_conflict_p.
+	(update_costs_from_prefs, restore_costs_from_copies): Adjust
+	update_costs_from_allocno calls.
+	(update_conflict_hard_regno_costs): Add checking conflicts with
+	allocnos_conflict_p.  Adjust calls of queue_update_cost and
+	get_next_update_cost.
+	(assign_hard_reg): Adjust calls of queue_update_cost.  Add
+	debugging print.
+	(bucket_allocno_compare_func): Restore previous version.
+
 2020-02-21  John David Anglin  <danglin@gcc.gnu.org>
 
 	* gcc/config/pa/pa.c (pa_function_value): Fix check for word and
diff --git a/gcc/ira-color.c b/gcc/ira-color.c
index 0bcc80463b0..0ffdd192020 100644
--- a/gcc/ira-color.c
+++ b/gcc/ira-color.c
@@ -1199,6 +1199,10 @@  struct update_cost_queue_elem
      connecting this allocno to the one being allocated.  */
   int divisor;
 
+  /* Allocno from which we started chaining costs of connected
+     allocnos. */
+  ira_allocno_t start;
+
   /* Allocno from which we are chaining costs of connected allocnos.
      It is used not go back in graph of allocnos connected by
      copies.  */
@@ -1258,10 +1262,11 @@  start_update_cost (void)
   update_cost_queue = NULL;
 }
 
-/* Add (ALLOCNO, FROM, DIVISOR) to the end of update_cost_queue, unless
+/* Add (ALLOCNO, START, FROM, DIVISOR) to the end of update_cost_queue, unless
    ALLOCNO is already in the queue, or has NO_REGS class.  */
 static inline void
-queue_update_cost (ira_allocno_t allocno, ira_allocno_t from, int divisor)
+queue_update_cost (ira_allocno_t allocno, ira_allocno_t start,
+		   ira_allocno_t from, int divisor)
 {
   struct update_cost_queue_elem *elem;
 
@@ -1270,6 +1275,7 @@  queue_update_cost (ira_allocno_t allocno, ira_allocno_t from, int divisor)
       && ALLOCNO_CLASS (allocno) != NO_REGS)
     {
       elem->check = update_cost_check;
+      elem->start = start;
       elem->from = from;
       elem->divisor = divisor;
       elem->next = NULL;
@@ -1282,10 +1288,11 @@  queue_update_cost (ira_allocno_t allocno, ira_allocno_t from, int divisor)
 }
 
 /* Try to remove the first element from update_cost_queue.  Return
-   false if the queue was empty, otherwise make (*ALLOCNO, *FROM,
-   *DIVISOR) describe the removed element.  */
+   false if the queue was empty, otherwise make (*ALLOCNO, *START,
+   *FROM, *DIVISOR) describe the removed element.  */
 static inline bool
-get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *from, int *divisor)
+get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *start,
+		      ira_allocno_t *from, int *divisor)
 {
   struct update_cost_queue_elem *elem;
 
@@ -1294,6 +1301,7 @@  get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *from, int *divisor)
 
   *allocno = update_cost_queue;
   elem = &update_cost_queue_elems[ALLOCNO_NUM (*allocno)];
+  *start = elem->start;
   *from = elem->from;
   *divisor = elem->divisor;
   update_cost_queue = elem->next;
@@ -1325,18 +1333,41 @@  update_allocno_cost (ira_allocno_t allocno, int hard_regno,
   return true;
 }
 
+/* Return TRUE if allocnos A1 and A2 conflicts. Here we are
+   interesting only in conflicts of allocnos with intersected allocno
+   classes. */
+static bool
+allocnos_conflict_p (ira_allocno_t a1, ira_allocno_t a2)
+{
+  ira_object_t obj, conflict_obj;
+  ira_object_conflict_iterator oci;
+  int word, nwords = ALLOCNO_NUM_OBJECTS (a1);
+  
+  for (word = 0; word < nwords; word++)
+    {
+      obj = ALLOCNO_OBJECT (a1, word);
+      /* Take preferences of conflicting allocnos into account.  */
+      FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
+	if (OBJECT_ALLOCNO (conflict_obj) == a2)
+	  return true;
+    }
+  return false;
+}  
+
 /* Update (decrease if DECR_P) HARD_REGNO cost of allocnos connected
    by copies to ALLOCNO to increase chances to remove some copies as
-   the result of subsequent assignment.  Record cost updates if
-   RECORD_P is true.  */
+   the result of subsequent assignment.  Update conflict costs only
+   for true CONFLICT_COST_UPDATE_P.  Record cost updates if RECORD_P is
+   true.  */
 static void
 update_costs_from_allocno (ira_allocno_t allocno, int hard_regno,
-			   int divisor, bool decr_p, bool record_p)
+			   int divisor, bool decr_p, bool record_p,
+			   bool conflict_cost_update_p)
 {
   int cost, update_cost, update_conflict_cost;
   machine_mode mode;
   enum reg_class rclass, aclass;
-  ira_allocno_t another_allocno, from = NULL;
+  ira_allocno_t another_allocno, start = allocno, from = NULL;
   ira_copy_t cp, next_cp;
 
   rclass = REGNO_REG_CLASS (hard_regno);
@@ -1359,7 +1390,8 @@  update_costs_from_allocno (ira_allocno_t allocno, int hard_regno,
 	  else
 	    gcc_unreachable ();
 
-	  if (another_allocno == from)
+	  if (another_allocno == from
+	      || allocnos_conflict_p (another_allocno, start))
 	    continue;
 
 	  aclass = ALLOCNO_CLASS (another_allocno);
@@ -1384,7 +1416,8 @@  update_costs_from_allocno (ira_allocno_t allocno, int hard_regno,
 	  if (decr_p)
 	    cost = -cost;
 
-	  update_conflict_cost = update_cost = cp->freq * cost / divisor;
+	  update_cost = cp->freq * cost / divisor;
+	  update_conflict_cost = conflict_cost_update_p ? update_cost : 0;
 
 	  if (ALLOCNO_COLOR_DATA (another_allocno) != NULL
 	      && (ALLOCNO_COLOR_DATA (allocno)->first_thread_allocno
@@ -1399,7 +1432,8 @@  update_costs_from_allocno (ira_allocno_t allocno, int hard_regno,
 	  if (! update_allocno_cost (another_allocno, hard_regno,
 				     update_cost, update_conflict_cost))
 	    continue;
-	  queue_update_cost (another_allocno, allocno, divisor * COST_HOP_DIVISOR);
+	  queue_update_cost (another_allocno, start, allocno,
+			     divisor * COST_HOP_DIVISOR);
 	  if (record_p && ALLOCNO_COLOR_DATA (another_allocno) != NULL)
 	    ALLOCNO_COLOR_DATA (another_allocno)->update_cost_records
 	      = get_update_cost_record (hard_regno, divisor,
@@ -1407,7 +1441,7 @@  update_costs_from_allocno (ira_allocno_t allocno, int hard_regno,
 					->update_cost_records);
 	}
     }
-  while (get_next_update_cost (&allocno, &from, &divisor));
+  while (get_next_update_cost (&allocno, &start, &from, &divisor));
 }
 
 /* Decrease preferred ALLOCNO hard register costs and costs of
@@ -1420,7 +1454,7 @@  update_costs_from_prefs (ira_allocno_t allocno)
   start_update_cost ();
   for (pref = ALLOCNO_PREFS (allocno); pref != NULL; pref = pref->next_pref)
     update_costs_from_allocno (allocno, pref->hard_regno,
-			       COST_HOP_DIVISOR, true, true);
+			       COST_HOP_DIVISOR, true, true, false);
 }
 
 /* Update (decrease if DECR_P) the cost of allocnos connected to
@@ -1435,7 +1469,7 @@  update_costs_from_copies (ira_allocno_t allocno, bool decr_p, bool record_p)
   hard_regno = ALLOCNO_HARD_REGNO (allocno);
   ira_assert (hard_regno >= 0 && ALLOCNO_CLASS (allocno) != NO_REGS);
   start_update_cost ();
-  update_costs_from_allocno (allocno, hard_regno, 1, decr_p, record_p);
+  update_costs_from_allocno (allocno, hard_regno, 1, decr_p, record_p, true);
 }
 
 /* Update conflict_allocno_hard_prefs of allocnos conflicting with
@@ -1485,7 +1519,7 @@  restore_costs_from_copies (ira_allocno_t allocno)
   start_update_cost ();
   for (curr = records; curr != NULL; curr = curr->next)
     update_costs_from_allocno (allocno, curr->hard_regno,
-			       curr->divisor, true, false);
+			       curr->divisor, true, false, true);
   free_update_cost_record_list (records);
   ALLOCNO_COLOR_DATA (allocno)->update_cost_records = NULL;
 }
@@ -1503,10 +1537,10 @@  update_conflict_hard_regno_costs (int *costs, enum reg_class aclass,
   int *conflict_costs;
   bool cont_p;
   enum reg_class another_aclass;
-  ira_allocno_t allocno, another_allocno, from;
+  ira_allocno_t allocno, another_allocno, start, from;
   ira_copy_t cp, next_cp;
 
-  while (get_next_update_cost (&allocno, &from, &divisor))
+  while (get_next_update_cost (&allocno, &start, &from, &divisor))
     for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp)
       {
 	if (cp->first == allocno)
@@ -1522,7 +1556,8 @@  update_conflict_hard_regno_costs (int *costs, enum reg_class aclass,
 	else
 	  gcc_unreachable ();
 
-	if (another_allocno == from)
+	if (another_allocno == from
+	    || allocnos_conflict_p (another_allocno, start))
 	  continue;
 
  	another_aclass = ALLOCNO_CLASS (another_allocno);
@@ -1568,7 +1603,7 @@  update_conflict_hard_regno_costs (int *costs, enum reg_class aclass,
 			   * COST_HOP_DIVISOR
 			   * COST_HOP_DIVISOR
 			   * COST_HOP_DIVISOR))
-	  queue_update_cost (another_allocno, allocno, divisor * COST_HOP_DIVISOR);
+	  queue_update_cost (another_allocno, start, from, divisor * COST_HOP_DIVISOR);
       }
 }
 
@@ -1837,8 +1872,7 @@  assign_hard_reg (ira_allocno_t a, bool retry_p)
 		      continue;
 		    full_costs[j] -= conflict_costs[k];
 		  }
-	      queue_update_cost (conflict_a, NULL, COST_HOP_DIVISOR);
-
+	      queue_update_cost (conflict_a, conflict_a, NULL, COST_HOP_DIVISOR);
 	    }
 	}
     }
@@ -1852,7 +1886,7 @@  assign_hard_reg (ira_allocno_t a, bool retry_p)
   if (! retry_p)
     {
       start_update_cost ();
-      queue_update_cost (a, NULL,  COST_HOP_DIVISOR);
+      queue_update_cost (a, a, NULL, COST_HOP_DIVISOR);
       update_conflict_hard_regno_costs (full_costs, aclass, false);
     }
   min_cost = min_full_cost = INT_MAX;
@@ -1897,6 +1931,8 @@  assign_hard_reg (ira_allocno_t a, bool retry_p)
 	  best_hard_regno = hard_regno;
 	  ira_assert (hard_regno >= 0);
 	}
+      if (internal_flag_ira_verbose > 5 && ira_dump_file != NULL)
+	fprintf (ira_dump_file, "(%d=%d,%d) ", hard_regno, cost, full_cost);
     }
   if (min_full_cost > mem_cost
       /* Do not spill static chain pointer pseudo when non-local goto
@@ -2259,16 +2295,6 @@  bucket_allocno_compare_func (const void *v1p, const void *v2p)
   ira_allocno_t t2 = ALLOCNO_COLOR_DATA (a2)->first_thread_allocno;
   int cl1 = ALLOCNO_CLASS (a1), cl2 = ALLOCNO_CLASS (a2);
 
-  /* Push allocnos with minimal hard_reg_prefs first.  */
-  pref1 = ALLOCNO_COLOR_DATA (a1)->hard_reg_prefs;
-  pref2 = ALLOCNO_COLOR_DATA (a2)->hard_reg_prefs;
-  if ((diff = pref1 - pref2) != 0)
-    return diff;
-  /* Push allocnos with minimal conflict_allocno_hard_prefs first.  */
-  pref1 = ALLOCNO_COLOR_DATA (a1)->conflict_allocno_hard_prefs;
-  pref2 = ALLOCNO_COLOR_DATA (a2)->conflict_allocno_hard_prefs;
-  if ((diff = pref1 - pref2) != 0)
-    return diff;
   freq1 = ALLOCNO_COLOR_DATA (t1)->thread_freq;
   freq2 = ALLOCNO_COLOR_DATA (t2)->thread_freq;
   if ((diff = freq1 - freq2) != 0)
@@ -2294,6 +2320,11 @@  bucket_allocno_compare_func (const void *v1p, const void *v2p)
   a2_num = ALLOCNO_COLOR_DATA (a2)->available_regs_num;
   if ((diff = a2_num - a1_num) != 0)
     return diff;
+  /* Push allocnos with minimal conflict_allocno_hard_prefs first.  */
+  pref1 = ALLOCNO_COLOR_DATA (a1)->conflict_allocno_hard_prefs;
+  pref2 = ALLOCNO_COLOR_DATA (a2)->conflict_allocno_hard_prefs;
+  if ((diff = pref1 - pref2) != 0)
+    return diff;
   return ALLOCNO_NUM (a2) - ALLOCNO_NUM (a1);
 }