[v3] Make function clone name numbering independent.

Message ID ded14873-6181-849f-6a3e-faf804d3f097@oracle.com
State New
Headers show
Series
  • [v3] Make function clone name numbering independent.
Related show

Commit Message

Michael Ploujnikov Nov. 28, 2018, 9:09 p.m.
Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html

It took longer than expected, but I've finally rebased this on top of
the new clone_function_name* API and included the requested
optimizations.

There are a few remaining spots where we could probably apply similar
optimizations:

- gcc/multiple_target.c:create_target_clone
- gcc/multiple_target.c:create_dispatcher_calls
- gcc/omp-expand.c:grid_expand_target_grid_body

But I've yet to figure out how these work in detail and with the new
API these shouldn't block the main change from being merged.

I've also included a small change to rs6000 which I'm pretty sure is
safe, but I have no way of testing.

I'm not sure what's the consensus on what's more appropriate, but I
thought that it would be a good idea to keep these changes as separate
patches since only the first one really changes behaviour, while the
rest are optimizations. It's conceivable that someone who is
backporting this to an older release might want to just backport the
core bits of the change. I can re-submit it as one patch if that's
more appropriate.

Everything in these patches was bootstrapped and regression tested on
x86_64.

Ok for trunk?

- Michael

Comments

Segher Boessenkool Nov. 28, 2018, 10:49 p.m. | #1
Hi!

On Wed, Nov 28, 2018 at 04:09:14PM -0500, Michael Ploujnikov wrote:
> I've also included a small change to rs6000 which I'm pretty sure is

> safe, but I have no way of testing.


Do you have an account on the GCC Compile Farm?
https://cfarm.tetaneutral.net/
There are some nice Power machines in there!

Does this patch dependent on the rest of the series?

If it tests okay, it is okay for trunk of course.  Thanks!

One comment about your changelog:

> 2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

> 

> 	* config/rs6000/rs6000.c (make_resolver_func): Generate

> 	  resolver symbol with clone_function_name instead of

> 	  clone_function_name_numbered.


Those last two lines should not start with the spaces.  It should be:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* config/rs6000/rs6000.c (make_resolver_func): Generate
	resolver symbol with clone_function_name instead of
	clone_function_name_numbered.


Segher
Michael Ploujnikov Nov. 29, 2018, 2:13 p.m. | #2
On 2018-11-28 5:49 p.m., Segher Boessenkool wrote:
> Hi!

> 

> On Wed, Nov 28, 2018 at 04:09:14PM -0500, Michael Ploujnikov wrote:

>> I've also included a small change to rs6000 which I'm pretty sure is

>> safe, but I have no way of testing.

> 

> Do you have an account on the GCC Compile Farm?

> https://cfarm.tetaneutral.net/

> There are some nice Power machines in there!

> 

> Does this patch dependent on the rest of the series?

> 

> If it tests okay, it is okay for trunk of course.  Thanks!

> 

> One comment about your changelog:

> 

>> 2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

>>

>> 	* config/rs6000/rs6000.c (make_resolver_func): Generate

>> 	  resolver symbol with clone_function_name instead of

>> 	  clone_function_name_numbered.

> 

> Those last two lines should not start with the spaces.  It should be:

> 

> 2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

> 

> 	* config/rs6000/rs6000.c (make_resolver_func): Generate

> 	resolver symbol with clone_function_name instead of

> 	clone_function_name_numbered.

> 

> 

> Segher

> 


Thanks for the review and suggestion to use the cfarm! I've
successfully regtested this on power9 and committed the stand-alone
change to rs6000.c after fixing the changelog.

- Michael
Richard Biener Nov. 30, 2018, 7:25 a.m. | #3
On Wed, Nov 28, 2018 at 10:09 PM Michael Ploujnikov
<michael.ploujnikov@oracle.com> wrote:
>

> Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html

>

> It took longer than expected, but I've finally rebased this on top of

> the new clone_function_name* API and included the requested

> optimizations.

>

> There are a few remaining spots where we could probably apply similar

> optimizations:

>

> - gcc/multiple_target.c:create_target_clone

> - gcc/multiple_target.c:create_dispatcher_calls

> - gcc/omp-expand.c:grid_expand_target_grid_body

>

> But I've yet to figure out how these work in detail and with the new

> API these shouldn't block the main change from being merged.

>

> I've also included a small change to rs6000 which I'm pretty sure is

> safe, but I have no way of testing.

>

> I'm not sure what's the consensus on what's more appropriate, but I

> thought that it would be a good idea to keep these changes as separate

> patches since only the first one really changes behaviour, while the

> rest are optimizations. It's conceivable that someone who is

> backporting this to an older release might want to just backport the

> core bits of the change. I can re-submit it as one patch if that's

> more appropriate.

>

> Everything in these patches was bootstrapped and regression tested on

> x86_64.

>

> Ok for trunk?


+  unsigned &clone_number = lto_clone_numbers->get_or_insert (
+     IDENTIFIER_POINTER (DECL_NAME (decl)));
   name = maybe_rewrite_identifier (name);
   symtab->change_decl_assembler_name (decl,
-      clone_function_name_numbered (
-  name, "lto_priv"));
+      clone_function_name (
+  name, "lto_priv", clone_number));

since we use 'name' from maybe_rewrite_identifier in the end wouldn't it
make more sense to use that as a key for lto_clone_numbers?

For the ipa-hsa.c changes since we iterate over all defined functions
we at most create a single clone per cgraph_node.  That means your
hash-map keyed on cgraph_node is useless and you could use
an unnumbered clone (or a static zero number).

-  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
-   suffix));
+  SET_DECL_ASSEMBLER_NAME (new_decl,
+   clone_function_name (
+     IDENTIFIER_POINTER (
+       DECL_ASSEMBLER_NAME (old_decl)),
+     suffix,
+     num_suffix));

can you please hide the implementation detail of accessing the assembler name
by adding an overload to clone_function_name_numbered with an explicit
number?

OK with those changes.

Thanks,
Richard.



>

> - Michael
Michael Ploujnikov Nov. 30, 2018, 9:10 p.m. | #4
Hi,

On 2018-11-30 2:25 a.m., Richard Biener wrote:
> +  unsigned &clone_number = lto_clone_numbers->get_or_insert (

> +     IDENTIFIER_POINTER (DECL_NAME (decl)));

>    name = maybe_rewrite_identifier (name);

>    symtab->change_decl_assembler_name (decl,

> -      clone_function_name_numbered (

> -  name, "lto_priv"));

> +      clone_function_name (

> +  name, "lto_priv", clone_number));

> 

> since we use 'name' from maybe_rewrite_identifier in the end wouldn't it

> make more sense to use that as a key for lto_clone_numbers?


Yup, that looks much better. Fixed it.

> For the ipa-hsa.c changes since we iterate over all defined functions

> we at most create a single clone per cgraph_node.  That means your

> hash-map keyed on cgraph_node is useless and you could use

> an unnumbered clone (or a static zero number).


Good catch. I was thinking of doing this, but it somehow fell through the cracks during the rebase.

> 

> -  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,

> -   suffix));

> +  SET_DECL_ASSEMBLER_NAME (new_decl,

> +   clone_function_name (

> +     IDENTIFIER_POINTER (

> +       DECL_ASSEMBLER_NAME (old_decl)),

> +     suffix,

> +     num_suffix));

> 

> can you please hide the implementation detail of accessing the assembler name

> by adding an overload to clone_function_name_numbered with an explicit

> number?


Done.


> 

> OK with those changes.

> 

> Thanks,

> Richard.


Thank you for the review. I will commit as soon as my last test run finishes.

- Michael
From ac1f1579d37804c97833d460ec6cd5b87d6184c7 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>

Date: Thu, 1 Nov 2018 12:57:30 -0400
Subject: [PATCH 1/3] Make function assembly more independent.

This is achieved by having clone_function_name assign unique clone
numbers for each function independently.

gcc:

2018-11-30  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids;
	hash map.
	(clone_function_name_numbered): Use clone_fn_ids.

gcc/testsuite:

2018-11-30  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* gcc.dg/independent-cloneids-1.c: New test.
---
 gcc/cgraphclones.c                            | 10 ++++-
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++++++++++++++++++
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index e17959c0ca..fdd84b60d3 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map<const char *, unsigned> *clone_fn_ids;
 
 /* Return a new assembler name for a clone of decl named NAME.  Apart
    from the string SUFFIX, the new name will end with a unique (for
@@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num;
 tree
 clone_function_name_numbered (const char *name, const char *suffix)
 {
-  return clone_function_name (name, suffix, clone_fn_id_num++);
+  /* Initialize the function->counter mapping the first time it's
+     needed.  */
+  if (!clone_fn_ids)
+    clone_fn_ids = hash_map<const char *, unsigned int>::create_ggc (64);
+  unsigned int &suffix_counter = clone_fn_ids->get_or_insert (
+				   IDENTIFIER_POINTER (get_identifier (name)));
+  return clone_function_name (name, suffix, suffix_counter++);
 }
 
 /* Return a new assembler name for a clone of DECL.  Apart from string
diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c
new file mode 100644
index 0000000000..3203895492
--- /dev/null
+++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone"  } */
+
+extern int printf (const char *, ...);
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+int
+baz (int arg)
+{
+  printf("%d\n", bar (3));
+  printf("%d\n", bar (4));
+  printf("%d\n", foo (5));
+  printf("%d\n", foo (6));
+  /* adding or removing the following call should not affect foo
+     function's clone numbering */
+  printf("%d\n", bar (7));
+  return foo (8);
+}
+
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]3:} } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]4:} } } */
-- 
2.19.1
From 9656fc66b05c4ee9efd1b4a0533d564a35a85bc5 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>

Date: Mon, 17 Sep 2018 16:02:27 -0400
Subject: [PATCH 2/3] Minimize clone counter memory usage in
 create_virtual_clone.

Based on Martin Jambour's suggestion:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00111.html

gcc:

2018-11-30  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* cgraph.h (clone_function_name): Add a variant that takes a
	tree decl.
	* cgraph.h (cgraph_node::create_virtual_clone): Add a new
	argument: num_suffix.
	* cgraphclones.c (cgraph_node::create_virtual_clone): Pass
	num_suffix to clone_function_name.
	(clone_function_name): Add a variant that takes a tree decl.
	* ipa-cp.c (create_specialized_node): Keep track of clone
	counters in clone_num_suffixes hash map.
	(ipcp_driver): Free the counter hash map.
	* ipa-hsa.c (process_hsa_functions): Creates at most one hsa
	clone per function.
---
 gcc/cgraph.h       | 10 +++++++---
 gcc/cgraphclones.c | 25 ++++++++++++++++++++-----
 gcc/ipa-cp.c       | 10 +++++++++-
 gcc/ipa-hsa.c      |  4 ++--
 4 files changed, 38 insertions(+), 11 deletions(-)

diff --git gcc/cgraph.h gcc/cgraph.h
index c13d79850f..689bb828ca 100644
--- gcc/cgraph.h
+++ gcc/cgraph.h
@@ -968,11 +968,13 @@ public:
 			     cgraph_node *new_inlined_to,
 			     bitmap args_to_skip, const char *suffix = NULL);
 
-  /* Create callgraph node clone with new declaration.  The actual body will
-     be copied later at compilation stage.  */
+  /* Create callgraph node clone with new declaration.  The actual body will be
+     copied later at compilation stage.  The name of the new clone will be
+     constructed from the name of the original node, SUFFIX and NUM_SUFFIX.  */
   cgraph_node *create_virtual_clone (vec<cgraph_edge *> redirect_callers,
 				     vec<ipa_replace_map *, va_gc> *tree_map,
-				     bitmap args_to_skip, const char * suffix);
+				     bitmap args_to_skip, const char * suffix,
+				     unsigned num_suffix);
 
   /* cgraph node being removed from symbol table; see if its entry can be
    replaced by other inline clone.  */
@@ -2386,6 +2388,8 @@ tree clone_function_name_numbered (const char *name, const char *suffix);
 tree clone_function_name_numbered (tree decl, const char *suffix);
 tree clone_function_name (const char *name, const char *suffix,
 			  unsigned long number);
+tree clone_function_name (tree decl, const char *suffix,
+			  unsigned long number);
 tree clone_function_name (tree decl, const char *suffix);
 
 void tree_function_versioning (tree, tree, vec<ipa_replace_map *, va_gc> *,
diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index fdd84b60d3..2b7598e29e 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -568,6 +568,19 @@ clone_function_name (const char *name, const char *suffix,
   return get_identifier (tmp_name);
 }
 
+/* Return a new assembler name for a clone of DECL.  Apart from the
+   string SUFFIX, the new name will end with the specified NUMBER.  If
+   clone numbering is not needed then the two argument
+   clone_function_name should be used instead.  */
+
+tree
+clone_function_name (tree decl, const char *suffix,
+		     unsigned long number)
+{
+  return clone_function_name (
+	   IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)), suffix, number);
+}
+
 /* Return a new assembler name ending with the string SUFFIX for a
    clone of DECL.  */
 
@@ -595,8 +608,9 @@ clone_function_name (tree decl, const char *suffix)
 }
 
 
-/* Create callgraph node clone with new declaration.  The actual body will
-   be copied later at compilation stage.
+/* Create callgraph node clone with new declaration.  The actual body will be
+   copied later at compilation stage.  The name of the new clone will be
+   constructed from the name of the original node, SUFFIX and NUM_SUFFIX.
 
    TODO: after merging in ipa-sra use function call notes instead of args_to_skip
    bitmap interface.
@@ -604,7 +618,8 @@ clone_function_name (tree decl, const char *suffix)
 cgraph_node *
 cgraph_node::create_virtual_clone (vec<cgraph_edge *> redirect_callers,
 				   vec<ipa_replace_map *, va_gc> *tree_map,
-				   bitmap args_to_skip, const char * suffix)
+				   bitmap args_to_skip, const char * suffix,
+				   unsigned num_suffix)
 {
   tree old_decl = decl;
   cgraph_node *new_node = NULL;
@@ -639,8 +654,8 @@ cgraph_node::create_virtual_clone (vec<cgraph_edge *> redirect_callers,
   strcpy (name + len + 1, suffix);
   name[len] = '.';
   DECL_NAME (new_decl) = get_identifier (name);
-  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
-								   suffix));
+  SET_DECL_ASSEMBLER_NAME (new_decl,
+			   clone_function_name (old_decl, suffix, num_suffix));
   SET_DECL_RTL (new_decl, NULL);
 
   new_node = create_clone (new_decl, count, false,
diff --git gcc/ipa-cp.c gcc/ipa-cp.c
index 4471bae11c..e0cd1bc45b 100644
--- gcc/ipa-cp.c
+++ gcc/ipa-cp.c
@@ -376,6 +376,9 @@ static profile_count max_count;
 
 static long overall_size, max_new_size;
 
+/* Node to unique clone suffix number map.  */
+static hash_map<cgraph_node *, unsigned> *clone_num_suffixes;
+
 /* Return the param lattices structure corresponding to the Ith formal
    parameter of the function described by INFO.  */
 static inline struct ipcp_param_lattices *
@@ -3828,8 +3831,11 @@ create_specialized_node (struct cgraph_node *node,
 	}
     }
 
+  unsigned &suffix_counter = clone_num_suffixes->get_or_insert (node);
   new_node = node->create_virtual_clone (callers, replace_trees,
-					 args_to_skip, "constprop");
+					 args_to_skip, "constprop",
+					 suffix_counter);
+  suffix_counter++;
 
   bool have_self_recursive_calls = !self_recursive_calls.is_empty ();
   for (unsigned j = 0; j < self_recursive_calls.length (); j++)
@@ -5043,6 +5049,7 @@ ipcp_driver (void)
 
   ipa_check_create_node_params ();
   ipa_check_create_edge_args ();
+  clone_num_suffixes = new hash_map<cgraph_node *, unsigned>;
 
   if (dump_file)
     {
@@ -5064,6 +5071,7 @@ ipcp_driver (void)
   ipcp_store_vr_results ();
 
   /* Free all IPCP structures.  */
+  delete clone_num_suffixes;
   free_toporder_info (&topo);
   delete edge_clone_summaries;
   edge_clone_summaries = NULL;
diff --git gcc/ipa-hsa.c gcc/ipa-hsa.c
index 7c6ceaab8f..63b41eabc9 100644
--- gcc/ipa-hsa.c
+++ gcc/ipa-hsa.c
@@ -88,7 +88,7 @@ process_hsa_functions (void)
 	    continue;
 	  cgraph_node *clone
 	    = node->create_virtual_clone (vec <cgraph_edge *> (),
-					  NULL, NULL, "hsa");
+					  NULL, NULL, "hsa", 0);
 	  TREE_PUBLIC (clone->decl) = TREE_PUBLIC (node->decl);
 	  clone->externally_visible = node->externally_visible;
 
@@ -109,7 +109,7 @@ process_hsa_functions (void)
 	    continue;
 	  cgraph_node *clone
 	    = node->create_virtual_clone (vec <cgraph_edge *> (),
-					  NULL, NULL, "hsa");
+					  NULL, NULL, "hsa", 0);
 	  TREE_PUBLIC (clone->decl) = TREE_PUBLIC (node->decl);
 	  clone->externally_visible = node->externally_visible;
 
-- 
2.19.1
From 15283936692d1f0a8ff72c3e3211bb72eb5d11f8 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>

Date: Mon, 26 Nov 2018 14:22:39 -0500
Subject: [PATCH 3/3] Minimize clone counter memory usage in LTO.

gcc/lto:

2018-11-30  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* lto-partition.c (privatize_symbol_name_1): Keep track of
	non-unique symbol counters in the lto_clone_numbers hash
	map.
	(lto_promote_cross_file_statics): Allocate and free the
	lto_clone_numbers hash map.
	(lto_promote_statics_nonwpa): Free the lto_clone_numbers hash
	map.
---
 gcc/lto/lto-partition.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git gcc/lto/lto-partition.c gcc/lto/lto-partition.c
index 24e7c23859..867f075b58 100644
--- gcc/lto/lto-partition.c
+++ gcc/lto/lto-partition.c
@@ -951,6 +951,9 @@ validize_symbol_for_target (symtab_node *node)
     }
 }
 
+/* Maps symbol names to unique lto clone counters.  */
+static hash_map<const char *, unsigned> *lto_clone_numbers;
+
 /* Helper for privatize_symbol_name.  Mangle NODE symbol name
    represented by DECL.  */
 
@@ -963,9 +966,11 @@ privatize_symbol_name_1 (symtab_node *node, tree decl)
     return false;
 
   name = maybe_rewrite_identifier (name);
+  unsigned &clone_number = lto_clone_numbers->get_or_insert (name);
   symtab->change_decl_assembler_name (decl,
-				      clone_function_name_numbered (
-					  name, "lto_priv"));
+				      clone_function_name (
+					  name, "lto_priv", clone_number));
+  clone_number++;
 
   if (node->lto_file_data)
     lto_record_renamed_decl (node->lto_file_data, name,
@@ -1157,6 +1162,8 @@ lto_promote_cross_file_statics (void)
       part->encoder = compute_ltrans_boundary (part->encoder);
     }
 
+  lto_clone_numbers = new hash_map<const char *, unsigned>;
+
   /* Look at boundaries and promote symbols as needed.  */
   for (i = 0; i < n_sets; i++)
     {
@@ -1187,6 +1194,7 @@ lto_promote_cross_file_statics (void)
           promote_symbol (node);
         }
     }
+  delete lto_clone_numbers;
 }
 
 /* Rename statics in the whole unit in the case that 
@@ -1196,9 +1204,12 @@ void
 lto_promote_statics_nonwpa (void)
 {
   symtab_node *node;
+
+  lto_clone_numbers = new hash_map<const char *, unsigned>;
   FOR_EACH_SYMBOL (node)
     {
       rename_statics (NULL, node);
       validize_symbol_for_target (node);
     }
+  delete lto_clone_numbers;
 }
-- 
2.19.1
H.J. Lu Dec. 1, 2018, 4:29 p.m. | #5
On Fri, Nov 30, 2018 at 1:11 PM Michael Ploujnikov
<michael.ploujnikov@oracle.com> wrote:
>

> Hi,

>

> On 2018-11-30 2:25 a.m., Richard Biener wrote:

> > +  unsigned &clone_number = lto_clone_numbers->get_or_insert (

> > +     IDENTIFIER_POINTER (DECL_NAME (decl)));

> >    name = maybe_rewrite_identifier (name);

> >    symtab->change_decl_assembler_name (decl,

> > -      clone_function_name_numbered (

> > -  name, "lto_priv"));

> > +      clone_function_name (

> > +  name, "lto_priv", clone_number));

> >

> > since we use 'name' from maybe_rewrite_identifier in the end wouldn't it

> > make more sense to use that as a key for lto_clone_numbers?

>

> Yup, that looks much better. Fixed it.

>

> > For the ipa-hsa.c changes since we iterate over all defined functions

> > we at most create a single clone per cgraph_node.  That means your

> > hash-map keyed on cgraph_node is useless and you could use

> > an unnumbered clone (or a static zero number).

>

> Good catch. I was thinking of doing this, but it somehow fell through the cracks during the rebase.

>

> >

> > -  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,

> > -   suffix));

> > +  SET_DECL_ASSEMBLER_NAME (new_decl,

> > +   clone_function_name (

> > +     IDENTIFIER_POINTER (

> > +       DECL_ASSEMBLER_NAME (old_decl)),

> > +     suffix,

> > +     num_suffix));

> >

> > can you please hide the implementation detail of accessing the assembler name

> > by adding an overload to clone_function_name_numbered with an explicit

> > number?

>

> Done.

>

>

> >

> > OK with those changes.

> >

> > Thanks,

> > Richard.

>

> Thank you for the review. I will commit as soon as my last test run finishes.

>


This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297

-- 
H.J.
Michael Ploujnikov Dec. 3, 2018, 5 p.m. | #6
On 2018-12-01 11:29 a.m., H.J. Lu wrote:
> This caused:

> 

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297

> 


Sorry about that. Looks like I should have been testing with
--with-build-config=bootstrap-lto rather than just --enable-bootstrap.

The quick fix would be to undo the patch to create_virtual_clone or to
just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME
(node->decl) instead of node pointers. Any preferences?

The harder fix would be to figure out why some nodes share the same
names and fix that, but maybe that's just inevitable with LTO?

- Michael
Michael Ploujnikov Dec. 4, 2018, 3:06 a.m. | #7
On 2018-12-03 12:00 p.m., Michael Ploujnikov wrote:
> On 2018-12-01 11:29 a.m., H.J. Lu wrote:

>> This caused:

>>

>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297

>>

> 

> Sorry about that. Looks like I should have been testing with

> --with-build-config=bootstrap-lto rather than just --enable-bootstrap.

> 

> The quick fix would be to undo the patch to create_virtual_clone or to

> just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME

> (node->decl) instead of node pointers. Any preferences?

> 

> The harder fix would be to figure out why some nodes share the same

> names and fix that, but maybe that's just inevitable with LTO?

> 

> - Michael

> 


Here's a quick fix while the issue is being investigated.

Bootstrapped (--with-build-config=bootstrap-lto) and regtested on x86_64.

Ok for trunk?
From f5e2500f30ad337e85e0b53eaa15c724657966a2 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>

Date: Mon, 3 Dec 2018 18:19:18 -0500
Subject: [PATCH] PR ipa/88297

gcc/ChangeLog:

2018-12-03  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	PR ipa/88297
	* ipa-cp.c (create_specialized_node): Store clone counters by
	node assembler names.
	(ipcp_driver): Change type of clone_num_suffixes key type to
	const char*.
---
 gcc/ipa-cp.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git gcc/ipa-cp.c gcc/ipa-cp.c
index e0cd1bc45b..f82473e37c 100644
--- gcc/ipa-cp.c
+++ gcc/ipa-cp.c
@@ -376,8 +376,8 @@ static profile_count max_count;
 
 static long overall_size, max_new_size;
 
-/* Node to unique clone suffix number map.  */
-static hash_map<cgraph_node *, unsigned> *clone_num_suffixes;
+/* Node name to unique clone suffix number map.  */
+static hash_map<const char *, unsigned> *clone_num_suffixes;
 
 /* Return the param lattices structure corresponding to the Ith formal
    parameter of the function described by INFO.  */
@@ -3831,7 +3831,9 @@ create_specialized_node (struct cgraph_node *node,
 	}
     }
 
-  unsigned &suffix_counter = clone_num_suffixes->get_or_insert (node);
+  unsigned &suffix_counter = clone_num_suffixes->get_or_insert (
+			       IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (
+				 node->decl)));
   new_node = node->create_virtual_clone (callers, replace_trees,
 					 args_to_skip, "constprop",
 					 suffix_counter);
@@ -5049,7 +5051,7 @@ ipcp_driver (void)
 
   ipa_check_create_node_params ();
   ipa_check_create_edge_args ();
-  clone_num_suffixes = new hash_map<cgraph_node *, unsigned>;
+  clone_num_suffixes = new hash_map<const char *, unsigned>;
 
   if (dump_file)
     {
-- 
2.19.1
Jan Hubicka Dec. 4, 2018, 2:06 p.m. | #8
> On 2018-12-03 12:00 p.m., Michael Ploujnikov wrote:

> > On 2018-12-01 11:29 a.m., H.J. Lu wrote:

> >> This caused:

> >>

> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297

> >>

> > 

> > Sorry about that. Looks like I should have been testing with

> > --with-build-config=bootstrap-lto rather than just --enable-bootstrap.

> > 

> > The quick fix would be to undo the patch to create_virtual_clone or to

> > just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME

> > (node->decl) instead of node pointers. Any preferences?

> > 

> > The harder fix would be to figure out why some nodes share the same

> > names and fix that, but maybe that's just inevitable with LTO?

> > 

> > - Michael

> > 

> 

> Here's a quick fix while the issue is being investigated.

> 

> Bootstrapped (--with-build-config=bootstrap-lto) and regtested on x86_64.

> 

> Ok for trunk?


> From f5e2500f30ad337e85e0b53eaa15c724657966a2 Mon Sep 17 00:00:00 2001

> From: Michael Ploujnikov <michael.ploujnikov@oracle.com>

> Date: Mon, 3 Dec 2018 18:19:18 -0500

> Subject: [PATCH] PR ipa/88297

> 

> gcc/ChangeLog:

> 

> 2018-12-03  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

> 

> 	PR ipa/88297

> 	* ipa-cp.c (create_specialized_node): Store clone counters by

> 	node assembler names.

> 	(ipcp_driver): Change type of clone_num_suffixes key type to

> 	const char*.


OK,
thanks!
Honza

Patch

From a822759e422f0cea11640ec3d6d6fba94bcd4ee9 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov <michael.ploujnikov@oracle.com>
Date: Wed, 28 Nov 2018 09:24:30 -0500
Subject: [PATCH 4/4] There can be at most one .resolver clone per function

gcc:

2018-11-28  Michael Ploujnikov  <michael.ploujnikov@oracle.com>

	* config/rs6000/rs6000.c (make_resolver_func): Generate
	  resolver symbol with clone_function_name instead of
	  clone_function_name_numbered.
---
 gcc/config/rs6000/rs6000.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git gcc/config/rs6000/rs6000.c gcc/config/rs6000/rs6000.c
index 4f113cb025..a9d038829b 100644
--- gcc/config/rs6000/rs6000.c
+++ gcc/config/rs6000/rs6000.c
@@ -36997,7 +36997,7 @@  make_resolver_func (const tree default_decl,
 {
   /* Make the resolver function static.  The resolver function returns
      void *.  */
-  tree decl_name = clone_function_name_numbered (default_decl, "resolver");
+  tree decl_name = clone_function_name (default_decl, "resolver");
   const char *resolver_name = IDENTIFIER_POINTER (decl_name);
   tree type = build_function_type_list (ptr_type_node, NULL_TREE);
   tree decl = build_fn_decl (resolver_name, type);
-- 
2.19.1