Fix PR83452

Message ID alpine.LSU.2.20.1712191530500.12252@zhemvz.fhfr.qr
State New
Headers show
Series
  • Fix PR83452
Related show

Commit Message

Richard Biener Dec. 19, 2017, 2:37 p.m.
The following enables a hpux specific workaround for the WEAK UNDEF
symbols used by LTO debuginfo extraction.  It makes sure to use
a special name (gnu_lto_v1) for those.

It also adjusts removed local symbols to not use UNDEFs (similar
to solaris ld for globals hpux doesn't like undefs it cannot resolve even
if they are unused).  Instead it uses DEFS with NULL name.

I also fixed some readelf complaints about stale sh_info/link in
removed sections.  (maybe that fixes the solaris complaints? ... just
having some christmas wishes ;))

LTO Bootstrapped and tested on x86_64-unknown-linux-gnu.

I've tested GNU ld and gold for a simple testcase exercising multiple
WEAK UNDEFs with the same name and local defs, they seem happy
(but they also retain all those local defs now in the partial link ... :/)

I'll probably see to implement "real" section removal in stage3 to
fix the Solaris complaints but I hope to not need reloc section
rewriting (aka really pruning stuff from the symtab).

If I hear back from John / Rainer about this patch (or Alan) I see
to commit this during the holidays when it is convenient.

Richard.

2017-12-19  Richard Biener  <rguenther@suse.de>

	PR lto/83452
	* simple-object-elf.c (simple_object_elf_copy_lto_debug_section):
	Do not use UNDEF locals for removed symbols but instead just
	define them in the first prevailing section and with no name.
	Use the same gnu_lto_v1 name for all removed globals we promote to
	WEAK UNDEFs so hpux can use a stub to provide this symbol.  Clear
	sh_info and sh_link in removed sections.

Comments

Alan Modra Dec. 20, 2017, 5:20 a.m. | #1
On Tue, Dec 19, 2017 at 03:37:10PM +0100, Richard Biener wrote:
> +	      while ((gnu_lto = memchr (gnu_lto, 'g',

> +					strings + strsz - gnu_lto)))

> +		if (strncmp (gnu_lto, "gnu_lto_v1",

> +			     strings + strsz - gnu_lto) == 0)


Could be strcmp, I believe.  Entries in .strtab are NUL terminated.
The first byte of .strtab is always NUL too.

A thought occurred to me when looking at the patch.  If gcc emitted
"____gnu_lto_v1" as well as "__gnu_lto_v1" then this code could look
for the former symbol and use it as a replacement.  That would avoid
the user namespace symbol we're emitting here.  With .strtab string
tail merging the cost would be two extra bytes in .strtab and one
extra symbol, 20 or 24 bytes in .symtab, per lto object file.
(A target that adds leading underscores to symbols would complicate
things a little for the above code, and is why it is necessary to add
two extra underscores to the replacement symbol.  Currently gcc looks
for "__gnu_lto_v1" and "___gnu_lto_v1", while ld, gold, ar, nm look
for "__gnu_lto_slim" and "___gnu_lto_slim"..  So another possibility
would be for gcc to switch to using "__gnu__LTO_v1", with the
replacement being "__LTO_v1".  Or emit an entirely new symbol just to
use as a replacement.)

-- 
Alan Modra
Australia Development Lab, IBM
Rainer Orth Dec. 20, 2017, 11 a.m. | #2
Hi Richard,

> The following enables a hpux specific workaround for the WEAK UNDEF

> symbols used by LTO debuginfo extraction.  It makes sure to use

> a special name (gnu_lto_v1) for those.

>

> It also adjusts removed local symbols to not use UNDEFs (similar

> to solaris ld for globals hpux doesn't like undefs it cannot resolve even

> if they are unused).  Instead it uses DEFS with NULL name.

>

> I also fixed some readelf complaints about stale sh_info/link in

> removed sections.  (maybe that fixes the solaris complaints? ... just

> having some christmas wishes ;))


unfortunately, Santa doesn't grant all our wishes ;-)

> LTO Bootstrapped and tested on x86_64-unknown-linux-gnu.

>

> I've tested GNU ld and gold for a simple testcase exercising multiple

> WEAK UNDEFs with the same name and local defs, they seem happy

> (but they also retain all those local defs now in the partial link ... :/)

>

> I'll probably see to implement "real" section removal in stage3 to

> fix the Solaris complaints but I hope to not need reloc section

> rewriting (aka really pruning stuff from the symtab).


I've tested all of unmodified mainline, trunk + current pr81968 patch,
and trunk + this one last night on i386-pc-solaris2.11 and
sparc-sun-solaris2.11.  There's no difference between results with and
without this patch, while the pr81968 (which I've been keeping in my
tree for quite some time) does help quite a bit, but several errors
remain.

	Rainer
 
-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Richard Biener Jan. 2, 2018, 8:44 a.m. | #3
On Wed, 20 Dec 2017, Alan Modra wrote:

> On Tue, Dec 19, 2017 at 03:37:10PM +0100, Richard Biener wrote:

> > +	      while ((gnu_lto = memchr (gnu_lto, 'g',

> > +					strings + strsz - gnu_lto)))

> > +		if (strncmp (gnu_lto, "gnu_lto_v1",

> > +			     strings + strsz - gnu_lto) == 0)

> 

> Could be strcmp, I believe.  Entries in .strtab are NUL terminated.

> The first byte of .strtab is always NUL too.

> 

> A thought occurred to me when looking at the patch.  If gcc emitted

> "____gnu_lto_v1" as well as "__gnu_lto_v1" then this code could look

> for the former symbol and use it as a replacement.  That would avoid

> the user namespace symbol we're emitting here.


The issue is that the symbol is hardcoded in ld IIRC to trigger
plugin autoloading / diagnostics.  So if we re-emit either then
we'll get infinite recursion.

As much as it was convenient to detect whether we have an object
with LTO bytecode by using a symbol simply keying on the LTO
section names would have been better .... (harder to do with
the non-plugin path which just uses nm to look for the LTO
signature).

I've installed the patch as-is.

>  With .strtab string

> tail merging the cost would be two extra bytes in .strtab and one

> extra symbol, 20 or 24 bytes in .symtab, per lto object file.

> (A target that adds leading underscores to symbols would complicate

> things a little for the above code, and is why it is necessary to add

> two extra underscores to the replacement symbol.  Currently gcc looks

> for "__gnu_lto_v1" and "___gnu_lto_v1", while ld, gold, ar, nm look

> for "__gnu_lto_slim" and "___gnu_lto_slim"..  So another possibility

> would be for gcc to switch to using "__gnu__LTO_v1", with the

> replacement being "__LTO_v1".  Or emit an entirely new symbol just to

> use as a replacement.)


Appending something to the string section would be indeed possible
as well.

Richard.

Patch

Index: libiberty/simple-object-elf.c
===================================================================
--- libiberty/simple-object-elf.c	(revision 255777)
+++ libiberty/simple-object-elf.c	(working copy)
@@ -1091,6 +1091,7 @@  simple_object_elf_copy_lto_debug_section
   int changed;
   int *pfnret;
   const char **pfnname;
+  unsigned first_shndx = 0;
 
   shdr_size = (ei_class == ELFCLASS32
 	       ? sizeof (Elf32_External_Shdr)
@@ -1158,6 +1159,9 @@  simple_object_elf_copy_lto_debug_section
       ret = (*pfn) (&name);
       pfnret[i - 1] = ret == 1 ? 0 : -1;
       pfnname[i - 1] = name;
+      if (first_shndx == 0
+	  && pfnret[i - 1] == 0)
+	first_shndx = i;
     }
 
   /* Mark sections as preserved that are required by to be preserved
@@ -1327,6 +1331,15 @@  simple_object_elf_copy_lto_debug_section
 					   sobj->offset + stroff,
 					   (unsigned char *)strings,
 					   strsz, &errmsg, err);
+	      /* Find gnu_lto_ in strings.  */
+	      char *gnu_lto = strings;
+	      while ((gnu_lto = memchr (gnu_lto, 'g',
+					strings + strsz - gnu_lto)))
+		if (strncmp (gnu_lto, "gnu_lto_v1",
+			     strings + strsz - gnu_lto) == 0)
+		  break;
+		else
+		  gnu_lto++;
 	      for (ent = buf; ent < buf + length; ent += entsize)
 		{
 		  unsigned st_shndx = ELF_FETCH_FIELD (type_functions, ei_class,
@@ -1366,31 +1379,27 @@  simple_object_elf_copy_lto_debug_section
 		         in case it is local.  */
 		      int bind = ELF_ST_BIND (*st_info);
 		      int other = STV_DEFAULT;
-		      size_t st_name;
-
 		      if (bind == STB_LOCAL)
-			ELF_SET_FIELD (type_functions, ei_class, Sym,
-				       ent, st_name, Elf_Word, 0);
+			{
+			  /* Make discarded local symbols unnamed and
+			     defined in the first prevailing section.  */
+			  ELF_SET_FIELD (type_functions, ei_class, Sym,
+					 ent, st_name, Elf_Word, 0);
+			  ELF_SET_FIELD (type_functions, ei_class, Sym,
+					 ent, st_shndx, Elf_Half, first_shndx);
+			}
 		      else
 			{
+			  /* Make discarded global symbols hidden weak
+			     undefined and sharing the gnu_lto_ name.  */
 			  bind = STB_WEAK;
-			  st_name = ELF_FETCH_FIELD (type_functions, ei_class,
-						     Sym, ent, st_name,
-						     Elf_Word);
-			  if (st_name < strsz)
-			    {
-			      char *p = strings + st_name;
-			      if (p[0] == '_'
-				  && p[1] == '_'
-				  && strncmp (p + (p[2] == '_'),
-					      "__gnu_lto_", 10) == 0)
-				{
-				  other = STV_HIDDEN;
-				  ELF_SET_FIELD (type_functions, ei_class, Sym,
-						 ent, st_name, Elf_Word,
-						 st_name + 2);
-				}
-			    }
+			  other = STV_HIDDEN;
+			  if (gnu_lto)
+			    ELF_SET_FIELD (type_functions, ei_class, Sym,
+					   ent, st_name, Elf_Word,
+					   gnu_lto - strings);
+			  ELF_SET_FIELD (type_functions, ei_class, Sym,
+					 ent, st_shndx, Elf_Half, SHN_UNDEF);
 			}
 		      *st_other = other;
 		      *st_info = ELF_ST_INFO (bind, STT_NOTYPE);
@@ -1398,8 +1407,6 @@  simple_object_elf_copy_lto_debug_section
 				     ent, st_value, Elf_Addr, 0);
 		      ELF_SET_FIELD (type_functions, ei_class, Sym,
 				     ent, st_size, Elf_Word, 0);
-		      ELF_SET_FIELD (type_functions, ei_class, Sym,
-				     ent, st_shndx, Elf_Half, SHN_UNDEF);
 		    }
 		}
 	      XDELETEVEC (strings);
@@ -1422,6 +1429,10 @@  simple_object_elf_copy_lto_debug_section
 	     link.  */
 	  ELF_SET_FIELD (type_functions, ei_class, Shdr,
 			 shdr, sh_type, Elf_Word, SHT_NULL);
+	  ELF_SET_FIELD (type_functions, ei_class, Shdr,
+			 shdr, sh_info, Elf_Word, 0);
+	  ELF_SET_FIELD (type_functions, ei_class, Shdr,
+			 shdr, sh_link, Elf_Word, 0);
 	}
 
       flags = ELF_FETCH_FIELD (type_functions, ei_class, Shdr,