RFC: [PATCH] ELF: Don't require section header on ELF objects

Message ID 20200308175947.GA911529@gmail.com
State Superseded
Headers show
Series
  • RFC: [PATCH] ELF: Don't require section header on ELF objects
Related show

Commit Message

H.J. Lu March 8, 2020, 5:59 p.m.
Any comments?

Kaylee, do you have copyright paper with FSF?

H.J.
---
Section header isn't mandatory on ELF executable nor shared library.
This patch adds a new linker option, -z nosectionheader, to omit ELF
section header when building an executable or shared library, adds
an objcopy and strip option, --remove-section-header, to remove ELF
section header from an executable or shared library.

The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,
DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to
reconstruct dynamic symbol table when section header isn't available.
For DT_HASH, the number of dynamic symbol table entries equals the
number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols
with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol
table, all symbols with STB_LOCAL binding are placed before symbols with
other bindings and all defined symbols are placed before undefined ones,
the highest symbol index in DT_GNU_HASH/DT_MIPS_XHASH is the highest
dynamic symbol table index.

bfd/

2020-03-XX  H.J. Lu  <hongjiu.lu@intel.com>
	    Kaylee Blake  <klkblake@gmail.com>

	PR ld/25617
	* bfd.c (BFD_NO_SECTION_HEADER): New.
	(BFD_FLAGS_SAVED): Add BFD_NO_SECTION_HEADER.
	(BFD_FLAGS_FOR_BFD_USE_MASK): Likewise.
	* elfcode.h (elf_swap_ehdr_out): Omit section header on
	non-relocatable output with BFD_NO_SECTION_HEADER.
	(elf_write_shdrs_and_ehdr): Likewise.
	* elfxx-target.h (TARGET_BIG_SYM): Add BFD_NO_SECTION_HEADER
	to object_flags.
	(TARGET_LITTLE_SYM): Likewise.
	* bfd-in2.h: Regenerated.

binutils/

2020-03-XX  H.J. Lu  <hongjiu.lu@intel.com>

	PR ld/25617
	* NEWS: Mention --remove-section-header for objcopy and strip.
	* doc/binutils.texi: Document --remove-section-header for objcopy
	and strip.
	* objcopy.c (remove_section_header): New.
	(command_line_switch): Add OPTION_REMOVE_SECTION_HEADER.
	(strip_options): Add --remove-section-header.
	(copy_options): Likewise.
	(copy_usage): Add --remove-section-header.
	(strip_usage): Likewise.
	(copy_object): Renamed to ...
	(copy_object_1): This.  Issue a warning for
	--remove-section-header on non-ELF targets.
	(copy_object): New.
	(strip_main): Handle OPTION_REMOVE_SECTION_HEADER.
	(copy_main): Likewise.

ld/

2020-03-XX  H.J. Lu  <hongjiu.lu@intel.com>
	    Kaylee Blake  <klkblake@gmail.com>

	PR ld/25617
	* NEWS: Mention -z nosectionheader.
	* emultempl/elf.em: Support -z sectionheader and
	-z nosectionheader.
	* ld.h (ld_config_type): Add no_section_header.
	* ld.texi: Document -z sectionheader and -z nosectionheader.
	* ldlang.c (ldlang_open_output): Handle
	config.no_section_header.
	* lexsup.c (parse_args): Disallow -z nosectionheader with -r.
	(elf_static_list_options): Add -z sectionheader and
	-z nosectionheader.
---
 bfd/bfd-in2.h              |  8 ++++--
 bfd/bfd.c                  |  8 ++++--
 bfd/elfcode.h              | 41 ++++++++++++++++++++++-------
 bfd/elfxx-target.h         |  6 +++--
 binutils/NEWS              |  3 +++
 binutils/doc/binutils.texi | 12 +++++++++
 binutils/objcopy.c         | 54 ++++++++++++++++++++++++++++++++++++--
 ld/NEWS                    |  3 +++
 ld/emultempl/elf.em        |  4 +++
 ld/ld.h                    |  3 +++
 ld/ld.texi                 |  6 +++++
 ld/ldlang.c                |  4 +++
 ld/lexsup.c                | 12 +++++++++
 13 files changed, 146 insertions(+), 18 deletions(-)

-- 
2.24.1

Comments

H.J. Lu March 8, 2020, 6:06 p.m. | #1
On Sun, Mar 8, 2020 at 10:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>

> Any comments?

>

> Kaylee, do you have copyright paper with FSF?

>

> H.J.

> ---

> Section header isn't mandatory on ELF executable nor shared library.

> This patch adds a new linker option, -z nosectionheader, to omit ELF

> section header when building an executable or shared library, adds

> an objcopy and strip option, --remove-section-header, to remove ELF

> section header from an executable or shared library.

>

> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

> reconstruct dynamic symbol table when section header isn't available.

> For DT_HASH, the number of dynamic symbol table entries equals the

> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

> table, all symbols with STB_LOCAL binding are placed before symbols with

> other bindings and all defined symbols are placed before undefined ones,


It should read

---
all symbols with STB_LOCAL binding are placed
before symbols with other bindings and all undefined symbols are placed
before defined ones,
---

The complete patch set is on users/hjl/pr25617/master branch at

https://gitlab.com/x86-binutils/binutils-gdb

> the highest symbol index in DT_GNU_HASH/DT_MIPS_XHASH is the highest

> dynamic symbol table index.

>

> bfd/

>

> 2020-03-XX  H.J. Lu  <hongjiu.lu@intel.com>

>             Kaylee Blake  <klkblake@gmail.com>

>

>         PR ld/25617

>         * bfd.c (BFD_NO_SECTION_HEADER): New.

>         (BFD_FLAGS_SAVED): Add BFD_NO_SECTION_HEADER.

>         (BFD_FLAGS_FOR_BFD_USE_MASK): Likewise.

>         * elfcode.h (elf_swap_ehdr_out): Omit section header on

>         non-relocatable output with BFD_NO_SECTION_HEADER.

>         (elf_write_shdrs_and_ehdr): Likewise.

>         * elfxx-target.h (TARGET_BIG_SYM): Add BFD_NO_SECTION_HEADER

>         to object_flags.

>         (TARGET_LITTLE_SYM): Likewise.

>         * bfd-in2.h: Regenerated.

>

> binutils/

>

> 2020-03-XX  H.J. Lu  <hongjiu.lu@intel.com>

>

>         PR ld/25617

>         * NEWS: Mention --remove-section-header for objcopy and strip.

>         * doc/binutils.texi: Document --remove-section-header for objcopy

>         and strip.

>         * objcopy.c (remove_section_header): New.

>         (command_line_switch): Add OPTION_REMOVE_SECTION_HEADER.

>         (strip_options): Add --remove-section-header.

>         (copy_options): Likewise.

>         (copy_usage): Add --remove-section-header.

>         (strip_usage): Likewise.

>         (copy_object): Renamed to ...

>         (copy_object_1): This.  Issue a warning for

>         --remove-section-header on non-ELF targets.

>         (copy_object): New.

>         (strip_main): Handle OPTION_REMOVE_SECTION_HEADER.

>         (copy_main): Likewise.

>



-- 
H.J.
Kaylee Blake March 8, 2020, 11:24 p.m. | #2
On 9/3/20 4:29 am, H.J. Lu wrote:
> Any comments?

> 

> Kaylee, do you have copyright paper with FSF?

> 

> H.J.



I don't at present; is my contribution here significant enough to
require it?

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
H.J. Lu March 8, 2020, 11:29 p.m. | #3
On Sun, Mar 8, 2020 at 4:25 PM Kaylee Blake <klkblake@gmail.com> wrote:
>

> On 9/3/20 4:29 am, H.J. Lu wrote:

> > Any comments?

> >

> > Kaylee, do you have copyright paper with FSF?

> >

> > H.J.

>

>

> I don't at present; is my contribution here significant enough to

> require it?

>


I think so.

-- 
H.J.
Alan Modra March 8, 2020, 11:35 p.m. | #4
On Sun, Mar 08, 2020 at 11:06:33AM -0700, H.J. Lu wrote:
> On Sun, Mar 8, 2020 at 10:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:

> >

> > Any comments?

> >

> > Kaylee, do you have copyright paper with FSF?

> >

> > H.J.

> > ---

> > Section header isn't mandatory on ELF executable nor shared library.

> > This patch adds a new linker option, -z nosectionheader, to omit ELF

> > section header when building an executable or shared library, adds

> > an objcopy and strip option, --remove-section-header, to remove ELF

> > section header from an executable or shared library.

> >

> > The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

> > DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

> > reconstruct dynamic symbol table when section header isn't available.

> > For DT_HASH, the number of dynamic symbol table entries equals the

> > number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

> > with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

> > table, all symbols with STB_LOCAL binding are placed before symbols with

> > other bindings and all defined symbols are placed before undefined ones,

> 

> It should read

> 

> ---

> all symbols with STB_LOCAL binding are placed

> before symbols with other bindings and all undefined symbols are placed

> before defined ones,

> ---


That's new to me.  I don't think there is any ordering in .dynsym
among non-local symbols.

The patch looks OK.

-- 
Alan Modra
Australia Development Lab, IBM
Alan Modra March 8, 2020, 11:38 p.m. | #5
On Sun, Mar 08, 2020 at 04:29:27PM -0700, H.J. Lu wrote:
> On Sun, Mar 8, 2020 at 4:25 PM Kaylee Blake <klkblake@gmail.com> wrote:

> >

> > On 9/3/20 4:29 am, H.J. Lu wrote:

> > > Any comments?

> > >

> > > Kaylee, do you have copyright paper with FSF?

> > >

> > > H.J.

> >

> >

> > I don't at present; is my contribution here significant enough to

> > require it?

> >

> 

> I think so.


Yes, please wait for the copyright assignment to be in place before
committing.

-- 
Alan Modra
Australia Development Lab, IBM
H.J. Lu March 8, 2020, 11:45 p.m. | #6
On Sun, Mar 8, 2020 at 4:38 PM Alan Modra <amodra@gmail.com> wrote:
>

> On Sun, Mar 08, 2020 at 04:29:27PM -0700, H.J. Lu wrote:

> > On Sun, Mar 8, 2020 at 4:25 PM Kaylee Blake <klkblake@gmail.com> wrote:

> > >

> > > On 9/3/20 4:29 am, H.J. Lu wrote:

> > > > Any comments?

> > > >

> > > > Kaylee, do you have copyright paper with FSF?

> > > >

> > > > H.J.

> > >

> > >

> > > I don't at present; is my contribution here significant enough to

> > > require it?

> > >

> >

> > I think so.

>

> Yes, please wait for the copyright assignment to be in place before

> committing.


I will.   I submitted the whole patch set.

Thanks.

-- 
H.J.
H.J. Lu March 8, 2020, 11:46 p.m. | #7
On Sun, Mar 8, 2020 at 4:35 PM Alan Modra <amodra@gmail.com> wrote:
>

> On Sun, Mar 08, 2020 at 11:06:33AM -0700, H.J. Lu wrote:

> > On Sun, Mar 8, 2020 at 10:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:

> > >

> > > Any comments?

> > >

> > > Kaylee, do you have copyright paper with FSF?

> > >

> > > H.J.

> > > ---

> > > Section header isn't mandatory on ELF executable nor shared library.

> > > This patch adds a new linker option, -z nosectionheader, to omit ELF

> > > section header when building an executable or shared library, adds

> > > an objcopy and strip option, --remove-section-header, to remove ELF

> > > section header from an executable or shared library.

> > >

> > > The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

> > > DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

> > > reconstruct dynamic symbol table when section header isn't available.

> > > For DT_HASH, the number of dynamic symbol table entries equals the

> > > number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

> > > with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

> > > table, all symbols with STB_LOCAL binding are placed before symbols with

> > > other bindings and all defined symbols are placed before undefined ones,

> >

> > It should read

> >

> > ---

> > all symbols with STB_LOCAL binding are placed

> > before symbols with other bindings and all undefined symbols are placed

> > before defined ones,

> > ---

>

> That's new to me.  I don't think there is any ordering in .dynsym

> among non-local symbols.


I will get clarification from gABI group.

> The patch looks OK.

>


-- 
H.J.
H.J. Lu March 9, 2020, 12:02 a.m. | #8
On Sun, Mar 8, 2020 at 4:46 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>

> On Sun, Mar 8, 2020 at 4:35 PM Alan Modra <amodra@gmail.com> wrote:

> >

> > On Sun, Mar 08, 2020 at 11:06:33AM -0700, H.J. Lu wrote:

> > > On Sun, Mar 8, 2020 at 10:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > >

> > > > Any comments?

> > > >

> > > > Kaylee, do you have copyright paper with FSF?

> > > >

> > > > H.J.

> > > > ---

> > > > Section header isn't mandatory on ELF executable nor shared library.

> > > > This patch adds a new linker option, -z nosectionheader, to omit ELF

> > > > section header when building an executable or shared library, adds

> > > > an objcopy and strip option, --remove-section-header, to remove ELF

> > > > section header from an executable or shared library.

> > > >

> > > > The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

> > > > DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

> > > > reconstruct dynamic symbol table when section header isn't available.

> > > > For DT_HASH, the number of dynamic symbol table entries equals the

> > > > number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

> > > > with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

> > > > table, all symbols with STB_LOCAL binding are placed before symbols with

> > > > other bindings and all defined symbols are placed before undefined ones,

> > >

> > > It should read

> > >

> > > ---

> > > all symbols with STB_LOCAL binding are placed

> > > before symbols with other bindings and all undefined symbols are placed

> > > before defined ones,

> > > ---

> >

> > That's new to me.  I don't think there is any ordering in .dynsym

> > among non-local symbols.

>

> I will get clarification from gABI group.


FYI,

https://groups.google.com/forum/#!topic/generic-abi/oDQ3Z3IDYuU

-- 
H.J.
Kaylee Blake March 9, 2020, 12:02 a.m. | #9
On 9/3/20 10:16 am, H.J. Lu wrote:
> On Sun, Mar 8, 2020 at 4:35 PM Alan Modra <amodra@gmail.com> wrote:

>>

>> On Sun, Mar 08, 2020 at 11:06:33AM -0700, H.J. Lu wrote:

>>> On Sun, Mar 8, 2020 at 10:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>

>>>> Any comments?

>>>>

>>>> Kaylee, do you have copyright paper with FSF?

>>>>

>>>> H.J.

>>>> ---

>>>> Section header isn't mandatory on ELF executable nor shared library.

>>>> This patch adds a new linker option, -z nosectionheader, to omit ELF

>>>> section header when building an executable or shared library, adds

>>>> an objcopy and strip option, --remove-section-header, to remove ELF

>>>> section header from an executable or shared library.

>>>>

>>>> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

>>>> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

>>>> reconstruct dynamic symbol table when section header isn't available.

>>>> For DT_HASH, the number of dynamic symbol table entries equals the

>>>> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

>>>> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

>>>> table, all symbols with STB_LOCAL binding are placed before symbols with

>>>> other bindings and all defined symbols are placed before undefined ones,

>>>

>>> It should read

>>>

>>> ---

>>> all symbols with STB_LOCAL binding are placed

>>> before symbols with other bindings and all undefined symbols are placed

>>> before defined ones,

>>> ---

>>

>> That's new to me.  I don't think there is any ordering in .dynsym

>> among non-local symbols.

> 

> I will get clarification from gABI group.

> 

>> The patch looks OK.

>>

> 


Looks like it's required by DT_GNU_HASH, from what I could find.

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Alan Modra March 9, 2020, 12:05 a.m. | #10
On Sun, Mar 08, 2020 at 04:46:51PM -0700, H.J. Lu wrote:
> On Sun, Mar 8, 2020 at 4:35 PM Alan Modra <amodra@gmail.com> wrote:

> >

> > On Sun, Mar 08, 2020 at 11:06:33AM -0700, H.J. Lu wrote:

> > > On Sun, Mar 8, 2020 at 10:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > >

> > > > Any comments?

> > > >

> > > > Kaylee, do you have copyright paper with FSF?

> > > >

> > > > H.J.

> > > > ---

> > > > Section header isn't mandatory on ELF executable nor shared library.

> > > > This patch adds a new linker option, -z nosectionheader, to omit ELF

> > > > section header when building an executable or shared library, adds

> > > > an objcopy and strip option, --remove-section-header, to remove ELF

> > > > section header from an executable or shared library.

> > > >

> > > > The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

> > > > DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

> > > > reconstruct dynamic symbol table when section header isn't available.

> > > > For DT_HASH, the number of dynamic symbol table entries equals the

> > > > number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

> > > > with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

> > > > table, all symbols with STB_LOCAL binding are placed before symbols with

> > > > other bindings and all defined symbols are placed before undefined ones,

> > >

> > > It should read

> > >

> > > ---

> > > all symbols with STB_LOCAL binding are placed

> > > before symbols with other bindings and all undefined symbols are placed

> > > before defined ones,

> > > ---

> >

> > That's new to me.  I don't think there is any ordering in .dynsym

> > among non-local symbols.

> 

> I will get clarification from gABI group.


Well we certainly don't do such sorting.  For example, from a freshly
build ld/ld-new --enable-targets=all

   148: 0000000000f08380     4 OBJECT  GLOBAL DEFAULT   25 opterr@GLIBC_2.2.5 (3)
   149: 0000000000402f80     0 FUNC    GLOBAL DEFAULT  UND calloc@GLIBC_2.2.5 (3)
   150: 0000000000881536    35 FUNC    GLOBAL DEFAULT   13 _obstack_allocated_p

-- 
Alan Modra
Australia Development Lab, IBM
H.J. Lu March 9, 2020, 1:36 a.m. | #11
On Sun, Mar 8, 2020 at 5:05 PM Alan Modra <amodra@gmail.com> wrote:
>

> On Sun, Mar 08, 2020 at 04:46:51PM -0700, H.J. Lu wrote:

> > On Sun, Mar 8, 2020 at 4:35 PM Alan Modra <amodra@gmail.com> wrote:

> > >

> > > On Sun, Mar 08, 2020 at 11:06:33AM -0700, H.J. Lu wrote:

> > > > On Sun, Mar 8, 2020 at 10:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:

> > > > >

> > > > > Any comments?

> > > > >

> > > > > Kaylee, do you have copyright paper with FSF?

> > > > >

> > > > > H.J.

> > > > > ---

> > > > > Section header isn't mandatory on ELF executable nor shared library.

> > > > > This patch adds a new linker option, -z nosectionheader, to omit ELF

> > > > > section header when building an executable or shared library, adds

> > > > > an objcopy and strip option, --remove-section-header, to remove ELF

> > > > > section header from an executable or shared library.

> > > > >

> > > > > The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

> > > > > DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

> > > > > reconstruct dynamic symbol table when section header isn't available.

> > > > > For DT_HASH, the number of dynamic symbol table entries equals the

> > > > > number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

> > > > > with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

> > > > > table, all symbols with STB_LOCAL binding are placed before symbols with

> > > > > other bindings and all defined symbols are placed before undefined ones,

> > > >

> > > > It should read

> > > >

> > > > ---

> > > > all symbols with STB_LOCAL binding are placed

> > > > before symbols with other bindings and all undefined symbols are placed

> > > > before defined ones,

> > > > ---

> > >

> > > That's new to me.  I don't think there is any ordering in .dynsym

> > > among non-local symbols.

> >

> > I will get clarification from gABI group.

>

> Well we certainly don't do such sorting.  For example, from a freshly

> build ld/ld-new --enable-targets=all

>

>    148: 0000000000f08380     4 OBJECT  GLOBAL DEFAULT   25 opterr@GLIBC_2.2.5 (3)

>    149: 0000000000402f80     0 FUNC    GLOBAL DEFAULT  UND calloc@GLIBC_2.2.5 (3)

>    150: 0000000000881536    35 FUNC    GLOBAL DEFAULT   13 _obstack_allocated_p

>


I will make 2 changes:

1.  Update -z nosectionheader to guarantee that the last entry in
dynamic symbol table
is defined.
2.  Update --remove-section-header to issue an error if the last entry
in dynamic symbol
table is undefined.

-- 
H.J.
Kaylee Blake March 9, 2020, 1:59 a.m. | #12
On 9/3/20 12:06 pm, H.J. Lu wrote:
> On Sun, Mar 8, 2020 at 5:05 PM Alan Modra <amodra@gmail.com> wrote:

>>

>> On Sun, Mar 08, 2020 at 04:46:51PM -0700, H.J. Lu wrote:

>>> On Sun, Mar 8, 2020 at 4:35 PM Alan Modra <amodra@gmail.com> wrote:

>>>>

>>>> On Sun, Mar 08, 2020 at 11:06:33AM -0700, H.J. Lu wrote:

>>>>> On Sun, Mar 8, 2020 at 10:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>>>

>>>>>> Any comments?

>>>>>>

>>>>>> Kaylee, do you have copyright paper with FSF?

>>>>>>

>>>>>> H.J.

>>>>>> ---

>>>>>> Section header isn't mandatory on ELF executable nor shared library.

>>>>>> This patch adds a new linker option, -z nosectionheader, to omit ELF

>>>>>> section header when building an executable or shared library, adds

>>>>>> an objcopy and strip option, --remove-section-header, to remove ELF

>>>>>> section header from an executable or shared library.

>>>>>>

>>>>>> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

>>>>>> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

>>>>>> reconstruct dynamic symbol table when section header isn't available.

>>>>>> For DT_HASH, the number of dynamic symbol table entries equals the

>>>>>> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

>>>>>> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

>>>>>> table, all symbols with STB_LOCAL binding are placed before symbols with

>>>>>> other bindings and all defined symbols are placed before undefined ones,

>>>>>

>>>>> It should read

>>>>>

>>>>> ---

>>>>> all symbols with STB_LOCAL binding are placed

>>>>> before symbols with other bindings and all undefined symbols are placed

>>>>> before defined ones,

>>>>> ---

>>>>

>>>> That's new to me.  I don't think there is any ordering in .dynsym

>>>> among non-local symbols.

>>>

>>> I will get clarification from gABI group.

>>

>> Well we certainly don't do such sorting.  For example, from a freshly

>> build ld/ld-new --enable-targets=all

>>

>>    148: 0000000000f08380     4 OBJECT  GLOBAL DEFAULT   25 opterr@GLIBC_2.2.5 (3)

>>    149: 0000000000402f80     0 FUNC    GLOBAL DEFAULT  UND calloc@GLIBC_2.2.5 (3)

>>    150: 0000000000881536    35 FUNC    GLOBAL DEFAULT   13 _obstack_allocated_p

>>

> 

> I will make 2 changes:

> 

> 1.  Update -z nosectionheader to guarantee that the last entry in

> dynamic symbol table

> is defined.

> 2.  Update --remove-section-header to issue an error if the last entry

> in dynamic symbol

> table is undefined.

> 


With some testing, it seems like ld will emit an ordered symbol table
iff it's using the DT_GNU_HASH hash table style, and my understanding is
that DT_GNU_HASH in fact requires this behaviour. So in that case, we
don't need to do an additional check, because we only need the ordering
if we are looking up through DT_GNU_HASH instead of DT_HASH.

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Alan Modra March 9, 2020, 2:23 a.m. | #13
On Mon, Mar 09, 2020 at 12:29:48PM +1030, Kaylee Blake wrote:
> On 9/3/20 12:06 pm, H.J. Lu wrote:

> > On Sun, Mar 8, 2020 at 5:05 PM Alan Modra <amodra@gmail.com> wrote:

> >> Well we certainly don't do such sorting.  For example, from a freshly

> >> build ld/ld-new --enable-targets=all

> >>

> >>    148: 0000000000f08380     4 OBJECT  GLOBAL DEFAULT   25 opterr@GLIBC_2.2.5 (3)

> >>    149: 0000000000402f80     0 FUNC    GLOBAL DEFAULT  UND calloc@GLIBC_2.2.5 (3)

> >>    150: 0000000000881536    35 FUNC    GLOBAL DEFAULT   13 _obstack_allocated_p

> >>

> > 

> > I will make 2 changes:

> > 

> > 1.  Update -z nosectionheader to guarantee that the last entry in

> > dynamic symbol table

> > is defined.

> > 2.  Update --remove-section-header to issue an error if the last entry

> > in dynamic symbol

> > table is undefined.

> > 

> 

> With some testing, it seems like ld will emit an ordered symbol table

> iff it's using the DT_GNU_HASH hash table style


It doesn't.  The snippet of .dynsym I posted was from a binary with
DT_GNU_HASH.  elflink.c:_bfd_elf_link_renumber_dynsyms should convince
you that any ordering seen is by chance.

>, and my understanding is

> that DT_GNU_HASH in fact requires this behaviour.


Apparently not.  ;-)

> So in that case, we

> don't need to do an additional check, because we only need the ordering

> if we are looking up through DT_GNU_HASH instead of DT_HASH.

> 

> -- 

> Kaylee Blake <klkblake@gmail.com>

> C is the worst language, except for all the others.


-- 
Alan Modra
Australia Development Lab, IBM
H.J. Lu March 9, 2020, 2:35 a.m. | #14
On Sun, Mar 8, 2020 at 7:23 PM Alan Modra <amodra@gmail.com> wrote:
>

> On Mon, Mar 09, 2020 at 12:29:48PM +1030, Kaylee Blake wrote:

> > On 9/3/20 12:06 pm, H.J. Lu wrote:

> > > On Sun, Mar 8, 2020 at 5:05 PM Alan Modra <amodra@gmail.com> wrote:

> > >> Well we certainly don't do such sorting.  For example, from a freshly

> > >> build ld/ld-new --enable-targets=all

> > >>

> > >>    148: 0000000000f08380     4 OBJECT  GLOBAL DEFAULT   25 opterr@GLIBC_2.2.5 (3)

> > >>    149: 0000000000402f80     0 FUNC    GLOBAL DEFAULT  UND calloc@GLIBC_2.2.5 (3)

> > >>    150: 0000000000881536    35 FUNC    GLOBAL DEFAULT   13 _obstack_allocated_p

> > >>

> > >

> > > I will make 2 changes:

> > >

> > > 1.  Update -z nosectionheader to guarantee that the last entry in

> > > dynamic symbol table

> > > is defined.

> > > 2.  Update --remove-section-header to issue an error if the last entry

> > > in dynamic symbol

> > > table is undefined.

> > >

> >

> > With some testing, it seems like ld will emit an ordered symbol table

> > iff it's using the DT_GNU_HASH hash table style

>

> It doesn't.  The snippet of .dynsym I posted was from a binary with

> DT_GNU_HASH.  elflink.c:_bfd_elf_link_renumber_dynsyms should convince

> you that any ordering seen is by chance.

>

> >, and my understanding is

> > that DT_GNU_HASH in fact requires this behaviour.

>

> Apparently not.  ;-)

>

> > So in that case, we

> > don't need to do an additional check, because we only need the ordering

> > if we are looking up through DT_GNU_HASH instead of DT_HASH.

> >

> > --

> > Kaylee Blake <klkblake@gmail.com>

> > C is the worst language, except for all the others.

>


x86 backend does:

 if (!local_undefweak
      && !h->def_regular
      && (h->plt.offset != (bfd_vma) -1
          || eh->plt_got.offset != (bfd_vma) -1))
    {
      /* Mark the symbol as undefined, rather than as defined in
         the .plt section.  Leave the value if there were any
         relocations where pointer equality matters (this is a clue
         for the dynamic linker, to make function pointer
         comparisons work between an application and shared
         library), otherwise set it to zero.  If a function is only
         called from a binary, there is no need to slow down
         shared libraries because of that.  */
      sym->st_shndx = SHN_UNDEF;
      if (!h->pointer_equality_needed)
        sym->st_value = 0;
    }

Entries in DT_GNU_HASH were originally defined.  A backend
may change some entries to undefined.  I think my patch is OK.

-- 
H.J.
H.J. Lu March 9, 2020, 4:14 a.m. | #15
On Sun, Mar 8, 2020 at 7:35 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>

> On Sun, Mar 8, 2020 at 7:23 PM Alan Modra <amodra@gmail.com> wrote:

> >

> > On Mon, Mar 09, 2020 at 12:29:48PM +1030, Kaylee Blake wrote:

> > > On 9/3/20 12:06 pm, H.J. Lu wrote:

> > > > On Sun, Mar 8, 2020 at 5:05 PM Alan Modra <amodra@gmail.com> wrote:

> > > >> Well we certainly don't do such sorting.  For example, from a freshly

> > > >> build ld/ld-new --enable-targets=all

> > > >>

> > > >>    148: 0000000000f08380     4 OBJECT  GLOBAL DEFAULT   25 opterr@GLIBC_2.2.5 (3)

> > > >>    149: 0000000000402f80     0 FUNC    GLOBAL DEFAULT  UND calloc@GLIBC_2.2.5 (3)

> > > >>    150: 0000000000881536    35 FUNC    GLOBAL DEFAULT   13 _obstack_allocated_p

> > > >>

> > > >

> > > > I will make 2 changes:

> > > >

> > > > 1.  Update -z nosectionheader to guarantee that the last entry in

> > > > dynamic symbol table

> > > > is defined.

> > > > 2.  Update --remove-section-header to issue an error if the last entry

> > > > in dynamic symbol

> > > > table is undefined.

> > > >

> > >

> > > With some testing, it seems like ld will emit an ordered symbol table

> > > iff it's using the DT_GNU_HASH hash table style

> >

> > It doesn't.  The snippet of .dynsym I posted was from a binary with

> > DT_GNU_HASH.  elflink.c:_bfd_elf_link_renumber_dynsyms should convince

> > you that any ordering seen is by chance.

> >

> > >, and my understanding is

> > > that DT_GNU_HASH in fact requires this behaviour.

> >

> > Apparently not.  ;-)

> >

> > > So in that case, we

> > > don't need to do an additional check, because we only need the ordering

> > > if we are looking up through DT_GNU_HASH instead of DT_HASH.

> > >

> > > --

> > > Kaylee Blake <klkblake@gmail.com>

> > > C is the worst language, except for all the others.

> >

>

> x86 backend does:

>

>  if (!local_undefweak

>       && !h->def_regular

>       && (h->plt.offset != (bfd_vma) -1

>           || eh->plt_got.offset != (bfd_vma) -1))

>     {

>       /* Mark the symbol as undefined, rather than as defined in

>          the .plt section.  Leave the value if there were any

>          relocations where pointer equality matters (this is a clue

>          for the dynamic linker, to make function pointer

>          comparisons work between an application and shared

>          library), otherwise set it to zero.  If a function is only

>          called from a binary, there is no need to slow down

>          shared libraries because of that.  */

>       sym->st_shndx = SHN_UNDEF;

>       if (!h->pointer_equality_needed)

>         sym->st_value = 0;

>     }

>

> Entries in DT_GNU_HASH were originally defined.  A backend

> may change some entries to undefined.  I think my patch is OK.

>


[hjl@gnu-cfl-2 pr25617]$ cat y.s
.data
bar:
.dc.a foo
[hjl@gnu-cfl-2 pr25617]$ gcc -c y.s
[hjl@gnu-cfl-2 pr25617]$ ./ld -shared y.o --hash-style=sysv
[hjl@gnu-cfl-2 pr25617]$ readelf -D -s  a.out

Symbol table for image:
  Num Buc:    Value          Size   Type   Bind Vis      Ndx Name
    1   0: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT UND foo
[hjl@gnu-cfl-2 pr25617]$ ./ld -shared y.o --hash-style=gnu
[hjl@gnu-cfl-2 pr25617]$ readelf -D -s  a.out
[hjl@gnu-cfl-2 pr25617]$

I will update my patch to not to generate such binary without section
header.

-- 
H.J.
Kaylee Blake March 9, 2020, 4:59 a.m. | #16
On 9/3/20 2:44 pm, H.J. Lu wrote:
> On Sun, Mar 8, 2020 at 7:35 PM H.J. Lu <hjl.tools@gmail.com> wrote:

>>

>> On Sun, Mar 8, 2020 at 7:23 PM Alan Modra <amodra@gmail.com> wrote:

>>>

>>> On Mon, Mar 09, 2020 at 12:29:48PM +1030, Kaylee Blake wrote:

>>>> On 9/3/20 12:06 pm, H.J. Lu wrote:

>>>>> On Sun, Mar 8, 2020 at 5:05 PM Alan Modra <amodra@gmail.com> wrote:

>>>>>> Well we certainly don't do such sorting.  For example, from a freshly

>>>>>> build ld/ld-new --enable-targets=all

>>>>>>

>>>>>>    148: 0000000000f08380     4 OBJECT  GLOBAL DEFAULT   25 opterr@GLIBC_2.2.5 (3)

>>>>>>    149: 0000000000402f80     0 FUNC    GLOBAL DEFAULT  UND calloc@GLIBC_2.2.5 (3)

>>>>>>    150: 0000000000881536    35 FUNC    GLOBAL DEFAULT   13 _obstack_allocated_p

>>>>>>

>>>>>

>>>>> I will make 2 changes:

>>>>>

>>>>> 1.  Update -z nosectionheader to guarantee that the last entry in

>>>>> dynamic symbol table

>>>>> is defined.

>>>>> 2.  Update --remove-section-header to issue an error if the last entry

>>>>> in dynamic symbol

>>>>> table is undefined.

>>>>>

>>>>

>>>> With some testing, it seems like ld will emit an ordered symbol table

>>>> iff it's using the DT_GNU_HASH hash table style

>>>

>>> It doesn't.  The snippet of .dynsym I posted was from a binary with

>>> DT_GNU_HASH.  elflink.c:_bfd_elf_link_renumber_dynsyms should convince

>>> you that any ordering seen is by chance.

>>>

>>>> , and my understanding is

>>>> that DT_GNU_HASH in fact requires this behaviour.

>>>

>>> Apparently not.  ;-)

>>>

>>>> So in that case, we

>>>> don't need to do an additional check, because we only need the ordering

>>>> if we are looking up through DT_GNU_HASH instead of DT_HASH.

>>>>

>>>> --

>>>> Kaylee Blake <klkblake@gmail.com>

>>>> C is the worst language, except for all the others.

>>>

>>

>> x86 backend does:

>>

>>  if (!local_undefweak

>>       && !h->def_regular

>>       && (h->plt.offset != (bfd_vma) -1

>>           || eh->plt_got.offset != (bfd_vma) -1))

>>     {

>>       /* Mark the symbol as undefined, rather than as defined in

>>          the .plt section.  Leave the value if there were any

>>          relocations where pointer equality matters (this is a clue

>>          for the dynamic linker, to make function pointer

>>          comparisons work between an application and shared

>>          library), otherwise set it to zero.  If a function is only

>>          called from a binary, there is no need to slow down

>>          shared libraries because of that.  */

>>       sym->st_shndx = SHN_UNDEF;

>>       if (!h->pointer_equality_needed)

>>         sym->st_value = 0;

>>     }

>>

>> Entries in DT_GNU_HASH were originally defined.  A backend

>> may change some entries to undefined.  I think my patch is OK.

>>

> 

> [hjl@gnu-cfl-2 pr25617]$ cat y.s

> .data

> bar:

> .dc.a foo

> [hjl@gnu-cfl-2 pr25617]$ gcc -c y.s

> [hjl@gnu-cfl-2 pr25617]$ ./ld -shared y.o --hash-style=sysv

> [hjl@gnu-cfl-2 pr25617]$ readelf -D -s  a.out

> 

> Symbol table for image:

>   Num Buc:    Value          Size   Type   Bind Vis      Ndx Name

>     1   0: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT UND foo

> [hjl@gnu-cfl-2 pr25617]$ ./ld -shared y.o --hash-style=gnu

> [hjl@gnu-cfl-2 pr25617]$ readelf -D -s  a.out

> [hjl@gnu-cfl-2 pr25617]$

> 

> I will update my patch to not to generate such binary without section

> header.

> 


A possible alternative if we have DT_GNU_HASH is to scan through the
relocation list. Every symbol we care about for linking must either be
something this library is providing (in which case it's in the range
provided by DT_GNU_HASH), or something it needs (in which there will be
a relocation referencing it). So if there is no DT_HASH, we can take the
max of the highest DT_GNU_HASH symbol and the highest symbol referenced
by a relocation entry. Theoretically there could be a symbol which is
undefined but never referenced in a relocation, but the dynamic linker
doesn't have any information we don't, so it can't affect anything if we
don't have a way to get it.

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Florian Weimer March 9, 2020, 8:13 a.m. | #17
* H. J. Lu:

> Section header isn't mandatory on ELF executable nor shared library.

> This patch adds a new linker option, -z nosectionheader, to omit ELF

> section header when building an executable or shared library, adds

> an objcopy and strip option, --remove-section-header, to remove ELF

> section header from an executable or shared library.

>

> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

> reconstruct dynamic symbol table when section header isn't available.

> For DT_HASH, the number of dynamic symbol table entries equals the

> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

> table, all symbols with STB_LOCAL binding are placed before symbols with

> other bindings and all defined symbols are placed before undefined ones,

> the highest symbol index in DT_GNU_HASH/DT_MIPS_XHASH is the highest

> dynamic symbol table index.


Does this patch enable ld to use shared objects without a section
header for linking?

I think the NEWS and manual update should clarify this.

In my opinion, it should NOT be possible to link against objects
without section headers.  Lack of section headers clearly marks the
object as a run-time only object.  This is useful if you want to
prevent developers to create DT_NEEDED dependencies on internal
libraries, for example.
Alan Modra March 9, 2020, 11:56 a.m. | #18
On Sun, Mar 08, 2020 at 07:35:56PM -0700, H.J. Lu wrote:
> On Sun, Mar 8, 2020 at 7:23 PM Alan Modra <amodra@gmail.com> wrote:

> > It doesn't.  The snippet of .dynsym I posted was from a binary with

> > DT_GNU_HASH.  elflink.c:_bfd_elf_link_renumber_dynsyms should convince

> > you that any ordering seen is by chance.


I forgot that elf_gnu_hash_process_symidx does in fact perform yet
another ordering of .dynsyms, so undefined non-locals are put before
defined non-locals.

> Entries in DT_GNU_HASH were originally defined.  A backend

> may change some entries to undefined.


Yeah, the function pointer hack to make pointer comparisons work
between an application and shared libray.  SHN_UNDEF but with a
non-zero value like defined symbols, so not really undefined.

-- 
Alan Modra
Australia Development Lab, IBM
Kaylee Blake March 9, 2020, 12:54 p.m. | #19
On 9/3/20 6:43 pm, Florian Weimer wrote:
> * H. J. Lu:

> 

>> Section header isn't mandatory on ELF executable nor shared library.

>> This patch adds a new linker option, -z nosectionheader, to omit ELF

>> section header when building an executable or shared library, adds

>> an objcopy and strip option, --remove-section-header, to remove ELF

>> section header from an executable or shared library.

>>

>> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

>> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

>> reconstruct dynamic symbol table when section header isn't available.

>> For DT_HASH, the number of dynamic symbol table entries equals the

>> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

>> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

>> table, all symbols with STB_LOCAL binding are placed before symbols with

>> other bindings and all defined symbols are placed before undefined ones,

>> the highest symbol index in DT_GNU_HASH/DT_MIPS_XHASH is the highest

>> dynamic symbol table index.

> 

> Does this patch enable ld to use shared objects without a section

> header for linking?

> 

> I think the NEWS and manual update should clarify this.

> 

> In my opinion, it should NOT be possible to link against objects

> without section headers.  Lack of section headers clearly marks the

> object as a run-time only object.  This is useful if you want to

> prevent developers to create DT_NEEDED dependencies on internal

> libraries, for example.


For shared objects without debug symbols, the section header table is
~2kB on average of redundant data. I'm also not a fan of the
inconsistency of having shared libraries that the dynamic linker is
perfectly happy to load, but ld can't link against, especially since
this seems like an oversight rather than an intended design decision.

If the internal library use case is worth supporting, adding a note
tagging said internal library as not meant to be linked against seems
like a better (and much more efficient) approach? This could also
actually result in the dynamic linker rejecting attempting to load
through DT_NEEDED entry.

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Florian Weimer March 9, 2020, 1:06 p.m. | #20
* Kaylee Blake:

> On 9/3/20 6:43 pm, Florian Weimer wrote:

>> * H. J. Lu:

>> 

>>> Section header isn't mandatory on ELF executable nor shared library.

>>> This patch adds a new linker option, -z nosectionheader, to omit ELF

>>> section header when building an executable or shared library, adds

>>> an objcopy and strip option, --remove-section-header, to remove ELF

>>> section header from an executable or shared library.

>>>

>>> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

>>> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

>>> reconstruct dynamic symbol table when section header isn't available.

>>> For DT_HASH, the number of dynamic symbol table entries equals the

>>> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

>>> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

>>> table, all symbols with STB_LOCAL binding are placed before symbols with

>>> other bindings and all defined symbols are placed before undefined ones,

>>> the highest symbol index in DT_GNU_HASH/DT_MIPS_XHASH is the highest

>>> dynamic symbol table index.

>> 

>> Does this patch enable ld to use shared objects without a section

>> header for linking?

>> 

>> I think the NEWS and manual update should clarify this.

>> 

>> In my opinion, it should NOT be possible to link against objects

>> without section headers.  Lack of section headers clearly marks the

>> object as a run-time only object.  This is useful if you want to

>> prevent developers to create DT_NEEDED dependencies on internal

>> libraries, for example.

>

> For shared objects without debug symbols, the section header table is

> ~2kB on average of redundant data. I'm also not a fan of the

> inconsistency of having shared libraries that the dynamic linker is

> perfectly happy to load, but ld can't link against, especially since

> this seems like an oversight rather than an intended design decision.


You didn't answer my question. 8-)

> If the internal library use case is worth supporting, adding a note

> tagging said internal library as not meant to be linked against seems

> like a better (and much more efficient) approach?


The dynamic linker does not look at section headers at all.

> This could also actually result in the dynamic linker rejecting

> attempting to load through DT_NEEDED entry.


No, DT_NEEDED entries would be how the library is loaded.
Kaylee Blake March 9, 2020, 1:14 p.m. | #21
On 9/3/20 11:36 pm, Florian Weimer wrote:
> * Kaylee Blake:

> 

>> On 9/3/20 6:43 pm, Florian Weimer wrote:

>>> * H. J. Lu:

>>>

>>>> Section header isn't mandatory on ELF executable nor shared library.

>>>> This patch adds a new linker option, -z nosectionheader, to omit ELF

>>>> section header when building an executable or shared library, adds

>>>> an objcopy and strip option, --remove-section-header, to remove ELF

>>>> section header from an executable or shared library.

>>>>

>>>> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

>>>> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

>>>> reconstruct dynamic symbol table when section header isn't available.

>>>> For DT_HASH, the number of dynamic symbol table entries equals the

>>>> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

>>>> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

>>>> table, all symbols with STB_LOCAL binding are placed before symbols with

>>>> other bindings and all defined symbols are placed before undefined ones,

>>>> the highest symbol index in DT_GNU_HASH/DT_MIPS_XHASH is the highest

>>>> dynamic symbol table index.

>>>

>>> Does this patch enable ld to use shared objects without a section

>>> header for linking?

>>>

>>> I think the NEWS and manual update should clarify this.

>>>

>>> In my opinion, it should NOT be possible to link against objects

>>> without section headers.  Lack of section headers clearly marks the

>>> object as a run-time only object.  This is useful if you want to

>>> prevent developers to create DT_NEEDED dependencies on internal

>>> libraries, for example.

>>

>> For shared objects without debug symbols, the section header table is

>> ~2kB on average of redundant data. I'm also not a fan of the

>> inconsistency of having shared libraries that the dynamic linker is

>> perfectly happy to load, but ld can't link against, especially since

>> this seems like an oversight rather than an intended design decision.

> 

> You didn't answer my question. 8-)


Ah, yes, sorry. It does enable that; that was my primary motivation for
my part in it.

>> If the internal library use case is worth supporting, adding a note

>> tagging said internal library as not meant to be linked against seems

>> like a better (and much more efficient) approach?

> 

> The dynamic linker does not look at section headers at all.


I was thinking of using the PT_NOTE program header, as that's already
used by the dynamic linker to check some things.

>> This could also actually result in the dynamic linker rejecting

>> attempting to load through DT_NEEDED entry.

> 

> No, DT_NEEDED entries would be how the library is loaded.

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Florian Weimer March 9, 2020, 1:16 p.m. | #22
* Kaylee Blake:

> On 9/3/20 11:36 pm, Florian Weimer wrote:

>> * Kaylee Blake:

>> 

>>> On 9/3/20 6:43 pm, Florian Weimer wrote:

>>>> * H. J. Lu:

>>>>

>>>>> Section header isn't mandatory on ELF executable nor shared library.

>>>>> This patch adds a new linker option, -z nosectionheader, to omit ELF

>>>>> section header when building an executable or shared library, adds

>>>>> an objcopy and strip option, --remove-section-header, to remove ELF

>>>>> section header from an executable or shared library.

>>>>>

>>>>> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

>>>>> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

>>>>> reconstruct dynamic symbol table when section header isn't available.

>>>>> For DT_HASH, the number of dynamic symbol table entries equals the

>>>>> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

>>>>> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

>>>>> table, all symbols with STB_LOCAL binding are placed before symbols with

>>>>> other bindings and all defined symbols are placed before undefined ones,

>>>>> the highest symbol index in DT_GNU_HASH/DT_MIPS_XHASH is the highest

>>>>> dynamic symbol table index.

>>>>

>>>> Does this patch enable ld to use shared objects without a section

>>>> header for linking?

>>>>

>>>> I think the NEWS and manual update should clarify this.

>>>>

>>>> In my opinion, it should NOT be possible to link against objects

>>>> without section headers.  Lack of section headers clearly marks the

>>>> object as a run-time only object.  This is useful if you want to

>>>> prevent developers to create DT_NEEDED dependencies on internal

>>>> libraries, for example.

>>>

>>> For shared objects without debug symbols, the section header table is

>>> ~2kB on average of redundant data. I'm also not a fan of the

>>> inconsistency of having shared libraries that the dynamic linker is

>>> perfectly happy to load, but ld can't link against, especially since

>>> this seems like an oversight rather than an intended design decision.

>> 

>> You didn't answer my question. 8-)

>

> Ah, yes, sorry. It does enable that; that was my primary motivation for

> my part in it.


I think that's conceptually the wrong thing to do for ELF, sorry.  If
there is no section header, the object should be unlinkable.  The
linker should not use the dynamic segment to locate the symbol
information, only the dynamic section (in case the link ABI and
run-time ABI are different).

You could still get the run-time savings by stripping the DSO and
using different (but matching) DSOs for linker input and at run time.
Kaylee Blake March 9, 2020, 1:28 p.m. | #23
On 9/3/20 11:46 pm, Florian Weimer wrote:
> * Kaylee Blake:

> 

>> On 9/3/20 11:36 pm, Florian Weimer wrote:

>>> * Kaylee Blake:

>>>

>>>> On 9/3/20 6:43 pm, Florian Weimer wrote:

>>>>> * H. J. Lu:

>>>>>

>>>>>> Section header isn't mandatory on ELF executable nor shared library.

>>>>>> This patch adds a new linker option, -z nosectionheader, to omit ELF

>>>>>> section header when building an executable or shared library, adds

>>>>>> an objcopy and strip option, --remove-section-header, to remove ELF

>>>>>> section header from an executable or shared library.

>>>>>>

>>>>>> The PT_DYNAMIC segment contains DT_HASH/DT_GNU_HASH/DT_MIPS_XHASH,

>>>>>> DT_STRTAB, DT_SYMTAB, DT_STRSZ and DT_SYMENT, which can be used to

>>>>>> reconstruct dynamic symbol table when section header isn't available.

>>>>>> For DT_HASH, the number of dynamic symbol table entries equals the

>>>>>> number of chains.  For DT_GNU_HASH/DT_MIPS_XHASH, only defined symbols

>>>>>> with non-STB_LOCAL indings are in hash table.  Since in dynamic symbol

>>>>>> table, all symbols with STB_LOCAL binding are placed before symbols with

>>>>>> other bindings and all defined symbols are placed before undefined ones,

>>>>>> the highest symbol index in DT_GNU_HASH/DT_MIPS_XHASH is the highest

>>>>>> dynamic symbol table index.

>>>>>

>>>>> Does this patch enable ld to use shared objects without a section

>>>>> header for linking?

>>>>>

>>>>> I think the NEWS and manual update should clarify this.

>>>>>

>>>>> In my opinion, it should NOT be possible to link against objects

>>>>> without section headers.  Lack of section headers clearly marks the

>>>>> object as a run-time only object.  This is useful if you want to

>>>>> prevent developers to create DT_NEEDED dependencies on internal

>>>>> libraries, for example.

>>>>

>>>> For shared objects without debug symbols, the section header table is

>>>> ~2kB on average of redundant data. I'm also not a fan of the

>>>> inconsistency of having shared libraries that the dynamic linker is

>>>> perfectly happy to load, but ld can't link against, especially since

>>>> this seems like an oversight rather than an intended design decision.

>>>

>>> You didn't answer my question. 8-)

>>

>> Ah, yes, sorry. It does enable that; that was my primary motivation for

>> my part in it.

> 

> I think that's conceptually the wrong thing to do for ELF, sorry.  If

> there is no section header, the object should be unlinkable.  The

> linker should not use the dynamic segment to locate the symbol

> information, only the dynamic section (in case the link ABI and

> run-time ABI are different).


I'm confused by your comment about link and run-time ABIs differing;
surely if the ABI at runtime differs from the ABI at link time, you are
just going to crash at runtime?

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Florian Weimer March 9, 2020, 1:29 p.m. | #24
* Kaylee Blake:

>> I think that's conceptually the wrong thing to do for ELF, sorry.  If

>> there is no section header, the object should be unlinkable.  The

>> linker should not use the dynamic segment to locate the symbol

>> information, only the dynamic section (in case the link ABI and

>> run-time ABI are different).

>

> I'm confused by your comment about link and run-time ABIs differing;

> surely if the ABI at runtime differs from the ABI at link time, you are

> just going to crash at runtime?


No, the typical application are fewer symbols in the DSO at link time
than at load time, for example for linking against an older version of
glibc than is installed on the system.
Alan Modra March 9, 2020, 1:44 p.m. | #25
On Mon, Mar 09, 2020 at 11:24:44PM +1030, Kaylee Blake wrote:
> On 9/3/20 6:43 pm, Florian Weimer wrote:

> > In my opinion, it should NOT be possible to link against objects

> > without section headers.  Lack of section headers clearly marks the

> > object as a run-time only object.  This is useful if you want to

> > prevent developers to create DT_NEEDED dependencies on internal

> > libraries, for example.


I agree.

> For shared objects without debug symbols, the section header table is

> ~2kB on average of redundant data. I'm also not a fan of the

> inconsistency of having shared libraries that the dynamic linker is

> perfectly happy to load, but ld can't link against, especially since

> this seems like an oversight rather than an intended design decision.


The ELF spec designed things that way.  See figure 4.1 which I'll try
to represent in text.

Figure 4-1: Object File Format

|----------------------|    |----------------------|    
|      ELF Header      |    |      ELF Header      |    
|----------------------|    |----------------------|    
| Program header table |    | Program header table |    
|       optional       |    |       required       |    
|----------------------|    |----------------------|    
|       Section 1      |    |       Segment 1      |    
|----------------------|    |----------------------|    
|          ...         |    |       Segment 2      |    
|----------------------|    |----------------------|    
|       Section n      |    |       Segment 3      |    
|----------------------|    |----------------------|    
|          ...         |    |          ...         |    
|----------------------|    |----------------------|    
| Section header table |    | Section header table |    
|       required       |    |       optional       |    
|----------------------|    |----------------------|    
      Linking View               Execution View      


-- 
Alan Modra
Australia Development Lab, IBM
Kaylee Blake March 9, 2020, 1:45 p.m. | #26
On 9/3/20 11:59 pm, Florian Weimer wrote:
> * Kaylee Blake:

> 

>>> I think that's conceptually the wrong thing to do for ELF, sorry.  If

>>> there is no section header, the object should be unlinkable.  The

>>> linker should not use the dynamic segment to locate the symbol

>>> information, only the dynamic section (in case the link ABI and

>>> run-time ABI are different).

>>

>> I'm confused by your comment about link and run-time ABIs differing;

>> surely if the ABI at runtime differs from the ABI at link time, you are

>> just going to crash at runtime?

> 

> No, the typical application are fewer symbols in the DSO at link time

> than at load time, for example for linking against an older version of

> glibc than is installed on the system.


How is that being done? On my machine, the symbols in glibc found
through the section header are identical to the ones found through the
dynamic array, except that some of the latter are missing symbol
versions, which I think is due to this patch not looking them up? (I'm
not actually sure if this patch does that or not).

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
H.J. Lu March 9, 2020, 1:54 p.m. | #27
On Mon, Mar 9, 2020 at 6:46 AM Kaylee Blake <klkblake@gmail.com> wrote:
>

> On 9/3/20 11:59 pm, Florian Weimer wrote:

> > * Kaylee Blake:

> >

> >>> I think that's conceptually the wrong thing to do for ELF, sorry.  If

> >>> there is no section header, the object should be unlinkable.  The

> >>> linker should not use the dynamic segment to locate the symbol

> >>> information, only the dynamic section (in case the link ABI and

> >>> run-time ABI are different).

> >>

> >> I'm confused by your comment about link and run-time ABIs differing;

> >> surely if the ABI at runtime differs from the ABI at link time, you are

> >> just going to crash at runtime?

> >

> > No, the typical application are fewer symbols in the DSO at link time

> > than at load time, for example for linking against an older version of

> > glibc than is installed on the system.

>

> How is that being done? On my machine, the symbols in glibc found

> through the section header are identical to the ones found through the

> dynamic array, except that some of the latter are missing symbol

> versions, which I think is due to this patch not looking them up? (I'm

> not actually sure if this patch does that or not).

>


Symbol versioning is a real problem.  We need to reconstruct all dynamic
symbol info from PT_DYNAMIC segment.   I am running into a wrong
output problem on i386.  I am leaning toward Florian's suggestion to
only add --remove-section-header and -z nosectionheader without
reconstructing dynamic symbol info from PT_DYNAMIC segment.

-- 
H.J.
Kaylee Blake March 9, 2020, 1:54 p.m. | #28
On 10/3/20 12:14 am, Alan Modra wrote:
> On Mon, Mar 09, 2020 at 11:24:44PM +1030, Kaylee Blake wrote:

>> On 9/3/20 6:43 pm, Florian Weimer wrote:

>>> In my opinion, it should NOT be possible to link against objects

>>> without section headers.  Lack of section headers clearly marks the

>>> object as a run-time only object.  This is useful if you want to

>>> prevent developers to create DT_NEEDED dependencies on internal

>>> libraries, for example.

> 

> I agree.

> 

>> For shared objects without debug symbols, the section header table is

>> ~2kB on average of redundant data. I'm also not a fan of the

>> inconsistency of having shared libraries that the dynamic linker is

>> perfectly happy to load, but ld can't link against, especially since

>> this seems like an oversight rather than an intended design decision.

> 

> The ELF spec designed things that way.  See figure 4.1 which I'll try

> to represent in text.

> 

> Figure 4-1: Object File Format

> 

> |----------------------|    |----------------------|    

> |      ELF Header      |    |      ELF Header      |    

> |----------------------|    |----------------------|    

> | Program header table |    | Program header table |    

> |       optional       |    |       required       |    

> |----------------------|    |----------------------|    

> |       Section 1      |    |       Segment 1      |    

> |----------------------|    |----------------------|    

> |          ...         |    |       Segment 2      |    

> |----------------------|    |----------------------|    

> |       Section n      |    |       Segment 3      |    

> |----------------------|    |----------------------|    

> |          ...         |    |          ...         |    

> |----------------------|    |----------------------|    

> | Section header table |    | Section header table |    

> |       required       |    |       optional       |    

> |----------------------|    |----------------------|    

>       Linking View               Execution View      

> 


I had interpreted that table in combination to various other references
to which things are required vs optional in shared objects as meaning
that the "execution view" applied to executables and shared objects, and
the "linking view" applied to relocatable objects. You're saying that
that table should be interpreted as saying that if a shared object is to
be linkable, the spec is requiring it to have both sets of headers?


-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Kaylee Blake March 9, 2020, 2:02 p.m. | #29
On 10/3/20 12:24 am, H.J. Lu wrote:
> On Mon, Mar 9, 2020 at 6:46 AM Kaylee Blake <klkblake@gmail.com> wrote:

>>

>> On 9/3/20 11:59 pm, Florian Weimer wrote:

>>> * Kaylee Blake:

>>>

>>>>> I think that's conceptually the wrong thing to do for ELF, sorry.  If

>>>>> there is no section header, the object should be unlinkable.  The

>>>>> linker should not use the dynamic segment to locate the symbol

>>>>> information, only the dynamic section (in case the link ABI and

>>>>> run-time ABI are different).

>>>>

>>>> I'm confused by your comment about link and run-time ABIs differing;

>>>> surely if the ABI at runtime differs from the ABI at link time, you are

>>>> just going to crash at runtime?

>>>

>>> No, the typical application are fewer symbols in the DSO at link time

>>> than at load time, for example for linking against an older version of

>>> glibc than is installed on the system.

>>

>> How is that being done? On my machine, the symbols in glibc found

>> through the section header are identical to the ones found through the

>> dynamic array, except that some of the latter are missing symbol

>> versions, which I think is due to this patch not looking them up? (I'm

>> not actually sure if this patch does that or not).

>>

> 

> Symbol versioning is a real problem.  We need to reconstruct all dynamic

> symbol info from PT_DYNAMIC segment.   I am running into a wrong

> output problem on i386.  I am leaning toward Florian's suggestion to

> only add --remove-section-header and -z nosectionheader without

> reconstructing dynamic symbol info from PT_DYNAMIC segment.


Ah, it's not as simple as just grabbing addresses / sizes from the
DT_VER* entries? Admittedly I don't actually know much about how symbol
versioning operates.


-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Michael Matz March 9, 2020, 2:34 p.m. | #30
Hello,

On Tue, 10 Mar 2020, Alan Modra wrote:

> On Mon, Mar 09, 2020 at 11:24:44PM +1030, Kaylee Blake wrote:

> > On 9/3/20 6:43 pm, Florian Weimer wrote:

> > > In my opinion, it should NOT be possible to link against objects

> > > without section headers.  Lack of section headers clearly marks the

> > > object as a run-time only object.  This is useful if you want to

> > > prevent developers to create DT_NEEDED dependencies on internal

> > > libraries, for example.

> 

> I agree.


Me as well, as much as I like the cute hack of reconstructing the linking 
view from only the executable view.  (We probably could make everything 
work with some additional constraints that give us some guarantees, but I 
don't think that's in the spirit of ELF).


Ciao,
Michael.
Florian Weimer March 9, 2020, 2:52 p.m. | #31
* Kaylee Blake:

> On 9/3/20 11:59 pm, Florian Weimer wrote:

>> * Kaylee Blake:

>> 

>>>> I think that's conceptually the wrong thing to do for ELF, sorry.  If

>>>> there is no section header, the object should be unlinkable.  The

>>>> linker should not use the dynamic segment to locate the symbol

>>>> information, only the dynamic section (in case the link ABI and

>>>> run-time ABI are different).

>>>

>>> I'm confused by your comment about link and run-time ABIs differing;

>>> surely if the ABI at runtime differs from the ABI at link time, you are

>>> just going to crash at runtime?

>> 

>> No, the typical application are fewer symbols in the DSO at link time

>> than at load time, for example for linking against an older version of

>> glibc than is installed on the system.

>

> How is that being done?


libc.so is replaced with something that contains a stub library which
only exports the intended symbols, at the right versions.

If I recall correctly, some enterprise database products are linked
using ld against such a stub library upon installation.  It's a bit
weird for sure, but it's what people do.

> On my machine, the symbols in glibc found through the section header

> are identical to the ones found through the dynamic array, except

> that some of the latter are missing symbol versions, which I think

> is due to this patch not looking them up?


Ah, didn't know that the compat symbols (those with @ symbol versions
instead of @@ synbol versions) are not present in .symtab.  That makes
sense, given that they are not supposed to be linked against for new
binaries.  We should definitely keep hiding symbols which lack a
default symbol version.
Kaylee Blake March 9, 2020, 3:07 p.m. | #32
On 10/3/20 1:22 am, Florian Weimer wrote:
> * Kaylee Blake:

> 

>> On 9/3/20 11:59 pm, Florian Weimer wrote:

>>> * Kaylee Blake:

>>>

>>>>> I think that's conceptually the wrong thing to do for ELF, sorry.  If

>>>>> there is no section header, the object should be unlinkable.  The

>>>>> linker should not use the dynamic segment to locate the symbol

>>>>> information, only the dynamic section (in case the link ABI and

>>>>> run-time ABI are different).

>>>>

>>>> I'm confused by your comment about link and run-time ABIs differing;

>>>> surely if the ABI at runtime differs from the ABI at link time, you are

>>>> just going to crash at runtime?

>>>

>>> No, the typical application are fewer symbols in the DSO at link time

>>> than at load time, for example for linking against an older version of

>>> glibc than is installed on the system.

>>

>> How is that being done?

> 

> libc.so is replaced with something that contains a stub library which

> only exports the intended symbols, at the right versions.

> 

> If I recall correctly, some enterprise database products are linked

> using ld against such a stub library upon installation.  It's a bit

> weird for sure, but it's what people do.

> 

>> On my machine, the symbols in glibc found through the section header

>> are identical to the ones found through the dynamic array, except

>> that some of the latter are missing symbol versions, which I think

>> is due to this patch not looking them up?

> 

> Ah, didn't know that the compat symbols (those with @ symbol versions

> instead of @@ synbol versions) are not present in .symtab.  That makes

> sense, given that they are not supposed to be linked against for new

> binaries.  We should definitely keep hiding symbols which lack a

> default symbol version.


Oh, I think I was unclear. They contain the same set of symbols, but
readelf reports the name of the symbol without any @version or @@version
suffix if reading from the dynamic array.

-- 
Kaylee Blake <klkblake@gmail.com>
C is the worst language, except for all the others.
Florian Weimer March 9, 2020, 3:29 p.m. | #33
* Kaylee Blake:

> Oh, I think I was unclear. They contain the same set of symbols,


Right, I was confused.

> but readelf reports the name of the symbol without any @version or

> @@version suffix if reading from the dynamic array.


I think that's a readelf implementation choice, up to a certain
degree.

The link editor definitely has to consult the GNU_verdef section, it's
not just data used at run time.
Alan Modra March 9, 2020, 10:34 p.m. | #34
On Tue, Mar 10, 2020 at 12:24:51AM +1030, Kaylee Blake wrote:
> On 10/3/20 12:14 am, Alan Modra wrote:

> > On Mon, Mar 09, 2020 at 11:24:44PM +1030, Kaylee Blake wrote:

> >> On 9/3/20 6:43 pm, Florian Weimer wrote:

> >>> In my opinion, it should NOT be possible to link against objects

> >>> without section headers.  Lack of section headers clearly marks the

> >>> object as a run-time only object.  This is useful if you want to

> >>> prevent developers to create DT_NEEDED dependencies on internal

> >>> libraries, for example.

> > 

> > I agree.

> > 

> >> For shared objects without debug symbols, the section header table is

> >> ~2kB on average of redundant data. I'm also not a fan of the

> >> inconsistency of having shared libraries that the dynamic linker is

> >> perfectly happy to load, but ld can't link against, especially since

> >> this seems like an oversight rather than an intended design decision.

> > 

> > The ELF spec designed things that way.  See figure 4.1 which I'll try

> > to represent in text.

> > 

> > Figure 4-1: Object File Format

> > 

> > |----------------------|    |----------------------|    

> > |      ELF Header      |    |      ELF Header      |    

> > |----------------------|    |----------------------|    

> > | Program header table |    | Program header table |    

> > |       optional       |    |       required       |    

> > |----------------------|    |----------------------|    

> > |       Section 1      |    |       Segment 1      |    

> > |----------------------|    |----------------------|    

> > |          ...         |    |       Segment 2      |    

> > |----------------------|    |----------------------|    

> > |       Section n      |    |       Segment 3      |    

> > |----------------------|    |----------------------|    

> > |          ...         |    |          ...         |    

> > |----------------------|    |----------------------|    

> > | Section header table |    | Section header table |    

> > |       required       |    |       optional       |    

> > |----------------------|    |----------------------|    

> >       Linking View               Execution View      

> > 

> 

> I had interpreted that table in combination to various other references

> to which things are required vs optional in shared objects as meaning

> that the "execution view" applied to executables and shared objects, and

> the "linking view" applied to relocatable objects. You're saying that

> that table should be interpreted as saying that if a shared object is to

> be linkable, the spec is requiring it to have both sets of headers?


Yes.  Just below the table: "Files used during linking must have a
section header table".

-- 
Alan Modra
Australia Development Lab, IBM
H.J. Lu March 10, 2020, 12:14 a.m. | #35
On Mon, Mar 9, 2020 at 3:43 PM Alan Modra <amodra@gmail.com> wrote:
>

> On Tue, Mar 10, 2020 at 12:24:51AM +1030, Kaylee Blake wrote:

> > On 10/3/20 12:14 am, Alan Modra wrote:

> > > On Mon, Mar 09, 2020 at 11:24:44PM +1030, Kaylee Blake wrote:

> > >> On 9/3/20 6:43 pm, Florian Weimer wrote:

> > >>> In my opinion, it should NOT be possible to link against objects

> > >>> without section headers.  Lack of section headers clearly marks the

> > >>> object as a run-time only object.  This is useful if you want to

> > >>> prevent developers to create DT_NEEDED dependencies on internal

> > >>> libraries, for example.

> > >

> > > I agree.

> > >

> > >> For shared objects without debug symbols, the section header table is

> > >> ~2kB on average of redundant data. I'm also not a fan of the

> > >> inconsistency of having shared libraries that the dynamic linker is

> > >> perfectly happy to load, but ld can't link against, especially since

> > >> this seems like an oversight rather than an intended design decision.

> > >

> > > The ELF spec designed things that way.  See figure 4.1 which I'll try

> > > to represent in text.

> > >

> > > Figure 4-1: Object File Format

> > >

> > > |----------------------|    |----------------------|

> > > |      ELF Header      |    |      ELF Header      |

> > > |----------------------|    |----------------------|

> > > | Program header table |    | Program header table |

> > > |       optional       |    |       required       |

> > > |----------------------|    |----------------------|

> > > |       Section 1      |    |       Segment 1      |

> > > |----------------------|    |----------------------|

> > > |          ...         |    |       Segment 2      |

> > > |----------------------|    |----------------------|

> > > |       Section n      |    |       Segment 3      |

> > > |----------------------|    |----------------------|

> > > |          ...         |    |          ...         |

> > > |----------------------|    |----------------------|

> > > | Section header table |    | Section header table |

> > > |       required       |    |       optional       |

> > > |----------------------|    |----------------------|

> > >       Linking View               Execution View

> > >

> >

> > I had interpreted that table in combination to various other references

> > to which things are required vs optional in shared objects as meaning

> > that the "execution view" applied to executables and shared objects, and

> > the "linking view" applied to relocatable objects. You're saying that

> > that table should be interpreted as saying that if a shared object is to

> > be linkable, the spec is requiring it to have both sets of headers?

>

> Yes.  Just below the table: "Files used during linking must have a

> section header table".

>


I posted a new set of patches without linker support for PT_DYNAMIC:

https://sourceware.org/pipermail/binutils/2020-March/110157.html

-- 
H.J.
Fangrui Song March 12, 2020, 2:14 a.m. | #36
I am not subscribed to the list. The new Mailman2/Pipermail archive does
not provide Cc:/To: , so I am missing some Cc:

On 2020-03-08, H.J. Lu wrote:
>On Sun, Mar 8, 2020 at 4:38 PM Alan Modra <amodra@gmail.com> wrote:

>>

>> On Sun, Mar 08, 2020 at 04:29:27PM -0700, H.J. Lu wrote:

>> > On Sun, Mar 8, 2020 at 4:25 PM Kaylee Blake <klkblake@gmail.com> wrote:

>> > >

>> > > On 9/3/20 4:29 am, H.J. Lu wrote:

>> > > > Any comments?

>> > > >

>> > > > Kaylee, do you have copyright paper with FSF?

>> > > >

>> > > > H.J.

>> > >

>> > >

>> > > I don't at present; is my contribution here significant enough to

>> > > require it?

>> > >

>> >

>> > I think so.

>>

>> Yes, please wait for the copyright assignment to be in place before

>> committing.

>

>I will.   I submitted the whole patch set.

>

>Thanks.


eu-strip and llvm-objcopy (since https://reviews.llvm.org/D38335) implement --strip-sections, which can strip
the section header table.

Patch

diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h
index 37114607b5..d81d0e20dd 100644
--- a/bfd/bfd-in2.h
+++ b/bfd/bfd-in2.h
@@ -6588,17 +6588,21 @@  struct bfd
   /* Put pathnames into archives (non-POSIX).  */
 #define BFD_ARCHIVE_FULL_PATH  0x100000
 
+  /* Don't generate ELF section header.  */
+#define BFD_NO_SECTION_HEADER  0x200000
+
   /* Flags bits to be saved in bfd_preserve_save.  */
 #define BFD_FLAGS_SAVED \
   (BFD_IN_MEMORY | BFD_COMPRESS | BFD_DECOMPRESS | BFD_LINKER_CREATED \
    | BFD_PLUGIN | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON \
-   | BFD_USE_ELF_STT_COMMON)
+   | BFD_USE_ELF_STT_COMMON | BFD_NO_SECTION_HEADER)
 
   /* Flags bits which are for BFD use only.  */
 #define BFD_FLAGS_FOR_BFD_USE_MASK \
   (BFD_IN_MEMORY | BFD_COMPRESS | BFD_DECOMPRESS | BFD_LINKER_CREATED \
    | BFD_PLUGIN | BFD_TRADITIONAL_FORMAT | BFD_DETERMINISTIC_OUTPUT \
-   | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON | BFD_USE_ELF_STT_COMMON)
+   | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON | BFD_USE_ELF_STT_COMMON \
+   | BFD_NO_SECTION_HEADER)
 
   /* The format which belongs to the BFD. (object, core, etc.)  */
   ENUM_BITFIELD (bfd_format) format : 3;
diff --git a/bfd/bfd.c b/bfd/bfd.c
index 1c1238c036..366e662592 100644
--- a/bfd/bfd.c
+++ b/bfd/bfd.c
@@ -176,17 +176,21 @@  CODE_FRAGMENT
 .  {* Put pathnames into archives (non-POSIX).  *}
 .#define BFD_ARCHIVE_FULL_PATH  0x100000
 .
+.  {* Don't generate ELF section header.  *}
+.#define BFD_NO_SECTION_HEADER	0x200000
+.
 .  {* Flags bits to be saved in bfd_preserve_save.  *}
 .#define BFD_FLAGS_SAVED \
 .  (BFD_IN_MEMORY | BFD_COMPRESS | BFD_DECOMPRESS | BFD_LINKER_CREATED \
 .   | BFD_PLUGIN | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON \
-.   | BFD_USE_ELF_STT_COMMON)
+.   | BFD_USE_ELF_STT_COMMON | BFD_NO_SECTION_HEADER)
 .
 .  {* Flags bits which are for BFD use only.  *}
 .#define BFD_FLAGS_FOR_BFD_USE_MASK \
 .  (BFD_IN_MEMORY | BFD_COMPRESS | BFD_DECOMPRESS | BFD_LINKER_CREATED \
 .   | BFD_PLUGIN | BFD_TRADITIONAL_FORMAT | BFD_DETERMINISTIC_OUTPUT \
-.   | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON | BFD_USE_ELF_STT_COMMON)
+.   | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON | BFD_USE_ELF_STT_COMMON \
+.   | BFD_NO_SECTION_HEADER)
 .
 .  {* The format which belongs to the BFD. (object, core, etc.)  *}
 .  ENUM_BITFIELD (bfd_format) format : 3;
diff --git a/bfd/elfcode.h b/bfd/elfcode.h
index 18a6dac64e..4dde24e02a 100644
--- a/bfd/elfcode.h
+++ b/bfd/elfcode.h
@@ -266,6 +266,10 @@  elf_swap_ehdr_out (bfd *abfd,
 {
   unsigned int tmp;
   int signed_vma = get_elf_backend_data (abfd)->sign_extend_vma;
+  /* Relocatable object must have section header.  */
+  bfd_boolean no_section_header
+    = ((abfd->flags & BFD_NO_SECTION_HEADER) != 0
+       && (abfd->flags & (EXEC_P | DYNAMIC)) != 0);
   memcpy (dst->e_ident, src->e_ident, EI_NIDENT);
   /* note that all elements of dst are *arrays of unsigned char* already...  */
   H_PUT_16 (abfd, src->e_type, dst->e_type);
@@ -276,7 +280,10 @@  elf_swap_ehdr_out (bfd *abfd,
   else
     H_PUT_WORD (abfd, src->e_entry, dst->e_entry);
   H_PUT_WORD (abfd, src->e_phoff, dst->e_phoff);
-  H_PUT_WORD (abfd, src->e_shoff, dst->e_shoff);
+  if (no_section_header)
+    H_PUT_WORD (abfd, 0, dst->e_shoff);
+  else
+    H_PUT_WORD (abfd, src->e_shoff, dst->e_shoff);
   H_PUT_32 (abfd, src->e_flags, dst->e_flags);
   H_PUT_16 (abfd, src->e_ehsize, dst->e_ehsize);
   H_PUT_16 (abfd, src->e_phentsize, dst->e_phentsize);
@@ -284,15 +291,24 @@  elf_swap_ehdr_out (bfd *abfd,
   if (tmp > PN_XNUM)
     tmp = PN_XNUM;
   H_PUT_16 (abfd, tmp, dst->e_phnum);
-  H_PUT_16 (abfd, src->e_shentsize, dst->e_shentsize);
-  tmp = src->e_shnum;
-  if (tmp >= (SHN_LORESERVE & 0xffff))
-    tmp = SHN_UNDEF;
-  H_PUT_16 (abfd, tmp, dst->e_shnum);
-  tmp = src->e_shstrndx;
-  if (tmp >= (SHN_LORESERVE & 0xffff))
-    tmp = SHN_XINDEX & 0xffff;
-  H_PUT_16 (abfd, tmp, dst->e_shstrndx);
+  if (no_section_header)
+    {
+      H_PUT_16 (abfd, 0, dst->e_shentsize);
+      H_PUT_16 (abfd, 0, dst->e_shnum);
+      H_PUT_16 (abfd, 0, dst->e_shstrndx);
+    }
+  else
+    {
+      H_PUT_16 (abfd, src->e_shentsize, dst->e_shentsize);
+      tmp = src->e_shnum;
+      if (tmp >= (SHN_LORESERVE & 0xffff))
+        tmp = SHN_UNDEF;
+      H_PUT_16 (abfd, tmp, dst->e_shnum);
+      tmp = src->e_shstrndx;
+      if (tmp >= (SHN_LORESERVE & 0xffff))
+        tmp = SHN_XINDEX & 0xffff;
+      H_PUT_16 (abfd, tmp, dst->e_shstrndx);
+    }
 }
 
 /* Translate an ELF section header table entry in external format into an
@@ -1041,6 +1057,11 @@  elf_write_shdrs_and_ehdr (bfd *abfd)
       || bfd_bwrite (&x_ehdr, amt, abfd) != amt)
     return FALSE;
 
+  /* Relocatable object must have section header.  */
+   if ((abfd->flags & BFD_NO_SECTION_HEADER) != 0
+       && (abfd->flags & (EXEC_P | DYNAMIC)) != 0)
+    return TRUE;
+
   /* Some fields in the first section header handle overflow of ehdr
      fields.  */
   if (i_ehdrp->e_phnum >= PN_XNUM)
diff --git a/bfd/elfxx-target.h b/bfd/elfxx-target.h
index 1ae17f45ee..8d25c84e80 100644
--- a/bfd/elfxx-target.h
+++ b/bfd/elfxx-target.h
@@ -970,7 +970,8 @@  const bfd_target TARGET_BIG_SYM =
   /* object_flags: mask of all file flags */
   (HAS_RELOC | EXEC_P | HAS_LINENO | HAS_DEBUG | HAS_SYMS | HAS_LOCALS
    | DYNAMIC | WP_TEXT | D_PAGED | BFD_COMPRESS | BFD_DECOMPRESS
-   | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON | BFD_USE_ELF_STT_COMMON),
+   | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON | BFD_USE_ELF_STT_COMMON
+   | BFD_NO_SECTION_HEADER),
 
   /* section_flags: mask of all section flags */
   (SEC_HAS_CONTENTS | SEC_ALLOC | SEC_LOAD | SEC_RELOC | SEC_READONLY
@@ -1071,7 +1072,8 @@  const bfd_target TARGET_LITTLE_SYM =
   /* object_flags: mask of all file flags */
   (HAS_RELOC | EXEC_P | HAS_LINENO | HAS_DEBUG | HAS_SYMS | HAS_LOCALS
    | DYNAMIC | WP_TEXT | D_PAGED | BFD_COMPRESS | BFD_DECOMPRESS
-   | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON | BFD_USE_ELF_STT_COMMON),
+   | BFD_COMPRESS_GABI | BFD_CONVERT_ELF_COMMON | BFD_USE_ELF_STT_COMMON
+   | BFD_NO_SECTION_HEADER),
 
   /* section_flags: mask of all section flags */
   (SEC_HAS_CONTENTS | SEC_ALLOC | SEC_LOAD | SEC_RELOC | SEC_READONLY
diff --git a/binutils/NEWS b/binutils/NEWS
index 1650a3ac93..5116e2eedc 100644
--- a/binutils/NEWS
+++ b/binutils/NEWS
@@ -1,5 +1,8 @@ 
 -*- text -*-
 
+* Add command-line option, --remove-section-header, to objcopy and strip
+  to remove ELF section header from an executable or shared library.
+
 Changes in 2.34:
 
 * Binutils now supports debuginfod, an HTTP server for distributing
diff --git a/binutils/doc/binutils.texi b/binutils/doc/binutils.texi
index de3f1babb2..ece82e9e39 100644
--- a/binutils/doc/binutils.texi
+++ b/binutils/doc/binutils.texi
@@ -1186,6 +1186,7 @@  objcopy [@option{-F} @var{bfdname}|@option{--target=}@var{bfdname}]
         [@option{-R} @var{sectionpattern}|@option{--remove-section=}@var{sectionpattern}]
         [@option{--keep-section=}@var{sectionpattern}]
         [@option{--remove-relocations=}@var{sectionpattern}]
+        [@option{--remove-section-header}]
         [@option{-p}|@option{--preserve-dates}]
         [@option{-D}|@option{--enable-deterministic-archives}]
         [@option{-U}|@option{--disable-deterministic-archives}]
@@ -1403,6 +1404,11 @@  will remove all relocations for sections matching the pattern
 '.text.*', but will not remove relocations for the section
 '.text.foo'.
 
+@item --remove-section-header
+Remove section header from an ELF executable or shared object.  This
+option is specific to ELF files and is ignored on relocatable object
+files.  Implies @option{--strip-all} and @option{--merge-notes}.
+
 @item -S
 @itemx --strip-all
 Do not copy relocation and symbol information from the source file.
@@ -3262,6 +3268,7 @@  strip [@option{-F} @var{bfdname} |@option{--target=}@var{bfdname}]
       [@option{-R} @var{sectionname} |@option{--remove-section=}@var{sectionname}]
       [@option{--keep-section=}@var{sectionpattern}]
       [@option{--remove-relocations=}@var{sectionpattern}]
+      [@option{--remove-section-header}]
       [@option{-o} @var{file}] [@option{-p}|@option{--preserve-dates}]
       [@option{-D}|@option{--enable-deterministic-archives}]
       [@option{-U}|@option{--disable-deterministic-archives}]
@@ -3363,6 +3370,11 @@  will remove all relocations for sections matching the pattern
 '.text.*', but will not remove relocations for the section
 '.text.foo'.
 
+@item --remove-section-header
+Remove section header from an ELF executable or shared object.  This
+option is specific to ELF files and is ignored on relocatable object
+files.  Implies @option{--strip-all} and @option{--merge-notes}.
+
 @item -s
 @itemx --strip-all
 Remove all symbols.
diff --git a/binutils/objcopy.c b/binutils/objcopy.c
index 09facf0061..10891bc3f0 100644
--- a/binutils/objcopy.c
+++ b/binutils/objcopy.c
@@ -96,6 +96,9 @@  static bfd_boolean preserve_dates;	/* Preserve input file timestamp.  */
 static int deterministic = -1;		/* Enable deterministic archives.  */
 static int status = 0;			/* Exit status.  */
 
+/* Remove section header.  */
+static bfd_boolean remove_section_header = FALSE;
+
 static bfd_boolean    merge_notes = FALSE;	/* Merge note sections.  */
 
 typedef struct merged_note_section
@@ -352,6 +355,7 @@  enum command_line_switch
   OPTION_REDEFINE_SYMS,
   OPTION_REMOVE_LEADING_CHAR,
   OPTION_REMOVE_RELOCS,
+  OPTION_REMOVE_SECTION_HEADER,
   OPTION_RENAME_SECTION,
   OPTION_REVERSE_BYTES,
   OPTION_PE_SECTION_ALIGNMENT,
@@ -399,6 +403,7 @@  static struct option strip_options[] =
   {"preserve-dates", no_argument, 0, 'p'},
   {"remove-section", required_argument, 0, 'R'},
   {"remove-relocations", required_argument, 0, OPTION_REMOVE_RELOCS},
+  {"remove-section-header", no_argument, 0, OPTION_REMOVE_SECTION_HEADER},
   {"strip-all", no_argument, 0, 's'},
   {"strip-debug", no_argument, 0, 'S'},
   {"strip-dwo", no_argument, 0, OPTION_STRIP_DWO},
@@ -487,6 +492,7 @@  static struct option copy_options[] =
   {"remove-leading-char", no_argument, 0, OPTION_REMOVE_LEADING_CHAR},
   {"remove-section", required_argument, 0, 'R'},
   {"remove-relocations", required_argument, 0, OPTION_REMOVE_RELOCS},
+  {"remove-section-header", no_argument, 0, OPTION_REMOVE_SECTION_HEADER},
   {"rename-section", required_argument, 0, OPTION_RENAME_SECTION},
   {"reverse-bytes", required_argument, 0, OPTION_REVERSE_BYTES},
   {"section-alignment", required_argument, 0, OPTION_PE_SECTION_ALIGNMENT},
@@ -582,6 +588,7 @@  copy_usage (FILE *stream, int exit_status)
      --add-gnu-debuglink=<file>    Add section .gnu_debuglink linking to <file>\n\
   -R --remove-section <name>       Remove section <name> from the output\n\
      --remove-relocations <name>   Remove relocations from section <name>\n\
+     --remove-section-header       Remove section header from the output\n\
   -S --strip-all                   Remove all symbol and relocation information\n\
   -g --strip-debug                 Remove all debugging symbols & sections\n\
      --strip-dwo                   Remove all DWO sections\n\
@@ -719,6 +726,7 @@  strip_usage (FILE *stream, int exit_status)
   fprintf (stream, _("\
   -R --remove-section=<name>       Also remove section <name> from the output\n\
      --remove-relocations <name>   Remove relocations from section <name>\n\
+     --remove-section-header       Remove section header from the output\n\
   -s --strip-all                   Remove all symbol and relocation information\n\
   -g -S -d --strip-debug           Remove all debugging symbols & sections\n\
      --strip-dwo                   Remove all DWO sections\n\
@@ -2583,7 +2591,7 @@  check_new_section_flags (flagword flags, bfd * abfd, const char * secname)
    Returns TRUE upon success, FALSE otherwise.  */
 
 static bfd_boolean
-copy_object (bfd *ibfd, bfd *obfd, const bfd_arch_info_type *input_arch)
+copy_object_1 (bfd *ibfd, bfd *obfd, const bfd_arch_info_type *input_arch)
 {
   bfd_vma start;
   long symcount;
@@ -2637,6 +2645,13 @@  copy_object (bfd *ibfd, bfd *obfd, const bfd_arch_info_type *input_arch)
 		     bfd_get_archive_filename (ibfd));
 	  return FALSE;
 	}
+
+      if (remove_section_header)
+	{
+	  non_fatal (_("--remove_section_header is unsupported on `%s'"),
+		     bfd_get_archive_filename (ibfd));
+	  return FALSE;
+	}
     }
 
   if (verbose)
@@ -3360,7 +3375,7 @@  copy_object (bfd *ibfd, bfd *obfd, const bfd_arch_info_type *input_arch)
 	  free (merged);
 	}
     }
-  else if (merge_notes && ! is_strip)
+  else if (merge_notes && ! is_strip && ! remove_section_header)
     non_fatal (_("%s: Could not find any mergeable note sections"),
 	       bfd_get_filename (ibfd));
 
@@ -3458,6 +3473,34 @@  copy_object (bfd *ibfd, bfd *obfd, const bfd_arch_info_type *input_arch)
   return TRUE;
 }
 
+/* Copy object file IBFD onto OBFD, preserve strip_symbols and
+ * saved_merge_notes, which may be changed by --remove-section-header.
+   Returns TRUE upon success, FALSE otherwise.  */
+
+static bfd_boolean
+copy_object (bfd *ibfd, bfd *obfd, const bfd_arch_info_type *input_arch)
+{
+  enum strip_action saved_strip_symbols = strip_symbols;
+  bfd_boolean saved_merge_notes = merge_notes;
+  bfd_boolean res;
+
+  if (bfd_get_flavour (ibfd) == bfd_target_elf_flavour
+      && remove_section_header
+      && (ibfd->flags & (EXEC_P | DYNAMIC)) != 0)
+    {
+      ibfd->flags |= BFD_NO_SECTION_HEADER;
+      strip_symbols = STRIP_ALL;
+      merge_notes = TRUE;
+    }
+
+  res = copy_object_1 (ibfd, obfd, input_arch);
+
+  strip_symbols = saved_strip_symbols;
+  merge_notes = saved_merge_notes;
+
+  return res;
+}
+
 /* Read each archive element in turn from IBFD, copy the
    contents to temp file, and keep the temp file handle.
    If 'force_output_target' is TRUE then make sure that
@@ -4671,6 +4714,9 @@  strip_main (int argc, char *argv[])
 	case OPTION_REMOVE_RELOCS:
 	  handle_remove_relocations_option (optarg);
 	  break;
+	case OPTION_REMOVE_SECTION_HEADER:
+	  remove_section_header = TRUE;
+	  break;
 	case 's':
 	  strip_symbols = STRIP_ALL;
 	  break;
@@ -5102,6 +5148,10 @@  copy_main (int argc, char *argv[])
 	  handle_remove_relocations_option (optarg);
 	  break;
 
+	case OPTION_REMOVE_SECTION_HEADER:
+	  remove_section_header = TRUE;
+	  break;
+
 	case 'S':
 	  strip_symbols = STRIP_ALL;
 	  break;
diff --git a/ld/NEWS b/ld/NEWS
index 7734d23d5b..79cad101a6 100644
--- a/ld/NEWS
+++ b/ld/NEWS
@@ -1,5 +1,8 @@ 
 -*- text -*-
 
+* Add command-line option, -z nosectionheader, to omit ELF section header
+  when building an executable or shared library.
+
 Changes in 2.34:
 
 * The ld check for "PHDR segment not covered by LOAD segment" is more
diff --git a/ld/emultempl/elf.em b/ld/emultempl/elf.em
index bb7e537530..1ce1af1236 100644
--- a/ld/emultempl/elf.em
+++ b/ld/emultempl/elf.em
@@ -752,6 +752,10 @@  fragment <<EOF
 	{
 	  link_info.flags_1 |= DF_1_GLOBAUDIT;
 	}
+      else if (strcmp (optarg, "sectionheader") == 0)
+	config.no_section_header = FALSE;
+      else if (strcmp (optarg, "nosectionheader") == 0)
+	config.no_section_header = TRUE;
 EOF
 
 if test x"$GENERATE_SHLIB_SCRIPT" = xyes; then
diff --git a/ld/ld.h b/ld/ld.h
index 71fd781267..e969263b3b 100644
--- a/ld/ld.h
+++ b/ld/ld.h
@@ -280,6 +280,9 @@  typedef struct
   /* If set, code and non-code sections should never be in one segment.  */
   bfd_boolean separate_code;
 
+  /* If set,  generation of ELF section header should be suppressed.  */
+  bfd_boolean no_section_header;
+
   /* The rpath separation character.  Usually ':'.  */
   char rpath_separator;
 
diff --git a/ld/ld.texi b/ld/ld.texi
index 27343c798f..54ba174cc2 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -1295,6 +1295,12 @@  relocation, if supported.  Specifying @samp{common-page-size} smaller
 than the system page size will render this protection ineffective.
 Don't create an ELF @code{PT_GNU_RELRO} segment if @samp{norelro}.
 
+@item sectionheader
+@itemx nosectionheader
+Generate section header when building an executable or shared library.
+Don't generate section header if @samp{sectionheader} is used.
+@option{sectionheader} is the default.
+
 @item separate-code
 @itemx noseparate-code
 Create separate code @code{PT_LOAD} segment header in the object.  This
diff --git a/ld/ldlang.c b/ld/ldlang.c
index 63f9d182ea..df187c21bf 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -3432,6 +3432,10 @@  ldlang_open_output (lang_statement_union_type *statement)
 	link_info.output_bfd->flags |= BFD_TRADITIONAL_FORMAT;
       else
 	link_info.output_bfd->flags &= ~BFD_TRADITIONAL_FORMAT;
+      if (config.no_section_header)
+	link_info.output_bfd->flags |= BFD_NO_SECTION_HEADER;
+      else
+	link_info.output_bfd->flags &= ~BFD_NO_SECTION_HEADER;
       break;
 
     case lang_target_statement_enum:
diff --git a/ld/lexsup.c b/ld/lexsup.c
index 3d15cc491d..621c8d6605 100644
--- a/ld/lexsup.c
+++ b/ld/lexsup.c
@@ -1676,6 +1676,14 @@  parse_args (unsigned argc, char **argv)
       break;
     }
 
+  if (config.no_section_header)
+    {
+      if (link_info.type == type_relocatable)
+	einfo (_("%F%P: -r and -z nosectionheader may not be used together\n"));
+      /* -z nosectionheader implies --strip-all.  */
+      link_info.strip = strip_all;
+    }
+
   if (!bfd_link_dll (&link_info))
     {
       if (command_line.filter_shlib)
@@ -1879,6 +1887,10 @@  elf_static_list_options (FILE *file)
   -z noexecstack              Mark executable as not requiring executable stack\n"));
   fprintf (file, _("\
   -z globalaudit              Mark executable requiring global auditing\n"));
+  fprintf (file, _("\
+  -z sectionheader            Generate section header (default)\n"));
+  fprintf (file, _("\
+  -z nosectionheader          Do not generate section header\n"));
 }
 
 static void