RISC-V: enable lui relaxation for CODE and MERGE sections.

Message ID 36e6a5e5752cb65bc13667c50c3d105c@optimitech.com
State New
Headers show
Series
  • RISC-V: enable lui relaxation for CODE and MERGE sections.
Related show

Commit Message

Ilia Diachkov June 21, 2019, 6:58 p.m.
Hello,

This patch enables lui relaxation for CODE and MERGE sections on RISC-V. 
It helps to reduce code size.

The enabling exposed a linking time failure with a message like 
"relocation truncated to fit: R_RISCV_RVC_LUI against `.LC46'" during 
building of some large projects. The issue was caused by an error in 
symbol offset evaluation in merge sections on relaxation. It's also 
fixed in the patch.

Dejagnu testing on riscv-toolchain was passed for the patch. Also we 
have successfully built freedom-u-sdk, meta-openembedded and spec2006 
C/C++ tests by patched toolchain.

Please commit this patch since I have no write access.

Best regards,
Ilia.

bfd/ChangeLog:
	* elfnn-riscv.c (_bfd_riscv_relax_lui): enable relaxation for CODE and 
MERGE sections.
	(_bfd_riscv_relax_section): fix error in symbol offset evaluation in 
merge sections on relaxation.

Comments

Jim Wilson June 24, 2019, 8:55 p.m. | #1
On Fri, Jun 21, 2019 at 11:59 AM Ilia Diachkov
<ilia.diachkov@optimitech.com> wrote:
> This patch enables lui relaxation for CODE and MERGE sections on RISC-V.

> It helps to reduce code size.


I spotted two minor style issues.  An extra blank line where we had
the bfd_assert call during testing.  And a missing space before an
open paren.

I rewrote the ChangeLog entry.  It should describe the actual changes
that are being made, and not the effect of the change.  Also, it
should be sentences that start with capital letter.

I did a quick test to make sure I didn't accidentally break it and
then committed and pushed it.

Jim
Kito Cheng July 23, 2019, 6:40 a.m. | #2
Hi Illa:

I just found a case will got `relocation truncated to fit` error on
rv32 linux with this patch.

Here is the tarball for reproduce, included minimized sysroot and
necessary objects/libraries.

`make relax` can reproduce the problem and `make norelax` for disable
relax build.

https://drive.google.com/file/d/1l-IOiK3Ay7s2mnFWYXOkHYyPFNsIV58H/view?usp=sharing


On Tue, Jun 25, 2019 at 4:55 AM Jim Wilson <jimw@sifive.com> wrote:
>

> On Fri, Jun 21, 2019 at 11:59 AM Ilia Diachkov

> <ilia.diachkov@optimitech.com> wrote:

> > This patch enables lui relaxation for CODE and MERGE sections on RISC-V.

> > It helps to reduce code size.

>

> I spotted two minor style issues.  An extra blank line where we had

> the bfd_assert call during testing.  And a missing space before an

> open paren.

>

> I rewrote the ChangeLog entry.  It should describe the actual changes

> that are being made, and not the effect of the change.  Also, it

> should be sentences that start with capital letter.

>

> I did a quick test to make sure I didn't accidentally break it and

> then committed and pushed it.

>

> Jim
Kito Cheng July 23, 2019, 6:45 a.m. | #3
Hi Ilia:

Forgot to attend my binutils configuration and git hash:
../binutils-gdb/configure  --target=riscv32-unknown-linux-gnu

commit 89356123a17c27548c7e71f4f000b1f74e551c31
Refs: {origin/master}, {origin/HEAD},
users/ARM/embedded-gdb-master-2018q4-2078-g89356123a1
Author:     GDB Administrator <gdbadmin@sourceware.org>
AuthorDate: Tue Jul 23 00:00:36 2019 +0000
Commit:     GDB Admin <brobecker@adacore.com>
CommitDate: Tue Jul 23 00:00:36 2019 +0000

   Automatic date update in version.in

On Tue, Jul 23, 2019 at 2:40 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>

> Hi Illa:

>

> I just found a case will got `relocation truncated to fit` error on

> rv32 linux with this patch.

>

> Here is the tarball for reproduce, included minimized sysroot and

> necessary objects/libraries.

>

> `make relax` can reproduce the problem and `make norelax` for disable

> relax build.

>

> https://drive.google.com/file/d/1l-IOiK3Ay7s2mnFWYXOkHYyPFNsIV58H/view?usp=sharing

>

>

> On Tue, Jun 25, 2019 at 4:55 AM Jim Wilson <jimw@sifive.com> wrote:

> >

> > On Fri, Jun 21, 2019 at 11:59 AM Ilia Diachkov

> > <ilia.diachkov@optimitech.com> wrote:

> > > This patch enables lui relaxation for CODE and MERGE sections on RISC-V.

> > > It helps to reduce code size.

> >

> > I spotted two minor style issues.  An extra blank line where we had

> > the bfd_assert call during testing.  And a missing space before an

> > open paren.

> >

> > I rewrote the ChangeLog entry.  It should describe the actual changes

> > that are being made, and not the effect of the change.  Also, it

> > should be sentences that start with capital letter.

> >

> > I did a quick test to make sure I didn't accidentally break it and

> > then committed and pushed it.

> >

> > Jim
Ilia Diachkov July 31, 2019, 11:19 p.m. | #4
Hi Kito,

I have investigated the fail. It is triggered by my fix but the fix is 
definitely correct and the problem lays in the general relaxation 
mechanism and its implementation for riscv.

If you look to the end of _bfd_riscv_relax_lui, you'll find comments 
"Alignment might move the section forward;     account for this assuming 
page alignment at worst." and check "VALID_RVC_LUI_IMM 
(RISCV_CONST_HIGH_PART (symval + ELF_MAXPAGESIZE))". I.e. the range of 
addresses accepted by lui is reduced to avoid errors during relaxation. 
It's reasonable since current implementation of relaxation runs several 
passes and the first/second ones may give to a symbol the smallest 
address while the next/last one may increase it.

Briefly saying my investigation shown that the check must be more 
conservative in the case when we link objects with RELRO segment. That 
is because the code in lang_size_relro_segment_1 (ld/ldlang.c) tries to 
make the end of the segment aligned by page size. It achieves this 
(later) by increasing the addresses of sections belonging to the 
segment. So if you have a symbol which address is bigger then the 
address of the first section of RELRO segment, the symbol's address may 
be shifted forward by 1 page size by lang_size_relro_segment which is 
called during relaxation.

In the failed test case, address of .tm_clone_table section is loaded by 
a lui insn in .text section. At one moment during relaxation (it's pass 
0, see comments for _bfd_riscv_relax_section) the section gets address 
0x1e104. So it's Ok to convert that lui insn to c.lui. However later on 
pass 2 all sections are moved forward and the address becomes 0x1f0f4. 
It would be correct address for c.lui instruction. However RELRO segment 
is located before tm_clone_table section and later 
lang_size_relro_segment makes additional alignment of the segment's end 
by page size so tm_clone_table gets 0x20000 address which is not 
suitable for c.lui and breaks the relocation processing.

Thus we have to change the condition of c.lui->lui conversion. The 
simplest way I see is just to harden the worst assumption in the case we 
have RELRO segment. This patch is attached. It heals the failure. I have 
tested it by dejagnu with both eabi and linux toolchains and have 
checked freedom-u-sdk build. Also I have verified on tests (EEMBC 
automotive 1.1/2.0, SPEC2006 C/C++, riscv-beebs benchmarks) that the fix 
does not affect code size with eabi toolchain.

Since the symval is positive I'm not sure whether we really need to 
check this condition in elfnn-riscv.c twice:
       && VALID_RVC_LUI_IMM (RISCV_CONST_HIGH_PART (symval))
       && VALID_RVC_LUI_IMM (RISCV_CONST_HIGH_PART (symval + 
ELF_MAXPAGESIZE))
so I leaved the stricter check.

The more complex approach may include address comparison of RELRO 
segment (its first section) and relocatable symbol and if the symbol 
address is bigger then use stricter conditions on the symbol address.

Best regards,
Ilia.

bfd/ChangeLog:
     * elfnn-riscv.c (_bfd_riscv_relax_lui): set lui relax safety area to 
two pages in relro presence.

Kito Cheng wrote 2019-07-23 09:45:
> Hi Ilia:

> 

> Forgot to attend my binutils configuration and git hash:

> ../binutils-gdb/configure  --target=riscv32-unknown-linux-gnu

> 

> commit 89356123a17c27548c7e71f4f000b1f74e551c31

> Refs: {origin/master}, {origin/HEAD},

> users/ARM/embedded-gdb-master-2018q4-2078-g89356123a1

> Author:     GDB Administrator <gdbadmin@sourceware.org>

> AuthorDate: Tue Jul 23 00:00:36 2019 +0000

> Commit:     GDB Admin <brobecker@adacore.com>

> CommitDate: Tue Jul 23 00:00:36 2019 +0000

> 

>    Automatic date update in version.in

> 

> On Tue, Jul 23, 2019 at 2:40 PM Kito Cheng <kito.cheng@gmail.com> 

> wrote:

>> 

>> Hi Illa:

>> 

>> I just found a case will got `relocation truncated to fit` error on

>> rv32 linux with this patch.

>> 

>> Here is the tarball for reproduce, included minimized sysroot and

>> necessary objects/libraries.

>> 

>> `make relax` can reproduce the problem and `make norelax` for disable

>> relax build.

>> 

>> https://drive.google.com/file/d/1l-IOiK3Ay7s2mnFWYXOkHYyPFNsIV58H/view?usp=sharing

>> 

>> 

>> On Tue, Jun 25, 2019 at 4:55 AM Jim Wilson <jimw@sifive.com> wrote:

>> >

>> > On Fri, Jun 21, 2019 at 11:59 AM Ilia Diachkov

>> > <ilia.diachkov@optimitech.com> wrote:

>> > > This patch enables lui relaxation for CODE and MERGE sections on RISC-V.

>> > > It helps to reduce code size.

>> >

>> > I spotted two minor style issues.  An extra blank line where we had

>> > the bfd_assert call during testing.  And a missing space before an

>> > open paren.

>> >

>> > I rewrote the ChangeLog entry.  It should describe the actual changes

>> > that are being made, and not the effect of the change.  Also, it

>> > should be sentences that start with capital letter.

>> >

>> > I did a quick test to make sure I didn't accidentally break it and

>> > then committed and pushed it.

>> >

>> > Jim
---
 bfd/elfnn-riscv.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 003b4f8..e618477 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -3562,11 +3562,14 @@ _bfd_riscv_relax_lui (bfd *abfd,
     }
 
   /* Can we relax LUI to C.LUI?  Alignment might move the section forward;
-     account for this assuming page alignment at worst.  */
+     account for this assuming page alignment at worst. In the presence of 
+     RELRO segment the linker aligns it by one page size, therefore sections
+     after the segment can be moved more than one page. */
+  symval += link_info->relro ? 2 * ELF_MAXPAGESIZE : ELF_MAXPAGESIZE;
+
   if (use_rvc
       && ELFNN_R_TYPE (rel->r_info) == R_RISCV_HI20
-      && VALID_RVC_LUI_IMM (RISCV_CONST_HIGH_PART (symval))
-      && VALID_RVC_LUI_IMM (RISCV_CONST_HIGH_PART (symval + ELF_MAXPAGESIZE)))
+      && VALID_RVC_LUI_IMM (RISCV_CONST_HIGH_PART (symval)))
     {
       /* Replace LUI with C.LUI if legal (i.e., rd != x0 and rd != x2/sp).  */
       bfd_vma lui = bfd_get_32 (abfd, contents + rel->r_offset);
Jim Wilson Aug. 1, 2019, 11:42 p.m. | #5
On Wed, Jul 31, 2019 at 4:19 PM Ilia Diachkov
<ilia.diachkov@optimitech.com> wrote:
> Briefly saying my investigation shown that the check must be more

> conservative in the case when we link objects with RELRO segment. That

> is because the code in lang_size_relro_segment_1 (ld/ldlang.c) tries to

> make the end of the segment aligned by page size. It achieves this

> (later) by increasing the addresses of sections belonging to the

> segment. So if you have a symbol which address is bigger then the

> address of the first section of RELRO segment, the symbol's address may

> be shifted forward by 1 page size by lang_size_relro_segment which is

> called during relaxation.


I think it is a little more complicated than this.  I think the linker
is getting confused by some padding alignment.  The linker is trying
to align the data sections to start at a page boundary.  And it is
trying to align the end of the relro sections to a page boundary.  In
theory this should just be one alignment as they are adjacent.  But
along the way some alignment padding bytes end up between the relro
sections and the data sections, and this seems to confuse the linker
into doing two alignments.  In one pass, it aligns the start of the
data section.  Then in the next pass, it notices that the relro
section ends 0xf4 bytes before a page boundary because of padding
bytes and aligns it, putting the data sections 0xf4 bytes after a page
boundary.  Then it has to align the data sections again in the same
pass.  When we come around the next time, it seems to properly handle
the padding bytes and correctly only does one alignment.

relro is linux specific, and it is the embedded developers that care
the most about code size, so I think increasing the alignment check to
two pages with relro is fine.

> Since the symval is positive I'm not sure whether we really need to

> check this condition in elfnn-riscv.c twice:

>        && VALID_RVC_LUI_IMM (RISCV_CONST_HIGH_PART (symval))

>        && VALID_RVC_LUI_IMM (RISCV_CONST_HIGH_PART (symval +

> ELF_MAXPAGESIZE))

> so I leaved the stricter check.


On second thought, I don't think that this is right.  c.lui can load
0xfffe0000 to 0xfffff000 and 0x00001000 to 0x0001f000.  So there is a
problem if the symval is in the zero page and adding the offset places
it after 0x00001000.  And there is a problem if symval is before
0xfffe0000 and adding the offset puts if after.  So I think we still
need the two tests.

Otherwise this looks OK to me.  I committed a slightly modified form
of the patch that keeps the two symval checks.

Jim

Patch

---
 bfd/elfnn-riscv.c | 57 +++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 47 insertions(+), 10 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 1bddbca..e26da5d 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -3517,10 +3517,6 @@  _bfd_riscv_relax_lui (bfd *abfd,
   bfd_vma gp = riscv_global_pointer_value (link_info);
   int use_rvc = elf_elfheader (abfd)->e_flags & EF_RISCV_RVC;
 
-  /* Mergeable symbols and code might later move out of range.  */
-  if (sym_sec->flags & (SEC_MERGE | SEC_CODE))
-    return TRUE;
-
   BFD_ASSERT (rel->r_offset + 4 <= sec->size);
 
   if (gp)
@@ -3881,6 +3877,7 @@  _bfd_riscv_relax_section (bfd *abfd, asection *sec,
       relax_func_t relax_func;
       int type = ELFNN_R_TYPE (rel->r_info);
       bfd_vma symval;
+      char symtype;
 
       relax_func = NULL;
       if (info->relax_pass == 0)
@@ -3946,7 +3943,7 @@  _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	    ? 0 : isym->st_size - rel->r_addend;
 
 	  if (isym->st_shndx == SHN_UNDEF)
-	    sym_sec = sec, symval = sec_addr (sec) + rel->r_offset;
+	    sym_sec = sec, symval = rel->r_offset;
 	  else
 	    {
 	      BFD_ASSERT (isym->st_shndx < elf_numsections (abfd));
@@ -3959,8 +3956,9 @@  _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	      if (sec_addr (sym_sec) == 0)
 		continue;
 #endif
-	      symval = sec_addr (sym_sec) + isym->st_value;
+	      symval = isym->st_value;
 	    }
+	  symtype = ELF_ST_TYPE (isym->st_info);
 	}
       else
 	{
@@ -3975,21 +3973,60 @@  _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	    h = (struct elf_link_hash_entry *) h->root.u.i.link;
 
 	  if (h->plt.offset != MINUS_ONE)
-	    symval = sec_addr (htab->elf.splt) + h->plt.offset;
+	    {
+	      sym_sec = htab->elf.splt;
+	      symval = h->plt.offset;
+	    }
 	  else if (h->root.u.def.section->output_section == NULL
 		   || (h->root.type != bfd_link_hash_defined
 		       && h->root.type != bfd_link_hash_defweak))
 	    continue;
 	  else
-	    symval = sec_addr (h->root.u.def.section) + h->root.u.def.value;
+	    {
+	      symval = h->root.u.def.value;
+	      sym_sec = h->root.u.def.section;
+	    }
 
 	  if (h->type != STT_FUNC)
 	    reserve_size =
 	      (h->size - rel->r_addend) > h->size ? 0 : h->size - rel->r_addend;
-	  sym_sec = h->root.u.def.section;
+	  symtype = h->type;
 	}
 
-      symval += rel->r_addend;
+      if (sym_sec->sec_info_type == SEC_INFO_TYPE_MERGE
+          && (sym_sec->flags & SEC_MERGE))
+	{
+
+	  /* At this stage in linking, no SEC_MERGE symbol has been
+	     adjusted, so all references to such symbols need to be
+	     passed through _bfd_merged_section_offset.  (Later, in
+	     relocate_section, all SEC_MERGE symbols *except* for
+	     section symbols have been adjusted.)
+
+	     gas may reduce relocations against symbols in SEC_MERGE
+	     sections to a relocation against the section symbol when
+	     the original addend was zero.  When the reloc is against
+	     a section symbol we should include the addend in the
+	     offset passed to _bfd_merged_section_offset, since the
+	     location of interest is the original symbol.  On the
+	     other hand, an access to "sym+addend" where "sym" is not
+	     a section symbol should not include the addend;  Such an
+	     access is presumed to be an offset from "sym";  The
+	     location of interest is just "sym".  */
+	   if (symtype == STT_SECTION)
+	     symval += rel->r_addend;
+
+	   symval = _bfd_merged_section_offset (abfd, &sym_sec,
+						elf_section_data (sym_sec)->sec_info,
+						symval);
+
+	   if (symtype != STT_SECTION)
+	     symval += rel->r_addend;
+	}
+      else
+	symval += rel->r_addend;
+
+      symval += sec_addr(sym_sec);
 
       if (!relax_func (abfd, sec, sym_sec, info, rel, symval,
 		       max_alignment, reserve_size, again,
-- 
1.8.3.1