RFC: Disassembling from a symbol

Message ID ddb00f15-36fc-c452-d54f-0c834d681da5@redhat.com
State New
Headers show
Series
  • RFC: Disassembling from a symbol
Related show

Commit Message

Nick Clifton Jan. 16, 2019, 12:03 p.m.
Hi Florian, Hi Guys,

  OK, I think that I have got a solution to this problem now.
  The attached patch updates objdump's --disassemble=<symbol>
  option so that if <symbol> is not a function symbol then 
  disassembly stops at the next symbol encountered.  (Ie 
  exactly as it did before). But if <symbol> is a function
  symbol then disassembly continues for the next N octets,
  where N is the size of <symbol>.  If <symbol> is not an
  ELF symbol (or its size is zero) then disassembly 
  continues up to the next function symbol.

  Does this sound right to you guys ?

Cheers
  Nick

Comments

Michael Matz Jan. 21, 2019, 3:52 p.m. | #1
Hi,

ah, sorry for repeating this, but I answered on the other thread before 
I've gone through all mails.  Anyway, I don't like the check of the symbol 
being a function symbol.  It can validly be a data array containing 
instruction bytes or an unmarked symbol, and when someone says 
--disassemble=foobar and then the tool merely doesn't do anything even 
though 'foobar' is clearly a symbol per symtab, then that user will 
rightfully roll his eyes on the unhelpfulness of objdump.  "Sure I could 
disassemble this, but I won't, hahaha!".


Ciao,
Michael.


 On Wed, 16 Jan 2019, Nick Clifton wrote:

> Hi Florian, Hi Guys,

> 

>   OK, I think that I have got a solution to this problem now.

>   The attached patch updates objdump's --disassemble=<symbol>

>   option so that if <symbol> is not a function symbol then 

>   disassembly stops at the next symbol encountered.  (Ie 

>   exactly as it did before). But if <symbol> is a function

>   symbol then disassembly continues for the next N octets,

>   where N is the size of <symbol>.  If <symbol> is not an

>   ELF symbol (or its size is zero) then disassembly 

>   continues up to the next function symbol.

> 

>   Does this sound right to you guys ?
Nick Clifton Jan. 21, 2019, 3:59 p.m. | #2
Hi Michael,

> Anyway, I don't like the check of the symbol 

> being a function symbol.  It can validly be a data array containing 

> instruction bytes or an unmarked symbol,


Ah sorry, I was not clear.  The symbol *can* be any kind of symbol.
It is just that for a non-function symbol disassembly will only continue
until the next symbol is encountered.  For a function symbol however,
disassembly will continue until the end of the function is reached, even
if other symbols are encountered on they way.  

Does that make sense ?

Cheers
  Nick

PS.  Just to be clear the "next symbol" in this context means that next
symbol that objdump would display whilst disassembling.  Objdump 
automatically skips over certain symbols, like the ARM mapping state
symbols and assembler local symbols.  So these would not be considered
as the "next symbol".
Michael Matz Jan. 21, 2019, 11:22 p.m. | #3
Hi,

On Mon, 21 Jan 2019, Nick Clifton wrote:

> > Anyway, I don't like the check of the symbol 

> > being a function symbol.  It can validly be a data array containing 

> > instruction bytes or an unmarked symbol,

> 

> Ah sorry, I was not clear.  The symbol *can* be any kind of symbol.

> It is just that for a non-function symbol disassembly will only continue

> until the next symbol is encountered.  For a function symbol however,

> disassembly will continue until the end of the function is reached, even

> if other symbols are encountered on they way.  

> 

> Does that make sense ?


Ah, thanks for the clarification; yes that makes sense.


Ciao,
Michael.
P.S: the restriction that only executable sections are disassembled could 
also be relaxed when a specific symbol is wanted, though that might be 
more hackery than it's worth in objdump.

Patch

2019-01-16  Nick Clifton  <nickc@redhat.com>

	* objdump.c (disassemble_section): When disassembling from a
	symbol only stop at the next symbol if the original symbol was not
	a function symbol.  Otherwise continue disassembling until a new
	function is reached.
	* testsuite/binutils-all/objdump.exp: Add tests of extended
	functionality.
	* testsuite/binutils-all/disasm.s: New test source file.

diff --git a/binutils/NEWS b/binutils/NEWS
index 56bd7400d2..0b8cc8e18b 100644
--- a/binutils/NEWS
+++ b/binutils/NEWS
@@ -13,7 +13,7 @@ 
 
 * Objdump's --disassemble option can now take a parameter, specifying the
   starting symbol for disassembly.  Disassembly will continue from this
-  symbol up to the next symbol.
+  symbol up to the next symbol or the end of the function.
 
 * The MIPS port now supports the Loongson 2K1000 processor which implements
   the MIPS64r2 ISA, the Loongson-mmi ASE, Loongson-cam ASE, Loongson-ext ASE,
diff --git a/binutils/doc/binutils.texi b/binutils/doc/binutils.texi
index 5664b9c7a5..49101888f5 100644
--- a/binutils/doc/binutils.texi
+++ b/binutils/doc/binutils.texi
@@ -2230,9 +2230,11 @@  with ctags tool.
 Display the assembler mnemonics for the machine instructions from the
 input file.  This option only disassembles those sections which are 
 expected to contain instructions.  If the optional @var{symbol}
-argument is given, then display the assembler mnemonics only from
-@var{symbol} up to next symbol.  If there are no matches for
-@var{symbol} then nothing will be displayed.
+argument is given, then display the assembler mnemonics starting at
+@var{symbol}.  If @var{symbol} is a function name then disassembly
+will stop at the end of the function, otherwise it will stop when the
+next symbol is encountered.  If there are no matches for @var{symbol}
+then nothing will be displayed.
 
 @item -D
 @itemx --disassemble-all
diff --git a/binutils/objdump.c b/binutils/objdump.c
index 2300a66a8a..872539068c 100644
--- a/binutils/objdump.c
+++ b/binutils/objdump.c
@@ -2211,6 +2211,13 @@  disassemble_section (bfd *abfd, asection *section, void *inf)
   long                         rel_count;
   bfd_vma                      rel_offset;
   unsigned long                addr_offset;
+  bfd_boolean                  do_print;
+  enum loop_control
+  {
+   stop_offset_reached,
+   function_sym,
+   next_sym
+  } loop_until;
 
   /* Sections that do not contain machine
      code are not normally disassembled.  */
@@ -2328,13 +2335,15 @@  disassemble_section (bfd *abfd, asection *section, void *inf)
      the symbol we have just found.  Then print the symbol and find the
      next symbol on.  Repeat until we have disassembled the entire section
      or we have reached the end of the address range we are interested in.  */
+  do_print = paux->symbol == NULL;
+  loop_until = stop_offset_reached;
+
   while (addr_offset < stop_offset)
     {
       bfd_vma addr;
       asymbol *nextsym;
       bfd_vma nextstop_offset;
       bfd_boolean insns;
-      bfd_boolean do_print = TRUE;
 
       addr = section->vma + addr_offset;
       addr = ((addr & ((sign_adjust << 1) - 1)) ^ sign_adjust) - sign_adjust;
@@ -2360,20 +2369,80 @@  disassemble_section (bfd *abfd, asection *section, void *inf)
 	  pinfo->symtab_pos = -1;
 	}
 
+      /* If we are only disassembling from a specific symbol,
+	 check to see if we should start or stop displaying.  */
       if (sym && paux->symbol)
 	{
-	  const char *name = bfd_asymbol_name (sym);
-	  char *alloc = NULL;
+	  if (do_print)
+	    {
+	      /* See if we should stop printing.  */
+	      switch (loop_until)
+		{
+		case function_sym:
+		  if (sym->flags & BSF_FUNCTION)
+		    do_print = FALSE;
+		  break;
 
-	  if (do_demangle && name[0] != '\0')
+		case stop_offset_reached:
+		  /* Handled by the while loop.  */
+		  break;
+
+		case next_sym:
+		  /* FIXME: There is an implicit assumption here
+		     that the name of sym is different from
+		     paux->symbol.  */
+		  if (! bfd_is_local_label (abfd, sym))
+		    do_print = FALSE;
+		  break;
+		}
+	    }
+	  else
 	    {
-	      /* Demangle the name.  */
-	      alloc = bfd_demangle (abfd, name, demangle_flags);
-	      if (alloc != NULL)
-		name = alloc;
+	      const char * name = bfd_asymbol_name (sym);
+	      char * alloc = NULL;
+
+	      if (do_demangle && name[0] != '\0')
+		{
+		  /* Demangle the name.  */
+		  alloc = bfd_demangle (abfd, name, demangle_flags);
+		  if (alloc != NULL)
+		    name = alloc;
+		}
+
+	      /* We are not currently printing.  Check to see
+		 if the current symbol matches the requested symbol.  */
+	      if (streq (name, paux->symbol))
+		{
+		  do_print = TRUE;
+
+		  if (sym->flags & BSF_FUNCTION)
+		    {
+		      if (bfd_get_flavour (abfd) == bfd_target_elf_flavour
+			  && ((elf_symbol_type *) sym)->internal_elf_sym.st_size > 0)
+			{
+			  /* Sym is a function symbol with a size associated
+			     with it.  Turn on automatic disassembly for the
+			     next VALUE bytes.  */
+			  stop_offset = addr_offset
+			    + ((elf_symbol_type *) sym)->internal_elf_sym.st_size;
+			  loop_until = stop_offset_reached;
+			}
+		      else
+			{
+			  /* Otherwise we need to tell the loop heuristic to
+			     loop until the next function symbol is encountered.  */
+			  loop_until = function_sym;
+			}
+		    }
+		  else
+		    {
+		      /* Otherwise loop until the next symbol is encountered.  */
+		      loop_until = next_sym;
+		    }
+		}
+
+	      free (alloc);
 	    }
-	  do_print = streq (name, paux->symbol);
-	  free (alloc);
 	}
 
       if (! prefix_addresses && do_print)
@@ -2438,13 +2507,9 @@  disassemble_section (bfd *abfd, asection *section, void *inf)
 	insns = FALSE;
 
       if (do_print)
-	{
-	  disassemble_bytes (pinfo, paux->disassemble_fn, insns, data,
-			     addr_offset, nextstop_offset,
-			     rel_offset, &rel_pp, rel_ppend);
-	  if (paux->symbol)
-	    break;
-	}
+	disassemble_bytes (pinfo, paux->disassemble_fn, insns, data,
+			   addr_offset, nextstop_offset,
+			   rel_offset, &rel_pp, rel_ppend);
 
       addr_offset = nextstop_offset;
       sym = nextsym;
diff --git a/binutils/testsuite/binutils-all/objdump.exp b/binutils/testsuite/binutils-all/objdump.exp
index 50c81ba3d5..dd2e9bb02d 100644
--- a/binutils/testsuite/binutils-all/objdump.exp
+++ b/binutils/testsuite/binutils-all/objdump.exp
@@ -62,7 +62,7 @@  if [regexp $want $got] then {
 
 
 if {![binutils_assemble $srcdir/$subdir/bintest.s tmpdir/bintest.o]} then {
-    fail "objdump (assembling)"
+    fail "objdump (assembling bintest.s)"
     return
 }
 if {![binutils_assemble $srcdir/$subdir/bintest.s tmpdir/bintest2.o]} then {
@@ -280,8 +280,95 @@  proc test_objdump_d_sym { testfile dumpfile } {
 }
 
 test_objdump_d_sym $testfile $testfile
-if { [ remote_file host exists $testarchive ] } then {
-    test_objdump_d_sym $testarchive bintest2.o
+
+proc test_objdump_d_func_sym { testfile dumpfile } {
+    global OBJDUMP
+    global OBJDUMPFLAGS
+
+    set got [binutils_run $OBJDUMP "$OBJDUMPFLAGS --disassemble=func --disassemble-zeroes $testfile"]
+
+    set want "$dumpfile:.*Disassembly of section"
+    if ![regexp $want $got] then {
+	fail "objdump --disassemble=func $testfile: No disassembly title"
+	return
+    }
+
+    set want "$dumpfile:.*00+0 <start_of_text>"
+    if [regexp $want $got] then {
+	fail "objdump --disassemble=func $testfile: First symbol displayed, when it should be absent"
+	return
+    }
+
+    set want "$dumpfile:.*00+. <func>"
+    if ![regexp $want $got] then {
+	fail "objdump --disassemble=func $testfile: Disassembly does not start at function symbol"
+	return
+    }
+
+    set want "$dumpfile:.*00+. <global_non_func_sym>"
+    if ![regexp $want $got] then {
+	fail "objdump --disassemble=func $testfile: Non function symbol not displayed"
+	return
+    }
+
+    set want "$dumpfile:.*00+. <next_func>"
+    if [regexp $want $got] then {
+	fail "objdump --disassemble=func $testfile: Disassembly did not stop at the next function"
+	return
+    }
+
+    pass "objdump --disassemble=func $testfile"
+}
+
+proc test_objdump_d_non_func_sym { testfile dumpfile } {
+    global OBJDUMP
+    global OBJDUMPFLAGS
+
+    set got [binutils_run $OBJDUMP "$OBJDUMPFLAGS --disassemble=global_non_func_sym $testfile"]
+
+    set want "$dumpfile:.*Disassembly of section"
+    if ![regexp $want $got] then {
+	fail "objdump --disassemble=non_func $testfile: No disassembly title"
+	return
+    }
+
+    set want "$dumpfile:.*00+0 <start_of_text>"
+    if [regexp $want $got] then {
+	fail "objdump --disassemble=non_func $testfile: First symbol displayed, when it should be absent"
+	return
+    }
+
+    set want "$dumpfile:.*00+. <global_non_func_sym>"
+    if ![regexp $want $got] then {
+	fail "objdump --disassemble=non_func $testfile: Non function symbol not displayed"
+	return
+    }
+
+    set want "$dumpfile:.*00+. <local_non_func_sym>"
+    if [regexp $want $got] then {
+	fail "objdump --disassemble=non_func $testfile: Disassembly did not stop at the next symbol"
+	return
+    }
+
+    pass "objdump --disassemble=non_func $testfile"
+}
+
+# Extra test for ELF format - check that --disassemble=func disassembles
+# all of func, and does not stop at the next symbol.
+if { [is_elf_format] } then {
+
+    if {![binutils_assemble $srcdir/$subdir/disasm.s tmpdir/disasm.o]} then {
+	fail "objdump --disassemble=func (assembling disasm.s)"
+    } else {
+	if [is_remote host] {
+	    set elftestfile [remote_download host tmpdir/disasm.o]
+	} else {
+	    set elftestfile tmpdir/disasm.o
+	}
+    
+	test_objdump_d_func_sym $elftestfile $elftestfile
+	test_objdump_d_non_func_sym $elftestfile $elftestfile
+    }
 }
 
 
--- /dev/null	2019-01-16 08:04:59.280999228 +0000
+++ binutils/testsuite/binutils-all/disasm.s	2019-01-16 10:40:12.272649563 +0000
@@ -0,0 +1,24 @@ 
+	.text
+	
+	.globl start_of_text
+start_of_text:
+	.type start_of_text, "function"
+	.long	1
+	.size start_of_text, . - start_of_text
+
+	.globl func
+func:
+	.type func, "function"
+	.long	2
+	.global global_non_func_sym
+global_non_func_sym:
+	.long	3
+local_non_func_sym:
+	.long	4
+	.size func, . - func
+
+	.globl next_func
+next_func:	
+	.type next_func, "function"
+	.long	5
+	.size next_func, . - next_func