[PATCHv5] Fix range end handling of inlined subroutines

Message ID AM6PR03MB51707FDDD28177A2B85DBE3BE4C40@AM6PR03MB5170.eurprd03.prod.outlook.com
State New
Headers show
Series
  • [PATCHv5] Fix range end handling of inlined subroutines
Related show

Commit Message

Bernd Edlinger April 4, 2020, 7:50 p.m.
Hi,

this is an updated version of my patch that was originally only intended
to fix the issues with inline functions in the same file.  That needed
re-basing anyway, because of a merge conflict.

I removed the code that does the special handling of end sequence
markers in record_line now, since it seems to cause more problems than
it solves.

I believe it will fix the regression that Louis pointed out, and
should fix the regression that Andrew wanted to fix with his
patch:

[PATCH 2/2] gdb: Preserve is-stmt lines when switch between files
https://marc.info/?l=gdb-patches&m=158595247916817&w=2

So I hope that will not be necessary after this.

There is a theoretic issue with line numbers at the end
of a function, these could coincidentally have the same
PC as the following function.  That might need more work
to solve that, in the moment I have not yet looked at
tracking that down.


Thanks
Bernd.

Comments

Andrew Burgess April 4, 2020, 10:07 p.m. | #1
* Bernd Edlinger <bernd.edlinger@hotmail.de> [2020-04-04 21:50:21 +0200]:

> Hi,

> 

> this is an updated version of my patch that was originally only intended

> to fix the issues with inline functions in the same file.  That needed

> re-basing anyway, because of a merge conflict.

> 

> I removed the code that does the special handling of end sequence

> markers in record_line now, since it seems to cause more problems than

> it solves.

> 

> I believe it will fix the regression that Louis pointed out,


Do you have a link or any extra information for this?  I'd like to
understand what the actual regression was.  I noticed there didn't
seem to be any extra tests added over previous versions of the patch,
is this regression covered by an existing test?

>                                                               and

> should fix the regression that Andrew wanted to fix with his

> patch:


> 

> [PATCH 2/2] gdb: Preserve is-stmt lines when switch between files

> https://marc.info/?l=gdb-patches&m=158595247916817&w=2

> 

> So I hope that will not be necessary after this.


It would be nice if you picked up the extra tests from that patch and
included them here if this is a replacement solution.

> 

> There is a theoretic issue with line numbers at the end

> of a function, these could coincidentally have the same

> PC as the following function.  That might need more work

> to solve that, in the moment I have not yet looked at

> tracking that down.

> 

> 

> Thanks

> Bernd.


> From 330eadf4b42e44bfa82c30a6bda21393fa4a54c8 Mon Sep 17 00:00:00 2001

> From: Bernd Edlinger <bernd.edlinger@hotmail.de>

> Date: Sun, 9 Feb 2020 21:13:17 +0100

> Subject: [PATCH] Fix range end handling of inlined subroutines

> 

> Since the is_stmt is now handled, it becomes

> possible to locate dubious is_stmt line entries

> at the end of an inlined function, even if the

> called inline function is in the same subfile.

> 

> When there is a sequence of line entries at the

> same address where an inline range ends, and the

> last item has is_stmt = 0, we force all previous

> items to have is_stmt = 0 as well.

> 

> If the last line at that address has is_stmt = 1,

> there is no need to change anything, since a step

> over will always stop at that last line from the

> same address, which is fine, since it is outside

> the subroutine.

> 

> With this change we loose the ability to set

> breakpoints on some lines using file:line syntax,

> but the data is not completely lost, as the

> line table is still holding those lines, just

> with the is_stmt flag reset.


This is a trade off that I don't think we should breeze over quite so
easily.  Many, many users make use of an IDE to drive GDB, and in
those cases they mostly use file:line syntax exclusively, so any loss
in this area is something we should consider seriously.

> 

> This is necessary as breakpoints on these lines

> are problematic, as the call stack is often

> wrong.


Maybe.  Yes I absolutely agree that in the test case, when we stop at
these line numbers, the call stack is wrong.  But, as GDB developers,
we can have a slightly different take on things.  In this particular
case the debug information we're supplied is just plain wrong.  This
is documented in this GCC bug:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474

( Bernd, I know you've already seen that, I include the link for anyone else
  following this thread. )

The lines that are part of the inline subroutine, in this test are
marked as is-stmt _and_ are outside the range of the inline
subroutine.  So the call stack GDB presents is not wrong w.r.t the
debug information we're given, it's the debug information that is
wrong in this case.

Now, there are many places where GDB tries to work around broken debug
information, but I think that in this case we give up too much.  I
would prefer to see if GCC can fix their DWARF creation first before
we take something like this.

>        From the block info they appear to be

> located in the caller, but they are actually meant

> to be part of the subroutine, therefore usually one

> frame is missing from the callstack when the

> execution stops there.


The lines shouldn't be marked as is-stmt in the location they are, or
GCC should extend the range slightly.

> 

> This is about the best we can do at the moment,

> unless location view information are added to the

> block ranges debug info structure, and location

> views are implemented in gdb in general.


Please do correct me if I'm wrong, but I don't think GCC can yet
produce view information for range endings, right?

I agree that the best solution here would be to have location views
for range start (GCC can do this) and range end (I think this is
missing), then scopes can merge over within a single address.

> 

> gdb:

> 2020-04-04  Bernd Edlinger  <bernd.edlinger@hotmail.de>

> 

> 	* buildsym.c (buildsym_compunit::record_line): Remove line deletion

> 	at end sequence marker.

> 	(buildsym_compunit::record_inline_range_end,

> 	patch_inline_end_pos): New helper functions.

> 	(buildsym_compunit::end_symtab_with_blockvector): Patch line table.

> 	(buildsym_compunit::~buildsym_compunit): Cleanup m_inline_end_vector.

> 	* buildsym.h (buildsym_compunit::record_inline_range_end): Declare.

> 	(buildsym_compunit::m_inline_end_vector,

> 	buildsym_compunit::m_inline_end_vector_length,

> 	buildsym_compunit::m_inline_end_vector_nitems): New data items.

> 	* dwarf2/read.c (dwarf2_rnglists_process,

> 	dwarf2_ranges_process): Don't ignore empty ranges here.

> 	(dwarf2_ranges_read): Ignore empty ranges here.

> 	(dwarf2_record_block_ranges): Pass end of range PC to

> 	record_inline_range_end for inline functions.

> 

> gdb/testsuite:

> 2020-04-04  Bernd Edlinger  <bernd.edlinger@hotmail.de>

> 

> 	* gdb.cp/step-and-next-inline.exp: Adjust test.

> ---

>  gdb/buildsym.c                                | 115 ++++++++++++++++++++------

>  gdb/buildsym.h                                |  11 +++

>  gdb/dwarf2/read.c                             |  22 +++--

>  gdb/testsuite/gdb.cp/step-and-next-inline.exp |  17 ----

>  4 files changed, 114 insertions(+), 51 deletions(-)

> 

> diff --git a/gdb/buildsym.c b/gdb/buildsym.c

> index fe07103..e6e7437 100644

> --- a/gdb/buildsym.c

> +++ b/gdb/buildsym.c

> @@ -113,6 +113,8 @@ struct pending_block

>        next1 = next->next;

>        xfree ((void *) next);

>      }

> +

> +  xfree (m_inline_end_vector);

>  }

>  

>  struct macro_table *

> @@ -691,31 +693,6 @@ struct blockvector *

>  		      * sizeof (struct linetable_entry))));

>      }

>  

> -  /* The end of sequence marker is special.  We need to reset the

> -     is_stmt flag on previous lines at the same PC, otherwise these

> -     lines may cause problems since they might be at the same address

> -     as the following function.  For instance suppose a function calls

> -     abort there is no reason to emit a ret after that point (no joke).

> -     So the label may be at the same address where the following

> -     function begins.  A similar problem appears if a label is at the

> -     same address where an inline function ends we cannot reliably tell

> -     if this is considered part of the inline function or the calling

> -     program or even the next inline function, so stack traces may

> -     give surprising results.  Expect gdb.cp/step-and-next-inline.exp

> -     to fail if these lines are not modified here.  */

> -  if (line == 0 && subfile->line_vector->nitems > 0)

> -    {

> -      e = subfile->line_vector->item + subfile->line_vector->nitems;

> -      do

> -	{

> -	  e--;

> -	  if (e->pc != pc || e->line == 0)

> -	    break;

> -	  e->is_stmt = 0;

> -	}

> -      while (e > subfile->line_vector->item);

> -    }

> -


I'm slightly nervous with this proposed change.  Code like this
existed before we started messing with is-stmt support, and I suspect
it probably dates from a time before testing was as good as it is
now.  I assume nothing regresses with this removed, but it would be
nice to try a few experiments I think with different combinations of
two files, empty line ranges, and having is-stmt before and after the
transition between files, just to confirm that we really don't need
this code any more.  I suspect this is the "line numbers at the end of
a function" issue you mentioned in your introductory text.

That said, I see how removing this provides an alternative solution to
the patch I posted.

>    e = subfile->line_vector->item + subfile->line_vector->nitems++;

>    e->line = line;

>    e->is_stmt = is_stmt ? 1 : 0;

> @@ -723,6 +700,90 @@ struct blockvector *

>  }

>  

>  

> +/* Record a PC where a inlined subroutine ends.  */

> +

> +void

> +buildsym_compunit::record_inline_range_end (CORE_ADDR end)

> +{

> +  /* The performance of this function is very important,

> +     it shall be O(n*log(n)) therefore we do not use std::vector

> +     here since some compilers, e.g. visual studio, do not

> +     guarantee that for vector::push_back.  */


I think we're going to need more of a group discussion on this.
Simply saying std::vector is too slow doesn't I'm afraid convince me.
There seem to be lots of other places in GDB where both performance is
critical, and we use ::push_back.

Is your claim that we should move away from std::vector in all these
cases?  Is this case somehow special?

Obviously I've made this point before, and you're sticking to your
position, which is fine, but I can't approve this patch without seeing
a more compelling argument against std::vector.  That said, if some
other maintainer is convinced, they're welcome to approve it.  Either
way, the comment would be better placed at the declaration of the
variable, rather than here I think.

> +  if (m_inline_end_vector == nullptr)

> +    {

> +      m_inline_end_vector_length = INITIAL_LINE_VECTOR_LENGTH;

> +      m_inline_end_vector = (CORE_ADDR *)

> +	xmalloc (sizeof (CORE_ADDR) * m_inline_end_vector_length);

> +      m_inline_end_vector_nitems = 0;

> +    }

> +  else if (m_inline_end_vector_nitems == m_inline_end_vector_length)

> +    {

> +      m_inline_end_vector_length *= 2;

> +      m_inline_end_vector = (CORE_ADDR *)

> +	xrealloc ((char *) m_inline_end_vector,

> +		  sizeof (CORE_ADDR) * m_inline_end_vector_length);

> +    }

> +

> +  m_inline_end_vector[m_inline_end_vector_nitems++] = end;

> +}

> +

> +

> +/* Patch the is_stmt bits at the given inline end address.

> +   The line table has to be already sorted.  */

> +

> +static void

> +patch_inline_end_pos (struct linetable *table, CORE_ADDR end)

> +{

> +  int a = 2, b = table->nitems - 1;

> +  struct linetable_entry *items = table->item;

> +

> +  /* We need at least two items with pc = end in the table.

> +     The lowest usable items are at pos 0 and 1, the highest

> +     usable items are at pos b - 2 and b - 1.  */

> +  if (a > b || end < items[1].pc || end > items[b - 2].pc)

> +    return;

> +

> +  /* Look for the first item with pc > end in the range [a,b].

> +     The previous element has pc = end or there is no match.

> +     We set a = 2, since we need at least two consecutive elements

> +     with pc = end to do anything useful.

> +     We set b = nitems - 1, since we are not interested in the last

> +     element which should be an end of sequence marker with line = 0

> +     and is_stmt = 1.  */

> +  while (a < b)

> +    {

> +      int c = (a + b) / 2;

> +

> +      if (end < items[c].pc)

> +	b = c;

> +      else

> +	a = c + 1;

> +    }

> +

> +  a--;

> +  if (items[a].pc != end || items[a].is_stmt)

> +    return;

> +

> +  /* When there is a sequence of line entries at the same address

> +     where an inline range ends, and the last item has is_stmt = 0,

> +     we force all previous items to have is_stmt = 0 as well.

> +     Setting breakpoints at those addresses is currently not

> +     supported, since it is unclear if the previous addresses are

> +     part of the subroutine or the calling program.  */

> +  do

> +    {

> +      /* We stop at the first line entry with a different address,

> +	 or when we see an end of sequence marker.  */

> +      a--;

> +      if (items[a].pc != end || items[a].line == 0)

> +	break;

> +

> +      items[a].is_stmt = 0;

> +    }

> +  while (a > 0);

> +}

> +

> +

>  /* Subroutine of end_symtab to simplify it.  Look for a subfile that

>     matches the main source file's basename.  If there is only one, and

>     if the main source file doesn't have any symbol or line number

> @@ -956,6 +1017,10 @@ struct compunit_symtab *

>  			      subfile->line_vector->item

>  			      + subfile->line_vector->nitems,

>  			      lte_is_less_than);

> +

> +	   for (int i = 0; i < m_inline_end_vector_nitems; i++)

> +	     patch_inline_end_pos (subfile->line_vector,

> +				   m_inline_end_vector[i]);

>  	}

>  

>        /* Allocate a symbol table if necessary.  */

> diff --git a/gdb/buildsym.h b/gdb/buildsym.h

> index c768a4c..2845789 100644

> --- a/gdb/buildsym.h

> +++ b/gdb/buildsym.h

> @@ -190,6 +190,8 @@ struct buildsym_compunit

>    void record_line (struct subfile *subfile, int line, CORE_ADDR pc,

>  		    bool is_stmt);

>  

> +  void record_inline_range_end (CORE_ADDR end);

> +

>    struct compunit_symtab *get_compunit_symtab ()

>    {

>      return m_compunit_symtab;

> @@ -397,6 +399,15 @@ struct buildsym_compunit

>  

>    /* Pending symbols that are local to the lexical context.  */

>    struct pending *m_local_symbols = nullptr;

> +

> +  /* Pending inline end range addresses.  */

> +  CORE_ADDR * m_inline_end_vector = nullptr;

> +

> +  /* Number of allocated inline end range addresses.  */

> +  int m_inline_end_vector_length = 0;

> +

> +  /* Number of pending inline end range addresses.  */

> +  int m_inline_end_vector_nitems = 0;

>  };

>  

>  

> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c

> index bcc3116..321de93 100644

> --- a/gdb/dwarf2/read.c

> +++ b/gdb/dwarf2/read.c

> @@ -13527,10 +13527,6 @@ class process_die_scope

>  	  return false;

>  	}

>  

> -      /* Empty range entries have no effect.  */

> -      if (range_beginning == range_end)

> -	continue;


I don't think we should be doing this.  This is defined quite clearly
in the DWARF spec as being an empty range.  No code is associated with
this range.  As such, it really shouldn't have an impact on how we
interpret the rest of the DWARF.

Again, I think you're trying too hard to work around GCC's broken
DWARF output.

> -

>        range_beginning += *base;

>        range_end += *base;

>  

> @@ -13638,10 +13634,6 @@ class process_die_scope

>  	  return 0;

>  	}

>  

> -      /* Empty range entries have no effect.  */

> -      if (range_beginning == range_end)

> -	continue;


Same.

> -

>        range_beginning += *base;

>        range_end += *base;

>  

> @@ -13681,6 +13673,10 @@ class process_die_scope

>    retval = dwarf2_ranges_process (offset, cu,

>      [&] (CORE_ADDR range_beginning, CORE_ADDR range_end)

>      {

> +      /* Empty range entries have no effect.  */

> +      if (range_beginning == range_end)

> +	return;


Same.

> +

>        if (ranges_pst != NULL)

>  	{

>  	  CORE_ADDR lowpc;

> @@ -13918,6 +13914,7 @@ class process_die_scope

>    struct gdbarch *gdbarch = get_objfile_arch (objfile);

>    struct attribute *attr;

>    struct attribute *attr_high;

> +  bool inlined_subroutine = (die->tag == DW_TAG_inlined_subroutine);

>  

>    attr_high = dwarf2_attr (die, DW_AT_high_pc, cu);

>    if (attr_high)

> @@ -13933,7 +13930,10 @@ class process_die_scope

>  

>  	  low = gdbarch_adjust_dwarf2_addr (gdbarch, low + baseaddr);

>  	  high = gdbarch_adjust_dwarf2_addr (gdbarch, high + baseaddr);

> -	  cu->get_builder ()->record_block_range (block, low, high - 1);

> +	  if (inlined_subroutine)

> +	    cu->get_builder ()->record_inline_range_end (high);

> +	  if (low < high)

> +	    cu->get_builder ()->record_block_range (block, low, high - 1);

>          }

>      }

>  

> @@ -13958,6 +13958,10 @@ class process_die_scope

>  	  end += baseaddr;

>  	  start = gdbarch_adjust_dwarf2_addr (gdbarch, start);

>  	  end = gdbarch_adjust_dwarf2_addr (gdbarch, end);

> +	  if (inlined_subroutine)

> +	    cu->get_builder ()->record_inline_range_end (end);

> +	  if (start == end)

> +	    return;

>  	  cu->get_builder ()->record_block_range (block, start, end - 1);

>  	  blockvec.emplace_back (start, end);

>  	});

> diff --git a/gdb/testsuite/gdb.cp/step-and-next-inline.exp b/gdb/testsuite/gdb.cp/step-and-next-inline.exp

> index 3733fa7..e3a5793 100644

> --- a/gdb/testsuite/gdb.cp/step-and-next-inline.exp

> +++ b/gdb/testsuite/gdb.cp/step-and-next-inline.exp

> @@ -52,37 +52,20 @@ proc do_test { use_header } {

>      gdb_test "step" ".*" "step into get_alias_set"

>      gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \

>  	"not in inline 1"

> -    # It's possible that this first failure (when not using a header

> -    # file) is GCC's fault, though the remaining failures would best

> -    # be fixed by adding location views support (though it could be

> -    # that some easier heuristic could be figured out).  Still, it is

> -    # not certain that the first failure wouldn't also be fixed by

> -    # having location view support, so for now it is tagged as such.

> -    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }

>      gdb_test "next" ".*TREE_TYPE.*" "next step 1"

>      gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \

>  	"not in inline 2"

>      gdb_test "next" ".*TREE_TYPE.*" "next step 2"

>      gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \

>  	"not in inline 3"

> -    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }

>      gdb_test "next" ".*TREE_TYPE.*" "next step 3"

>      gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \

>  	"not in inline 4"

> -    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }

>      gdb_test "next" "return 0.*" "next step 4"

>      gdb_test "bt" \

>  	"\\s*\\#0\\s+(main|get_alias_set)\[^\r\]*${srcfile}:.*" \

>  	"not in inline 5"

>  

> -    if {!$use_header} {

> -	# With the debug from GCC 10.x (and earlier) GDB is currently

> -	# unable to successfully complete the following tests when we

> -	# are not using a header file.

> -	kfail symtab/25507 "stepping tests"

> -	return

> -    }

> -


In summary:

If we can convince ourselves that the removal of the special END
handling block really is OK, then I'm happy for _that_ part of this
patch to replace my proposed patch.  Does this part have to be bundled
with the second part?

I don't think the rest of the patch should be merged at the moment.  I
think we should wait to see if GCC can fix their broken DWARF
production.  If this issue is resolved, then possibly we could
maintain this, or some variation of this guarded with a producer
check, so only known bad versions of GCC are modified.  But I don't
think we should assume that all producers get their range information
wrong.

Thanks,
Andrew
Keith Seitz via Gdb-patches April 4, 2020, 10:27 p.m. | #2
Hi,

On 4/4/20 4:50 PM, Bernd Edlinger wrote:
> Hi,

> 

> this is an updated version of my patch that was originally only intended

> to fix the issues with inline functions in the same file.  That needed

> re-basing anyway, because of a merge conflict.

> 

> I removed the code that does the special handling of end sequence

> markers in record_line now, since it seems to cause more problems than

> it solves.

> 

> I believe it will fix the regression that Louis pointed out, and

> should fix the regression that Andrew wanted to fix with his

> patch:


I gave this a try and the FAIL's are still there unfortunately.
Keith Seitz via Gdb-patches April 4, 2020, 10:30 p.m. | #3
On 4/4/20 7:27 PM, Luis Machado wrote:
> Hi,

> 

> On 4/4/20 4:50 PM, Bernd Edlinger wrote:

>> Hi,

>>

>> this is an updated version of my patch that was originally only intended

>> to fix the issues with inline functions in the same file.  That needed

>> re-basing anyway, because of a merge conflict.

>>

>> I removed the code that does the special handling of end sequence

>> markers in record_line now, since it seems to cause more problems than

>> it solves.

>>

>> I believe it will fix the regression that Louis pointed out, and

>> should fix the regression that Andrew wanted to fix with his

>> patch:

> 

> I gave this a try and the FAIL's are still there unfortunately.


Actually, there are 6 more of them, with the patch. I'll read through 
the thread to understand what these changes are aiming to fix.

I wouldn't be surprised if GDB is covering some known compiler debug 
info generation hiccup.
Bernd Edlinger April 4, 2020, 11:38 p.m. | #4
On 4/5/20 12:27 AM, Luis Machado wrote:
> Hi,

> 

> On 4/4/20 4:50 PM, Bernd Edlinger wrote:

>> Hi,

>>

>> this is an updated version of my patch that was originally only intended

>> to fix the issues with inline functions in the same file.  That needed

>> re-basing anyway, because of a merge conflict.

>>

>> I removed the code that does the special handling of end sequence

>> markers in record_line now, since it seems to cause more problems than

>> it solves.

>>

>> I believe it will fix the regression that Louis pointed out, and

>> should fix the regression that Andrew wanted to fix with his

>> patch:

> 

> I gave this a try and the FAIL's are still there unfortunately.


Yes, I already am aware, short after I wrote the mail,
I think I have now the right idea, I will probably post another
version of the patch....

For your test case we need to make sure that the line entries
at the end of a function are acutually deleted.
That is easy, writing the right thing in the comment will be 
hard.


Bernd.
Bernd Edlinger April 4, 2020, 11:59 p.m. | #5
On 4/5/20 12:07 AM, Andrew Burgess wrote:
> * Bernd Edlinger <bernd.edlinger@hotmail.de> [2020-04-04 21:50:21 +0200]:

> 

>> Hi,

>>

>> this is an updated version of my patch that was originally only intended

>> to fix the issues with inline functions in the same file.  That needed

>> re-basing anyway, because of a merge conflict.

>>

>> I removed the code that does the special handling of end sequence

>> markers in record_line now, since it seems to cause more problems than

>> it solves.

>>

>> I believe it will fix the regression that Louis pointed out,

> 

> Do you have a link or any extra information for this?  I'd like to

> understand what the actual regression was.  I noticed there didn't

> seem to be any extra tests added over previous versions of the patch,

> is this regression covered by an existing test?

> 


Yup.  Please have a look at
https://marc.info/?l=gdb-patches&m=158601805302656&w=2

Here we see the beginning of main i assume:
  [0x000000b4]  Extended opcode 2: set Address to 0x73c
  [0x000000bf]  Advance Line by 75 to 76
  [0x000000c2]  Copy


And somewere else we see the end of another function:

  [0x00000129]  Extended opcode 2: set Address to 0x73c
  [0x00000134]  Advance Line by 1 to 73
  [0x00000136]  Copy
  [0x00000137]  Extended opcode 1: End of Sequence

So by the stable sorting of the line table,

we should have 

line 76 is-stmt
line 73 !is_stmt
line 0  is_stmt

which is the state, what is in master where the test fail for Luis
but the step-over-test case works.

with my update patch of today

we should have

line 76 is-stmt
lint 73 is-stmt
line 0 is-stmt

which also fails for Luis, and has 6 additional failures.
but the step-over-test case works.

So I start to think there is no way how I can leave line 73
in the line table, I must delete it, when there is an explicit
end of sequence.  But not when there is no explicit end.

Previously we had

line 76 is-stmt < from main
line 0 is-stmt < end of another function

which works, but is also not very nice.

I would like to remove the lines at a true end of sequence,
but not remove lines at a fake end of sequence that we emit
when the cu changes, the idea, that I have right now is that
is pass line = -1 to the record_line and do not touch any
lines at that PC, change that to line = 0 internally, record
that. so the line table looks like it was always, since
I cannot tell what breaks when that end sequence is not there.


Bernd.

>>                                                               and

>> should fix the regression that Andrew wanted to fix with his

>> patch:

> 

>>

>> [PATCH 2/2] gdb: Preserve is-stmt lines when switch between files

>> https://marc.info/?l=gdb-patches&m=158595247916817&w=2

>>

>> So I hope that will not be necessary after this.

> 

> It would be nice if you picked up the extra tests from that patch and

> included them here if this is a replacement solution.

> 

>>

>> There is a theoretic issue with line numbers at the end

>> of a function, these could coincidentally have the same

>> PC as the following function.  That might need more work

>> to solve that, in the moment I have not yet looked at

>> tracking that down.

>>

>>

>> Thanks

>> Bernd.

> 

>> From 330eadf4b42e44bfa82c30a6bda21393fa4a54c8 Mon Sep 17 00:00:00 2001

>> From: Bernd Edlinger <bernd.edlinger@hotmail.de>

>> Date: Sun, 9 Feb 2020 21:13:17 +0100

>> Subject: [PATCH] Fix range end handling of inlined subroutines

>>

>> Since the is_stmt is now handled, it becomes

>> possible to locate dubious is_stmt line entries

>> at the end of an inlined function, even if the

>> called inline function is in the same subfile.

>>

>> When there is a sequence of line entries at the

>> same address where an inline range ends, and the

>> last item has is_stmt = 0, we force all previous

>> items to have is_stmt = 0 as well.

>>

>> If the last line at that address has is_stmt = 1,

>> there is no need to change anything, since a step

>> over will always stop at that last line from the

>> same address, which is fine, since it is outside

>> the subroutine.

>>

>> With this change we loose the ability to set

>> breakpoints on some lines using file:line syntax,

>> but the data is not completely lost, as the

>> line table is still holding those lines, just

>> with the is_stmt flag reset.

> 

> This is a trade off that I don't think we should breeze over quite so

> easily.  Many, many users make use of an IDE to drive GDB, and in

> those cases they mostly use file:line syntax exclusively, so any loss

> in this area is something we should consider seriously.

> 

>>

>> This is necessary as breakpoints on these lines

>> are problematic, as the call stack is often

>> wrong.

> 

> Maybe.  Yes I absolutely agree that in the test case, when we stop at

> these line numbers, the call stack is wrong.  But, as GDB developers,

> we can have a slightly different take on things.  In this particular

> case the debug information we're supplied is just plain wrong.  This

> is documented in this GCC bug:

> 

>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474

> 

> ( Bernd, I know you've already seen that, I include the link for anyone else

>   following this thread. )

> 

> The lines that are part of the inline subroutine, in this test are

> marked as is-stmt _and_ are outside the range of the inline

> subroutine.  So the call stack GDB presents is not wrong w.r.t the

> debug information we're given, it's the debug information that is

> wrong in this case.

> 

> Now, there are many places where GDB tries to work around broken debug

> information, but I think that in this case we give up too much.  I

> would prefer to see if GCC can fix their DWARF creation first before

> we take something like this.

> 

>>        From the block info they appear to be

>> located in the caller, but they are actually meant

>> to be part of the subroutine, therefore usually one

>> frame is missing from the callstack when the

>> execution stops there.

> 

> The lines shouldn't be marked as is-stmt in the location they are, or

> GCC should extend the range slightly.

> 

>>

>> This is about the best we can do at the moment,

>> unless location view information are added to the

>> block ranges debug info structure, and location

>> views are implemented in gdb in general.

> 

> Please do correct me if I'm wrong, but I don't think GCC can yet

> produce view information for range endings, right?

> 

> I agree that the best solution here would be to have location views

> for range start (GCC can do this) and range end (I think this is

> missing), then scopes can merge over within a single address.

> 

>>

>> gdb:

>> 2020-04-04  Bernd Edlinger  <bernd.edlinger@hotmail.de>

>>

>> 	* buildsym.c (buildsym_compunit::record_line): Remove line deletion

>> 	at end sequence marker.

>> 	(buildsym_compunit::record_inline_range_end,

>> 	patch_inline_end_pos): New helper functions.

>> 	(buildsym_compunit::end_symtab_with_blockvector): Patch line table.

>> 	(buildsym_compunit::~buildsym_compunit): Cleanup m_inline_end_vector.

>> 	* buildsym.h (buildsym_compunit::record_inline_range_end): Declare.

>> 	(buildsym_compunit::m_inline_end_vector,

>> 	buildsym_compunit::m_inline_end_vector_length,

>> 	buildsym_compunit::m_inline_end_vector_nitems): New data items.

>> 	* dwarf2/read.c (dwarf2_rnglists_process,

>> 	dwarf2_ranges_process): Don't ignore empty ranges here.

>> 	(dwarf2_ranges_read): Ignore empty ranges here.

>> 	(dwarf2_record_block_ranges): Pass end of range PC to

>> 	record_inline_range_end for inline functions.

>>

>> gdb/testsuite:

>> 2020-04-04  Bernd Edlinger  <bernd.edlinger@hotmail.de>

>>

>> 	* gdb.cp/step-and-next-inline.exp: Adjust test.

>> ---

>>  gdb/buildsym.c                                | 115 ++++++++++++++++++++------

>>  gdb/buildsym.h                                |  11 +++

>>  gdb/dwarf2/read.c                             |  22 +++--

>>  gdb/testsuite/gdb.cp/step-and-next-inline.exp |  17 ----

>>  4 files changed, 114 insertions(+), 51 deletions(-)

>>

>> diff --git a/gdb/buildsym.c b/gdb/buildsym.c

>> index fe07103..e6e7437 100644

>> --- a/gdb/buildsym.c

>> +++ b/gdb/buildsym.c

>> @@ -113,6 +113,8 @@ struct pending_block

>>        next1 = next->next;

>>        xfree ((void *) next);

>>      }

>> +

>> +  xfree (m_inline_end_vector);

>>  }

>>  

>>  struct macro_table *

>> @@ -691,31 +693,6 @@ struct blockvector *

>>  		      * sizeof (struct linetable_entry))));

>>      }

>>  

>> -  /* The end of sequence marker is special.  We need to reset the

>> -     is_stmt flag on previous lines at the same PC, otherwise these

>> -     lines may cause problems since they might be at the same address

>> -     as the following function.  For instance suppose a function calls

>> -     abort there is no reason to emit a ret after that point (no joke).

>> -     So the label may be at the same address where the following

>> -     function begins.  A similar problem appears if a label is at the

>> -     same address where an inline function ends we cannot reliably tell

>> -     if this is considered part of the inline function or the calling

>> -     program or even the next inline function, so stack traces may

>> -     give surprising results.  Expect gdb.cp/step-and-next-inline.exp

>> -     to fail if these lines are not modified here.  */

>> -  if (line == 0 && subfile->line_vector->nitems > 0)

>> -    {

>> -      e = subfile->line_vector->item + subfile->line_vector->nitems;

>> -      do

>> -	{

>> -	  e--;

>> -	  if (e->pc != pc || e->line == 0)

>> -	    break;

>> -	  e->is_stmt = 0;

>> -	}

>> -      while (e > subfile->line_vector->item);

>> -    }

>> -

> 

> I'm slightly nervous with this proposed change.  Code like this

> existed before we started messing with is-stmt support, and I suspect

> it probably dates from a time before testing was as good as it is

> now.  I assume nothing regresses with this removed, but it would be

> nice to try a few experiments I think with different combinations of

> two files, empty line ranges, and having is-stmt before and after the

> transition between files, just to confirm that we really don't need

> this code any more.  I suspect this is the "line numbers at the end of

> a function" issue you mentioned in your introductory text.

> 

> That said, I see how removing this provides an alternative solution to

> the patch I posted.

> 

>>    e = subfile->line_vector->item + subfile->line_vector->nitems++;

>>    e->line = line;

>>    e->is_stmt = is_stmt ? 1 : 0;

>> @@ -723,6 +700,90 @@ struct blockvector *

>>  }

>>  

>>  

>> +/* Record a PC where a inlined subroutine ends.  */

>> +

>> +void

>> +buildsym_compunit::record_inline_range_end (CORE_ADDR end)

>> +{

>> +  /* The performance of this function is very important,

>> +     it shall be O(n*log(n)) therefore we do not use std::vector

>> +     here since some compilers, e.g. visual studio, do not

>> +     guarantee that for vector::push_back.  */

> 

> I think we're going to need more of a group discussion on this.

> Simply saying std::vector is too slow doesn't I'm afraid convince me.

> There seem to be lots of other places in GDB where both performance is

> critical, and we use ::push_back.

> 

> Is your claim that we should move away from std::vector in all these

> cases?  Is this case somehow special?

> 

> Obviously I've made this point before, and you're sticking to your

> position, which is fine, but I can't approve this patch without seeing

> a more compelling argument against std::vector.  That said, if some

> other maintainer is convinced, they're welcome to approve it.  Either

> way, the comment would be better placed at the declaration of the

> variable, rather than here I think.

> 

>> +  if (m_inline_end_vector == nullptr)

>> +    {

>> +      m_inline_end_vector_length = INITIAL_LINE_VECTOR_LENGTH;

>> +      m_inline_end_vector = (CORE_ADDR *)

>> +	xmalloc (sizeof (CORE_ADDR) * m_inline_end_vector_length);

>> +      m_inline_end_vector_nitems = 0;

>> +    }

>> +  else if (m_inline_end_vector_nitems == m_inline_end_vector_length)

>> +    {

>> +      m_inline_end_vector_length *= 2;

>> +      m_inline_end_vector = (CORE_ADDR *)

>> +	xrealloc ((char *) m_inline_end_vector,

>> +		  sizeof (CORE_ADDR) * m_inline_end_vector_length);

>> +    }

>> +

>> +  m_inline_end_vector[m_inline_end_vector_nitems++] = end;

>> +}

>> +

>> +

>> +/* Patch the is_stmt bits at the given inline end address.

>> +   The line table has to be already sorted.  */

>> +

>> +static void

>> +patch_inline_end_pos (struct linetable *table, CORE_ADDR end)

>> +{

>> +  int a = 2, b = table->nitems - 1;

>> +  struct linetable_entry *items = table->item;

>> +

>> +  /* We need at least two items with pc = end in the table.

>> +     The lowest usable items are at pos 0 and 1, the highest

>> +     usable items are at pos b - 2 and b - 1.  */

>> +  if (a > b || end < items[1].pc || end > items[b - 2].pc)

>> +    return;

>> +

>> +  /* Look for the first item with pc > end in the range [a,b].

>> +     The previous element has pc = end or there is no match.

>> +     We set a = 2, since we need at least two consecutive elements

>> +     with pc = end to do anything useful.

>> +     We set b = nitems - 1, since we are not interested in the last

>> +     element which should be an end of sequence marker with line = 0

>> +     and is_stmt = 1.  */

>> +  while (a < b)

>> +    {

>> +      int c = (a + b) / 2;

>> +

>> +      if (end < items[c].pc)

>> +	b = c;

>> +      else

>> +	a = c + 1;

>> +    }

>> +

>> +  a--;

>> +  if (items[a].pc != end || items[a].is_stmt)

>> +    return;

>> +

>> +  /* When there is a sequence of line entries at the same address

>> +     where an inline range ends, and the last item has is_stmt = 0,

>> +     we force all previous items to have is_stmt = 0 as well.

>> +     Setting breakpoints at those addresses is currently not

>> +     supported, since it is unclear if the previous addresses are

>> +     part of the subroutine or the calling program.  */

>> +  do

>> +    {

>> +      /* We stop at the first line entry with a different address,

>> +	 or when we see an end of sequence marker.  */

>> +      a--;

>> +      if (items[a].pc != end || items[a].line == 0)

>> +	break;

>> +

>> +      items[a].is_stmt = 0;

>> +    }

>> +  while (a > 0);

>> +}

>> +

>> +

>>  /* Subroutine of end_symtab to simplify it.  Look for a subfile that

>>     matches the main source file's basename.  If there is only one, and

>>     if the main source file doesn't have any symbol or line number

>> @@ -956,6 +1017,10 @@ struct compunit_symtab *

>>  			      subfile->line_vector->item

>>  			      + subfile->line_vector->nitems,

>>  			      lte_is_less_than);

>> +

>> +	   for (int i = 0; i < m_inline_end_vector_nitems; i++)

>> +	     patch_inline_end_pos (subfile->line_vector,

>> +				   m_inline_end_vector[i]);

>>  	}

>>  

>>        /* Allocate a symbol table if necessary.  */

>> diff --git a/gdb/buildsym.h b/gdb/buildsym.h

>> index c768a4c..2845789 100644

>> --- a/gdb/buildsym.h

>> +++ b/gdb/buildsym.h

>> @@ -190,6 +190,8 @@ struct buildsym_compunit

>>    void record_line (struct subfile *subfile, int line, CORE_ADDR pc,

>>  		    bool is_stmt);

>>  

>> +  void record_inline_range_end (CORE_ADDR end);

>> +

>>    struct compunit_symtab *get_compunit_symtab ()

>>    {

>>      return m_compunit_symtab;

>> @@ -397,6 +399,15 @@ struct buildsym_compunit

>>  

>>    /* Pending symbols that are local to the lexical context.  */

>>    struct pending *m_local_symbols = nullptr;

>> +

>> +  /* Pending inline end range addresses.  */

>> +  CORE_ADDR * m_inline_end_vector = nullptr;

>> +

>> +  /* Number of allocated inline end range addresses.  */

>> +  int m_inline_end_vector_length = 0;

>> +

>> +  /* Number of pending inline end range addresses.  */

>> +  int m_inline_end_vector_nitems = 0;

>>  };

>>  

>>  

>> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c

>> index bcc3116..321de93 100644

>> --- a/gdb/dwarf2/read.c

>> +++ b/gdb/dwarf2/read.c

>> @@ -13527,10 +13527,6 @@ class process_die_scope

>>  	  return false;

>>  	}

>>  

>> -      /* Empty range entries have no effect.  */

>> -      if (range_beginning == range_end)

>> -	continue;

> 

> I don't think we should be doing this.  This is defined quite clearly

> in the DWARF spec as being an empty range.  No code is associated with

> this range.  As such, it really shouldn't have an impact on how we

> interpret the rest of the DWARF.

> 

> Again, I think you're trying too hard to work around GCC's broken

> DWARF output.

> 

>> -

>>        range_beginning += *base;

>>        range_end += *base;

>>  

>> @@ -13638,10 +13634,6 @@ class process_die_scope

>>  	  return 0;

>>  	}

>>  

>> -      /* Empty range entries have no effect.  */

>> -      if (range_beginning == range_end)

>> -	continue;

> 

> Same.

> 

>> -

>>        range_beginning += *base;

>>        range_end += *base;

>>  

>> @@ -13681,6 +13673,10 @@ class process_die_scope

>>    retval = dwarf2_ranges_process (offset, cu,

>>      [&] (CORE_ADDR range_beginning, CORE_ADDR range_end)

>>      {

>> +      /* Empty range entries have no effect.  */

>> +      if (range_beginning == range_end)

>> +	return;

> 

> Same.

> 

>> +

>>        if (ranges_pst != NULL)

>>  	{

>>  	  CORE_ADDR lowpc;

>> @@ -13918,6 +13914,7 @@ class process_die_scope

>>    struct gdbarch *gdbarch = get_objfile_arch (objfile);

>>    struct attribute *attr;

>>    struct attribute *attr_high;

>> +  bool inlined_subroutine = (die->tag == DW_TAG_inlined_subroutine);

>>  

>>    attr_high = dwarf2_attr (die, DW_AT_high_pc, cu);

>>    if (attr_high)

>> @@ -13933,7 +13930,10 @@ class process_die_scope

>>  

>>  	  low = gdbarch_adjust_dwarf2_addr (gdbarch, low + baseaddr);

>>  	  high = gdbarch_adjust_dwarf2_addr (gdbarch, high + baseaddr);

>> -	  cu->get_builder ()->record_block_range (block, low, high - 1);

>> +	  if (inlined_subroutine)

>> +	    cu->get_builder ()->record_inline_range_end (high);

>> +	  if (low < high)

>> +	    cu->get_builder ()->record_block_range (block, low, high - 1);

>>          }

>>      }

>>  

>> @@ -13958,6 +13958,10 @@ class process_die_scope

>>  	  end += baseaddr;

>>  	  start = gdbarch_adjust_dwarf2_addr (gdbarch, start);

>>  	  end = gdbarch_adjust_dwarf2_addr (gdbarch, end);

>> +	  if (inlined_subroutine)

>> +	    cu->get_builder ()->record_inline_range_end (end);

>> +	  if (start == end)

>> +	    return;

>>  	  cu->get_builder ()->record_block_range (block, start, end - 1);

>>  	  blockvec.emplace_back (start, end);

>>  	});

>> diff --git a/gdb/testsuite/gdb.cp/step-and-next-inline.exp b/gdb/testsuite/gdb.cp/step-and-next-inline.exp

>> index 3733fa7..e3a5793 100644

>> --- a/gdb/testsuite/gdb.cp/step-and-next-inline.exp

>> +++ b/gdb/testsuite/gdb.cp/step-and-next-inline.exp

>> @@ -52,37 +52,20 @@ proc do_test { use_header } {

>>      gdb_test "step" ".*" "step into get_alias_set"

>>      gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \

>>  	"not in inline 1"

>> -    # It's possible that this first failure (when not using a header

>> -    # file) is GCC's fault, though the remaining failures would best

>> -    # be fixed by adding location views support (though it could be

>> -    # that some easier heuristic could be figured out).  Still, it is

>> -    # not certain that the first failure wouldn't also be fixed by

>> -    # having location view support, so for now it is tagged as such.

>> -    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }

>>      gdb_test "next" ".*TREE_TYPE.*" "next step 1"

>>      gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \

>>  	"not in inline 2"

>>      gdb_test "next" ".*TREE_TYPE.*" "next step 2"

>>      gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \

>>  	"not in inline 3"

>> -    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }

>>      gdb_test "next" ".*TREE_TYPE.*" "next step 3"

>>      gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \

>>  	"not in inline 4"

>> -    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }

>>      gdb_test "next" "return 0.*" "next step 4"

>>      gdb_test "bt" \

>>  	"\\s*\\#0\\s+(main|get_alias_set)\[^\r\]*${srcfile}:.*" \

>>  	"not in inline 5"

>>  

>> -    if {!$use_header} {

>> -	# With the debug from GCC 10.x (and earlier) GDB is currently

>> -	# unable to successfully complete the following tests when we

>> -	# are not using a header file.

>> -	kfail symtab/25507 "stepping tests"

>> -	return

>> -    }

>> -

> 

> In summary:

> 

> If we can convince ourselves that the removal of the special END

> handling block really is OK, then I'm happy for _that_ part of this

> patch to replace my proposed patch.  Does this part have to be bundled

> with the second part?

> 

> I don't think the rest of the patch should be merged at the moment.  I

> think we should wait to see if GCC can fix their broken DWARF

> production.  If this issue is resolved, then possibly we could

> maintain this, or some variation of this guarded with a producer

> check, so only known bad versions of GCC are modified.  But I don't

> think we should assume that all producers get their range information

> wrong.

> 

> Thanks,

> Andrew

>
Bernd Edlinger April 5, 2020, 12:05 a.m. | #6
On 4/5/20 12:30 AM, Luis Machado wrote:
> On 4/4/20 7:27 PM, Luis Machado wrote:

>> Hi,

>>

>> On 4/4/20 4:50 PM, Bernd Edlinger wrote:

>>> Hi,

>>>

>>> this is an updated version of my patch that was originally only intended

>>> to fix the issues with inline functions in the same file.  That needed

>>> re-basing anyway, because of a merge conflict.

>>>

>>> I removed the code that does the special handling of end sequence

>>> markers in record_line now, since it seems to cause more problems than

>>> it solves.

>>>

>>> I believe it will fix the regression that Louis pointed out, and

>>> should fix the regression that Andrew wanted to fix with his

>>> patch:

>>

>> I gave this a try and the FAIL's are still there unfortunately.

> 

> Actually, there are 6 more of them, with the patch. I'll read through the thread to understand what these changes are aiming to fix.

> 

> I wouldn't be surprised if GDB is covering some known compiler debug info generation hiccup.


Thanks Luis,

even a FAIL allows me to learn out of it :)


Bernd.
Tom Tromey April 22, 2020, 9:03 p.m. | #7
>>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:


>> +void

>> +buildsym_compunit::record_inline_range_end (CORE_ADDR end)

>> +{

>> +  /* The performance of this function is very important,

>> +     it shall be O(n*log(n)) therefore we do not use std::vector

>> +     here since some compilers, e.g. visual studio, do not

>> +     guarantee that for vector::push_back.  */


Andrew> I think we're going to need more of a group discussion on this.
Andrew> Simply saying std::vector is too slow doesn't I'm afraid convince me.
Andrew> There seem to be lots of other places in GDB where both performance is
Andrew> critical, and we use ::push_back.

Andrew> Is your claim that we should move away from std::vector in all these
Andrew> cases?  Is this case somehow special?

C++ documents push_back as having amortized constant complexity.  If
that's not the case for the MS compiler, that seems like a pretty
serious bug there... I guess I'd like some documentation of some kind (a
stackoverflow question, or maybe a test program that shows the vector
growing linearly, or something like that).

I didn't understand the relevance or target of the "n*log(n)" comment.

Andrew> I don't think we should be doing this.  This is defined quite clearly
Andrew> in the DWARF spec as being an empty range.  No code is associated with
Andrew> this range.  As such, it really shouldn't have an impact on how we
Andrew> interpret the rest of the DWARF.

Andrew> Again, I think you're trying too hard to work around GCC's broken
Andrew> DWARF output.

Do we know how long GCC has been generating this?
And whether anybody is investigating a fix?

I dislike adding workarounds for relatively modern versions of
GCC... I'd prefer these things be fixed in GCC if possible.  However, if
that's not possible, I'm also flexible about it.

Tom
Bernd Edlinger April 25, 2020, 7:29 a.m. | #8
On 4/22/20 11:03 PM, Tom Tromey wrote:
>>>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:

> 

>>> +void

>>> +buildsym_compunit::record_inline_range_end (CORE_ADDR end)

>>> +{

>>> +  /* The performance of this function is very important,

>>> +     it shall be O(n*log(n)) therefore we do not use std::vector

>>> +     here since some compilers, e.g. visual studio, do not

>>> +     guarantee that for vector::push_back.  */

> 

> Andrew> I think we're going to need more of a group discussion on this.

> Andrew> Simply saying std::vector is too slow doesn't I'm afraid convince me.

> Andrew> There seem to be lots of other places in GDB where both performance is

> Andrew> critical, and we use ::push_back.

> 

> Andrew> Is your claim that we should move away from std::vector in all these

> Andrew> cases?  Is this case somehow special?

> 

> C++ documents push_back as having amortized constant complexity.  If

> that's not the case for the MS compiler, that seems like a pretty

> serious bug there... I guess I'd like some documentation of some kind (a

> stackoverflow question, or maybe a test program that shows the vector

> growing linearly, or something like that).

> 


The problem is, this is in an inner loop where the overall system
performance is severely affected by this bug which is not our fault.

BUT for the user this is a GDB bug, and guess what the microsoft debugger
will load faster.  So why do we take that risk?

> I didn't understand the relevance or target of the "n*log(n)" comment.

> 


Maybe give more context, I do not understand your question when all the context
is removed.

> Andrew> I don't think we should be doing this.  This is defined quite clearly

> Andrew> in the DWARF spec as being an empty range.  No code is associated with

> Andrew> this range.  As such, it really shouldn't have an impact on how we

> Andrew> interpret the rest of the DWARF.

> 

> Andrew> Again, I think you're trying too hard to work around GCC's broken

> Andrew> DWARF output.

> 

> Do we know how long GCC has been generating this?

> And whether anybody is investigating a fix?

> 


Since gcc-7 IIRC.  So these versions can no longer be changed,
but they will be around for very long.  As I said the only possible outcome
is disabling this feature for GCC a back-port of anything more complicated
will be way too dangerous.  Even that is only gcc-10, gcc-9 and gcc-8.

So we missed gcc-7, everybody using that will be in trouble.

> I dislike adding workarounds for relatively modern versions of

> GCC... I'd prefer these things be fixed in GCC if possible.  However, if

> that's not possible, I'm also flexible about it.

> 

> Tom

> 


Tom, once a bug has been unfixed for so long, the damage can no longer
be undone.


So I think we should work around the failuer,
which is IMHO neither GCC nor GDB's fault, but a flaw in the DWARF-2 spec.

I think the best solution would be to fix the spec, and work around
the issue until we have a new dwarf-version.


Thanks,
Bernd.

Patch

From 330eadf4b42e44bfa82c30a6bda21393fa4a54c8 Mon Sep 17 00:00:00 2001
From: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date: Sun, 9 Feb 2020 21:13:17 +0100
Subject: [PATCH] Fix range end handling of inlined subroutines

Since the is_stmt is now handled, it becomes
possible to locate dubious is_stmt line entries
at the end of an inlined function, even if the
called inline function is in the same subfile.

When there is a sequence of line entries at the
same address where an inline range ends, and the
last item has is_stmt = 0, we force all previous
items to have is_stmt = 0 as well.

If the last line at that address has is_stmt = 1,
there is no need to change anything, since a step
over will always stop at that last line from the
same address, which is fine, since it is outside
the subroutine.

With this change we loose the ability to set
breakpoints on some lines using file:line syntax,
but the data is not completely lost, as the
line table is still holding those lines, just
with the is_stmt flag reset.

This is necessary as breakpoints on these lines
are problematic, as the call stack is often
wrong.  From the block info they appear to be
located in the caller, but they are actually meant
to be part of the subroutine, therefore usually one
frame is missing from the callstack when the
execution stops there.

This is about the best we can do at the moment,
unless location view information are added to the
block ranges debug info structure, and location
views are implemented in gdb in general.

gdb:
2020-04-04  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* buildsym.c (buildsym_compunit::record_line): Remove line deletion
	at end sequence marker.
	(buildsym_compunit::record_inline_range_end,
	patch_inline_end_pos): New helper functions.
	(buildsym_compunit::end_symtab_with_blockvector): Patch line table.
	(buildsym_compunit::~buildsym_compunit): Cleanup m_inline_end_vector.
	* buildsym.h (buildsym_compunit::record_inline_range_end): Declare.
	(buildsym_compunit::m_inline_end_vector,
	buildsym_compunit::m_inline_end_vector_length,
	buildsym_compunit::m_inline_end_vector_nitems): New data items.
	* dwarf2/read.c (dwarf2_rnglists_process,
	dwarf2_ranges_process): Don't ignore empty ranges here.
	(dwarf2_ranges_read): Ignore empty ranges here.
	(dwarf2_record_block_ranges): Pass end of range PC to
	record_inline_range_end for inline functions.

gdb/testsuite:
2020-04-04  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* gdb.cp/step-and-next-inline.exp: Adjust test.
---
 gdb/buildsym.c                                | 115 ++++++++++++++++++++------
 gdb/buildsym.h                                |  11 +++
 gdb/dwarf2/read.c                             |  22 +++--
 gdb/testsuite/gdb.cp/step-and-next-inline.exp |  17 ----
 4 files changed, 114 insertions(+), 51 deletions(-)

diff --git a/gdb/buildsym.c b/gdb/buildsym.c
index fe07103..e6e7437 100644
--- a/gdb/buildsym.c
+++ b/gdb/buildsym.c
@@ -113,6 +113,8 @@  struct pending_block
       next1 = next->next;
       xfree ((void *) next);
     }
+
+  xfree (m_inline_end_vector);
 }
 
 struct macro_table *
@@ -691,31 +693,6 @@  struct blockvector *
 		      * sizeof (struct linetable_entry))));
     }
 
-  /* The end of sequence marker is special.  We need to reset the
-     is_stmt flag on previous lines at the same PC, otherwise these
-     lines may cause problems since they might be at the same address
-     as the following function.  For instance suppose a function calls
-     abort there is no reason to emit a ret after that point (no joke).
-     So the label may be at the same address where the following
-     function begins.  A similar problem appears if a label is at the
-     same address where an inline function ends we cannot reliably tell
-     if this is considered part of the inline function or the calling
-     program or even the next inline function, so stack traces may
-     give surprising results.  Expect gdb.cp/step-and-next-inline.exp
-     to fail if these lines are not modified here.  */
-  if (line == 0 && subfile->line_vector->nitems > 0)
-    {
-      e = subfile->line_vector->item + subfile->line_vector->nitems;
-      do
-	{
-	  e--;
-	  if (e->pc != pc || e->line == 0)
-	    break;
-	  e->is_stmt = 0;
-	}
-      while (e > subfile->line_vector->item);
-    }
-
   e = subfile->line_vector->item + subfile->line_vector->nitems++;
   e->line = line;
   e->is_stmt = is_stmt ? 1 : 0;
@@ -723,6 +700,90 @@  struct blockvector *
 }
 
 
+/* Record a PC where a inlined subroutine ends.  */
+
+void
+buildsym_compunit::record_inline_range_end (CORE_ADDR end)
+{
+  /* The performance of this function is very important,
+     it shall be O(n*log(n)) therefore we do not use std::vector
+     here since some compilers, e.g. visual studio, do not
+     guarantee that for vector::push_back.  */
+  if (m_inline_end_vector == nullptr)
+    {
+      m_inline_end_vector_length = INITIAL_LINE_VECTOR_LENGTH;
+      m_inline_end_vector = (CORE_ADDR *)
+	xmalloc (sizeof (CORE_ADDR) * m_inline_end_vector_length);
+      m_inline_end_vector_nitems = 0;
+    }
+  else if (m_inline_end_vector_nitems == m_inline_end_vector_length)
+    {
+      m_inline_end_vector_length *= 2;
+      m_inline_end_vector = (CORE_ADDR *)
+	xrealloc ((char *) m_inline_end_vector,
+		  sizeof (CORE_ADDR) * m_inline_end_vector_length);
+    }
+
+  m_inline_end_vector[m_inline_end_vector_nitems++] = end;
+}
+
+
+/* Patch the is_stmt bits at the given inline end address.
+   The line table has to be already sorted.  */
+
+static void
+patch_inline_end_pos (struct linetable *table, CORE_ADDR end)
+{
+  int a = 2, b = table->nitems - 1;
+  struct linetable_entry *items = table->item;
+
+  /* We need at least two items with pc = end in the table.
+     The lowest usable items are at pos 0 and 1, the highest
+     usable items are at pos b - 2 and b - 1.  */
+  if (a > b || end < items[1].pc || end > items[b - 2].pc)
+    return;
+
+  /* Look for the first item with pc > end in the range [a,b].
+     The previous element has pc = end or there is no match.
+     We set a = 2, since we need at least two consecutive elements
+     with pc = end to do anything useful.
+     We set b = nitems - 1, since we are not interested in the last
+     element which should be an end of sequence marker with line = 0
+     and is_stmt = 1.  */
+  while (a < b)
+    {
+      int c = (a + b) / 2;
+
+      if (end < items[c].pc)
+	b = c;
+      else
+	a = c + 1;
+    }
+
+  a--;
+  if (items[a].pc != end || items[a].is_stmt)
+    return;
+
+  /* When there is a sequence of line entries at the same address
+     where an inline range ends, and the last item has is_stmt = 0,
+     we force all previous items to have is_stmt = 0 as well.
+     Setting breakpoints at those addresses is currently not
+     supported, since it is unclear if the previous addresses are
+     part of the subroutine or the calling program.  */
+  do
+    {
+      /* We stop at the first line entry with a different address,
+	 or when we see an end of sequence marker.  */
+      a--;
+      if (items[a].pc != end || items[a].line == 0)
+	break;
+
+      items[a].is_stmt = 0;
+    }
+  while (a > 0);
+}
+
+
 /* Subroutine of end_symtab to simplify it.  Look for a subfile that
    matches the main source file's basename.  If there is only one, and
    if the main source file doesn't have any symbol or line number
@@ -956,6 +1017,10 @@  struct compunit_symtab *
 			      subfile->line_vector->item
 			      + subfile->line_vector->nitems,
 			      lte_is_less_than);
+
+	   for (int i = 0; i < m_inline_end_vector_nitems; i++)
+	     patch_inline_end_pos (subfile->line_vector,
+				   m_inline_end_vector[i]);
 	}
 
       /* Allocate a symbol table if necessary.  */
diff --git a/gdb/buildsym.h b/gdb/buildsym.h
index c768a4c..2845789 100644
--- a/gdb/buildsym.h
+++ b/gdb/buildsym.h
@@ -190,6 +190,8 @@  struct buildsym_compunit
   void record_line (struct subfile *subfile, int line, CORE_ADDR pc,
 		    bool is_stmt);
 
+  void record_inline_range_end (CORE_ADDR end);
+
   struct compunit_symtab *get_compunit_symtab ()
   {
     return m_compunit_symtab;
@@ -397,6 +399,15 @@  struct buildsym_compunit
 
   /* Pending symbols that are local to the lexical context.  */
   struct pending *m_local_symbols = nullptr;
+
+  /* Pending inline end range addresses.  */
+  CORE_ADDR * m_inline_end_vector = nullptr;
+
+  /* Number of allocated inline end range addresses.  */
+  int m_inline_end_vector_length = 0;
+
+  /* Number of pending inline end range addresses.  */
+  int m_inline_end_vector_nitems = 0;
 };
 
 
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index bcc3116..321de93 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -13527,10 +13527,6 @@  class process_die_scope
 	  return false;
 	}
 
-      /* Empty range entries have no effect.  */
-      if (range_beginning == range_end)
-	continue;
-
       range_beginning += *base;
       range_end += *base;
 
@@ -13638,10 +13634,6 @@  class process_die_scope
 	  return 0;
 	}
 
-      /* Empty range entries have no effect.  */
-      if (range_beginning == range_end)
-	continue;
-
       range_beginning += *base;
       range_end += *base;
 
@@ -13681,6 +13673,10 @@  class process_die_scope
   retval = dwarf2_ranges_process (offset, cu,
     [&] (CORE_ADDR range_beginning, CORE_ADDR range_end)
     {
+      /* Empty range entries have no effect.  */
+      if (range_beginning == range_end)
+	return;
+
       if (ranges_pst != NULL)
 	{
 	  CORE_ADDR lowpc;
@@ -13918,6 +13914,7 @@  class process_die_scope
   struct gdbarch *gdbarch = get_objfile_arch (objfile);
   struct attribute *attr;
   struct attribute *attr_high;
+  bool inlined_subroutine = (die->tag == DW_TAG_inlined_subroutine);
 
   attr_high = dwarf2_attr (die, DW_AT_high_pc, cu);
   if (attr_high)
@@ -13933,7 +13930,10 @@  class process_die_scope
 
 	  low = gdbarch_adjust_dwarf2_addr (gdbarch, low + baseaddr);
 	  high = gdbarch_adjust_dwarf2_addr (gdbarch, high + baseaddr);
-	  cu->get_builder ()->record_block_range (block, low, high - 1);
+	  if (inlined_subroutine)
+	    cu->get_builder ()->record_inline_range_end (high);
+	  if (low < high)
+	    cu->get_builder ()->record_block_range (block, low, high - 1);
         }
     }
 
@@ -13958,6 +13958,10 @@  class process_die_scope
 	  end += baseaddr;
 	  start = gdbarch_adjust_dwarf2_addr (gdbarch, start);
 	  end = gdbarch_adjust_dwarf2_addr (gdbarch, end);
+	  if (inlined_subroutine)
+	    cu->get_builder ()->record_inline_range_end (end);
+	  if (start == end)
+	    return;
 	  cu->get_builder ()->record_block_range (block, start, end - 1);
 	  blockvec.emplace_back (start, end);
 	});
diff --git a/gdb/testsuite/gdb.cp/step-and-next-inline.exp b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
index 3733fa7..e3a5793 100644
--- a/gdb/testsuite/gdb.cp/step-and-next-inline.exp
+++ b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
@@ -52,37 +52,20 @@  proc do_test { use_header } {
     gdb_test "step" ".*" "step into get_alias_set"
     gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
 	"not in inline 1"
-    # It's possible that this first failure (when not using a header
-    # file) is GCC's fault, though the remaining failures would best
-    # be fixed by adding location views support (though it could be
-    # that some easier heuristic could be figured out).  Still, it is
-    # not certain that the first failure wouldn't also be fixed by
-    # having location view support, so for now it is tagged as such.
-    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
     gdb_test "next" ".*TREE_TYPE.*" "next step 1"
     gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
 	"not in inline 2"
     gdb_test "next" ".*TREE_TYPE.*" "next step 2"
     gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
 	"not in inline 3"
-    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
     gdb_test "next" ".*TREE_TYPE.*" "next step 3"
     gdb_test "bt" "\\s*\\#0\\s+get_alias_set\[^\r\]*${srcfile}:.*" \
 	"not in inline 4"
-    if {!$use_header} { setup_kfail "*-*-*" symtab/25507 }
     gdb_test "next" "return 0.*" "next step 4"
     gdb_test "bt" \
 	"\\s*\\#0\\s+(main|get_alias_set)\[^\r\]*${srcfile}:.*" \
 	"not in inline 5"
 
-    if {!$use_header} {
-	# With the debug from GCC 10.x (and earlier) GDB is currently
-	# unable to successfully complete the following tests when we
-	# are not using a header file.
-	kfail symtab/25507 "stepping tests"
-	return
-    }
-
     clean_restart ${executable}
 
     if ![runto_main] {
-- 
1.9.1