breakpoint: Make sure location types match before swapping

Message ID 20200401013813.GA27550@juliacomputing.com
State New
Headers show
Series
  • breakpoint: Make sure location types match before swapping
Related show

Commit Message

Keno Fischer April 1, 2020, 1:38 a.m.
This patch fixes PR gdb/25741 "GDB tries to set breakpoint using Z0, but remove it using z1".
In particular, what occurs in that case is that a hardware breakpoint is hit,
after which GDB removes it and establishes a single step breakpoint at the
same location. Afterwards, rather than simply removing this breakpoint and
re-enabling the hardware breakpoint, GDB simply swaps the activation, without
informing the server, leading to an inconsistency in GDB's view of the world
and the server's view of the world. To remidy this situation, this
patch adds a check that ensures two breakpoint locations have the
same type before they are considered equal and thus eligible for silent
swapping.

gdb/ChangeLog:
	* breakpoint.c (breakpoint_locations_match): Fix PR gdb/25741

Signed-off-by: Keno Fischer <keno@juliacomputing.com>

---
 gdb/breakpoint.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.24.0

Comments

Keno Fischer April 14, 2020, 7:01 a.m. | #1
Bump. It would be great to get this fixed.

On Tue, Mar 31, 2020 at 9:38 PM Keno Fischer <keno@juliacomputing.com> wrote:
>

> This patch fixes PR gdb/25741 "GDB tries to set breakpoint using Z0, but remove it using z1".

> In particular, what occurs in that case is that a hardware breakpoint is hit,

> after which GDB removes it and establishes a single step breakpoint at the

> same location. Afterwards, rather than simply removing this breakpoint and

> re-enabling the hardware breakpoint, GDB simply swaps the activation, without

> informing the server, leading to an inconsistency in GDB's view of the world

> and the server's view of the world. To remidy this situation, this

> patch adds a check that ensures two breakpoint locations have the

> same type before they are considered equal and thus eligible for silent

> swapping.

>

> gdb/ChangeLog:

>         * breakpoint.c (breakpoint_locations_match): Fix PR gdb/25741

>

> Signed-off-by: Keno Fischer <keno@juliacomputing.com>

> ---

>  gdb/breakpoint.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

>

> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c

> index e49025461b..582dae1946 100644

> --- a/gdb/breakpoint.c

> +++ b/gdb/breakpoint.c

> @@ -6838,7 +6838,7 @@ breakpoint_locations_match (struct bp_location *loc1,

>      /* We compare bp_location.length in order to cover ranged breakpoints.  */

>      return (breakpoint_address_match (loc1->pspace->aspace, loc1->address,

>                                      loc2->pspace->aspace, loc2->address)

> -           && loc1->length == loc2->length);

> +           && loc1->length == loc2->length && loc1->loc_type == loc2->loc_type);

>  }

>

>  static void

> --

> 2.24.0

>
Simon Marchi April 14, 2020, 3:04 p.m. | #2
On 2020-04-14 3:01 a.m., Keno Fischer wrote:
> Bump. It would be great to get this fixed.

> 

> On Tue, Mar 31, 2020 at 9:38 PM Keno Fischer <keno@juliacomputing.com> wrote:

>>

>> This patch fixes PR gdb/25741 "GDB tries to set breakpoint using Z0, but remove it using z1".

>> In particular, what occurs in that case is that a hardware breakpoint is hit,

>> after which GDB removes it and establishes a single step breakpoint at the

>> same location. Afterwards, rather than simply removing this breakpoint and

>> re-enabling the hardware breakpoint, GDB simply swaps the activation, without

>> informing the server, leading to an inconsistency in GDB's view of the world

>> and the server's view of the world. To remidy this situation, this

>> patch adds a check that ensures two breakpoint locations have the

>> same type before they are considered equal and thus eligible for silent

>> swapping.

>>

>> gdb/ChangeLog:

>>         * breakpoint.c (breakpoint_locations_match): Fix PR gdb/25741

>>

>> Signed-off-by: Keno Fischer <keno@juliacomputing.com>

>> ---

>>  gdb/breakpoint.c | 2 +-

>>  1 file changed, 1 insertion(+), 1 deletion(-)

>>

>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c

>> index e49025461b..582dae1946 100644

>> --- a/gdb/breakpoint.c

>> +++ b/gdb/breakpoint.c

>> @@ -6838,7 +6838,7 @@ breakpoint_locations_match (struct bp_location *loc1,

>>      /* We compare bp_location.length in order to cover ranged breakpoints.  */

>>      return (breakpoint_address_match (loc1->pspace->aspace, loc1->address,

>>                                      loc2->pspace->aspace, loc2->address)

>> -           && loc1->length == loc2->length);

>> +           && loc1->length == loc2->length && loc1->loc_type == loc2->loc_type);

>>  }

>>

>>  static void

>> --

>> 2.24.0

>>


I think the change makes sense, but this is not an area I know well, and it's one that
is a bit sensitive.  I'll do a full test run and take a bit more time to look at it.

In the mean time, if anybody else wants to take a look, go for it.

Simon
Andrew Burgess April 14, 2020, 4 p.m. | #3
* Simon Marchi <simark@simark.ca> [2020-04-14 11:04:47 -0400]:

> On 2020-04-14 3:01 a.m., Keno Fischer wrote:

> > Bump. It would be great to get this fixed.

> > 

> > On Tue, Mar 31, 2020 at 9:38 PM Keno Fischer <keno@juliacomputing.com> wrote:

> >>

> >> This patch fixes PR gdb/25741 "GDB tries to set breakpoint using Z0, but remove it using z1".

> >> In particular, what occurs in that case is that a hardware breakpoint is hit,

> >> after which GDB removes it and establishes a single step breakpoint at the

> >> same location. Afterwards, rather than simply removing this breakpoint and

> >> re-enabling the hardware breakpoint, GDB simply swaps the activation, without

> >> informing the server, leading to an inconsistency in GDB's view of the world

> >> and the server's view of the world. To remidy this situation, this

> >> patch adds a check that ensures two breakpoint locations have the

> >> same type before they are considered equal and thus eligible for silent

> >> swapping.

> >>

> >> gdb/ChangeLog:

> >>         * breakpoint.c (breakpoint_locations_match): Fix PR gdb/25741

> >>

> >> Signed-off-by: Keno Fischer <keno@juliacomputing.com>

> >> ---

> >>  gdb/breakpoint.c | 2 +-

> >>  1 file changed, 1 insertion(+), 1 deletion(-)

> >>

> >> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c

> >> index e49025461b..582dae1946 100644

> >> --- a/gdb/breakpoint.c

> >> +++ b/gdb/breakpoint.c

> >> @@ -6838,7 +6838,7 @@ breakpoint_locations_match (struct bp_location *loc1,

> >>      /* We compare bp_location.length in order to cover ranged breakpoints.  */

> >>      return (breakpoint_address_match (loc1->pspace->aspace, loc1->address,

> >>                                      loc2->pspace->aspace, loc2->address)

> >> -           && loc1->length == loc2->length);

> >> +           && loc1->length == loc2->length && loc1->loc_type == loc2->loc_type);

> >>  }

> >>

> >>  static void

> >> --

> >> 2.24.0

> >>

> 

> I think the change makes sense, but this is not an area I know well, and it's one that

> is a bit sensitive.  I'll do a full test run and take a bit more time to look at it.

> 

> In the mean time, if anybody else wants to take a look, go for it.


I was able to reproduce this issue with a hacked up RISC-V target
(forced software single step on for bare-metal) running on a
development board I have access too.

I agree that this fix feels like the right thing to do, it's inline
with location checking done in watchpoint_locations_match.  The only
question I had when comparing to watchpoint_locations_match, is
whether we should be similarly comparing the types of the owner
breakpoint rather than the actual location.  However, the b/p type is
overloaded to contain more than just whether the breakpoint is h/w or
s/w, so in this case we have the user h/w breakpoint, its type is
bp_hardware_breakpoint, while the software single step breakpoint has
type bp_single_step.  The conclusion I came to is that it really is
the type of the location we need to compare here.

The other slight issue I had with the patch was that based on the
original description I still had to go and figure out the exact
conditions that would trigger this bug.  I believe that to trigger
this you need a h/w breakpoint placed on an instruction that loops to
itself, I don't see how else this could happen.

I took a crack at writing a more detailed break down of the conditions
that trigger this bug.

I'm still running this through testing here, but I'd be interested to
hear if you think the fuller description is helpful.

Thanks,
Andrew

---

From d2b719b0f4857461064ed7b1da744a01b2ad2c6d Mon Sep 17 00:00:00 2001
From: Andrew Burgess <andrew.burgess@embecosm.com>

Date: Tue, 14 Apr 2020 16:32:02 +0100
Subject: [PATCH] gdb: ensure location types match before swapping locations

In the following conditions:

  - A target with hardware breakpoints available, and
  - A target that uses software single stepping,
  - An instruction at ADDRESS loops back to itself,

Now consider the following steps:

  1. The user places a hardware breakpoint at ADDRESS (an instruction
  that loops to itself),

  2. The inferior runs and hits the breakpoint from #1,

  3. The user tells GDB to 'continue'.

In #3 when the user tells GDB to continue, GDB first disables the
hardware breakpoint at ADDRESS, and then inserts a software single
step breakpoint at ADDRESS.  The original user created breakpoint was
a hardware breakpoint, while the single step breakpoint will be a
software breakpoint.

GDB continues and immediately hits the software single step
breakpoint.

GDB then deletes the software single step breakpoint by calling
delete_single_step_breakpoints, which eventually calls
delete_breakpoint, which, once the breakpoint (and its locations) are
deleted, calls update_global_location_list.

During update_global_location_list GDB spots that we have an old
location (the software single step breakpoint location) that is
inserted, but being deleted, and a location at the same address which
we are keeping, but which is not currently inserted (the original
hardware breakpoint location), GDB then calls
breakpoint_locations_match on these two locations.  Currently the
locations do match, and so GDB calls swap_insertion which swaps the
"inserted" state of the two locations.  GDB finally returns through
the call stack and leaves delete_single_step_breakpoints.  After this
GDB continues with its normal "stopping" process, as part of this
stopping process GDB removes all the breakpoints from the target.  Due
to the swap the original hardware breakpoint location is removed.

The problem is that GDB inserted the software single step breakpoint
as a software breakpoint, and then, thanks to the swap, tries to
remove it as a hardware breakpoint.  This will leave the target in an
inconsistent state, and as in the original bug report (PR gdb/25741),
could result in the target throwing an error.

The solution for all this is to have two breakpoint locations of
different types (hardware breakpoint vs software breakpoint) not
match.  The result of this is that GDB will no longer swap the
inserted status of the two breakpoints.  The original call to
update_global_location_list (after deleting the software single step
breakpoint) will at this point trigger a call to remove the
breakpoint, something which will be done based on the type of that
location.  Later GDB will see that the original hardware breakpoint is
already not inserted, and so will leave it alone.

This patch was original proposed by Keno Fischer here:

  https://sourceware.org/pipermail/gdb-patches/2020-April/167202.html

gdb/ChangeLog:

	PR gdb/25741
	* breakpoint.c (breakpoint_locations_match): Compare location
	types.
---
 gdb/ChangeLog    | 6 ++++++
 gdb/breakpoint.c | 3 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index e49025461ba..2ab40a8e40a 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -6838,7 +6838,8 @@ breakpoint_locations_match (struct bp_location *loc1,
     /* We compare bp_location.length in order to cover ranged breakpoints.  */
     return (breakpoint_address_match (loc1->pspace->aspace, loc1->address,
 				     loc2->pspace->aspace, loc2->address)
-	    && loc1->length == loc2->length);
+	    && loc1->length == loc2->length
+	    && loc1->loc_type == loc2->loc_type);
 }
 
 static void
-- 
2.25.2
Keno Fischer April 14, 2020, 4:26 p.m. | #4
Thanks for the detailed write up - I wasn't quite sure of the exact
circumstances that trigger it, since in my case it involved an external
signal handler that was rewriting register state. That did in effect lead
to the instruction looping back on itself.

On Tue, Apr 14, 2020, 12:01 Andrew Burgess <andrew.burgess@embecosm.com>
wrote:

> * Simon Marchi <simark@simark.ca> [2020-04-14 11:04:47 -0400]:

>

> > On 2020-04-14 3:01 a.m., Keno Fischer wrote:

> > > Bump. It would be great to get this fixed.

> > >

> > > On Tue, Mar 31, 2020 at 9:38 PM Keno Fischer <keno@juliacomputing.com>

> wrote:

> > >>

> > >> This patch fixes PR gdb/25741 "GDB tries to set breakpoint using Z0,

> but remove it using z1".

> > >> In particular, what occurs in that case is that a hardware breakpoint

> is hit,

> > >> after which GDB removes it and establishes a single step breakpoint

> at the

> > >> same location. Afterwards, rather than simply removing this

> breakpoint and

> > >> re-enabling the hardware breakpoint, GDB simply swaps the activation,

> without

> > >> informing the server, leading to an inconsistency in GDB's view of

> the world

> > >> and the server's view of the world. To remidy this situation, this

> > >> patch adds a check that ensures two breakpoint locations have the

> > >> same type before they are considered equal and thus eligible for

> silent

> > >> swapping.

> > >>

> > >> gdb/ChangeLog:

> > >>         * breakpoint.c (breakpoint_locations_match): Fix PR gdb/25741

> > >>

> > >> Signed-off-by: Keno Fischer <keno@juliacomputing.com>

> > >> ---

> > >>  gdb/breakpoint.c | 2 +-

> > >>  1 file changed, 1 insertion(+), 1 deletion(-)

> > >>

> > >> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c

> > >> index e49025461b..582dae1946 100644

> > >> --- a/gdb/breakpoint.c

> > >> +++ b/gdb/breakpoint.c

> > >> @@ -6838,7 +6838,7 @@ breakpoint_locations_match (struct bp_location

> *loc1,

> > >>      /* We compare bp_location.length in order to cover ranged

> breakpoints.  */

> > >>      return (breakpoint_address_match (loc1->pspace->aspace,

> loc1->address,

> > >>                                      loc2->pspace->aspace,

> loc2->address)

> > >> -           && loc1->length == loc2->length);

> > >> +           && loc1->length == loc2->length && loc1->loc_type ==

> loc2->loc_type);

> > >>  }

> > >>

> > >>  static void

> > >> --

> > >> 2.24.0

> > >>

> >

> > I think the change makes sense, but this is not an area I know well, and

> it's one that

> > is a bit sensitive.  I'll do a full test run and take a bit more time to

> look at it.

> >

> > In the mean time, if anybody else wants to take a look, go for it.

>

> I was able to reproduce this issue with a hacked up RISC-V target

> (forced software single step on for bare-metal) running on a

> development board I have access too.

>

> I agree that this fix feels like the right thing to do, it's inline

> with location checking done in watchpoint_locations_match.  The only

> question I had when comparing to watchpoint_locations_match, is

> whether we should be similarly comparing the types of the owner

> breakpoint rather than the actual location.  However, the b/p type is

> overloaded to contain more than just whether the breakpoint is h/w or

> s/w, so in this case we have the user h/w breakpoint, its type is

> bp_hardware_breakpoint, while the software single step breakpoint has

> type bp_single_step.  The conclusion I came to is that it really is

> the type of the location we need to compare here.

>

> The other slight issue I had with the patch was that based on the

> original description I still had to go and figure out the exact

> conditions that would trigger this bug.  I believe that to trigger

> this you need a h/w breakpoint placed on an instruction that loops to

> itself, I don't see how else this could happen.

>

> I took a crack at writing a more detailed break down of the conditions

> that trigger this bug.

>

> I'm still running this through testing here, but I'd be interested to

> hear if you think the fuller description is helpful.

>

> Thanks,

> Andrew

>

> ---

>

> From d2b719b0f4857461064ed7b1da744a01b2ad2c6d Mon Sep 17 00:00:00 2001

> From: Andrew Burgess <andrew.burgess@embecosm.com>

> Date: Tue, 14 Apr 2020 16:32:02 +0100

> Subject: [PATCH] gdb: ensure location types match before swapping locations

>

> In the following conditions:

>

>   - A target with hardware breakpoints available, and

>   - A target that uses software single stepping,

>   - An instruction at ADDRESS loops back to itself,

>

> Now consider the following steps:

>

>   1. The user places a hardware breakpoint at ADDRESS (an instruction

>   that loops to itself),

>

>   2. The inferior runs and hits the breakpoint from #1,

>

>   3. The user tells GDB to 'continue'.

>

> In #3 when the user tells GDB to continue, GDB first disables the

> hardware breakpoint at ADDRESS, and then inserts a software single

> step breakpoint at ADDRESS.  The original user created breakpoint was

> a hardware breakpoint, while the single step breakpoint will be a

> software breakpoint.

>

> GDB continues and immediately hits the software single step

> breakpoint.

>

> GDB then deletes the software single step breakpoint by calling

> delete_single_step_breakpoints, which eventually calls

> delete_breakpoint, which, once the breakpoint (and its locations) are

> deleted, calls update_global_location_list.

>

> During update_global_location_list GDB spots that we have an old

> location (the software single step breakpoint location) that is

> inserted, but being deleted, and a location at the same address which

> we are keeping, but which is not currently inserted (the original

> hardware breakpoint location), GDB then calls

> breakpoint_locations_match on these two locations.  Currently the

> locations do match, and so GDB calls swap_insertion which swaps the

> "inserted" state of the two locations.  GDB finally returns through

> the call stack and leaves delete_single_step_breakpoints.  After this

> GDB continues with its normal "stopping" process, as part of this

> stopping process GDB removes all the breakpoints from the target.  Due

> to the swap the original hardware breakpoint location is removed.

>

> The problem is that GDB inserted the software single step breakpoint

> as a software breakpoint, and then, thanks to the swap, tries to

> remove it as a hardware breakpoint.  This will leave the target in an

> inconsistent state, and as in the original bug report (PR gdb/25741),

> could result in the target throwing an error.

>

> The solution for all this is to have two breakpoint locations of

> different types (hardware breakpoint vs software breakpoint) not

> match.  The result of this is that GDB will no longer swap the

> inserted status of the two breakpoints.  The original call to

> update_global_location_list (after deleting the software single step

> breakpoint) will at this point trigger a call to remove the

> breakpoint, something which will be done based on the type of that

> location.  Later GDB will see that the original hardware breakpoint is

> already not inserted, and so will leave it alone.

>

> This patch was original proposed by Keno Fischer here:

>

>   https://sourceware.org/pipermail/gdb-patches/2020-April/167202.html

>

> gdb/ChangeLog:

>

>         PR gdb/25741

>         * breakpoint.c (breakpoint_locations_match): Compare location

>         types.

> ---

>  gdb/ChangeLog    | 6 ++++++

>  gdb/breakpoint.c | 3 ++-

>  2 files changed, 8 insertions(+), 1 deletion(-)

>

> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c

> index e49025461ba..2ab40a8e40a 100644

> --- a/gdb/breakpoint.c

> +++ b/gdb/breakpoint.c

> @@ -6838,7 +6838,8 @@ breakpoint_locations_match (struct bp_location *loc1,

>      /* We compare bp_location.length in order to cover ranged

> breakpoints.  */

>      return (breakpoint_address_match (loc1->pspace->aspace, loc1->address,

>                                      loc2->pspace->aspace, loc2->address)

> -           && loc1->length == loc2->length);

> +           && loc1->length == loc2->length

> +           && loc1->loc_type == loc2->loc_type);

>  }

>

>  static void

> --

> 2.25.2

>

>
Victor Collod via Gdb-patches April 14, 2020, 7:17 p.m. | #5
On 4/14/20 5:00 PM, Andrew Burgess wrote:

> The other slight issue I had with the patch was that based on the

> original description I still had to go and figure out the exact

> conditions that would trigger this bug.  I believe that to trigger

> this you need a h/w breakpoint placed on an instruction that loops to

> itself, I don't see how else this could happen.

> 

> I took a crack at writing a more detailed break down of the conditions

> that trigger this bug.

> 


> I'm still running this through testing here, but I'd be interested to

> hear if you think the fuller description is helpful.


It is.

> From d2b719b0f4857461064ed7b1da744a01b2ad2c6d Mon Sep 17 00:00:00 2001

> From: Andrew Burgess <andrew.burgess@embecosm.com>

> Date: Tue, 14 Apr 2020 16:32:02 +0100

> Subject: [PATCH] gdb: ensure location types match before swapping locations

> 

> In the following conditions:

> 

>   - A target with hardware breakpoints available, and

>   - A target that uses software single stepping,

>   - An instruction at ADDRESS loops back to itself,

> 

> Now consider the following steps:

> 

>   1. The user places a hardware breakpoint at ADDRESS (an instruction

>   that loops to itself),

> 

>   2. The inferior runs and hits the breakpoint from #1,

> 

>   3. The user tells GDB to 'continue'.

> 

> In #3 when the user tells GDB to continue, GDB first disables the

> hardware breakpoint at ADDRESS, and then inserts a software single

> step breakpoint at ADDRESS.  The original user created breakpoint was

> a hardware breakpoint, while the single step breakpoint will be a

> software breakpoint.

> 

> GDB continues and immediately hits the software single step

> breakpoint.

> 

> GDB then deletes the software single step breakpoint by calling

> delete_single_step_breakpoints, which eventually calls

> delete_breakpoint, which, once the breakpoint (and its locations) are

> deleted, calls update_global_location_list.

> 

> During update_global_location_list GDB spots that we have an old

> location (the software single step breakpoint location) that is

> inserted, but being deleted, and a location at the same address which

> we are keeping, but which is not currently inserted (the original

> hardware breakpoint location), GDB then calls

> breakpoint_locations_match on these two locations.  Currently the

> locations do match, and so GDB calls swap_insertion which swaps the

> "inserted" state of the two locations.  GDB finally returns through

> the call stack and leaves delete_single_step_breakpoints.  After this

> GDB continues with its normal "stopping" process, as part of this

> stopping process GDB removes all the breakpoints from the target.  Due

> to the swap the original hardware breakpoint location is removed.

> 

> The problem is that GDB inserted the software single step breakpoint

> as a software breakpoint, and then, thanks to the swap, tries to

> remove it as a hardware breakpoint.  This will leave the target in an

> inconsistent state, and as in the original bug report (PR gdb/25741),

> could result in the target throwing an error.

> 

> The solution for all this is to have two breakpoint locations of

> different types (hardware breakpoint vs software breakpoint) not

> match.  The result of this is that GDB will no longer swap the

> inserted status of the two breakpoints.  The original call to

> update_global_location_list (after deleting the software single step

> breakpoint) will at this point trigger a call to remove the

> breakpoint, something which will be done based on the type of that

> location.  Later GDB will see that the original hardware breakpoint is

> already not inserted, and so will leave it alone.

> 

> This patch was original proposed by Keno Fischer here:

> 

>   https://sourceware.org/pipermail/gdb-patches/2020-April/167202.html

> 

> gdb/ChangeLog:

> 

> 	PR gdb/25741

> 	* breakpoint.c (breakpoint_locations_match): Compare location

> 	types.


This seems right at first blush, though there are some things
that we need to look at.  Here are 3 cases that found while
looking for breakpoint_locations_match callers:

#1

E.g., with this, GDB will now install both a hw breakpoint location
and a software location at the same address.  E.g., a contrived case
to see it happen would be, with:

 (gdb) set breakpoint always-inserted on
 (gdb) set debug remote 1

before:

 (gdb) hbreak main
 Sending packet: $m400736,1#fe...Packet received: 48
 Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.
 Sending packet: $Z1,400736,1#48...Packet received: OK
 (gdb) b main
 Note: breakpoint 1 also set at pc 0x400736.
 Sending packet: $m400736,1#fe...Packet received: 48
 Breakpoint 2 at 0x400736: file threads.c, line 57.
 Sending packet: $Z1,400736,1#48...Packet received: OK

after:

 (gdb) hbreak main
 Sending packet: $m400736,1#fe...Packet received: 48
 Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.
 Sending packet: $Z1,400736,1#48...Packet received: OK
 (gdb) break main
 Note: breakpoint 1 also set at pc 0x400736.
 Sending packet: $m400736,1#fe...Packet received: 48
 Breakpoint 2 at 0x400736: file threads.c, line 57.
 Sending packet: $Z1,400736,1#48...Packet received: OK
 Sending packet: $Z0,400736,1#47...Packet received: OK

This is likely not to cause a problem, but it's worth it to consider.

#2

Another thing to consider is the automatic hardware breakpoints
support.  See insert_bp_location.  That means that breakpoint
locations can start as software breakpoints but still end up
inserted as hw breakpoints.  Similarly to #1 above, without the patch,
gdb is considering those locations as duplicates, and after the
patch it no longer will.  So we may end up with multiple insertions
for the same location.  Again, similar to #1.

#3

I suspect this the (automatic hardware breakpoints) can interfere with
e.g., update_breakpoint_locations 's logic of matching old locations
with new locations, in the have_ambiguous_names case.  I.e., after
insertion the locations are hw breakpoint locations, but after
a breakpoint re-set, the new locations will be software breakpoint
locations until they try to be inserted.  Before the patch, the
disabled state will still be carried over.  After the patch, they won't.
I think.

Maybe we need to explicitly consider the case in
update_breakpoint_locations.

Thanks,
Pedro Alves
Andrew Burgess April 15, 2020, 8:46 p.m. | #6
* Pedro Alves <palves@redhat.com> [2020-04-14 20:17:00 +0100]:

> On 4/14/20 5:00 PM, Andrew Burgess wrote:

> 

> > The other slight issue I had with the patch was that based on the

> > original description I still had to go and figure out the exact

> > conditions that would trigger this bug.  I believe that to trigger

> > this you need a h/w breakpoint placed on an instruction that loops to

> > itself, I don't see how else this could happen.

> > 

> > I took a crack at writing a more detailed break down of the conditions

> > that trigger this bug.

> > 

> 

> > I'm still running this through testing here, but I'd be interested to

> > hear if you think the fuller description is helpful.

> 

> It is.

> 

> > From d2b719b0f4857461064ed7b1da744a01b2ad2c6d Mon Sep 17 00:00:00 2001

> > From: Andrew Burgess <andrew.burgess@embecosm.com>

> > Date: Tue, 14 Apr 2020 16:32:02 +0100

> > Subject: [PATCH] gdb: ensure location types match before swapping locations

> > 

> > In the following conditions:

> > 

> >   - A target with hardware breakpoints available, and

> >   - A target that uses software single stepping,

> >   - An instruction at ADDRESS loops back to itself,

> > 

> > Now consider the following steps:

> > 

> >   1. The user places a hardware breakpoint at ADDRESS (an instruction

> >   that loops to itself),

> > 

> >   2. The inferior runs and hits the breakpoint from #1,

> > 

> >   3. The user tells GDB to 'continue'.

> > 

> > In #3 when the user tells GDB to continue, GDB first disables the

> > hardware breakpoint at ADDRESS, and then inserts a software single

> > step breakpoint at ADDRESS.  The original user created breakpoint was

> > a hardware breakpoint, while the single step breakpoint will be a

> > software breakpoint.

> > 

> > GDB continues and immediately hits the software single step

> > breakpoint.

> > 

> > GDB then deletes the software single step breakpoint by calling

> > delete_single_step_breakpoints, which eventually calls

> > delete_breakpoint, which, once the breakpoint (and its locations) are

> > deleted, calls update_global_location_list.

> > 

> > During update_global_location_list GDB spots that we have an old

> > location (the software single step breakpoint location) that is

> > inserted, but being deleted, and a location at the same address which

> > we are keeping, but which is not currently inserted (the original

> > hardware breakpoint location), GDB then calls

> > breakpoint_locations_match on these two locations.  Currently the

> > locations do match, and so GDB calls swap_insertion which swaps the

> > "inserted" state of the two locations.  GDB finally returns through

> > the call stack and leaves delete_single_step_breakpoints.  After this

> > GDB continues with its normal "stopping" process, as part of this

> > stopping process GDB removes all the breakpoints from the target.  Due

> > to the swap the original hardware breakpoint location is removed.

> > 

> > The problem is that GDB inserted the software single step breakpoint

> > as a software breakpoint, and then, thanks to the swap, tries to

> > remove it as a hardware breakpoint.  This will leave the target in an

> > inconsistent state, and as in the original bug report (PR gdb/25741),

> > could result in the target throwing an error.

> > 

> > The solution for all this is to have two breakpoint locations of

> > different types (hardware breakpoint vs software breakpoint) not

> > match.  The result of this is that GDB will no longer swap the

> > inserted status of the two breakpoints.  The original call to

> > update_global_location_list (after deleting the software single step

> > breakpoint) will at this point trigger a call to remove the

> > breakpoint, something which will be done based on the type of that

> > location.  Later GDB will see that the original hardware breakpoint is

> > already not inserted, and so will leave it alone.

> > 

> > This patch was original proposed by Keno Fischer here:

> > 

> >   https://sourceware.org/pipermail/gdb-patches/2020-April/167202.html

> > 

> > gdb/ChangeLog:

> > 

> > 	PR gdb/25741

> > 	* breakpoint.c (breakpoint_locations_match): Compare location

> > 	types.

> 

> This seems right at first blush, though there are some things

> that we need to look at.  Here are 3 cases that found while

> looking for breakpoint_locations_match callers:

> 

> #1

> 

> E.g., with this, GDB will now install both a hw breakpoint location

> and a software location at the same address.  E.g., a contrived case

> to see it happen would be, with:

> 

>  (gdb) set breakpoint always-inserted on

>  (gdb) set debug remote 1

> 

> before:

> 

>  (gdb) hbreak main

>  Sending packet: $m400736,1#fe...Packet received: 48

>  Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.

>  Sending packet: $Z1,400736,1#48...Packet received: OK

>  (gdb) b main

>  Note: breakpoint 1 also set at pc 0x400736.

>  Sending packet: $m400736,1#fe...Packet received: 48

>  Breakpoint 2 at 0x400736: file threads.c, line 57.

>  Sending packet: $Z1,400736,1#48...Packet received: OK

> 

> after:

> 

>  (gdb) hbreak main

>  Sending packet: $m400736,1#fe...Packet received: 48

>  Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.

>  Sending packet: $Z1,400736,1#48...Packet received: OK

>  (gdb) break main

>  Note: breakpoint 1 also set at pc 0x400736.

>  Sending packet: $m400736,1#fe...Packet received: 48

>  Breakpoint 2 at 0x400736: file threads.c, line 57.

>  Sending packet: $Z1,400736,1#48...Packet received: OK

>  Sending packet: $Z0,400736,1#47...Packet received: OK

> 

> This is likely not to cause a problem, but it's worth it to consider.

> 

> #2

> 

> Another thing to consider is the automatic hardware breakpoints

> support.  See insert_bp_location.  That means that breakpoint

> locations can start as software breakpoints but still end up

> inserted as hw breakpoints.  Similarly to #1 above, without the patch,

> gdb is considering those locations as duplicates, and after the

> patch it no longer will.  So we may end up with multiple insertions

> for the same location.  Again, similar to #1.

> 

> #3

> 

> I suspect this the (automatic hardware breakpoints) can interfere with

> e.g., update_breakpoint_locations 's logic of matching old locations

> with new locations, in the have_ambiguous_names case.  I.e., after

> insertion the locations are hw breakpoint locations, but after

> a breakpoint re-set, the new locations will be software breakpoint

> locations until they try to be inserted.  Before the patch, the

> disabled state will still be carried over.  After the patch, they won't.

> I think.

> 

> Maybe we need to explicitly consider the case in

> update_breakpoint_locations.


I'll take another look at this code, but it sounds almost like we
should split breakpoint_locations_match in two, we have:

  1. match (v1) - two breakpoints at the same address match regardless
  of type, this would be used when deciding if we should insert
  bp_location 1, 2, or both.

  2. match (v2) - when one breakpoint is already inserted, can another
  breakpoint "take over" its inserted status.

It's almost as though we want:

  bool
  breakpoint_locations_swapable ( .... )
  {
     return (breakpoint_locations_match ( .... )
             && loc1->type == loc2->type);
  }

Maybe?

I'll take another look at this when I can.

Thanks,
Andrew
Andrew Burgess April 17, 2020, 12:28 p.m. | #7
* Pedro Alves <palves@redhat.com> [2020-04-14 20:17:00 +0100]:

> On 4/14/20 5:00 PM, Andrew Burgess wrote:

> 

> > The other slight issue I had with the patch was that based on the

> > original description I still had to go and figure out the exact

> > conditions that would trigger this bug.  I believe that to trigger

> > this you need a h/w breakpoint placed on an instruction that loops to

> > itself, I don't see how else this could happen.

> > 

> > I took a crack at writing a more detailed break down of the conditions

> > that trigger this bug.

> > 

> 

> > I'm still running this through testing here, but I'd be interested to

> > hear if you think the fuller description is helpful.

> 

> It is.

> 

> > From d2b719b0f4857461064ed7b1da744a01b2ad2c6d Mon Sep 17 00:00:00 2001

> > From: Andrew Burgess <andrew.burgess@embecosm.com>

> > Date: Tue, 14 Apr 2020 16:32:02 +0100

> > Subject: [PATCH] gdb: ensure location types match before swapping locations

> > 

> > In the following conditions:

> > 

> >   - A target with hardware breakpoints available, and

> >   - A target that uses software single stepping,

> >   - An instruction at ADDRESS loops back to itself,

> > 

> > Now consider the following steps:

> > 

> >   1. The user places a hardware breakpoint at ADDRESS (an instruction

> >   that loops to itself),

> > 

> >   2. The inferior runs and hits the breakpoint from #1,

> > 

> >   3. The user tells GDB to 'continue'.

> > 

> > In #3 when the user tells GDB to continue, GDB first disables the

> > hardware breakpoint at ADDRESS, and then inserts a software single

> > step breakpoint at ADDRESS.  The original user created breakpoint was

> > a hardware breakpoint, while the single step breakpoint will be a

> > software breakpoint.

> > 

> > GDB continues and immediately hits the software single step

> > breakpoint.

> > 

> > GDB then deletes the software single step breakpoint by calling

> > delete_single_step_breakpoints, which eventually calls

> > delete_breakpoint, which, once the breakpoint (and its locations) are

> > deleted, calls update_global_location_list.

> > 

> > During update_global_location_list GDB spots that we have an old

> > location (the software single step breakpoint location) that is

> > inserted, but being deleted, and a location at the same address which

> > we are keeping, but which is not currently inserted (the original

> > hardware breakpoint location), GDB then calls

> > breakpoint_locations_match on these two locations.  Currently the

> > locations do match, and so GDB calls swap_insertion which swaps the

> > "inserted" state of the two locations.  GDB finally returns through

> > the call stack and leaves delete_single_step_breakpoints.  After this

> > GDB continues with its normal "stopping" process, as part of this

> > stopping process GDB removes all the breakpoints from the target.  Due

> > to the swap the original hardware breakpoint location is removed.

> > 

> > The problem is that GDB inserted the software single step breakpoint

> > as a software breakpoint, and then, thanks to the swap, tries to

> > remove it as a hardware breakpoint.  This will leave the target in an

> > inconsistent state, and as in the original bug report (PR gdb/25741),

> > could result in the target throwing an error.

> > 

> > The solution for all this is to have two breakpoint locations of

> > different types (hardware breakpoint vs software breakpoint) not

> > match.  The result of this is that GDB will no longer swap the

> > inserted status of the two breakpoints.  The original call to

> > update_global_location_list (after deleting the software single step

> > breakpoint) will at this point trigger a call to remove the

> > breakpoint, something which will be done based on the type of that

> > location.  Later GDB will see that the original hardware breakpoint is

> > already not inserted, and so will leave it alone.

> > 

> > This patch was original proposed by Keno Fischer here:

> > 

> >   https://sourceware.org/pipermail/gdb-patches/2020-April/167202.html

> > 

> > gdb/ChangeLog:

> > 

> > 	PR gdb/25741

> > 	* breakpoint.c (breakpoint_locations_match): Compare location

> > 	types.

> 

> This seems right at first blush, though there are some things

> that we need to look at.  Here are 3 cases that found while

> looking for breakpoint_locations_match callers:

> 

> #1

> 

> E.g., with this, GDB will now install both a hw breakpoint location

> and a software location at the same address.  E.g., a contrived case

> to see it happen would be, with:

> 

>  (gdb) set breakpoint always-inserted on

>  (gdb) set debug remote 1

> 

> before:

> 

>  (gdb) hbreak main

>  Sending packet: $m400736,1#fe...Packet received: 48

>  Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.

>  Sending packet: $Z1,400736,1#48...Packet received: OK

>  (gdb) b main

>  Note: breakpoint 1 also set at pc 0x400736.

>  Sending packet: $m400736,1#fe...Packet received: 48

>  Breakpoint 2 at 0x400736: file threads.c, line 57.

>  Sending packet: $Z1,400736,1#48...Packet received: OK

> 

> after:

> 

>  (gdb) hbreak main

>  Sending packet: $m400736,1#fe...Packet received: 48

>  Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.

>  Sending packet: $Z1,400736,1#48...Packet received: OK

>  (gdb) break main

>  Note: breakpoint 1 also set at pc 0x400736.

>  Sending packet: $m400736,1#fe...Packet received: 48

>  Breakpoint 2 at 0x400736: file threads.c, line 57.

>  Sending packet: $Z1,400736,1#48...Packet received: OK

>  Sending packet: $Z0,400736,1#47...Packet received: OK


While working on a revised patch I tried to reproduce this, and it's
most odd.  Notice that both before and after you have two Z1 packets,
you're inserting the hardware breakpoint twice with no intermediate
removal.

I don't see this behaviour, even on an unmodified GDB, and I know some
remove targets, for example OpenOCD, will sanity check against such
double insertions and throw an error.

Just though this was worth mentioning.

Anyway, here's a revised patch.  I've moved the location of the type
check so it is only done now for the swapping case.  This should
resolve all the concerns you raised, while still addressing the
original issue.  I updated the commit message a bit too.

What do you think?

Thanks,
Andrew

---

commit 10d62976088600ee39b9b828bae93f82a44a2974
Author: Andrew Burgess <andrew.burgess@embecosm.com>
Date:   Tue Apr 14 16:32:02 2020 +0100

    gdb: ensure location types match before swapping locations
    
    In the following conditions:
    
      - A target with hardware breakpoints available, and
      - A target that uses software single stepping,
      - An instruction at ADDRESS loops back to itself,
    
    Now consider the following steps:
    
      1. The user places a hardware breakpoint at ADDRESS (an instruction
      that loops to itself),
    
      2. The inferior runs and hits the breakpoint at ADDRESS,
    
      3. The user tells GDB to 'continue'.
    
    In #3 when the user tells GDB to continue, GDB first disables the
    hardware breakpoint at ADDRESS, and then inserts a software single
    step breakpoint at ADDRESS.  The original user created breakpoint was
    a hardware breakpoint, while the single step breakpoint will be a
    software breakpoint.
    
    GDB continues and immediately hits the software single step
    breakpoint.
    
    GDB then deletes the software single step breakpoint by calling
    delete_single_step_breakpoints, which eventually calls
    delete_breakpoint, which, once the breakpoint (and its locations) are
    deleted, calls update_global_location_list.
    
    During update_global_location_list GDB spots that we have an old
    location (the software single step breakpoint location) that is
    inserted, but being deleted, and a location (the original hardware
    breakpoint) at the same address which we are keeping, but which is not
    currently inserted, GDB then calls breakpoint_locations_match on these
    two locations.
    
    Currently the locations do match, and so GDB calls swap_insertion
    which swaps the "inserted" state of the two locations.  The user
    created hardware breakpoint is marked as inserted, while the GDB
    internal software single step breakpoint is now marked as not
    inserted.  After this GDB returns through the call stack and leaves
    delete_single_step_breakpoints.
    
    After this GDB continues with its normal "stopping" process, as part
    of this stopping process GDB removes all the breakpoints from the
    target.  Due to the swap it is now the user created hardware
    breakpoint that is marked as inserted, so it is this breakpoint GDB
    tries to remove.
    
    The problem is that GDB inserted the software single step breakpoint
    as a software breakpoint, but is now trying to remove the hardware
    breakpoint.  The problem is removing a software breakpoint is very
    different to removing a hardware breakpoint, this could result is some
    undetected undefined behaviour, or as in the original bug report (PR
    gdb/25741), could result in the target throwing an error.
    
    This original patch proposed by Keno Fischer is here:
    
      https://sourceware.org/pipermail/gdb-patches/2020-April/167202.html
    
    This patch modified breakpoint_locations_match so that two breakpoints
    of different types would no longer match.  This resolves the immediate
    problem described above, but introduces some other issues.
    
    The breakpoint_locations_match function is also used to control
    whether two breakpoints places at the same address should both be
    inserted or not.
    
    Consider this (slightly contrived) use case:
    
      (gdb) set debug remote 1
      (gdb) set breakpoint  always-inserted on
      (gdb) hbreak  MyFunc0
      Sending packet: $me0000202,2#84...Packet received: 0000
      Hardware assisted breakpoint 3 at 0xe0000202: file test.c, line 288.
      Sending packet: $m80000188,1#63...Packet received: 00
      Sending packet: $Z1,e0000202,2#ce...Packet received: OK
      (gdb) break MyFunc0
      Note: breakpoint 3 also set at pc 0xe0000202.
      Sending packet: $me0000202,2#84...Packet received: 0000
      Breakpoint 4 at 0xe0000202: file test.c, line 288.
      Sending packet: $m80000188,1#63...Packet received: 00
    
    Notice after setting the hardware breakpoint we see the Z1 packet
    sent, but after setting the software breakpoint there's no Z0 packet,
    GDB sees that the locations match and doesn't make the second
    insertion.  If we modify breakpoint_locations_match then we would
    insert both a hardware _and_ software breakpoint.  This probably
    doesn't matter, but is not ideal, so a solution that doesn't do this
    would be better.
    
    The proposal here is to focus specifically on the case where we are
    considering swapping the inserted status of two breakpoint locations,
    and moves the location type check out of breakpoint_locations_match,
    and into the caller.  This resolves the original issue, while avoiding
    the double insertion problem.
    
    gdb/ChangeLog:
    
            PR gdb/25741
            * breakpoint.c (breakpoint_locations_match): Compare location
            types.

diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index e49025461ba..afd6459a634 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -11786,6 +11786,17 @@ update_global_location_list (enum ugll_insert_mode insert_mode)
 			      gdb_assert (is_hardware_watchpoint (loc2->owner));
 			      loc2->watchpoint_type = old_loc->watchpoint_type;
 			    }
+			  else if (is_breakpoint (old_loc->owner))
+			    {
+			      /* If we swap the inserted status of a
+				 hardware and software breakpoint then GDB
+				 will insert the breakpoint as one type,
+				 and later try to remove the breakpoint as
+				 the other type.  This is not good.  */
+			      gdb_assert (is_breakpoint (loc2->owner));
+			      if (old_loc->loc_type != loc2->loc_type)
+				continue;
+			    }
 
 			  /* loc2 is a duplicated location. We need to check
 			     if it should be inserted in case it will be
Victor Collod via Gdb-patches April 17, 2020, 5:22 p.m. | #8
On 4/17/20 1:28 PM, Andrew Burgess wrote:
> * Pedro Alves <palves@redhat.com> [2020-04-14 20:17:00 +0100]:

> 

>> On 4/14/20 5:00 PM, Andrew Burgess wrote:

>>

>>> The other slight issue I had with the patch was that based on the

>>> original description I still had to go and figure out the exact

>>> conditions that would trigger this bug.  I believe that to trigger

>>> this you need a h/w breakpoint placed on an instruction that loops to

>>> itself, I don't see how else this could happen.

>>>

>>> I took a crack at writing a more detailed break down of the conditions

>>> that trigger this bug.

>>>

>>

>>> I'm still running this through testing here, but I'd be interested to

>>> hear if you think the fuller description is helpful.

>>

>> It is.

>>

>>> From d2b719b0f4857461064ed7b1da744a01b2ad2c6d Mon Sep 17 00:00:00 2001

>>> From: Andrew Burgess <andrew.burgess@embecosm.com>

>>> Date: Tue, 14 Apr 2020 16:32:02 +0100

>>> Subject: [PATCH] gdb: ensure location types match before swapping locations

>>>

>>> In the following conditions:

>>>

>>>   - A target with hardware breakpoints available, and

>>>   - A target that uses software single stepping,

>>>   - An instruction at ADDRESS loops back to itself,

>>>

>>> Now consider the following steps:

>>>

>>>   1. The user places a hardware breakpoint at ADDRESS (an instruction

>>>   that loops to itself),

>>>

>>>   2. The inferior runs and hits the breakpoint from #1,

>>>

>>>   3. The user tells GDB to 'continue'.

>>>

>>> In #3 when the user tells GDB to continue, GDB first disables the

>>> hardware breakpoint at ADDRESS, and then inserts a software single

>>> step breakpoint at ADDRESS.  The original user created breakpoint was

>>> a hardware breakpoint, while the single step breakpoint will be a

>>> software breakpoint.

>>>

>>> GDB continues and immediately hits the software single step

>>> breakpoint.

>>>

>>> GDB then deletes the software single step breakpoint by calling

>>> delete_single_step_breakpoints, which eventually calls

>>> delete_breakpoint, which, once the breakpoint (and its locations) are

>>> deleted, calls update_global_location_list.

>>>

>>> During update_global_location_list GDB spots that we have an old

>>> location (the software single step breakpoint location) that is

>>> inserted, but being deleted, and a location at the same address which

>>> we are keeping, but which is not currently inserted (the original

>>> hardware breakpoint location), GDB then calls

>>> breakpoint_locations_match on these two locations.  Currently the

>>> locations do match, and so GDB calls swap_insertion which swaps the

>>> "inserted" state of the two locations.  GDB finally returns through

>>> the call stack and leaves delete_single_step_breakpoints.  After this

>>> GDB continues with its normal "stopping" process, as part of this

>>> stopping process GDB removes all the breakpoints from the target.  Due

>>> to the swap the original hardware breakpoint location is removed.

>>>

>>> The problem is that GDB inserted the software single step breakpoint

>>> as a software breakpoint, and then, thanks to the swap, tries to

>>> remove it as a hardware breakpoint.  This will leave the target in an

>>> inconsistent state, and as in the original bug report (PR gdb/25741),

>>> could result in the target throwing an error.

>>>

>>> The solution for all this is to have two breakpoint locations of

>>> different types (hardware breakpoint vs software breakpoint) not

>>> match.  The result of this is that GDB will no longer swap the

>>> inserted status of the two breakpoints.  The original call to

>>> update_global_location_list (after deleting the software single step

>>> breakpoint) will at this point trigger a call to remove the

>>> breakpoint, something which will be done based on the type of that

>>> location.  Later GDB will see that the original hardware breakpoint is

>>> already not inserted, and so will leave it alone.

>>>

>>> This patch was original proposed by Keno Fischer here:

>>>

>>>   https://sourceware.org/pipermail/gdb-patches/2020-April/167202.html

>>>

>>> gdb/ChangeLog:

>>>

>>> 	PR gdb/25741

>>> 	* breakpoint.c (breakpoint_locations_match): Compare location

>>> 	types.

>>

>> This seems right at first blush, though there are some things

>> that we need to look at.  Here are 3 cases that found while

>> looking for breakpoint_locations_match callers:

>>

>> #1

>>

>> E.g., with this, GDB will now install both a hw breakpoint location

>> and a software location at the same address.  E.g., a contrived case

>> to see it happen would be, with:

>>

>>  (gdb) set breakpoint always-inserted on

>>  (gdb) set debug remote 1

>>

>> before:

>>

>>  (gdb) hbreak main

>>  Sending packet: $m400736,1#fe...Packet received: 48

>>  Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.

>>  Sending packet: $Z1,400736,1#48...Packet received: OK

>>  (gdb) b main

>>  Note: breakpoint 1 also set at pc 0x400736.

>>  Sending packet: $m400736,1#fe...Packet received: 48

>>  Breakpoint 2 at 0x400736: file threads.c, line 57.

>>  Sending packet: $Z1,400736,1#48...Packet received: OK

>>

>> after:

>>

>>  (gdb) hbreak main

>>  Sending packet: $m400736,1#fe...Packet received: 48

>>  Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.

>>  Sending packet: $Z1,400736,1#48...Packet received: OK

>>  (gdb) break main

>>  Note: breakpoint 1 also set at pc 0x400736.

>>  Sending packet: $m400736,1#fe...Packet received: 48

>>  Breakpoint 2 at 0x400736: file threads.c, line 57.

>>  Sending packet: $Z1,400736,1#48...Packet received: OK

>>  Sending packet: $Z0,400736,1#47...Packet received: OK

> 

> While working on a revised patch I tried to reproduce this, and it's

> most odd.  Notice that both before and after you have two Z1 packets,

> you're inserting the hardware breakpoint twice with no intermediate

> removal.


That happens because when a software breakpoint location is created,
it is created with condition_changed == condition_modified.
Then, in update_global_location_list, we end up in 
force_breakpoint_reinsertion:

	  /* Check if this is a new/duplicated location or a duplicated
	     location that had its condition modified.  If so, we want to send
	     its condition to the target if evaluation of conditions is taking
	     place there.  */
	  if ((*loc2p)->condition_changed == condition_modified
	      && (last_addr != old_loc->address
		  || last_pspace_num != old_loc->pspace->num))
	    {
	      force_breakpoint_reinsertion (*loc2p);
	      last_pspace_num = old_loc->pspace->num;
	    }

force_breakpoint_reinsertion forces reinsertion of all breakpoint
locations at the same address.

> 

> I don't see this behaviour, even on an unmodified GDB, 


I was debugging against pristine master x86-64 gdbserver.

And:

 (gdb) show breakpoint condition-evaluation 
 Breakpoint condition evaluation mode is auto (currently target).

Maybe you tested against a different remote server which doesn't
support target-side condition evaluation?

> and I know some remove targets, for example OpenOCD, will sanity check against such

> double insertions and throw an error.


If they do that, they should be fixed.  z/Z packets are supposed to be
idempotent.  Double insertions are to be expected and should not cause
an error.

 @emph{Implementation notes: A remote target shall return an empty string
 for an unrecognized breakpoint or watchpoint packet @var{type}.  A
 remote target shall support either both or neither of a given
 @samp{Z@var{type}@dots{}} and @samp{z@var{type}@dots{}} packet pair.  To
 avoid potential problems with duplicate packets, the operations should
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 be implemented in an idempotent way.}
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

We took advantage of that when we added support for target-side
breakpoint conditions.

> 

> Just though this was worth mentioning.

> 

> Anyway, here's a revised patch.  I've moved the location of the type

> check so it is only done now for the swapping case.  This should

> resolve all the concerns you raised, while still addressing the

> original issue.  I updated the commit message a bit too.

> 

> What do you think?


Let me try to find a bit of time to give this a think.

Thanks,
Pedro Alves
Victor Collod via Gdb-patches April 19, 2020, 6:21 p.m. | #9
On 4/17/20 1:28 PM, Andrew Burgess wrote:
> @@ -11786,6 +11786,17 @@ update_global_location_list (enum ugll_insert_mode insert_mode)

>  			      gdb_assert (is_hardware_watchpoint (loc2->owner));

>  			      loc2->watchpoint_type = old_loc->watchpoint_type;

>  			    }

> +			  else if (is_breakpoint (old_loc->owner))

> +			    {

> +			      /* If we swap the inserted status of a

> +				 hardware and software breakpoint then GDB

> +				 will insert the breakpoint as one type,

> +				 and later try to remove the breakpoint as

> +				 the other type.  This is not good.  */

> +			      gdb_assert (is_breakpoint (loc2->owner));

> +			      if (old_loc->loc_type != loc2->loc_type)

> +				continue;

> +			    }


Unfortunately this doesn't actually work correctly.

For example:

 $ gdb ~/gdb/tests/threads -ex "set breakpoint always-inserted on" -ex "tar rem :9999" -ex "set debug remote 1"
 ...
 (gdb) hbreak main
 Sending packet: $m400700,40#28...Packet received:  89e58b...
 Sending packet: $m400736,1#fe...Packet received: 48
 Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.
 Sending packet: $Z1,400736,1#48...Packet received: OK

 (gdb) b main
 Note: breakpoint 1 also set at pc 0x400736.
 Sending packet: $m400736,1#fe...Packet received: 48
 Breakpoint 2 at 0x400736: file threads.c, line 57.
 Sending packet: $Z1,400736,1#48...Packet received: OK

Note no Z0 was sent after "b main".  That's because the sw and
hw breakpoints are still considered duplicates.

But now, let's delete the hw breakpoint:

(gdb) del 1
Sending packet: $z1,400736,1#68...Packet received: OK

Note z1 (remove hw break) was sent, but Z0 (insert sw break) was not.
At this point, the target side has no breakpoint installed!
That is very wrong, because we have "set breakpoint always-inserted on".
If the target was running at this point (e.g., non-stop mode),
even with "set breakpoint always-inserted off", GDB should keep
breakpoints installed, otherwise threads would miss the breakpoint.
"set breakpoint always-inserted on" is just a handy way to emulate
that.

If we do any other operation that forces installing breakpoints,
then finally the Z0 is sent:

 (gdb) b main
 Note: breakpoint 2 also set at pc 0x400736.
 Sending packet: $m400736,1#fe...Packet received: 48
 Breakpoint 3 at 0x400736: file threads.c, line 57.
 Sending packet: $Z0,400736,1#47...Packet received: OK
 (gdb) 

I spent a good while yesterday trying to make this all work
with GDB still considering hw breakpoints and sw breakpoints
as duplicates.

We also need to handle my concern about insert_bp_location
changing the type of breakpoint locations from sw breakpoint
to hw breakpoint, _after_ the update_global_location_list
sorted out which breakpoints are duplicates of which.

The result is that GDB behaves like this:

 (gdb) hbreak main
 Sending packet: $m400700,40#28...Packet received: 89e58b...
 Sending packet: $m400736,1#fe...Packet received: 48
 Hardware assisted breakpoint 1 at 0x400736: file threads.c, line 57.
 Sending packet: $Z1,400736,1#48...Packet received: OK

 (gdb) b main
 Note: breakpoint 1 also set at pc 0x400736.
 Sending packet: $m400736,1#fe...Packet received: 48
 Breakpoint 2 at 0x400736: file threads.c, line 57.
 Sending packet: $Z1,400736,1#48...Packet received: OK

 (gdb) del 1
 Sending packet: $Z0,400736,1#47...Packet received: OK
 Sending packet: $z1,400736,1#68...Packet received: OK
 Sending packet: $Z0,400736,1#47...Packet received: OK

Note how after deleting breakpoint 1 (the hw break),
GDB first installs the sw breakpoint, and then deletes
the hw breakpoint.  This order is important to ensures that
at no moment could a thread miss breakpoint 2.  The final
Z0, is an artifact of the target-side condition handling.
We could get rid of that but I didn't try it.

It was only after all the effort that I realized that it's
pointless to try to make hw and sw breakpoint locations
match _if we always need to install the new sw break before
deleting the hw break_.  I mean, we go through all that
effort, but then there's always going to be a small window
where the remote target needs to be able to handle
a sw breakpoint and a hw breakpoint at the same address
anyway (e.g., handle target-side breakpoint conditions
correctly, handle target-side commands correctly, etc.)

So the patch below is the one I wrote yesterday.  It
wasn't finished, but was well advanced.  I'm posting
it for the archives and for discussion.  At this point,
I don't believe we should follow this patch's approach.
I'll send another reply with today's new patch.

From e2fe5095cc2e136dadbec373ebaed2938e282145 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>

Date: Sat, 18 Apr 2020 23:28:46 +0100
Subject: [PATCH] Handle hw and sw breakpoints at the same address

---
 gdb/breakpoint.c | 314 +++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 236 insertions(+), 78 deletions(-)

diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index e49025461ba..25ce3b9d217 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -2405,6 +2405,8 @@ breakpoint_kind (struct bp_location *bl, CORE_ADDR *addr)
     return gdbarch_breakpoint_kind_from_pc (bl->gdbarch, addr);
 }
 
+static void handle_automatic_hardware_breakpoints (bp_location *bl);
+
 /* Insert a low-level "breakpoint" of some type.  BL is the breakpoint
    location.  Any error messages are printed to TMP_ERROR_STREAM; and
    DISABLED_BREAKS, and HW_BREAKPOINT_ERROR are used to report problems.
@@ -2454,69 +2456,6 @@ insert_bp_location (struct bp_location *bl,
   if (bl->loc_type == bp_loc_software_breakpoint
       || bl->loc_type == bp_loc_hardware_breakpoint)
     {
-      if (bl->owner->type != bp_hardware_breakpoint)
-	{
-	  /* If the explicitly specified breakpoint type
-	     is not hardware breakpoint, check the memory map to see
-	     if the breakpoint address is in read only memory or not.
-
-	     Two important cases are:
-	     - location type is not hardware breakpoint, memory
-	     is readonly.  We change the type of the location to
-	     hardware breakpoint.
-	     - location type is hardware breakpoint, memory is
-	     read-write.  This means we've previously made the
-	     location hardware one, but then the memory map changed,
-	     so we undo.
-	     
-	     When breakpoints are removed, remove_breakpoints will use
-	     location types we've just set here, the only possible
-	     problem is that memory map has changed during running
-	     program, but it's not going to work anyway with current
-	     gdb.  */
-	  struct mem_region *mr 
-	    = lookup_mem_region (bl->target_info.reqstd_address);
-	  
-	  if (mr)
-	    {
-	      if (automatic_hardware_breakpoints)
-		{
-		  enum bp_loc_type new_type;
-		  
-		  if (mr->attrib.mode != MEM_RW)
-		    new_type = bp_loc_hardware_breakpoint;
-		  else 
-		    new_type = bp_loc_software_breakpoint;
-		  
-		  if (new_type != bl->loc_type)
-		    {
-		      static int said = 0;
-
-		      bl->loc_type = new_type;
-		      if (!said)
-			{
-			  fprintf_filtered (gdb_stdout,
-					    _("Note: automatically using "
-					      "hardware breakpoints for "
-					      "read-only addresses.\n"));
-			  said = 1;
-			}
-		    }
-		}
-	      else if (bl->loc_type == bp_loc_software_breakpoint
-		       && mr->attrib.mode != MEM_RW)
-		{
-		  fprintf_unfiltered (tmp_error_stream,
-				      _("Cannot insert breakpoint %d.\n"
-					"Cannot set software breakpoint "
-					"at read-only address %s\n"),
-				      bl->owner->number,
-				      paddress (bl->gdbarch, bl->address));
-		  return 1;
-		}
-	    }
-	}
-        
       /* First check to see if we have to handle an overlay.  */
       if (overlay_debugging == ovly_off
 	  || bl->section == NULL
@@ -2830,7 +2769,11 @@ insert_breakpoints (void)
 
   /* Updating watchpoints creates new locations, so update the global
      location list.  Explicitly tell ugll to insert locations and
-     ignore breakpoints_always_inserted_mode.  */
+     ignore breakpoints_always_inserted_mode.  Also,
+     update_global_location_list tries to "upgrade" software
+     breakpoints to hardware breakpoints to handle "set breakpoint
+     auto-hw", so we need to call it even if we don't have new
+     locations.  */
   update_global_location_list (UGLL_INSERT);
 }
 
@@ -2905,6 +2848,22 @@ update_inserted_breakpoint_locations (void)
     }
 }
 
+static void
+breakpoint_error_stream (string_file *tmp_error_stream,
+			 int *hw_breakpoint_error,
+			 int *hw_bp_error_explained_already)
+{
+  /* If a hardware breakpoint or watchpoint was inserted, add a
+     message about possibly exhausted resources.  */
+  if (*hw_breakpoint_error && !*hw_bp_error_explained_already)
+    {
+      tmp_error_stream->printf ("Could not insert hardware breakpoints:\n\
+You may have requested too many hardware breakpoints/watchpoints.\n");
+    }
+  target_terminal::ours_for_output ();
+  error_stream (*tmp_error_stream);
+}
+
 /* Used when starting or continuing the program.  */
 
 static void
@@ -2991,17 +2950,9 @@ insert_breakpoint_locations (void)
     }
 
   if (error_flag)
-    {
-      /* If a hardware breakpoint or watchpoint was inserted, add a
-         message about possibly exhausted resources.  */
-      if (hw_breakpoint_error && !hw_bp_error_explained_already)
-	{
-	  tmp_error_stream.printf ("Could not insert hardware breakpoints:\n\
-You may have requested too many hardware breakpoints/watchpoints.\n");
-	}
-      target_terminal::ours_for_output ();
-      error_stream (tmp_error_stream);
-    }
+    breakpoint_error_stream (&tmp_error_stream,
+			     &hw_breakpoint_error,
+			     &hw_bp_error_explained_already);
 }
 
 /* Used when the program stops.
@@ -8515,8 +8466,93 @@ mention (struct breakpoint *b)
 }
 
 
+static int
+bp_location_number (bp_location *loc)
+{
+  int n = 1;
+  for (bp_location *l = loc->owner->loc; l != nullptr; l = l->next)
+    {
+      if (l == loc)
+	return n;
+      ++n;
+    }
+
+  gdb_assert_not_reached ("breakpoint location not found in owner breakpoint");
+}
+
 static bool bp_loc_is_permanent (struct bp_location *loc);
 
+static void
+handle_automatic_hardware_breakpoints (bp_location *bl)
+{
+  if (bl->loc_type == bp_loc_software_breakpoint
+      || bl->loc_type == bp_loc_hardware_breakpoint)
+    {
+      if (bl->owner->type != bp_hardware_breakpoint)
+	{
+	  /* If the explicitly specified breakpoint type is not
+	     hardware breakpoint, check the memory map to see if the
+	     breakpoint address is in read only memory or not.
+
+	     Two important cases are:
+	     - location type is not hardware breakpoint, memory is
+	     readonly.  We change the type of the location to hardware
+	     breakpoint.
+	     - location type is hardware breakpoint, memory is
+	     read-write.  This means we've previously made the
+	     location hardware one, but then the memory map changed,
+	     so we undo.
+
+	     When breakpoints are removed, remove_breakpoints will use
+	     location types we've just set here, the only possible
+	     problem is that memory map has changed during running
+	     program, but it's not going to work anyway with current
+	     gdb.  */
+	  struct mem_region *mr
+	    = lookup_mem_region (bl->target_info.reqstd_address);
+
+	  if (mr)
+	    {
+	      if (automatic_hardware_breakpoints)
+		{
+		  enum bp_loc_type new_type;
+
+		  if (mr->attrib.mode != MEM_RW)
+		    new_type = bp_loc_hardware_breakpoint;
+		  else
+		    new_type = bp_loc_software_breakpoint;
+
+		  if (new_type != bl->loc_type)
+		    {
+		      static int said = 0;
+
+		      bl->loc_type = new_type;
+		      if (!said)
+			{
+			  fprintf_filtered (gdb_stdout,
+					    _("Note: automatically using "
+					      "hardware breakpoints for "
+					      "read-only addresses.\n"));
+			  said = 1;
+			}
+		    }
+		}
+	      else if (bl->loc_type == bp_loc_software_breakpoint
+		       && mr->attrib.mode != MEM_RW)
+		{
+		  bl->enabled = false;
+
+		  warning (_("Breakpoint %d.%d disabled.\n"
+			    "Cannot set software breakpoint "
+			    "at read-only address %s\n"),
+			   bl->owner->number, bp_location_number (bl),
+			   paddress (bl->gdbarch, bl->address));
+		}
+	    }
+	}
+    }
+}
+
 static struct bp_location *
 add_location_to_breakpoint (struct breakpoint *b,
 			    const struct symtab_and_line *sal)
@@ -11451,6 +11487,29 @@ bp_location_is_less_than (const bp_location *a, const bp_location *b)
   if (a->permanent != b->permanent)
     return a->permanent > b->permanent;
 
+  /* Sort hardware breakpoint locations before software breakpoint
+     locations, in order to prefer inserting hardware breakpoint
+     locations.  */
+  auto type_order = [](bp_loc_type loc_type)
+    {
+      switch (loc_type)
+	{
+	case bp_loc_hardware_breakpoint:
+	return 0;
+	case bp_loc_software_breakpoint:
+	return 1;
+	case bp_loc_hardware_watchpoint:
+	return 2;
+	case bp_loc_other:
+	return 3;
+	}
+
+      gdb_assert_not_reached ("bad switch");
+    };
+
+  if (type_order (a->loc_type) < type_order (b->loc_type))
+    return true;
+
   /* Make the internal GDB representation stable across GDB runs
      where A and B memory inside GDB can differ.  Breakpoint locations of
      the same type at the same address can be sorted in arbitrary order.  */
@@ -11625,6 +11684,7 @@ force_breakpoint_reinsertion (struct bp_location *bl)
       loc->cond_bytecode.reset ();
     }
 }
+
 /* Called whether new breakpoints are created, or existing breakpoints
    deleted, to update the global location list and recompute which
    locations are duplicate of which.
@@ -11654,6 +11714,17 @@ update_global_location_list (enum ugll_insert_mode insert_mode)
   struct bp_location *awp_loc_first; /* access watchpoint */
   struct bp_location *rwp_loc_first; /* read watchpoint */
 
+  /* Used for the case we need to insert a sw breakpoint to replace a
+     hw breakpoint, and vice versa.  */
+  bool error_flag = false;
+  int disabled_breaks = 0;
+  int hw_breakpoint_error = 0;
+  int hw_bp_error_explained_already = 0;
+  string_file tmp_error_stream;
+  /* Explicitly mark the warning -- this will only
+     be printed if there was an error.  */
+  tmp_error_stream.puts ("Warning:\n");
+
   /* Saved former bp_locations array which we compare against the newly
      built bp_locations from the current state of ALL_BREAKPOINTS.  */
   struct bp_location **old_locp;
@@ -11673,6 +11744,20 @@ update_global_location_list (enum ugll_insert_mode insert_mode)
   ALL_BREAKPOINTS (b)
     for (loc = b->loc; loc; loc = loc->next)
       *locp++ = loc;
+
+  /* See if we need to "upgrade" a software breakpoint to a hardware
+     breakpoint.  Do this before deciding whether locations are
+     duplicates.  Also do this before sorting because sorting order
+     depends on location type.  */
+  for (locp = bp_locations;
+       locp < bp_locations + bp_locations_count;
+       locp++)
+    {
+      loc = *locp;
+      if (!loc->inserted)
+	handle_automatic_hardware_breakpoints (loc);
+    }
+
   std::sort (bp_locations, bp_locations + bp_locations_count,
 	     bp_location_is_less_than);
 
@@ -11769,6 +11854,8 @@ update_global_location_list (enum ugll_insert_mode insert_mode)
 	      /* OLD_LOC comes from existing struct breakpoint.  */
 	      if (bl_address_is_meaningful (old_loc))
 		{
+		  bp_location *replacement_loc = nullptr;
+
 		  for (loc2p = locp;
 		       (loc2p < bp_locations + bp_locations_count
 			&& (*loc2p)->address == old_loc->address);
@@ -11776,6 +11863,9 @@ update_global_location_list (enum ugll_insert_mode insert_mode)
 		    {
 		      struct bp_location *loc2 = *loc2p;
 
+		      if (loc2 == old_loc)
+			continue;
+
 		      if (breakpoint_locations_match (loc2, old_loc))
 			{
 			  /* Read watchpoint locations are switched to
@@ -11786,12 +11876,25 @@ update_global_location_list (enum ugll_insert_mode insert_mode)
 			      gdb_assert (is_hardware_watchpoint (loc2->owner));
 			      loc2->watchpoint_type = old_loc->watchpoint_type;
 			    }
+			  else if ((old_loc->loc_type == bp_loc_hardware_breakpoint
+				    || old_loc->loc_type == bp_loc_software_breakpoint)
+				   && loc2->loc_type != old_loc->loc_type)
+			    {
+			      /* A hardware breakpoint is being
+				 removed.  Don't consider software
+				 breakpoint locations as duplicates,
+				 to avoid consuming hardware resources
+				 unnecessarily without user
+				 request.  */
+			      if (unduplicated_should_be_inserted (loc2))
+				replacement_loc = loc2;
+			      continue;
+			    }
 
 			  /* loc2 is a duplicated location. We need to check
 			     if it should be inserted in case it will be
 			     unduplicated.  */
-			  if (loc2 != old_loc
-			      && unduplicated_should_be_inserted (loc2))
+			  if (unduplicated_should_be_inserted (loc2))
 			    {
 			      swap_insertion (old_loc, loc2);
 			      keep_in_target = 1;
@@ -11799,6 +11902,29 @@ update_global_location_list (enum ugll_insert_mode insert_mode)
 			    }
 			}
 		    }
+
+		  /* If we're deleting a harware breakpoint, and we
+		     have a replacement software breakpoint to install
+		     (or vice versa), we must install the new one
+		     before deleting the old one, so that the inferior
+		     doesn't miss a breakpoint in case it is running
+		     right now.  */
+		  if (!keep_in_target && replacement_loc != nullptr)
+		    {
+		      scoped_restore_current_pspace_and_thread restore_pspace_thread;
+
+		      switch_to_program_space_and_thread (replacement_loc->pspace);
+
+		      replacement_loc->duplicate = false;
+
+		      int val
+			= insert_bp_location (replacement_loc, &tmp_error_stream,
+					      &disabled_breaks,
+					      &hw_breakpoint_error,
+					      &hw_bp_error_explained_already);
+		      if (val)
+			error_flag = val;
+		    }
 		}
 	    }
 
@@ -11959,13 +12085,45 @@ update_global_location_list (enum ugll_insert_mode insert_mode)
 	 is not duplicated, and is the inserted one.
 	 All following are marked as duplicated, and are not inserted.  */
       if (loc->inserted)
-	swap_insertion (loc, *loc_first_p);
+	{
+	  if (loc->loc_type == bp_loc_software_breakpoint
+	      && (*loc_first_p)->loc_type == bp_loc_hardware_breakpoint)
+	    {
+	      /* Insert the replacement breakpoint first, so that
+		 there's never a time the inferior could miss a
+		 breakpoint, in case the inferior is running right
+		 now.  */
+	      scoped_restore_current_pspace_and_thread restore_pspace_thread;
+
+	      switch_to_program_space_and_thread ((*loc_first_p)->pspace);
+
+	      int val
+		= insert_bp_location ((*loc_first_p), &tmp_error_stream,
+				      &disabled_breaks,
+				      &hw_breakpoint_error,
+				      &hw_bp_error_explained_already);
+	      if (val)
+		error_flag = val;
+
+	      remove_breakpoint (loc);
+	    }
+	  else
+	    swap_insertion (loc, *loc_first_p);
+	}
       loc->duplicate = 1;
 
       /* Clear the condition modification flag.  */
       loc->condition_changed = condition_unchanged;
     }
 
+  /* Should this be propagated down to insert_breakpoint_locations if
+     it's going to be called, so that we have a single place that
+     throws?  */
+  if (error_flag)
+    breakpoint_error_stream (&tmp_error_stream,
+			     &hw_breakpoint_error,
+			     &hw_bp_error_explained_already);
+
   if (insert_mode == UGLL_INSERT || breakpoints_should_be_inserted_now ())
     {
       if (insert_mode != UGLL_DONT_INSERT)

base-commit: 8e4979ac1ea78147ecbcbf81af5e946873dda079
-- 
2.14.5

Patch

diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index e49025461b..582dae1946 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -6838,7 +6838,7 @@  breakpoint_locations_match (struct bp_location *loc1,
     /* We compare bp_location.length in order to cover ranged breakpoints.  */
     return (breakpoint_address_match (loc1->pspace->aspace, loc1->address,
 				     loc2->pspace->aspace, loc2->address)
-	    && loc1->length == loc2->length);
+	    && loc1->length == loc2->length && loc1->loc_type == loc2->loc_type);
 }
 
 static void