[28/28] Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412)

Message ID 20200414175434.8047-29-palves@redhat.com
State New
Headers show
Series
  • Decouple inferior_ptid/inferior_thread(); dup ptids in thread list (PR/25412)
Related show

Commit Message

Shahab Vahedi via Gdb-patches April 14, 2020, 5:54 p.m.
In PR/25412, Simon noticed that after the multi-target series, the
tid-reuse.exp testcase manages to create a duplicate thread in the
thread list.  Or rather, two threads with the same PTID.

add_thread_silent has code in place to detect the case of a new thread
reusing some older thread's ptid, but it doesn't work correctly
anymore when the old thread is NOT the current thread and it has a
refcount higher than 0.  Either condition prevents a thread from being
deleted, but the refcount case wasn't being considered.  I think the
reason that case wasn't considered is that that code predates
thread_info refcounting.  Back when it was originally written,
delete_thread always deleted the thread.

That add_thread_silent code in question has some now-unnecessary
warts, BTW.  For instance, this:

  /* Make switch_to_thread not read from the thread.  */
  new_thr->state = THREAD_EXITED;

... used to be required because switch_to_thread would update
'stop_pc' otherwise.  I.e., it would read registers from an exited
thread otherwise.  switch_to_thread no longer reads the stop_pc, since:

  commit f2ffa92bbce9dd5fbedc138ac2a3bc8a88327d09
  Author:     Pedro Alves <palves@redhat.com>
  AuthorDate: Thu Jun 28 20:18:24 2018 +0100

      gdb: Eliminate the 'stop_pc' global

Also, if the ptid of the now-gone current thread is reused, we
currently return from add_thread_silent with the current thread
pointing at the _new_ thread.  Either pointing at the old thread, or
at no thread selected would be reasonable.  But pointing at an
unrelated thread (the new thread that happens to reuse the ptid) is
just broken.  Seems like I was the one who wrote it like that but I
have no clue why, FWIW.

Currently, an exited thread kept in the thread list still holds its
original ptid.  The idea was that we need the ptid to be able to
temporarily switch to another thread and then switch back to the
original thread, because thread switching is really inferior_ptid
switching.  Switching back to the original thread requires a ptid
lookup.

Now, in order to avoid exited threads with the same ptid as a live
thread in the same thread list, one thing I considered (and tried) was
to change an exited thread's ptid to minus_one_ptid.  However, with
that, there's a case that we won't handle well, which is if we end up
with more than one exited thread in the list, since then all exited
threads will all have the same ptid.  Since inferior_thread() relies
on inferior_ptid, may well return the wrong thread.

My next attempt to address this, was to switch an exited thread's ptid
to a globally unique "exited" ptid, which is a ptid with pid == -1 and
tid == 'the thread's global GDB thread number'.  Note that GDB assumes
that the GDB global thread number is monotonically increasing and
doesn't wrap around.  (We should probably make GDB thread numbers
64-bit to prevent that happening in practice; they're currently signed
32-bit.)  This attempt went a long way, but still ran into a number of
issues.  It was a major hack too, obviously.

My next attempt is the one that I'm proposing, which is to bite the
bullet and break the connection between inferior_ptid and
inferior_thread(), aka the current thread.  I.e., make the current
thread be a global thread_info pointer that is written to directly by
switch_to_thread, etc., and making inferior_thread() return that
pointer, instead of having inferior_thread() lookup up the
inferior_ptid thread, by ptid_t.  You can look at this as a
continuation of the effort of using more thread_info pointers instead
of ptids when possible.

By making the current thread a global thread_info pointer, we can make
switch_to_thread simply write to the global thread pointer, which
makes scoped_restore_current_thread able to restore back to an exited
thread without relying on unrelyable ptid look ups.  I.e., this makes
it not a real problem to have more than one thread with the same ptid
in the thread list.  There will always be only one live thread with a
given ptid, so code that looks up a live thread by ptid will always be
able to find the right one.

This change required auditing the whole codebase for places where we
were writing to inferior_ptid directly to change the current thread,
and change them to use switch_to_thread instead or one of its
siblings, because otherwise inferior_thread() would return a thread
unrelated to the changed-to inferior_ptid.  That was all (hopefully)
done in previous patches.

After this, inferior_ptid is mainly used by target backend code.  It
is also relied on by a number of target methods.  E.g., the
target_resume interface and the memory reading routines -- we still
need it there because we need to be able to access memory off of
processes for which we don't have a corresponding inferior/thread
object, like when handling forks.  Maybe we could pass down a context
explicitly to target_read_memory, etc.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <palves@redhat.com>

	* gdbthread.h (delete_thread, delete_thread_silent)
	(find_thread_ptid): Update comments.
	* thread.c (current_thread_): New global.
	(is_current_thread): Move higher, and reimplement.
	(inferior_thread): Reimplement.
	(set_thread_exited): Use bool.  Add assertions.
	(add_thread_silent): Simplify thread-reuse handling by always
	calling delete_thread.
	(delete_thread): Remove intro comment.
	(find_thread_ptid): Skip exited threads.
	(switch_to_thread_no_regs): Write to current_thread_.
	(switch_to_no_thread): Check CURRENT_THREAD_ instead of
	INFERIOR_PTID.  Clear current_thread_.
---
 gdb/gdbthread.h | 17 +++++-----
 gdb/thread.c    | 97 ++++++++++++++++++++-------------------------------------
 2 files changed, 42 insertions(+), 72 deletions(-)

-- 
2.14.5

Comments

Simon Marchi April 16, 2020, 7:39 p.m. | #1
On 2020-04-14 1:54 p.m., Pedro Alves via Gdb-patches wrote:
> In PR/25412, Simon noticed that after the multi-target series, the

> tid-reuse.exp testcase manages to create a duplicate thread in the

> thread list.  Or rather, two threads with the same PTID.

> 

> add_thread_silent has code in place to detect the case of a new thread

> reusing some older thread's ptid, but it doesn't work correctly

> anymore when the old thread is NOT the current thread and it has a

> refcount higher than 0.  Either condition prevents a thread from being

> deleted, but the refcount case wasn't being considered.  I think the

> reason that case wasn't considered is that that code predates

> thread_info refcounting.  Back when it was originally written,

> delete_thread always deleted the thread.

> 

> That add_thread_silent code in question has some now-unnecessary

> warts, BTW.  For instance, this:

> 

>   /* Make switch_to_thread not read from the thread.  */

>   new_thr->state = THREAD_EXITED;

> 

> ... used to be required because switch_to_thread would update

> 'stop_pc' otherwise.  I.e., it would read registers from an exited

> thread otherwise.  switch_to_thread no longer reads the stop_pc, since:

> 

>   commit f2ffa92bbce9dd5fbedc138ac2a3bc8a88327d09

>   Author:     Pedro Alves <palves@redhat.com>

>   AuthorDate: Thu Jun 28 20:18:24 2018 +0100

> 

>       gdb: Eliminate the 'stop_pc' global

> 

> Also, if the ptid of the now-gone current thread is reused, we

> currently return from add_thread_silent with the current thread

> pointing at the _new_ thread.  Either pointing at the old thread, or

> at no thread selected would be reasonable.  But pointing at an

> unrelated thread (the new thread that happens to reuse the ptid) is

> just broken.  Seems like I was the one who wrote it like that but I

> have no clue why, FWIW.

> 

> Currently, an exited thread kept in the thread list still holds its

> original ptid.  The idea was that we need the ptid to be able to

> temporarily switch to another thread and then switch back to the

> original thread, because thread switching is really inferior_ptid

> switching.  Switching back to the original thread requires a ptid

> lookup.

> 

> Now, in order to avoid exited threads with the same ptid as a live

> thread in the same thread list, one thing I considered (and tried) was

> to change an exited thread's ptid to minus_one_ptid.  However, with

> that, there's a case that we won't handle well, which is if we end up

> with more than one exited thread in the list, since then all exited

> threads will all have the same ptid.  Since inferior_thread() relies

> on inferior_ptid, may well return the wrong thread.

> 

> My next attempt to address this, was to switch an exited thread's ptid

> to a globally unique "exited" ptid, which is a ptid with pid == -1 and

> tid == 'the thread's global GDB thread number'.  Note that GDB assumes

> that the GDB global thread number is monotonically increasing and

> doesn't wrap around.  (We should probably make GDB thread numbers

> 64-bit to prevent that happening in practice; they're currently signed

> 32-bit.)  This attempt went a long way, but still ran into a number of

> issues.  It was a major hack too, obviously.

> 

> My next attempt is the one that I'm proposing, which is to bite the

> bullet and break the connection between inferior_ptid and

> inferior_thread(), aka the current thread.  I.e., make the current

> thread be a global thread_info pointer that is written to directly by

> switch_to_thread, etc., and making inferior_thread() return that

> pointer, instead of having inferior_thread() lookup up the

> inferior_ptid thread, by ptid_t.  You can look at this as a

> continuation of the effort of using more thread_info pointers instead

> of ptids when possible.

> 

> By making the current thread a global thread_info pointer, we can make

> switch_to_thread simply write to the global thread pointer, which

> makes scoped_restore_current_thread able to restore back to an exited

> thread without relying on unrelyable ptid look ups.  I.e., this makes

> it not a real problem to have more than one thread with the same ptid

> in the thread list.  There will always be only one live thread with a

> given ptid, so code that looks up a live thread by ptid will always be

> able to find the right one.

> 

> This change required auditing the whole codebase for places where we

> were writing to inferior_ptid directly to change the current thread,

> and change them to use switch_to_thread instead or one of its

> siblings, because otherwise inferior_thread() would return a thread

> unrelated to the changed-to inferior_ptid.  That was all (hopefully)

> done in previous patches.

> 

> After this, inferior_ptid is mainly used by target backend code.  It

> is also relied on by a number of target methods.  E.g., the

> target_resume interface and the memory reading routines -- we still

> need it there because we need to be able to access memory off of

> processes for which we don't have a corresponding inferior/thread

> object, like when handling forks.  Maybe we could pass down a context

> explicitly to target_read_memory, etc.

> 

> gdb/ChangeLog:

> yyyy-mm-dd  Pedro Alves  <palves@redhat.com>

> 

> 	* gdbthread.h (delete_thread, delete_thread_silent)

> 	(find_thread_ptid): Update comments.

> 	* thread.c (current_thread_): New global.

> 	(is_current_thread): Move higher, and reimplement.

> 	(inferior_thread): Reimplement.

> 	(set_thread_exited): Use bool.  Add assertions.

> 	(add_thread_silent): Simplify thread-reuse handling by always

> 	calling delete_thread.

> 	(delete_thread): Remove intro comment.

> 	(find_thread_ptid): Skip exited threads.

> 	(switch_to_thread_no_regs): Write to current_thread_.

> 	(switch_to_no_thread): Check CURRENT_THREAD_ instead of

> 	INFERIOR_PTID.  Clear current_thread_.


I checked the rest of the series, I didn't find any obvious problem.

I have a patch [1] that I might consider sending upstream some day, which replaces the
inferior thread lists with maps, using the ptid as the key.  This was made for targets
that have a lot of threads (like, thousands), where looking up a thread object from a
ptid would take a bit of time, and that would add up.  If there can be two threads in
the list with the same ptid, that won't work with a map using the ptid as the key.

So I was wondering, when we want to delete a thread but its refcount is not 0, does it
need to stay in the thread list?  What if we remove it from the list, and whatever
decrements its refcount to 0 deletes it?  Do you think that could work?

It's possible, however, that with your series, the number of times we look up a thread
from its ptid is reduced enough that it makes my patch not really useful.

Simon

[1] https://github.com/simark/binutils-gdb/commits/inferior-thread-map
Shahab Vahedi via Gdb-patches April 16, 2020, 8:12 p.m. | #2
On 4/16/20 8:39 PM, Simon Marchi wrote:

> I checked the rest of the series, I didn't find any obvious problem.


Thanks.

> 

> I have a patch [1] that I might consider sending upstream some day, which replaces the

> inferior thread lists with maps, using the ptid as the key.  This was made for targets

> that have a lot of threads (like, thousands), where looking up a thread object from a

> ptid would take a bit of time, and that would add up.  If there can be two threads in

> the list with the same ptid, that won't work with a map using the ptid as the key.

> 

> So I was wondering, when we want to delete a thread but its refcount is not 0, does it

> need to stay in the thread list?  What if we remove it from the list, and whatever

> decrements its refcount to 0 deletes it?  Do you think that could work?


I think it could.  That's something that I considered, as something that is made
possible with this patchset, because, before we needed it to stay in the list so
that the lookup inside inferior_thread(), but after the series the lookup is gone.
We'd also need to look out for places that want to walk all threads, instead of
just non-exited ones, but on a quick look, I didn't find any left that couldn't
be converted to walk to non-exited threads only.

Another thing that makes it possible is that "info threads" doesn't show the
exited threads in the list.  Even if we wanted to show the current thread
in the list if it's exited (which is something I considered), we could still
easily do that, though.  Currently, we just show the
"The current thread <Thread ID %s> has terminated.  See `help thread'"
message at the end of the thread list.

> 

> It's possible, however, that with your series, the number of times we look up a thread

> from its ptid is reduced enough that it makes my patch not really useful.

> 

> Simon

> 

> [1] https://github.com/simark/binutils-gdb/commits/inferior-thread-map


How are you making sure that iterating the threads walks them in
creation/insertion order instead of ptid_t order?

Thanks,
Pedro Alves
Simon Marchi April 16, 2020, 8:38 p.m. | #3
On 2020-04-16 4:12 p.m., Pedro Alves via Gdb-patches wrote:
> On 4/16/20 8:39 PM, Simon Marchi wrote:

> 

>> I checked the rest of the series, I didn't find any obvious problem.

> 

> Thanks.

> 

>>

>> I have a patch [1] that I might consider sending upstream some day, which replaces the

>> inferior thread lists with maps, using the ptid as the key.  This was made for targets

>> that have a lot of threads (like, thousands), where looking up a thread object from a

>> ptid would take a bit of time, and that would add up.  If there can be two threads in

>> the list with the same ptid, that won't work with a map using the ptid as the key.

>>

>> So I was wondering, when we want to delete a thread but its refcount is not 0, does it

>> need to stay in the thread list?  What if we remove it from the list, and whatever

>> decrements its refcount to 0 deletes it?  Do you think that could work?

> 

> I think it could.  That's something that I considered, as something that is made

> possible with this patchset, because, before we needed it to stay in the list so

> that the lookup inside inferior_thread(), but after the series the lookup is gone.

> We'd also need to look out for places that want to walk all threads, instead of

> just non-exited ones, but on a quick look, I didn't find any left that couldn't

> be converted to walk to non-exited threads only.

> 

> Another thing that makes it possible is that "info threads" doesn't show the

> exited threads in the list.  Even if we wanted to show the current thread

> in the list if it's exited (which is something I considered), we could still

> easily do that, though.  Currently, we just show the

> "The current thread <Thread ID %s> has terminated.  See `help thread'"

> message at the end of the thread list.


Indeed, I checked "info threads" before posting this, because it's one case I
know the deleted thread was referred to.

>> It's possible, however, that with your series, the number of times we look up a thread

>> from its ptid is reduced enough that it makes my patch not really useful.

>>

>> Simon

>>

>> [1] https://github.com/simark/binutils-gdb/commits/inferior-thread-map

> 

> How are you making sure that iterating the threads walks them in

> creation/insertion order instead of ptid_t order?


The only spot that I found the order mattered was precisely for "info threads".
There, I gather the list of threads to display, sort them by per_inf_num, and
print them.

https://github.com/simark/binutils-gdb/commit/25c67729996dc83912d34a5901016404b3638bc1#diff-5479a5443dca2232f8552dfb5b30ac3dR1104

Simon
Shahab Vahedi via Gdb-patches April 17, 2020, 10:29 a.m. | #4
On 4/16/20 9:38 PM, Simon Marchi wrote:
> On 2020-04-16 4:12 p.m., Pedro Alves via Gdb-patches wrote:


>> How are you making sure that iterating the threads walks them in

>> creation/insertion order instead of ptid_t order?

> 

> The only spot that I found the order mattered was precisely for "info threads".

> There, I gather the list of threads to display, sort them by per_inf_num, and

> print them.

> 

> https://github.com/simark/binutils-gdb/commit/25c67729996dc83912d34a5901016404b3638bc1#diff-5479a5443dca2232f8552dfb5b30ac3dR1104

> 


Hmm, OK.  I'm not 100% sure that's a good idea; it makes me a bit nervous
because I'm not sure if this has an impact on run control.  Something to
think about.  We could instead have the new ptid_t map for ptid_t look ups, but
still also maintain the linked list in order to walk the thread list.  (That
would also be a much simpler patch, I think.)

Thanks,
Pedro Alves
Simon Marchi April 17, 2020, 2:06 p.m. | #5
On 2020-04-17 6:29 a.m., Pedro Alves wrote:
> On 4/16/20 9:38 PM, Simon Marchi wrote:

>> On 2020-04-16 4:12 p.m., Pedro Alves via Gdb-patches wrote:

> 

>>> How are you making sure that iterating the threads walks them in

>>> creation/insertion order instead of ptid_t order?

>>

>> The only spot that I found the order mattered was precisely for "info threads".

>> There, I gather the list of threads to display, sort them by per_inf_num, and

>> print them.

>>

>> https://github.com/simark/binutils-gdb/commit/25c67729996dc83912d34a5901016404b3638bc1#diff-5479a5443dca2232f8552dfb5b30ac3dR1104

>>

> 

> Hmm, OK.  I'm not 100% sure that's a good idea; it makes me a bit nervous

> because I'm not sure if this has an impact on run control.  Something to

> think about.  We could instead have the new ptid_t map for ptid_t look ups, but

> still also maintain the linked list in order to walk the thread list.  (That

> would also be a much simpler patch, I think.)


I also thought about that as a backup solution.  However, the testsuite didn't show
any more regressions, so I stayed with just the map.

Simon
Shahab Vahedi via Gdb-patches April 17, 2020, 4:46 p.m. | #6
On 4/17/20 3:06 PM, Simon Marchi wrote:
> On 2020-04-17 6:29 a.m., Pedro Alves wrote:

>> On 4/16/20 9:38 PM, Simon Marchi wrote:

>>> On 2020-04-16 4:12 p.m., Pedro Alves via Gdb-patches wrote:

>>

>>>> How are you making sure that iterating the threads walks them in

>>>> creation/insertion order instead of ptid_t order?

>>>

>>> The only spot that I found the order mattered was precisely for "info threads".

>>> There, I gather the list of threads to display, sort them by per_inf_num, and

>>> print them.

>>>

>>> https://github.com/simark/binutils-gdb/commit/25c67729996dc83912d34a5901016404b3638bc1#diff-5479a5443dca2232f8552dfb5b30ac3dR1104

>>>

>>

>> Hmm, OK.  I'm not 100% sure that's a good idea; it makes me a bit nervous

>> because I'm not sure if this has an impact on run control.  Something to

>> think about.  We could instead have the new ptid_t map for ptid_t look ups, but

>> still also maintain the linked list in order to walk the thread list.  (That

>> would also be a much simpler patch, I think.)

> 

> I also thought about that as a backup solution.  However, the testsuite didn't show

> any more regressions, so I stayed with just the map.


Maybe.  Though, this is likely to make debugging gdb harder, both
interactive debugging ("p thread->next->next->state" kind of things), and
also infrun logs -- The order which gdb resumes, suspends, etc. threads
will change as threads appear/disappear.  Sticking to creation order is
likely to avoid headaches, IMO.

Thanks,
Pedro Alves
Tom Tromey April 17, 2020, 6:53 p.m. | #7
>>>>> "Pedro" == Pedro Alves via Gdb-patches <gdb-patches@sourceware.org> writes:


Pedro> My next attempt is the one that I'm proposing, which is to bite the
Pedro> bullet and break the connection between inferior_ptid and
Pedro> inferior_thread(), aka the current thread.  I.e., make the current
Pedro> thread be a global thread_info pointer that is written to directly by
Pedro> switch_to_thread, etc., and making inferior_thread() return that
Pedro> pointer, instead of having inferior_thread() lookup up the
Pedro> inferior_ptid thread, by ptid_t.

Does this mean that now the current thread and current inferior can get
out of sync?  Or that these can be out of sync with inferior_ptid?

This patch looks like it tries to avoid that when writing to the current
thread -- but in the cover letter you mentioned that there are other
assignments to inferior_ptid.

I wouldn't block the patches based on this but I'd like to understand
the direction.  I guess I'd prefer to remove globals and possibilities
for divergence if it's possible.

Tom
Shahab Vahedi via Gdb-patches June 18, 2020, 7:59 p.m. | #8
Hi Tom,

Sorry for the delay in responding.  I hate when I end up doing this.

On 4/17/20 7:53 PM, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves via Gdb-patches <gdb-patches@sourceware.org> writes:

> 

> Pedro> My next attempt is the one that I'm proposing, which is to bite the

> Pedro> bullet and break the connection between inferior_ptid and

> Pedro> inferior_thread(), aka the current thread.  I.e., make the current

> Pedro> thread be a global thread_info pointer that is written to directly by

> Pedro> switch_to_thread, etc., and making inferior_thread() return that

> Pedro> pointer, instead of having inferior_thread() lookup up the

> Pedro> inferior_ptid thread, by ptid_t.

> 

> Does this mean that now the current thread and current inferior can get

> out of sync?  


They already could, because the current thread was based on inferior_ptid,
while the current inferior is stored in a global "inferior *current_inferior_"
pointer.

> Or that these can be out of sync with inferior_ptid?


Yes, before, inferior_ptid could get out of sync with the current
inferior, but not with the current thread.  Afterwards, the current
"thread_info *" can also get out of sync with inferior_ptid.  That's
why most of the series eliminates writes to inferior_ptid directly.

Over time, inferior_ptid usages have been disappearing, replaced by
references to "thread_info *" or "inferior *" objects directly.  

I think that's much better than referring to objects via integers that may
be ambiguous between targets, and worrying about what target_ops owns what
in the ptid_t fields, how to extend ptid for different types of execution
objects, etc.  I think that ideally, in the GDB core, references to
inferior_ptid would virtually disappear.  I see this series as another step
in that direction.

> 

> This patch looks like it tries to avoid that when writing to the current

> thread -- but in the cover letter you mentioned that there are other

> assignments to inferior_ptid.


I think you are referring to this comment:

> "After this, inferior_ptid still exists, but it is mostly read-only and

> mainly used by target backend code.  It is also relied on by a number

> of target methods as a global input argument.  E.g., the target_resume

> interface and the memory reading routines -- we still need it there

> because we need to be able to access memory off of processes for which

> we don't have a corresponding inferior/thread object, like when

> handling forks.  Maybe we could pass down a context explicitly to

> target_read_memory, etc."


In the target_resume case, we always switch inferior_ptid using
switch_to_thread or similar, so not a big deal.

The worrying case I found is the memory access routines.  TBC, that's a case
where the connection between current thread and inferior_ptid was already
being broken previously.  Let me expand on the "handling forks" bit:

When we detach breakpoints from a fork child that we're about to
detach from due to "set detach-on-fork off", we need to uninstall memory
breakpoints from the child, because breakpoint instructions that were
inserted on the parent end up in the child as well, because the child's
address space is just a copy of the parent's.  So to do that, we switch
inferior_ptid to the fork child, and call the breakpoint removal target
method, so that it applies to the fork child.  Note that there are no
thread_info or inferior objects for that fork child in the thread/inferior
lists at all.  So if something in those code breakpoint removal paths (all
the way down the the memory xfer routines) calls inferior_thread() it would
already assert for not finding the thread in the list.  The patch series
doesn't change that.

> I wouldn't block the patches based on this but I'd like to understand

> the direction.  I guess I'd prefer to remove globals and possibilities

> for divergence if it's possible.


Me too, but it's just that inferior_ptid is so pervasive that it's
impossible to eliminate all in one go.

Thanks,
Pedro Alves
Andrew Burgess June 23, 2020, 1:37 p.m. | #9
* Pedro Alves via Gdb-patches <gdb-patches@sourceware.org> [2020-04-14 18:54:34 +0100]:

> In PR/25412, Simon noticed that after the multi-target series, the

> tid-reuse.exp testcase manages to create a duplicate thread in the

> thread list.  Or rather, two threads with the same PTID.

> 

> add_thread_silent has code in place to detect the case of a new thread

> reusing some older thread's ptid, but it doesn't work correctly

> anymore when the old thread is NOT the current thread and it has a

> refcount higher than 0.  Either condition prevents a thread from being

> deleted, but the refcount case wasn't being considered.  I think the

> reason that case wasn't considered is that that code predates

> thread_info refcounting.  Back when it was originally written,

> delete_thread always deleted the thread.

> 

> That add_thread_silent code in question has some now-unnecessary

> warts, BTW.  For instance, this:

> 

>   /* Make switch_to_thread not read from the thread.  */

>   new_thr->state = THREAD_EXITED;

> 

> ... used to be required because switch_to_thread would update

> 'stop_pc' otherwise.  I.e., it would read registers from an exited

> thread otherwise.  switch_to_thread no longer reads the stop_pc, since:

> 

>   commit f2ffa92bbce9dd5fbedc138ac2a3bc8a88327d09

>   Author:     Pedro Alves <palves@redhat.com>

>   AuthorDate: Thu Jun 28 20:18:24 2018 +0100

> 

>       gdb: Eliminate the 'stop_pc' global

> 

> Also, if the ptid of the now-gone current thread is reused, we

> currently return from add_thread_silent with the current thread

> pointing at the _new_ thread.  Either pointing at the old thread, or

> at no thread selected would be reasonable.  But pointing at an

> unrelated thread (the new thread that happens to reuse the ptid) is

> just broken.  Seems like I was the one who wrote it like that but I

> have no clue why, FWIW.

> 

> Currently, an exited thread kept in the thread list still holds its

> original ptid.  The idea was that we need the ptid to be able to

> temporarily switch to another thread and then switch back to the

> original thread, because thread switching is really inferior_ptid

> switching.  Switching back to the original thread requires a ptid

> lookup.

> 

> Now, in order to avoid exited threads with the same ptid as a live

> thread in the same thread list, one thing I considered (and tried) was

> to change an exited thread's ptid to minus_one_ptid.  However, with

> that, there's a case that we won't handle well, which is if we end up

> with more than one exited thread in the list, since then all exited

> threads will all have the same ptid.  Since inferior_thread() relies

> on inferior_ptid, may well return the wrong thread.

> 

> My next attempt to address this, was to switch an exited thread's ptid

> to a globally unique "exited" ptid, which is a ptid with pid == -1 and

> tid == 'the thread's global GDB thread number'.  Note that GDB assumes

> that the GDB global thread number is monotonically increasing and

> doesn't wrap around.  (We should probably make GDB thread numbers

> 64-bit to prevent that happening in practice; they're currently signed

> 32-bit.)  This attempt went a long way, but still ran into a number of

> issues.  It was a major hack too, obviously.

> 

> My next attempt is the one that I'm proposing, which is to bite the

> bullet and break the connection between inferior_ptid and

> inferior_thread(), aka the current thread.  I.e., make the current

> thread be a global thread_info pointer that is written to directly by

> switch_to_thread, etc., and making inferior_thread() return that

> pointer, instead of having inferior_thread() lookup up the

> inferior_ptid thread, by ptid_t.  You can look at this as a

> continuation of the effort of using more thread_info pointers instead

> of ptids when possible.

> 

> By making the current thread a global thread_info pointer, we can make

> switch_to_thread simply write to the global thread pointer, which

> makes scoped_restore_current_thread able to restore back to an exited

> thread without relying on unrelyable ptid look ups.  I.e., this makes

> it not a real problem to have more than one thread with the same ptid

> in the thread list.  There will always be only one live thread with a

> given ptid, so code that looks up a live thread by ptid will always be

> able to find the right one.

> 

> This change required auditing the whole codebase for places where we

> were writing to inferior_ptid directly to change the current thread,

> and change them to use switch_to_thread instead or one of its

> siblings, because otherwise inferior_thread() would return a thread

> unrelated to the changed-to inferior_ptid.  That was all (hopefully)

> done in previous patches.

> 

> After this, inferior_ptid is mainly used by target backend code.  It

> is also relied on by a number of target methods.  E.g., the

> target_resume interface and the memory reading routines -- we still

> need it there because we need to be able to access memory off of

> processes for which we don't have a corresponding inferior/thread

> object, like when handling forks.  Maybe we could pass down a context

> explicitly to target_read_memory, etc.

> 

> gdb/ChangeLog:

> yyyy-mm-dd  Pedro Alves  <palves@redhat.com>

> 

> 	* gdbthread.h (delete_thread, delete_thread_silent)

> 	(find_thread_ptid): Update comments.

> 	* thread.c (current_thread_): New global.

> 	(is_current_thread): Move higher, and reimplement.

> 	(inferior_thread): Reimplement.

> 	(set_thread_exited): Use bool.  Add assertions.

> 	(add_thread_silent): Simplify thread-reuse handling by always

> 	calling delete_thread.

> 	(delete_thread): Remove intro comment.

> 	(find_thread_ptid): Skip exited threads.

> 	(switch_to_thread_no_regs): Write to current_thread_.

> 	(switch_to_no_thread): Check CURRENT_THREAD_ instead of

> 	INFERIOR_PTID.  Clear current_thread_.

> ---

>  gdb/gdbthread.h | 17 +++++-----

>  gdb/thread.c    | 97 ++++++++++++++++++++-------------------------------------

>  2 files changed, 42 insertions(+), 72 deletions(-)


This commit causes a regression for gdb.gdb/unittest.exp when GDB is
configured with --enable-targets=all.

The failure is:

  gdb/thread.c:95: internal-error: thread_info* inferior_thread(): Assertion `current_thread_ != nullptr' failed.

Here's a partial backtrace:

  #8  0x00000000013e4d0f in internal_error (file=0x16c55c7 "../../src.dev-1/gdb/thread.c", line=95, 
      fmt=0x16c54d9 "%s: Assertion `%s' failed.") at ../../src.dev-1/gdbsupport/errors.cc:55
  #9  0x0000000000c34e63 in inferior_thread () at ../../src.dev-1/gdb/thread.c:95
  #10 0x0000000000a73d62 in get_current_regcache () at ../../src.dev-1/gdb/regcache.c:391
  #11 0x00000000009051c0 in current_options () at ../../src.dev-1/gdb/mep-tdep.c:873
  #12 0x00000000009052eb in mep_register_name (gdbarch=0x33862a0, regnr=152) at ../../src.dev-1/gdb/mep-tdep.c:958
  #13 0x00000000009054aa in mep_register_reggroup_p (gdbarch=0x33862a0, regnum=152, group=0x26334c0 <save_group>)
      at ../../src.dev-1/gdb/mep-tdep.c:1029
  #14 0x00000000007975e8 in gdbarch_register_reggroup_p (gdbarch=0x33862a0, regnum=152, reggroup=0x26334c0 <save_group>)
      at ../../src.dev-1/gdb/gdbarch.c:3595
  #15 0x0000000000a7355a in reg_buffer::save(gdb::function_view<register_status (int, unsigned char*)>) (
      this=0x7fffffffad00, cooked_read=...) at ../../src.dev-1/gdb/regcache.c:248
  #16 0x0000000000780b36 in readonly_detached_regcache::readonly_detached_regcache(gdbarch*, gdb::function_view<register_status (int, unsigned char*)>) (this=0x7fffffffad00, gdbarch=0x33862a0, cooked_read=...)
      at ../../src.dev-1/gdb/regcache.h:455
  #17 0x0000000000a73409 in readonly_detached_regcache::readonly_detached_regcache (this=0x7fffffffad00, src=...)
      at ../../src.dev-1/gdb/regcache.c:213
  #18 0x0000000000a773b5 in selftests::cooked_read_test (gdbarch=0x33862a0) at ../../src.dev-1/gdb/regcache.c:1690
  #19 0x0000000000b48c29 in selftests::gdbarch_selftest::operator() (this=0x320fef0)
      at ../../src.dev-1/gdb/selftest-arch.c:73
  #20 0x00000000013fe5ac in selftests::run_tests (filter=0x0) at ../../src.dev-1/gdbsupport/selftest.cc:88
  #21 0x00000000008e78ba in maintenance_selftest (args=0x0, from_tty=1) at ../../src.dev-1/gdb/maint.c:1044
  #22 0x00000000005b0557 in do_const_cfunc (c=0x32004f0, args=0x0, from_tty=1)
      at ../../src.dev-1/gdb/cli/cli-decode.c:95
  #23 0x00000000005b3839 in cmd_func (cmd=0x32004f0, args=0x0, from_tty=1) at ../../src.dev-1/gdb/cli/cli-decode.c:2113
  #24 0x0000000000c4953b in execute_command (p=0x2924895 "", from_tty=1) at ../../src.dev-1/gdb/top.c:655
  #25 0x0000000000759c26 in command_handler (command=0x2924880 "maintenance selftest ")
      at ../../src.dev-1/gdb/event-top.c:588
  #26 0x000000000075a052 in command_line_handler (rl=...) at ../../src.dev-1/gdb/event-top.c:773

I suspect that the problem might be this line in regcache.c:cooked_read_test:

  /* Switch to the mock thread.  */
  scoped_restore restore_inferior_ptid
    = make_scoped_restore (&inferior_ptid, mock_ptid);

My suspicion from a quick read of your patch above is that we need to
do more than just set inferior_ptid here now.

Thanks,
Andrew
Shahab Vahedi via Gdb-patches June 23, 2020, 2:26 p.m. | #10
On 6/23/20 2:37 PM, Andrew Burgess wrote:

> This commit causes a regression for gdb.gdb/unittest.exp when GDB is

> configured with --enable-targets=all.

> 

> The failure is:

> 

>   gdb/thread.c:95: internal-error: thread_info* inferior_thread(): Assertion `current_thread_ != nullptr' failed.

> 

> Here's a partial backtrace:

> 

>   #8  0x00000000013e4d0f in internal_error (file=0x16c55c7 "../../src.dev-1/gdb/thread.c", line=95, 

>       fmt=0x16c54d9 "%s: Assertion `%s' failed.") at ../../src.dev-1/gdbsupport/errors.cc:55

>   #9  0x0000000000c34e63 in inferior_thread () at ../../src.dev-1/gdb/thread.c:95

>   #10 0x0000000000a73d62 in get_current_regcache () at ../../src.dev-1/gdb/regcache.c:391

>   #11 0x00000000009051c0 in current_options () at ../../src.dev-1/gdb/mep-tdep.c:873

>   #12 0x00000000009052eb in mep_register_name (gdbarch=0x33862a0, regnr=152) at ../../src.dev-1/gdb/mep-tdep.c:958

>   #13 0x00000000009054aa in mep_register_reggroup_p (gdbarch=0x33862a0, regnum=152, group=0x26334c0 <save_group>)

>       at ../../src.dev-1/gdb/mep-tdep.c:1029

>   #14 0x00000000007975e8 in gdbarch_register_reggroup_p (gdbarch=0x33862a0, regnum=152, reggroup=0x26334c0 <save_group>)

>       at ../../src.dev-1/gdb/gdbarch.c:3595

>   #15 0x0000000000a7355a in reg_buffer::save(gdb::function_view<register_status (int, unsigned char*)>) (

>       this=0x7fffffffad00, cooked_read=...) at ../../src.dev-1/gdb/regcache.c:248

>   #16 0x0000000000780b36 in readonly_detached_regcache::readonly_detached_regcache(gdbarch*, gdb::function_view<register_status (int, unsigned char*)>) (this=0x7fffffffad00, gdbarch=0x33862a0, cooked_read=...)

>       at ../../src.dev-1/gdb/regcache.h:455

>   #17 0x0000000000a73409 in readonly_detached_regcache::readonly_detached_regcache (this=0x7fffffffad00, src=...)

>       at ../../src.dev-1/gdb/regcache.c:213

>   #18 0x0000000000a773b5 in selftests::cooked_read_test (gdbarch=0x33862a0) at ../../src.dev-1/gdb/regcache.c:1690

>   #19 0x0000000000b48c29 in selftests::gdbarch_selftest::operator() (this=0x320fef0)

>       at ../../src.dev-1/gdb/selftest-arch.c:73

>   #20 0x00000000013fe5ac in selftests::run_tests (filter=0x0) at ../../src.dev-1/gdbsupport/selftest.cc:88

>   #21 0x00000000008e78ba in maintenance_selftest (args=0x0, from_tty=1) at ../../src.dev-1/gdb/maint.c:1044

>   #22 0x00000000005b0557 in do_const_cfunc (c=0x32004f0, args=0x0, from_tty=1)

>       at ../../src.dev-1/gdb/cli/cli-decode.c:95

>   #23 0x00000000005b3839 in cmd_func (cmd=0x32004f0, args=0x0, from_tty=1) at ../../src.dev-1/gdb/cli/cli-decode.c:2113

>   #24 0x0000000000c4953b in execute_command (p=0x2924895 "", from_tty=1) at ../../src.dev-1/gdb/top.c:655

>   #25 0x0000000000759c26 in command_handler (command=0x2924880 "maintenance selftest ")

>       at ../../src.dev-1/gdb/event-top.c:588

>   #26 0x000000000075a052 in command_line_handler (rl=...) at ../../src.dev-1/gdb/event-top.c:773

> 

> I suspect that the problem might be this line in regcache.c:cooked_read_test:

> 

>   /* Switch to the mock thread.  */

>   scoped_restore restore_inferior_ptid

>     = make_scoped_restore (&inferior_ptid, mock_ptid);

> 

> My suspicion from a quick read of your patch above is that we need to

> do more than just set inferior_ptid here now.


Indeed.  Thanks for reporting it.  

I fixed this in gdbarch-selftests.c, where this code is duplicated,
but missed regcache.c.  I'll fix it.

Thanks,
Pedro Alves

Patch

diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
index 717a2ad08c..6764c8fc49 100644
--- a/gdb/gdbthread.h
+++ b/gdb/gdbthread.h
@@ -433,12 +433,13 @@  extern struct thread_info *add_thread_with_info (process_stratum_target *targ,
 						 ptid_t ptid,
 						 private_thread_info *);
 
-/* Delete an existing thread list entry.  */
+/* Delete thread THREAD and notify of thread exit.  If the thread is
+   currently not deletable, don't actually delete it but still tag it
+   as exited and do the notification.  */
 extern void delete_thread (struct thread_info *thread);
 
-/* Delete an existing thread list entry, and be quiet about it.  Used
-   after the process this thread having belonged to having already
-   exited, for example.  */
+/* Like delete_thread, but be quiet about it.  Used when the process
+   this thread belonged to has already exited, for example.  */
 extern void delete_thread_silent (struct thread_info *thread);
 
 /* Delete a step_resume_breakpoint from the thread database.  */
@@ -478,15 +479,15 @@  extern bool in_thread_list (process_stratum_target *targ, ptid_t ptid);
    global id, not the system's).  */
 extern int valid_global_thread_id (int global_id);
 
-/* Find thread PTID of inferior INF.  */
+/* Find (non-exited) thread PTID of inferior INF.  */
 extern thread_info *find_thread_ptid (inferior *inf, ptid_t ptid);
 
-/* Search function to lookup a thread by 'pid'.  */
+/* Search function to lookup a (non-exited) thread by 'ptid'.  */
 extern struct thread_info *find_thread_ptid (process_stratum_target *targ,
 					     ptid_t ptid);
 
-/* Search function to lookup a thread by 'ptid'.  Only searches in
-   threads of INF.  */
+/* Search function to lookup a (non-exited) thread by 'ptid'.  Only
+   searches in threads of INF.  */
 extern struct thread_info *find_thread_ptid (inferior *inf, ptid_t ptid);
 
 /* Find thread by GDB global thread ID.  */
diff --git a/gdb/thread.c b/gdb/thread.c
index c6e3d356a5..c233995f76 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -55,6 +55,9 @@ 
 
 static int highest_thread_num;
 
+/* The current/selected thread.  */
+static thread_info *current_thread_;
+
 /* RAII type used to increase / decrease the refcount of each thread
    in a given list of threads.  */
 
@@ -78,13 +81,19 @@  class scoped_inc_dec_ref
   const std::vector<thread_info *> &m_thrds;
 };
 
+/* Returns true if THR is the current thread.  */
+
+static bool
+is_current_thread (const thread_info *thr)
+{
+  return thr == current_thread_;
+}
 
 struct thread_info*
 inferior_thread (void)
 {
-  struct thread_info *tp = find_thread_ptid (current_inferior (), inferior_ptid);
-  gdb_assert (tp);
-  return tp;
+  gdb_assert (current_thread_ != nullptr);
+  return current_thread_;
 }
 
 /* Delete the breakpoint pointed at by BP_P, if there's one.  */
@@ -194,7 +203,7 @@  clear_thread_inferior_resources (struct thread_info *tp)
 /* Set the TP's state as exited.  */
 
 static void
-set_thread_exited (thread_info *tp, int silent)
+set_thread_exited (thread_info *tp, bool silent)
 {
   /* Dead threads don't need to step-over.  Remove from queue.  */
   if (tp->step_over_next != NULL)
@@ -245,7 +254,12 @@  new_thread (struct inferior *inf, ptid_t ptid)
       struct thread_info *last;
 
       for (last = inf->thread_list; last->next != NULL; last = last->next)
-	;
+	gdb_assert (ptid != last->ptid
+		    || last->state == THREAD_EXITED);
+
+      gdb_assert (ptid != last->ptid
+		  || last->state == THREAD_EXITED);
+
       last->next = tp;
     }
 
@@ -255,51 +269,15 @@  new_thread (struct inferior *inf, ptid_t ptid)
 struct thread_info *
 add_thread_silent (process_stratum_target *targ, ptid_t ptid)
 {
-  inferior *inf;
-
-  thread_info *tp = find_thread_ptid (targ, ptid);
-  if (tp)
-    /* Found an old thread with the same id.  It has to be dead,
-       otherwise we wouldn't be adding a new thread with the same id.
-       The OS is reusing this id --- delete it, and recreate a new
-       one.  */
-    {
-      /* In addition to deleting the thread, if this is the current
-	 thread, then we need to take care that delete_thread doesn't
-	 really delete the thread if it is inferior_ptid.  Create a
-	 new template thread in the list with an invalid ptid, switch
-	 to it, delete the original thread, reset the new thread's
-	 ptid, and switch to it.  */
-
-      if (inferior_ptid == ptid)
-	{
-	  thread_info *new_thr = new_thread (tp->inf, null_ptid);
-
-	  /* Make switch_to_thread not read from the thread.  */
-	  new_thr->state = THREAD_EXITED;
-	  switch_to_no_thread ();
-
-	  /* Now we can delete it.  */
-	  delete_thread (tp);
-
-	  /* Now reset its ptid, and reswitch inferior_ptid to it.  */
-	  new_thr->ptid = ptid;
-	  new_thr->state = THREAD_STOPPED;
-	  switch_to_thread (new_thr);
-
-	  gdb::observers::new_thread.notify (new_thr);
-
-	  /* All done.  */
-	  return new_thr;
-	}
+  inferior *inf = find_inferior_ptid (targ, ptid);
 
-      inf = tp->inf;
-
-      /* Just go ahead and delete it.  */
-      delete_thread (tp);
-    }
-  else
-    inf = find_inferior_ptid (targ, ptid);
+  /* We may have an old thread with the same id in the thread list.
+     If we do, it must be dead, otherwise we wouldn't be adding a new
+     thread with the same id.  The OS is reusing this id --- delete
+     the old thread, and create a new one.  */
+  thread_info *tp = find_thread_ptid (inf, ptid);
+  if (tp != nullptr)
+    delete_thread (tp);
 
   tp = new_thread (inf, ptid);
   gdb::observers::new_thread.notify (tp);
@@ -349,14 +327,6 @@  thread_info::~thread_info ()
   xfree (this->name);
 }
 
-/* Returns true if THR is the current thread.  */
-
-static bool
-is_current_thread (const thread_info *thr)
-{
-  return thr->inf == current_inferior () && thr->ptid == inferior_ptid;
-}
-
 /* See gdbthread.h.  */
 
 bool
@@ -482,10 +452,7 @@  delete_thread_1 (thread_info *thr, bool silent)
   delete tp;
 }
 
-/* Delete thread THREAD and notify of thread exit.  If this is the
-   current thread, don't actually delete it, but tag it as exited and
-   do the notification.  If this is the user selected thread, clear
-   it.  */
+/* See gdbthread.h.  */
 
 void
 delete_thread (thread_info *thread)
@@ -535,7 +502,7 @@  find_thread_ptid (process_stratum_target *targ, ptid_t ptid)
 struct thread_info *
 find_thread_ptid (inferior *inf, ptid_t ptid)
 {
-  for (thread_info *tp : inf->threads ())
+  for (thread_info *tp : inf->non_exited_threads ())
     if (tp->ptid == ptid)
       return tp;
 
@@ -1317,7 +1284,8 @@  switch_to_thread_no_regs (struct thread_info *thread)
   set_current_program_space (inf->pspace);
   set_current_inferior (inf);
 
-  inferior_ptid = thread->ptid;
+  current_thread_ = thread;
+  inferior_ptid = current_thread_->ptid;
 }
 
 /* See gdbthread.h.  */
@@ -1325,9 +1293,10 @@  switch_to_thread_no_regs (struct thread_info *thread)
 void
 switch_to_no_thread ()
 {
-  if (inferior_ptid == null_ptid)
+  if (current_thread_ == nullptr)
     return;
 
+  current_thread_ = nullptr;
   inferior_ptid = null_ptid;
   reinit_frame_cache ();
 }