[4/4] gdb/remote: error if a stop arrives for a non-resumed thread

Message ID 3768e00eacad6fed0539fe30669483f4f56bc0ce.1623268999.git.andrew.burgess@embecosm.com
State New
Headers show
  • gdb/remote: spot stop packets sent for the wrong thread
Related show

Commit Message

Andrew Burgess June 9, 2021, 8:06 p.m.
When looking at bug gdb/26819 I spotted that due to the configuration
of OpenOCD, OpenOCD was sending GDB a stop reply for a thread that GDB
was not expecting to be executing.

If GDB was doing a non-displaced step-over, and so, had only sent a
single step request to one specific thread, GDB would, assert that the
thread that stopped should be in the process of completing a
step-over.  This would result in this assertion:

  infrun.c:5672: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.

This commit is an attempt to have GDB spot when the incoming stop
packet from the target is invalid and throw an error before we hit the
above assert.

So, in this commit, when processing the stop in the remote_target
code, if the stop is for a thread that is not resumed, throw an error.

There's no test for this, as it requires that a target send back an
invalid stop packet, one that includes the wrong thread-id, I don't
currently have any good ideas for how I would trigger such a remote

When looking at the original example from the bug report (which uses
the RISC-V spike simulator and OpenOCD), when the invalid packet
arrives the GDB session now looks like this:

  (gdb) continue
  stop received from remote for thread that should not be executing
  (gdb) info threads
    Id   Target Id                                       Frame
    1    Thread 1 (Name: riscv.cpu0, state: single-step) (running)
  * 2    Thread 2 (Name: riscv.cpu1, state: single-step) (running)

Unfortunately, at this point the session is effectively dead as the
threads are still in the running state, but GDB is in all-stop mode,
so isn't really expecting to be in this state.

Given that, should we get to this point, we know that the remote is
not following the rules, I'm not hugely worried.  Even if we could put
GDB back into a usable state, we couldn't reasonable expect to
continue working with the remote when we know it's broken.


	PR gdb/26819
	* remote.c (remote_target::process_stop_reply): Catch the case
	where a stop reply arrives for a thread that was not resumed.
 gdb/ChangeLog |  6 ++++++
 gdb/remote.c  | 21 +++++++++++++++++++++
 2 files changed, 27 insertions(+)



diff --git a/gdb/remote.c b/gdb/remote.c
index 85ab16e894e..1bb6f0303a3 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -8057,6 +8057,27 @@  remote_target::process_stop_reply (struct stop_reply *stop_reply,
       remote_notice_new_inferior (ptid, false);
       remote_thread_info *remote_thr = get_remote_thread_info (this, ptid);
+      /* Check that the thread that is reported stopped is one that was
+	 previously running.  This catches the case where a badly behaving
+	 target reports a stop for a thread that GDB thinks should already
+	 be stopped.
+	 In many cases, reporting a stop for a thread that GDB thinks is
+	 already stopped, turns out to be harmless, however, there are
+	 cases, (e.g. when single stepping), where reporting a stop against
+	 an unexpected thread will trigger assertion failures from within
+	 infrun.
+	 The one exception to this logic is when we stop after an exec
+	 call.  The exec will replace the main thread of the inferior, even
+	 if the exec is performed in some other thread.  Thus, GDB might
+	 step thread 2, but the exec-stop is reported in thread 1.  */
+      if (remote_thr->get_resume_state () != resume_state::RESUMED
+	  && status->kind != TARGET_WAITKIND_EXECD)
+	error (_("stop received from remote for thread that should "
+		 "not be executing"));
       remote_thr->core = stop_reply->core;
       remote_thr->stop_reason = stop_reply->stop_reason;
       remote_thr->watch_data_address = stop_reply->watch_data_address;