[v4,2/2] gdb: defer commit resume until all available events are consumed

Message ID 20210321000218.107375-3-pedro@palves.net
State New
Headers show
Series
  • Reduce back and forth with target when there are pending statuses
Related show

Commit Message

Pedro Alves March 21, 2021, 12:02 a.m.
From: Simon Marchi <simon.marchi@efficios.com>


New in v4 (Pedro):

 - Made target_has_events a regular delegating target method.

 - Rename target_has_events -> target_has_pending_events

 - No longer check target_has_events in do_target_wait (that caused
   regressions).

 - No longer switch linux-nat to use an async event source.  Simply
   make target_has_pending_events consult whether the pipe is
   readable.  I ended up disliking that a SIGCHLD results in two round
   trips via the event loop.  Also results in less code and a "single
   source of truth".  And also some divergence from gdbserver.  We can
   always revisit, but this way it doesn't have to be a dependency, it
   can be done afterwards.

 - Removed the sprinkled gdb_assert implementations throughout,
   letting the default of false kick in.

 - No longer regresses against gdbserver, so updated commit log to no
   longer say it does.

 - Misc cleanups, formattings, etc.

Rationale
---------

Let's say you have multiple threads hitting a conditional breakpoint
at the same time, and all of these are going to evaluate to false.
All these threads will need to be resumed.

Currently, GDB fetches one target event (one SIGTRAP representing the
breakpoint hit) and decides that the thread should be resumed.  It
calls resume and commit_resume immediately.  It then fetches the
second target event, and does the same, until it went through all
threads.

The result is therefore something like:

  - consume event for thread A
  - resume thread A
  - commit resume (affects thread A)
  - consume event for thread B
  - resume thread B
  - commit resume (affects thread B)
  - consume event for thread C
  - resume thread C
  - commit resume (affects thread C)

For targets where it's beneficial to group resumptions requests (most
likely those that implement target_ops::commit_resume), it would be
much better to have:

  - consume event for thread A
  - resume thread A
  - consume event for thread B
  - resume thread B
  - consume event for thread C
  - resume thread C
  - commit resume (affects threads A, B and C)

Implementation details
----------------------

To achieve this, this patch adds another check in
maybe_set_commit_resumed_all_targets to avoid setting the
commit-resumed flag of targets that readily have events to provide to
infrun.

To determine if a target has events readily available to report, this
patch adds an `has_pending_events` target_ops method.  The method
returns a simple bool to say whether or not it has pending events to
report.

Testing
-- 
2.26.2

Comments

Lancelot SIX via Gdb-patches March 25, 2021, 4 p.m. | #1
On 2021-03-20 8:02 p.m., Pedro Alves wrote:
> From: Simon Marchi <simon.marchi@efficios.com>

> 

> New in v4 (Pedro):

> 

>  - Made target_has_events a regular delegating target method.

> 

>  - Rename target_has_events -> target_has_pending_events

> 

>  - No longer check target_has_events in do_target_wait (that caused

>    regressions).

> 

>  - No longer switch linux-nat to use an async event source.  Simply

>    make target_has_pending_events consult whether the pipe is

>    readable.  I ended up disliking that a SIGCHLD results in two round

>    trips via the event loop.  Also results in less code and a "single

>    source of truth".  And also some divergence from gdbserver.  We can

>    always revisit, but this way it doesn't have to be a dependency, it

>    can be done afterwards.


Since linux-nat doesn't benefit from commit_resumed, we could also just
not implement has_pending_events in linux_nat_target, and let it default
to "return false".

> 

>  - Removed the sprinkled gdb_assert implementations throughout,

>    letting the default of false kick in.

> 

>  - No longer regresses against gdbserver, so updated commit log to no

>    longer say it does.

> 

>  - Misc cleanups, formattings, etc.


I don't think I previously submitted this patch upstream, I just had it
as a "draft".  But this is fine, the goal was to submit it eventually.
I agree with all of the above.

> Rationale

> ---------

> 

> Let's say you have multiple threads hitting a conditional breakpoint

> at the same time, and all of these are going to evaluate to false.

> All these threads will need to be resumed.

> 

> Currently, GDB fetches one target event (one SIGTRAP representing the

> breakpoint hit) and decides that the thread should be resumed.  It

> calls resume and commit_resume immediately.  It then fetches the


commit_resume -> commit_resumed

> second target event, and does the same, until it went through all

> threads.

> 

> The result is therefore something like:

> 

>   - consume event for thread A

>   - resume thread A

>   - commit resume (affects thread A)

>   - consume event for thread B

>   - resume thread B

>   - commit resume (affects thread B)

>   - consume event for thread C

>   - resume thread C

>   - commit resume (affects thread C)

> 

> For targets where it's beneficial to group resumptions requests (most

> likely those that implement target_ops::commit_resume), it would be

> much better to have:

> 

>   - consume event for thread A

>   - resume thread A

>   - consume event for thread B

>   - resume thread B

>   - consume event for thread C

>   - resume thread C

>   - commit resume (affects threads A, B and C)


Should be "commit resumed" in all of the above.

Otherwise, this all LGTM.

Simon
Pedro Alves March 26, 2021, 4:16 p.m. | #2
On 25/03/21 16:00, Simon Marchi wrote:
> On 2021-03-20 8:02 p.m., Pedro Alves wrote:

>> From: Simon Marchi <simon.marchi@efficios.com>

>>

>> New in v4 (Pedro):

>>

>>  - Made target_has_events a regular delegating target method.

>>

>>  - Rename target_has_events -> target_has_pending_events

>>

>>  - No longer check target_has_events in do_target_wait (that caused

>>    regressions).

>>

>>  - No longer switch linux-nat to use an async event source.  Simply

>>    make target_has_pending_events consult whether the pipe is

>>    readable.  I ended up disliking that a SIGCHLD results in two round

>>    trips via the event loop.  Also results in less code and a "single

>>    source of truth".  And also some divergence from gdbserver.  We can

>>    always revisit, but this way it doesn't have to be a dependency, it

>>    can be done afterwards.

> 

> Since linux-nat doesn't benefit from commit_resumed, we could also just

> not implement has_pending_events in linux_nat_target, and let it default

> to "return false".


Hmm, indeed.  It was necessary before I removed the check for target_has_events
from do_target_wait, otherwise we'd never consult the target.  But without that call,
indeed we don't need it.  I removed it.

I also noticed that the infrun.h include in async-event.c isn't needed,
so I removed that.

>> Rationale

>> ---------

>>

>> Let's say you have multiple threads hitting a conditional breakpoint

>> at the same time, and all of these are going to evaluate to false.

>> All these threads will need to be resumed.

>>

>> Currently, GDB fetches one target event (one SIGTRAP representing the

>> breakpoint hit) and decides that the thread should be resumed.  It

>> calls resume and commit_resume immediately.  It then fetches the

> 

> commit_resume -> commit_resumed

> 


Argh, I thought I had adjusted this before pushing, but I didn't.  Too late now.

Sorry about that!

> Otherwise, this all LGTM.


Great, I merged it to master now.

Thanks,
Pedro Alves
Lancelot SIX via Gdb-patches March 26, 2021, 4:18 p.m. | #3
>>> Rationale

>>> ---------

>>>

>>> Let's say you have multiple threads hitting a conditional breakpoint

>>> at the same time, and all of these are going to evaluate to false.

>>> All these threads will need to be resumed.

>>>

>>> Currently, GDB fetches one target event (one SIGTRAP representing the

>>> breakpoint hit) and decides that the thread should be resumed.  It

>>> calls resume and commit_resume immediately.  It then fetches the

>>

>> commit_resume -> commit_resumed

>>

> 

> Argh, I thought I had adjusted this before pushing, but I didn't.  Too late now.

> 

> Sorry about that!


Bah, really not a big deal!

> 

>> Otherwise, this all LGTM.

> 

> Great, I merged it to master now.


Thanks!

Simon

Patch

=======

To test this, I start GDBserver with a program that spawns multiple
threads:

 $ ../gdbserver/gdbserver --once :1234 ~/src/many-threads-stepping-over-breakpoints/many-threads-stepping-over-breakpoints

I then connect with GDB and install a conditional breakpoint that always
evaluates to false (and force the evaluation to be done by GDB):

 $ ./gdb -nx --data-directory=data-directory \
     /home/simark/src/many-threads-stepping-over-breakpoints/many-threads-stepping-over-breakpoints \
     -ex "set breakpoint condition-evaluation host" \
     -ex "set pag off" \
     -ex "set confirm off" \
     -ex "maint set target-non-stop on" \
     -ex "tar rem :1234" \
     -ex "tb main" \
     -ex "b 13 if 0" \
     -ex c \
     -ex "set debug infrun" \
     -ex "set debug remote 1" \
     -ex "set debug displaced"

I then do "continue" and look at the log.

The remote target receives a bunch of stop notifications for all
threads that have hit the breakpoint.  infrun consumes and processes
one event, decides it should not cause a stop, prepares a displaced
step, after which we should see:

 [infrun] maybe_set_commit_resumed_all_process_targets: not requesting commit-resumed for target remote, target has pending events

Same for a second thread (since we have 2 displaced step buffers).
For the following threads, their displaced step is deferred since
there are no more buffers available.

After consuming the last event the remote target has to offer, we get:

 [infrun] maybe_set_commit_resumed_all_process_targets: enabling commit-resumed for target remote
 [infrun] maybe_call_commit_resumed_all_process_targets: calling commit_resumed for target remote
 [remote] Sending packet: $vCont;s:p14d16b.14d1b1;s:p14d16b.14d1b2#55
 [remote] Packet received: OK

Without the patch, there would have been one vCont;s just after each
prepared displaced step.

gdb/ChangeLog:

	* async-event.c: Include "infrun.h".
	(async_event_handler_marked): New.
	* async-event.h (async_event_handler_marked): Declare.
	* infrun.c (maybe_set_commit_resumed_all_targets): Switch to
	inferior before calling target method.  Don't commit-resumed if
	target_has_pending_events is true.
	* linux-nat.c (async_file_is_marked): New.
	(linux_nat_target::has_pending_events): New.
	* linux-nat.h (linux_nat_target) <has_pending_events>: Declare.
	* remote.c (remote_target::has_pending_events): New.
	* target-delegates.c: Regenerate.
	* target.h (target_ops::has_pending_events): New target method.
	(target_has_pending_events): New.

Change-Id: I18112ba19a1ff4986530c660f530d847bb4a1f1d
---
 gdb/async-event.c      |  9 +++++++++
 gdb/async-event.h      |  3 +++
 gdb/infrun.c           | 12 ++++++++++++
 gdb/linux-nat.c        | 30 ++++++++++++++++++++++++++++++
 gdb/linux-nat.h        |  1 +
 gdb/remote.c           | 21 +++++++++++++++++++++
 gdb/target-delegates.c | 27 +++++++++++++++++++++++++++
 gdb/target.h           | 18 ++++++++++++++++++
 8 files changed, 121 insertions(+)

diff --git a/gdb/async-event.c b/gdb/async-event.c
index c4ab7318179..5cc518467b0 100644
--- a/gdb/async-event.c
+++ b/gdb/async-event.c
@@ -21,6 +21,7 @@ 
 
 #include "ser-event.h"
 #include "top.h"
+#include "infrun.h"
 
 /* PROC is a function to be invoked when the READY flag is set.  This
    happens when there has been a signal and the corresponding signal
@@ -308,6 +309,14 @@  clear_async_event_handler (async_event_handler *async_handler_ptr)
   async_handler_ptr->ready = 0;
 }
 
+/* See event-loop.h.  */
+
+bool
+async_event_handler_marked (async_event_handler *handler)
+{
+  return handler->ready;
+}
+
 /* Check if asynchronous event handlers are ready, and call the
    handler function for one that is.  */
 
diff --git a/gdb/async-event.h b/gdb/async-event.h
index 9d96235b38d..47759d5c2d3 100644
--- a/gdb/async-event.h
+++ b/gdb/async-event.h
@@ -78,6 +78,9 @@  extern void
    loop.  */
 extern void mark_async_event_handler (struct async_event_handler *handler);
 
+/* Return true if HANDLER is marked.  */
+extern bool async_event_handler_marked (async_event_handler *handler);
+
 /* Mark the handler (ASYNC_HANDLER_PTR) as NOT ready.  */
 
 extern void clear_async_event_handler (struct async_event_handler *handler);
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 0af2f940105..195cd4176e3 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2765,6 +2765,8 @@  schedlock_applies (struct thread_info *tp)
 static void
 maybe_set_commit_resumed_all_targets ()
 {
+  scoped_restore_current_thread restore_thread;
+
   for (inferior *inf : all_non_exited_inferiors ())
     {
       process_stratum_target *proc_target = inf->process_target ();
@@ -2806,6 +2808,16 @@  maybe_set_commit_resumed_all_targets ()
 	  continue;
 	}
 
+      switch_to_inferior_no_thread (inf);
+
+      if (target_has_pending_events ())
+	{
+	  infrun_debug_printf ("not requesting commit-resumed for target %s, "
+			       "target has pending events",
+			       proc_target->shortname ());
+	  continue;
+	}
+
       infrun_debug_printf ("enabling commit-resumed for target %s",
 			   proc_target->shortname ());
 
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index ccfd3c1320c..1186e10b925 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -263,6 +263,28 @@  async_file_mark (void)
      be awakened anyway.  */
 }
 
+/* Return true if the event pipe is marked.  */
+
+static bool
+async_file_is_marked ()
+{
+  int fd = linux_nat_event_pipe[0];
+  if (fd != -1)
+    {
+      struct timeval timeout {};
+
+      fd_set set;
+      FD_ZERO (&set);
+      FD_SET (fd, &set);
+
+      int n = select (fd + 1, &set, NULL, NULL, &timeout);
+      if (n > 0)
+       return true;
+    }
+
+  return false;
+}
+
 static int kill_lwp (int lwpid, int signo);
 
 static int stop_callback (struct lwp_info *lp);
@@ -4104,6 +4126,14 @@  linux_nat_target::async_wait_fd ()
   return linux_nat_event_pipe[0];
 }
 
+/* target_has_pending_events implementation.  */
+
+bool
+linux_nat_target::has_pending_events ()
+{
+  return async_file_is_marked ();
+}
+
 /* target_async implementation.  */
 
 void
diff --git a/gdb/linux-nat.h b/gdb/linux-nat.h
index ff4d753422d..fb731c7f889 100644
--- a/gdb/linux-nat.h
+++ b/gdb/linux-nat.h
@@ -90,6 +90,7 @@  class linux_nat_target : public inf_ptrace_target
 
   int async_wait_fd () override;
   void async (int) override;
+  bool has_pending_events () override;
 
   void close () override;
 
diff --git a/gdb/remote.c b/gdb/remote.c
index 5b2bc240bcb..893f0c1399d 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -428,6 +428,7 @@  class remote_target : public process_stratum_target
   void commit_resumed () override;
   void resume (ptid_t, int, enum gdb_signal) override;
   ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
+  bool has_pending_events () override;
 
   void fetch_registers (struct regcache *, int) override;
   void store_registers (struct regcache *, int) override;
@@ -6794,6 +6795,26 @@  remote_target::commit_resumed ()
   vcont_builder.flush ();
 }
 
+/* Implementation of target_has_pending_events.  */
+
+bool
+remote_target::has_pending_events ()
+{
+  if (target_can_async_p ())
+    {
+      remote_state *rs = get_remote_state ();
+
+      if (async_event_handler_marked (rs->remote_async_inferior_event_token))
+	return true;
+
+      /* Note that BUFCNT can be negative, indicating sticky
+	 error.  */
+      if (rs->remote_desc->bufcnt != 0)
+	return true;
+    }
+  return false;
+}
+
 
 
 /* Non-stop version of target_stop.  Uses `vCont;t' to stop a remote
diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
index c0b3129afad..1b2400175de 100644
--- a/gdb/target-delegates.c
+++ b/gdb/target-delegates.c
@@ -84,6 +84,7 @@  struct dummy_target : public target_ops
   bool is_async_p () override;
   void async (int arg0) override;
   int async_wait_fd () override;
+  bool has_pending_events () override;
   void thread_events (int arg0) override;
   bool supports_non_stop () override;
   bool always_non_stop_p () override;
@@ -255,6 +256,7 @@  struct debug_target : public target_ops
   bool is_async_p () override;
   void async (int arg0) override;
   int async_wait_fd () override;
+  bool has_pending_events () override;
   void thread_events (int arg0) override;
   bool supports_non_stop () override;
   bool always_non_stop_p () override;
@@ -2193,6 +2195,31 @@  debug_target::async_wait_fd ()
   return result;
 }
 
+bool
+target_ops::has_pending_events ()
+{
+  return this->beneath ()->has_pending_events ();
+}
+
+bool
+dummy_target::has_pending_events ()
+{
+  return false;
+}
+
+bool
+debug_target::has_pending_events ()
+{
+  bool result;
+  fprintf_unfiltered (gdb_stdlog, "-> %s->has_pending_events (...)\n", this->beneath ()->shortname ());
+  result = this->beneath ()->has_pending_events ();
+  fprintf_unfiltered (gdb_stdlog, "<- %s->has_pending_events (", this->beneath ()->shortname ());
+  fputs_unfiltered (") = ", gdb_stdlog);
+  target_debug_print_bool (result);
+  fputs_unfiltered ("\n", gdb_stdlog);
+  return result;
+}
+
 void
 target_ops::thread_events (int arg0)
 {
diff --git a/gdb/target.h b/gdb/target.h
index 450a6c7d46b..3be49c06d33 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -717,6 +717,15 @@  struct target_ops
       TARGET_DEFAULT_NORETURN (tcomplain ());
     virtual int async_wait_fd ()
       TARGET_DEFAULT_NORETURN (noprocess ());
+    /* Return true if the target has pending events to report to the
+       core.  If true, then GDB avoids resuming the target until all
+       pending events are consumed, so that multiple resumptions can
+       be coalesced as an optimization.  Most targets can't tell
+       whether they have pending events without calling target_wait,
+       so we default to returning false.  The only downside is that a
+       potential optimization is missed.  */
+    virtual bool has_pending_events ()
+      TARGET_DEFAULT_RETURN (false);
     virtual void thread_events (int)
       TARGET_DEFAULT_IGNORE ();
     /* This method must be implemented in some situations.  See the
@@ -1462,6 +1471,15 @@  extern ptid_t default_target_wait (struct target_ops *ops,
 				   struct target_waitstatus *status,
 				   target_wait_flags options);
 
+/* Return true if the target has pending events to report to the core.
+   See target_ops::has_pending_events().  */
+
+static inline bool
+target_has_pending_events ()
+{
+  return current_top_target ()->has_pending_events ();
+}
+
 /* Fetch at least register REGNO, or all regs if regno == -1.  No result.  */
 
 extern void target_fetch_registers (struct regcache *regcache, int regno);