[0/4] Micro-optimize DWARF partial symbol reading

Message ID 20200520174032.9525-1-tromey@adacore.com
Headers show
Series
  • Micro-optimize DWARF partial symbol reading
Related show

Message

Tom Tromey May 20, 2020, 5:40 p.m.
A personal goal of mine is to improve the startup time of gdb.  In the
long run, I think the answer lies partly with threading, and perhaps
with a more radical rewrite of the DWARF psymbol reader.  However,
those are difficult goals; and in the short term, I found that just
profiling the reader and making small improvements can make a
difference.

This series improves the performance of the DWARF partial symbol
reader about 10% (more in one case) on some real-world executables.
See the first patch for details (I chose to put the details there so
they would end up in the eventual git log).

Regression tested on x86-64 Fedora 30.

Let me know what you think.

Tom

Comments

Kevin Buettner via Gdb-patches May 20, 2020, 7:30 p.m. | #1
On Wed, May 20, 2020 at 12:40 PM Tom Tromey <tromey@adacore.com> wrote:
>

> A personal goal of mine is to improve the startup time of gdb.  In the

> long run, I think the answer lies partly with threading, and perhaps

> with a more radical rewrite of the DWARF psymbol reader.  However,

> those are difficult goals; and in the short term, I found that just

> profiling the reader and making small improvements can make a

> difference.

>

> This series improves the performance of the DWARF partial symbol

> reader about 10% (more in one case) on some real-world executables.

> See the first patch for details (I chose to put the details there so

> they would end up in the eventual git log).

>

> Regression tested on x86-64 Fedora 30.

>

> Let me know what you think.


These look good to me, for what it's worth.

Christian
Simon Marchi May 20, 2020, 9:08 p.m. | #2
On 2020-05-20 1:40 p.m., Tom Tromey wrote:
> A personal goal of mine is to improve the startup time of gdb.  In the

> long run, I think the answer lies partly with threading, and perhaps

> with a more radical rewrite of the DWARF psymbol reader.  However,

> those are difficult goals; and in the short term, I found that just

> profiling the reader and making small improvements can make a

> difference.

> 

> This series improves the performance of the DWARF partial symbol

> reader about 10% (more in one case) on some real-world executables.

> See the first patch for details (I chose to put the details there so

> they would end up in the eventual git log).

> 

> Regression tested on x86-64 Fedora 30.

> 

> Let me know what you think.

> 

> Tom


I tried the series as a whole, with these two files, libxul.so, which reads this debug info file:

$ l /usr/lib/debug/.build-id/06/bc3dd11d2331977ff78ce8e18c59216a8b9a61.debug
-rwxrwxr-x 1 root root 1.5G May  8 12:21 /usr/lib/debug/.build-id/06/bc3dd11d2331977ff78ce8e18c59216a8b9a61.debug

and libwebkit2gtk-4.0.so.37.28.5, which reads this debug info file:

$ l /usr/lib/debug/.build-id/77/5b4022ee4a85d12697b8791001b40570c25f98.debug
-rwxrwxr-x 1 root root 1.4G Aug 15  2018 /usr/lib/debug/.build-id/77/5b4022ee4a85d12697b8791001b40570c25f98.debug

So both are about the same size.  This is without the patchset applied

$ for i in 1 2 3 4 5; do time ./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch; done
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  97.10s user 1.81s system 102% cpu 1:36.94 total
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  97.61s user 1.96s system 102% cpu 1:37.55 total
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  99.33s user 1.90s system 101% cpu 1:39.34 total
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  96.87s user 1.95s system 101% cpu 1:36.92 total
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  97.19s user 1.94s system 102% cpu 1:37.10 total

$ for i in 1 2 3 4 5; do time ./gdb -nx --data-directory=data-directory /usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37.28.5 -batch; done
./gdb -nx --data-directory=data-directory  -batch  96.66s user 1.27s system 101% cpu 1:36.76 total
./gdb -nx --data-directory=data-directory  -batch  95.63s user 1.45s system 101% cpu 1:35.92 total
./gdb -nx --data-directory=data-directory  -batch  92.45s user 1.24s system 101% cpu 1:32.62 total
./gdb -nx --data-directory=data-directory  -batch  96.55s user 1.45s system 101% cpu 1:36.85 total
./gdb -nx --data-directory=data-directory  -batch  92.75s user 1.34s system 101% cpu 1:32.93 total

And this is with the patchset applied:

$ for i in 1 2 3 4 5; do time ./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch; done
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  58.08s user 1.71s system 103% cpu 57.780 total
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  57.89s user 1.75s system 103% cpu 57.618 total
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  57.85s user 1.67s system 103% cpu 57.492 total
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  58.03s user 1.85s system 103% cpu 57.883 total
./gdb -nx --data-directory=data-directory /usr/lib/firefox/libxul.so -batch  58.16s user 1.73s system 103% cpu 57.833 total

$ for i in 1 2 3 4 5; do time ./gdb -nx --data-directory=data-directory /usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37.28.5 -batch; do
ne
./gdb -nx --data-directory=data-directory  -batch  57.81s user 1.17s system 102% cpu 57.788 total
./gdb -nx --data-directory=data-directory  -batch  57.60s user 1.27s system 101% cpu 57.728 total
./gdb -nx --data-directory=data-directory  -batch  57.75s user 1.18s system 101% cpu 57.847 total
./gdb -nx --data-directory=data-directory  -batch  57.33s user 1.19s system 102% cpu 57.318 total
./gdb -nx --data-directory=data-directory  -batch  57.95s user 1.17s system 101% cpu 57.967 total

It's still a bit too long for an interactive user to wait, but it's quite an improvement!

Simon
Tom Tromey May 27, 2020, 5:48 p.m. | #3
>>>>> "Tom" == Tom Tromey <tromey@adacore.com> writes:


Tom> This series improves the performance of the DWARF partial symbol
Tom> reader about 10% (more in one case) on some real-world executables.
Tom> See the first patch for details (I chose to put the details there so
Tom> they would end up in the eventual git log).

I've rebased these and re-run the performance tests to make sure they
are still improvements.  So, I'm checking them in.

Tom