[review] Precompute hash value for symbol_set_names

Message ID gerrit.1572031795000.I044449e7eb60cffc1c43efd3412f2b485bd9faac@gnutoolchain-gerrit.osci.io
State Superseded
Headers show
Series
  • [review] Precompute hash value for symbol_set_names
Related show

Commit Message

Simon Marchi (Code Review) Oct. 25, 2019, 7:29 p.m.
Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307
......................................................................

Precompute hash value for symbol_set_names

We can also compute the hash for the mangled name on a background
thread so make this function even faster (about a 7% speedup).

gdb/ChangeLog:

2019-10-03  Christian Biesinger  <cbiesinger@google.com>

	* minsyms.c (minimal_symbol_reader::install): Also compute the hash
	of the mangled name on the background thread.
	* symtab.c (symbol_set_names): Allow passing in the hash of the
	linkage_name.
	* symtab.h (symbol_set_names): Likewise.

Change-Id: I044449e7eb60cffc1c43efd3412f2b485bd9faac
---
M gdb/minsyms.c
M gdb/symtab.c
M gdb/symtab.h
3 files changed, 22 insertions(+), 6 deletions(-)




-- 
Gerrit-Project: binutils-gdb
Gerrit-Branch: master
Gerrit-Change-Id: I044449e7eb60cffc1c43efd3412f2b485bd9faac
Gerrit-Change-Number: 307
Gerrit-PatchSet: 1
Gerrit-Owner: Christian Biesinger <cbiesinger@google.com>
Gerrit-MessageType: newchange

Comments

Simon Marchi (Code Review) Oct. 29, 2019, 7:28 p.m. | #1
Tom Tromey has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307
......................................................................


Patch Set 1: Code-Review+1

(4 comments)

This looks good.  I found a few nits, see below.
I'll probably put in the threading patches "soon", they've
been few a through rounds of review and have been sitting for
a while now.

https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/minsyms.c 
File gdb/minsyms.c:

https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/minsyms.c@1252 
PS1, Line 1252: 
1243 | 	  else
1244 | 	    *copyto++ = *copyfrom++;
1245 | 	}
1246 |       *copyto++ = *copyfrom++;
1247 |       mcount = copyto - msymbol;
1248 |     }
1249 |   return (mcount);
1250 | }
1251 | 
1252 | struct computed_hash_values {

{ on next line.


https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/minsyms.c@1392 
PS1, Line 1392: 
1370 | #endif
     | ...
1383 | 		   /* This will be freed later, by symbol_set_names.  */
1384 | 		   char *demangled_name
1385 | 		     = symbol_find_demangled_name (msym, msym->name);
1386 | 		   symbol_set_demangled_name
1387 | 		     (msym, demangled_name,
1388 | 		      &m_objfile->per_bfd->storage_obstack);
1389 | 		   msym->name_set = 1;
1390 | 
1391 | 		   size_t idx = msym - msymbols;
1392 | 		   hash_values[idx].mangled_name_hash = htab_hash_string (msym->name);

over-long line?  Also, fast_hash


https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/symtab.h 
File gdb/symtab.h:

https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/symtab.h@518 
PS1, Line 518: 
509 |    it.  Used for constructs which do not have a mangled name,
510 |    e.g. struct tags.  Unlike SYMBOL_SET_NAMES, linkage_name must
511 |    be terminated and either already on the objfile's obstack or
512 |    permanently allocated.  */
513 | #define SYMBOL_SET_LINKAGE_NAME(symbol,linkage_name) \
514 |   (symbol)->ginfo.name = (linkage_name)
515 | 
516 | /* Set the linkage and natural names of a symbol, by demangling
517 |    the linkage name.  Optionally, HASH can be set to the value
518 |    of htab_hash_string (linkage_name) (if nullterminated), to

This has to refer to `fast_hash` now.


https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/symtab.h@526 
PS1, Line 526: 
517 |    the linkage name.  Optionally, HASH can be set to the value
518 |    of htab_hash_string (linkage_name) (if nullterminated), to
519 |    speed up this function.  */
520 | #define SYMBOL_SET_NAMES(symbol,linkage_name,len,copy_name,objfile)	\
521 |   symbol_set_names (&(symbol)->ginfo, linkage_name, len, copy_name, \
522 | 		    (objfile)->per_bfd)
523 | extern void symbol_set_names (struct general_symbol_info *symbol,
524 | 			      const char *linkage_name, int len, bool copy_name,
525 | 			      struct objfile_per_bfd_storage *per_bfd,
526 | 			      hashval_t hash = 0);

I suppose an optional<hashval_t> would avoid the problem
when the hash value is actually 0



-- 
Gerrit-Project: binutils-gdb
Gerrit-Branch: master
Gerrit-Change-Id: I044449e7eb60cffc1c43efd3412f2b485bd9faac
Gerrit-Change-Number: 307
Gerrit-PatchSet: 1
Gerrit-Owner: Christian Biesinger <cbiesinger@google.com>
Gerrit-Reviewer: Tom Tromey <tromey@sourceware.org>
Gerrit-Comment-Date: Tue, 29 Oct 2019 19:28:03 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
Simon Marchi (Code Review) Oct. 29, 2019, 9:56 p.m. | #2
Christian Biesinger has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307
......................................................................


Uploaded patch set 2.

(4 comments)

https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/minsyms.c 
File gdb/minsyms.c:

https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/minsyms.c@1252 
PS1, Line 1252: 
1243 | 	  else
1244 | 	    *copyto++ = *copyfrom++;
1245 | 	}
1246 |       *copyto++ = *copyfrom++;
1247 |       mcount = copyto - msymbol;
1248 |     }
1249 |   return (mcount);
1250 | }
1251 | 
1252 | struct computed_hash_values {

> { on next line.


Done


https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/minsyms.c@1392 
PS1, Line 1392: 
1370 | #endif
     | ...
1383 | 		   /* This will be freed later, by symbol_set_names.  */
1384 | 		   char *demangled_name
1385 | 		     = symbol_find_demangled_name (msym, msym->name);
1386 | 		   symbol_set_demangled_name
1387 | 		     (msym, demangled_name,
1388 | 		      &m_objfile->per_bfd->storage_obstack);
1389 | 		   msym->name_set = 1;
1390 | 
1391 | 		   size_t idx = msym - msymbols;
1392 | 		   hash_values[idx].mangled_name_hash = htab_hash_string (msym->name);

> over-long line?  Also, fast_hash


Great catch, thanks!

Since I now need strlen (msym->name) in two places, I decided to store that in hash_values[idx] as well. Let me know if you dislike that.


https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/symtab.h 
File gdb/symtab.h:

https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/symtab.h@518 
PS1, Line 518: 
509 |    it.  Used for constructs which do not have a mangled name,
510 |    e.g. struct tags.  Unlike SYMBOL_SET_NAMES, linkage_name must
511 |    be terminated and either already on the objfile's obstack or
512 |    permanently allocated.  */
513 | #define SYMBOL_SET_LINKAGE_NAME(symbol,linkage_name) \
514 |   (symbol)->ginfo.name = (linkage_name)
515 | 
516 | /* Set the linkage and natural names of a symbol, by demangling
517 |    the linkage name.  Optionally, HASH can be set to the value
518 |    of htab_hash_string (linkage_name) (if nullterminated), to

> This has to refer to `fast_hash` now.


Done


https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307/1/gdb/symtab.h@526 
PS1, Line 526: 
517 |    the linkage name.  Optionally, HASH can be set to the value
518 |    of htab_hash_string (linkage_name) (if nullterminated), to
519 |    speed up this function.  */
520 | #define SYMBOL_SET_NAMES(symbol,linkage_name,len,copy_name,objfile)	\
521 |   symbol_set_names (&(symbol)->ginfo, linkage_name, len, copy_name, \
522 | 		    (objfile)->per_bfd)
523 | extern void symbol_set_names (struct general_symbol_info *symbol,
524 | 			      const char *linkage_name, int len, bool copy_name,
525 | 			      struct objfile_per_bfd_storage *per_bfd,
526 | 			      hashval_t hash = 0);

> I suppose an optional<hashval_t> would avoid the problem […]


Done



-- 
Gerrit-Project: binutils-gdb
Gerrit-Branch: master
Gerrit-Change-Id: I044449e7eb60cffc1c43efd3412f2b485bd9faac
Gerrit-Change-Number: 307
Gerrit-PatchSet: 2
Gerrit-Owner: Christian Biesinger <cbiesinger@google.com>
Gerrit-Reviewer: Tom Tromey <tromey@sourceware.org>
Gerrit-Comment-Date: Tue, 29 Oct 2019 21:56:00 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Tom Tromey <tromey@sourceware.org>
Gerrit-MessageType: comment
Simon Marchi (Code Review) Nov. 26, 2019, 10:08 p.m. | #3
Tom Tromey has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307
......................................................................


Patch Set 4: Code-Review+2

(1 comment)

Thanks for doing this.  This looks good to me.  I have one nit;
this is ok with that fixed.

| --- gdb/minsyms.c
| +++ gdb/minsyms.c
| @@ -1252,13 +1252,19 @@ static void
|  clear_minimal_symbol_hash_tables (struct objfile *objfile)
|  {
|    for (size_t i = 0; i < MINIMAL_SYMBOL_HASH_SIZE; i++)
|      {
|        objfile->per_bfd->msymbol_hash[i] = 0;
|        objfile->per_bfd->msymbol_demangled_hash[i] = 0;
|      }
|  }
|  
| +struct computed_hash_values

PS4, Line 1261:

This should have a comment explaining its purpose.
Also the fields should have comments.

| +{
| +  size_t name_length;
| +  hashval_t mangled_name_hash;
| +};
| +
|  /* Build (or rebuild) the minimal symbol hash tables.  This is necessary
|     after compacting or sorting the table since the entries move around
|     thus causing the internal minimal_symbol pointers to become jumbled.  */
|    

-- 
Gerrit-Project: binutils-gdb
Gerrit-Branch: master
Gerrit-Change-Id: I044449e7eb60cffc1c43efd3412f2b485bd9faac
Gerrit-Change-Number: 307
Gerrit-PatchSet: 4
Gerrit-Owner: Christian Biesinger <cbiesinger@google.com>
Gerrit-Reviewer: Tom Tromey <tromey@sourceware.org>
Gerrit-Comment-Date: Tue, 26 Nov 2019 22:08:35 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
Simon Marchi (Code Review) Nov. 26, 2019, 10:23 p.m. | #4
Christian Biesinger has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307
......................................................................


Patch Set 5:

(1 comment)

| --- gdb/minsyms.c
| +++ gdb/minsyms.c
| @@ -1252,13 +1252,19 @@ static void
|  clear_minimal_symbol_hash_tables (struct objfile *objfile)
|  {
|    for (size_t i = 0; i < MINIMAL_SYMBOL_HASH_SIZE; i++)
|      {
|        objfile->per_bfd->msymbol_hash[i] = 0;
|        objfile->per_bfd->msymbol_demangled_hash[i] = 0;
|      }
|  }
|  
| +struct computed_hash_values

PS4, Line 1261:

Done

| +{
| +  size_t name_length;
| +  hashval_t mangled_name_hash;
| +};
| +
|  /* Build (or rebuild) the minimal symbol hash tables.  This is necessary
|     after compacting or sorting the table since the entries move around
|     thus causing the internal minimal_symbol pointers to become jumbled.  */
|    

-- 
Gerrit-Project: binutils-gdb
Gerrit-Branch: master
Gerrit-Change-Id: I044449e7eb60cffc1c43efd3412f2b485bd9faac
Gerrit-Change-Number: 307
Gerrit-PatchSet: 5
Gerrit-Owner: Christian Biesinger <cbiesinger@google.com>
Gerrit-Reviewer: Christian Biesinger <cbiesinger@google.com>
Gerrit-Reviewer: Tom Tromey <tromey@sourceware.org>
Gerrit-Comment-Date: Tue, 26 Nov 2019 22:23:59 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Tom Tromey <tromey@sourceware.org>
Gerrit-MessageType: comment
Simon Marchi (Code Review) Nov. 27, 2019, 6:03 p.m. | #5
Tom Tromey has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/307
......................................................................


Patch Set 5: Code-Review+2

Thank you.


-- 
Gerrit-Project: binutils-gdb
Gerrit-Branch: master
Gerrit-Change-Id: I044449e7eb60cffc1c43efd3412f2b485bd9faac
Gerrit-Change-Number: 307
Gerrit-PatchSet: 5
Gerrit-Owner: Christian Biesinger <cbiesinger@google.com>
Gerrit-Reviewer: Christian Biesinger <cbiesinger@google.com>
Gerrit-Reviewer: Tom Tromey <tromey@sourceware.org>
Gerrit-Comment-Date: Wed, 27 Nov 2019 18:03:52 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment

Patch

diff --git a/gdb/minsyms.c b/gdb/minsyms.c
index bfcd5d5..51894b2 100644
--- a/gdb/minsyms.c
+++ b/gdb/minsyms.c
@@ -1249,6 +1249,10 @@ 
   return (mcount);
 }
 
+struct computed_hash_values {
+  hashval_t mangled_name_hash;
+};
+
 /* Build (or rebuild) the minimal symbol hash tables.  This is necessary
    after compacting or sorting the table since the entries move around
    thus causing the internal minimal_symbol pointers to become jumbled.  */
@@ -1365,6 +1369,8 @@ 
       std::mutex demangled_mutex;
 #endif
 
+      std::vector<computed_hash_values> hash_values (mcount);
+
       msymbols = m_objfile->per_bfd->msymbols.get ();
       gdb::parallel_for_each
 	(&msymbols[0], &msymbols[mcount],
@@ -1381,6 +1387,9 @@ 
 		     (msym, demangled_name,
 		      &m_objfile->per_bfd->storage_obstack);
 		   msym->name_set = 1;
+
+		   size_t idx = msym - msymbols;
+		   hash_values[idx].mangled_name_hash = htab_hash_string (msym->name);
 		 }
 	     }
 	   {
@@ -1391,9 +1400,11 @@ 
 #endif
 	     for (minimal_symbol *msym = start; msym < end; ++msym)
 	       {
+		 size_t idx = msym - msymbols;
 		 symbol_set_names (msym, msym->name,
 				   strlen (msym->name), 0,
-				   m_objfile->per_bfd);
+				   m_objfile->per_bfd,
+				   hash_values[idx].mangled_name_hash);
 	       }
 	   }
 	 });
diff --git a/gdb/symtab.c b/gdb/symtab.c
index abc6a77..06d8370 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -825,7 +825,7 @@ 
 void
 symbol_set_names (struct general_symbol_info *gsymbol,
 		  const char *linkage_name, int len, bool copy_name,
-		  struct objfile_per_bfd_storage *per_bfd)
+		  struct objfile_per_bfd_storage *per_bfd, hashval_t hash)
 {
   struct demangled_name_entry **slot;
   /* A 0-terminated copy of the linkage name.  */
@@ -868,9 +868,11 @@ 
     linkage_name_copy = linkage_name;
 
   struct demangled_name_entry entry (gdb::string_view (linkage_name_copy, len));
+  if (hash == 0)
+    hash = hash_demangled_name_entry (&entry);
   slot = ((struct demangled_name_entry **)
-	  htab_find_slot (per_bfd->demangled_names_hash.get (),
-			  &entry, INSERT));
+          htab_find_slot_with_hash (per_bfd->demangled_names_hash.get (),
+				    &entry, hash, INSERT));
 
   /* If this name is not in the hash table, add it.  */
   if (*slot == NULL
diff --git a/gdb/symtab.h b/gdb/symtab.h
index 7d41de9..48392b9 100644
--- a/gdb/symtab.h
+++ b/gdb/symtab.h
@@ -514,13 +514,16 @@ 
   (symbol)->ginfo.name = (linkage_name)
 
 /* Set the linkage and natural names of a symbol, by demangling
-   the linkage name.  */
+   the linkage name.  Optionally, HASH can be set to the value
+   of htab_hash_string (linkage_name) (if nullterminated), to
+   speed up this function.  */
 #define SYMBOL_SET_NAMES(symbol,linkage_name,len,copy_name,objfile)	\
   symbol_set_names (&(symbol)->ginfo, linkage_name, len, copy_name, \
 		    (objfile)->per_bfd)
 extern void symbol_set_names (struct general_symbol_info *symbol,
 			      const char *linkage_name, int len, bool copy_name,
-			      struct objfile_per_bfd_storage *per_bfd);
+			      struct objfile_per_bfd_storage *per_bfd,
+			      hashval_t hash = 0);
 
 /* Now come lots of name accessor macros.  Short version as to when to
    use which: Use SYMBOL_NATURAL_NAME to refer to the name of the