Fix buffer overflow in ada-lang.c:move_bits

Message ID 20181024162037.21024-1-tom@tromey.com
State New
Headers show
Series
  • Fix buffer overflow in ada-lang.c:move_bits
Related show

Commit Message

Tom Tromey Oct. 24, 2018, 4:20 p.m.
-fsanitize=address showed that ada-lang.c:move_bits can run off the
end of the source buffer.  I believe this patch fixes the problem, by
arranging not to read from the source buffer once there are sufficient
bits in the accumulator.

gdb/ChangeLog
2018-10-23  Tom Tromey  <tom@tromey.com>

	* ada-lang.c (move_bits): Don't run off the end of the source
	buffer.
---
 gdb/ChangeLog  |  5 +++++
 gdb/ada-lang.c | 18 ++++++++++++------
 2 files changed, 17 insertions(+), 6 deletions(-)

-- 
2.17.1

Comments

Joel Brobecker Nov. 1, 2018, 3:35 p.m. | #1
Hi Tom,

> -fsanitize=address showed that ada-lang.c:move_bits can run off the

> end of the source buffer.  I believe this patch fixes the problem, by

> arranging not to read from the source buffer once there are sufficient

> bits in the accumulator.

> 

> gdb/ChangeLog

> 2018-10-23  Tom Tromey  <tom@tromey.com>

> 

> 	* ada-lang.c (move_bits): Don't run off the end of the source

> 	buffer.


Thanks for the patch!

This is a part of the code that always forces me to think twice
(or ten times), each time I try to touch it. I should really start
adding comments to this code that detail what we are trying to do
as we do it.

I tested your change through our testsuite on the various baremetal
targets we have, and noticed that it causes regressions on ppc and arm
targets. It's hopefully something small, but just being back from
a holiday, I'm a bit tied up at work; I'll put that issue on my TODO
list to look at further.

> ---

>  gdb/ChangeLog  |  5 +++++

>  gdb/ada-lang.c | 18 ++++++++++++------

>  2 files changed, 17 insertions(+), 6 deletions(-)

> 

> diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c

> index 1462271a71..7288d65df6 100644

> --- a/gdb/ada-lang.c

> +++ b/gdb/ada-lang.c

> @@ -2682,9 +2682,12 @@ move_bits (gdb_byte *target, int targ_offset, const gdb_byte *source,

>          {

>            int unused_right;

>  

> -          accum = (accum << HOST_CHAR_BIT) + (unsigned char) *source;

> -          accum_bits += HOST_CHAR_BIT;

> -          source += 1;

> +	  if (n > accum_bits)

> +	    {

> +	      accum = (accum << HOST_CHAR_BIT) + (unsigned char) *source;

> +	      accum_bits += HOST_CHAR_BIT;

> +	      source += 1;

> +	    }

>            chunk_size = HOST_CHAR_BIT - targ_offset;

>            if (chunk_size > n)

>              chunk_size = n;

> @@ -2707,9 +2710,12 @@ move_bits (gdb_byte *target, int targ_offset, const gdb_byte *source,

>  

>        while (n > 0)

>          {

> -          accum = accum + ((unsigned char) *source << accum_bits);

> -          accum_bits += HOST_CHAR_BIT;

> -          source += 1;

> +	  if (n > accum_bits)

> +	    {

> +	      accum = accum + ((unsigned char) *source << accum_bits);

> +	      accum_bits += HOST_CHAR_BIT;

> +	      source += 1;

> +	    }

>            chunk_size = HOST_CHAR_BIT - targ_offset;

>            if (chunk_size > n)

>              chunk_size = n;

> -- 

> 2.17.1


-- 
Joel
Tom Tromey Nov. 1, 2018, 10:11 p.m. | #2
>>>>> "Joel" == Joel Brobecker <brobecker@adacore.com> writes:


Joel> I tested your change through our testsuite on the various baremetal
Joel> targets we have, and noticed that it causes regressions on ppc and arm
Joel> targets. It's hopefully something small, but just being back from
Joel> a holiday, I'm a bit tied up at work; I'll put that issue on my TODO
Joel> list to look at further.

Thanks.  To reproduce the problem I saw, just rebuild with
-fsanitize=address and run the gdb.ada tests.  I don't recall exactly
which ones failed, but you should definitely see a read off the end of
the source buffer.

Tom
Pedro Alves Nov. 8, 2018, 7:11 p.m. | #3
On 11/01/2018 03:35 PM, Joel Brobecker wrote:
> Hi Tom,

> 

>> -fsanitize=address showed that ada-lang.c:move_bits can run off the

>> end of the source buffer.  I believe this patch fixes the problem, by

>> arranging not to read from the source buffer once there are sufficient

>> bits in the accumulator.

>>

>> gdb/ChangeLog

>> 2018-10-23  Tom Tromey  <tom@tromey.com>

>>

>> 	* ada-lang.c (move_bits): Don't run off the end of the source

>> 	buffer.

> 

> Thanks for the patch!

> 

> This is a part of the code that always forces me to think twice

> (or ten times), each time I try to touch it. I should really start

> adding comments to this code that detail what we are trying to do

> as we do it.

> 

> I tested your change through our testsuite on the various baremetal

> targets we have, and noticed that it causes regressions on ppc and arm

> targets. It's hopefully something small, but just being back from

> a holiday, I'm a bit tied up at work; I'll put that issue on my TODO

> list to look at further.


I was going to suggest that this would benefit from unit tests in
the style of dwarf2read.c:copy_bitwise's, but, actually, isn't this
exactly the same as copy_bitwise?  Can we get rid of ada-lang.c:move_bits?
(And maybe move copy_bitwise elsewhere?)

Thanks,
Pedro Alves
Pedro Alves Nov. 8, 2018, 7:12 p.m. | #4
On 11/08/2018 07:11 PM, Pedro Alves wrote:
> On 11/01/2018 03:35 PM, Joel Brobecker wrote:

>> Hi Tom,

>>

>>> -fsanitize=address showed that ada-lang.c:move_bits can run off the

>>> end of the source buffer.  I believe this patch fixes the problem, by

>>> arranging not to read from the source buffer once there are sufficient

>>> bits in the accumulator.

>>>

>>> gdb/ChangeLog

>>> 2018-10-23  Tom Tromey  <tom@tromey.com>

>>>

>>> 	* ada-lang.c (move_bits): Don't run off the end of the source

>>> 	buffer.

>>

>> Thanks for the patch!

>>

>> This is a part of the code that always forces me to think twice

>> (or ten times), each time I try to touch it. I should really start

>> adding comments to this code that detail what we are trying to do

>> as we do it.

>>

>> I tested your change through our testsuite on the various baremetal

>> targets we have, and noticed that it causes regressions on ppc and arm

>> targets. It's hopefully something small, but just being back from

>> a holiday, I'm a bit tied up at work; I'll put that issue on my TODO

>> list to look at further.

> 

> I was going to suggest that this would benefit from unit tests in

> the style of dwarf2read.c:copy_bitwise's, but, actually, isn't this

> exactly the same as copy_bitwise?  Can we get rid of ada-lang.c:move_bits?

> (And maybe move copy_bitwise elsewhere?)


I meant to say dwarf2loc.c instead of dwarf2read.c.

Thanks,
Pedro Alves
Joel Brobecker Nov. 9, 2018, 5:16 p.m. | #5
> > I was going to suggest that this would benefit from unit tests in

> > the style of dwarf2read.c:copy_bitwise's, but, actually, isn't this

> > exactly the same as copy_bitwise?  Can we get rid of ada-lang.c:move_bits?

> > (And maybe move copy_bitwise elsewhere?)

> 

> I meant to say dwarf2loc.c instead of dwarf2read.c.


It does look exactly the same, doesn't it? I'll see if we can just
re-use dwarf2loc's copy_bitwise. Thanks for the suggestion!

-- 
Joel
Joel Brobecker Nov. 14, 2018, 5:11 p.m. | #6
Hello,

> > > I was going to suggest that this would benefit from unit tests in

> > > the style of dwarf2read.c:copy_bitwise's, but, actually, isn't this

> > > exactly the same as copy_bitwise?  Can we get rid of ada-lang.c:move_bits?

> > > (And maybe move copy_bitwise elsewhere?)

> > 

> > I meant to say dwarf2loc.c instead of dwarf2read.c.

> 

> It does look exactly the same, doesn't it? I'll see if we can just

> re-use dwarf2loc's copy_bitwise. Thanks for the suggestion!


How about the attached? I ran it through AdaCore's testsuite on
all the platforms we support as well as the official testsuite on
x86_64-linux. No regression.

gdb/ChangeLog:

        * ada-lang.c (move_bits): Delete. Update all callers to use
        copy_bitwise instead.
        * dwarf2loc.c (copy_bitwise, bits_to_str::bits_to_str)
        (selftests::check_copy_bitwise, selftests::copy_bitwise_tests):
        Move from here to utils.c.
        (_initialize_dwarf2loc): Remove call to register copy_bitwise
        selftests.
        * utils.h (copy_bitwise): Add declaration.
        * utils.c (copy_bitwise, bits_to_str::bits_to_str)
        (selftests::check_copy_bitwise, selftests::copy_bitwise_tests):
        Moved here from dwarf2loc.c.
        (_initialize_utils): Register copy_bitwise selftests.

Thank you!
-- 
Joel
From 888fa6e7624bdee066d0e493d4fef108462cfe66 Mon Sep 17 00:00:00 2001
From: Joel Brobecker <brobecker@adacore.com>

Date: Fri, 9 Nov 2018 12:39:12 -0500
Subject: [PATCH] delete ada-lang.c::move_bits, sharing and re-using
 copy_bitwise instead

This patch deletes ada-lang.c's move_bits function entirely, and
replaces all calls to it by calls to copy_bitwise instead. Because
the latter function was declared locally inside dwarf2loc.c, this
patch also move the function to a common area, and makes it non-static.

gdb/ChangeLog:

        * ada-lang.c (move_bits): Delete. Update all callers to use
        copy_bitwise instead.
        * dwarf2loc.c (copy_bitwise, bits_to_str::bits_to_str)
        (selftests::check_copy_bitwise, selftests::copy_bitwise_tests):
        Move from here to utils.c.
        (_initialize_dwarf2loc): Remove call to register copy_bitwise
        selftests.
        * utils.h (copy_bitwise): Add declaration.
        * utils.c (copy_bitwise, bits_to_str::bits_to_str)
        (selftests::check_copy_bitwise, selftests::copy_bitwise_tests):
        Moved here from dwarf2loc.c.
        (_initialize_utils): Register copy_bitwise selftests.

Tested on x86_64-linux, no regression. Also tested using AdaCore's
testsuite on a collection of small endian and big endian platforms.
---
 gdb/ada-lang.c  |  88 +++------------------
 gdb/dwarf2loc.c | 234 --------------------------------------------------------
 gdb/utils.c     | 229 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 gdb/utils.h     |   8 ++
 4 files changed, 247 insertions(+), 312 deletions(-)

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index b7cb122..8967cf1 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -190,8 +190,6 @@ static int ada_is_unconstrained_packed_array_type (struct type *);
 static struct value *value_subscript_packed (struct value *, int,
                                              struct value **);
 
-static void move_bits (gdb_byte *, int, const gdb_byte *, int, int, int);
-
 static struct value *coerce_unspec_val_to_type (struct value *,
                                                 struct type *);
 
@@ -2669,72 +2667,6 @@ ada_value_primitive_packed_val (struct value *obj, const gdb_byte *valaddr,
   return v;
 }
 
-/* Move N bits from SOURCE, starting at bit offset SRC_OFFSET to
-   TARGET, starting at bit offset TARG_OFFSET.  SOURCE and TARGET must
-   not overlap.  */
-static void
-move_bits (gdb_byte *target, int targ_offset, const gdb_byte *source,
-	   int src_offset, int n, int bits_big_endian_p)
-{
-  unsigned int accum, mask;
-  int accum_bits, chunk_size;
-
-  target += targ_offset / HOST_CHAR_BIT;
-  targ_offset %= HOST_CHAR_BIT;
-  source += src_offset / HOST_CHAR_BIT;
-  src_offset %= HOST_CHAR_BIT;
-  if (bits_big_endian_p)
-    {
-      accum = (unsigned char) *source;
-      source += 1;
-      accum_bits = HOST_CHAR_BIT - src_offset;
-
-      while (n > 0)
-        {
-          int unused_right;
-
-          accum = (accum << HOST_CHAR_BIT) + (unsigned char) *source;
-          accum_bits += HOST_CHAR_BIT;
-          source += 1;
-          chunk_size = HOST_CHAR_BIT - targ_offset;
-          if (chunk_size > n)
-            chunk_size = n;
-          unused_right = HOST_CHAR_BIT - (chunk_size + targ_offset);
-          mask = ((1 << chunk_size) - 1) << unused_right;
-          *target =
-            (*target & ~mask)
-            | ((accum >> (accum_bits - chunk_size - unused_right)) & mask);
-          n -= chunk_size;
-          accum_bits -= chunk_size;
-          target += 1;
-          targ_offset = 0;
-        }
-    }
-  else
-    {
-      accum = (unsigned char) *source >> src_offset;
-      source += 1;
-      accum_bits = HOST_CHAR_BIT - src_offset;
-
-      while (n > 0)
-        {
-          accum = accum + ((unsigned char) *source << accum_bits);
-          accum_bits += HOST_CHAR_BIT;
-          source += 1;
-          chunk_size = HOST_CHAR_BIT - targ_offset;
-          if (chunk_size > n)
-            chunk_size = n;
-          mask = ((1 << chunk_size) - 1) << targ_offset;
-          *target = (*target & ~mask) | ((accum << targ_offset) & mask);
-          n -= chunk_size;
-          accum_bits -= chunk_size;
-          accum >>= chunk_size;
-          target += 1;
-          targ_offset = 0;
-        }
-    }
-}
-
 /* Store the contents of FROMVAL into the location of TOVAL.
    Return a new value with the location of TOVAL and contents of
    FROMVAL.   Handles assignment into packed fields that have
@@ -2777,11 +2709,11 @@ ada_value_assign (struct value *toval, struct value *fromval)
       if (from_size == 0)
 	from_size = TYPE_LENGTH (value_type (fromval)) * TARGET_CHAR_BIT;
       if (gdbarch_bits_big_endian (get_type_arch (type)))
-        move_bits (buffer, value_bitpos (toval),
-		   value_contents (fromval), from_size - bits, bits, 1);
+        copy_bitwise (buffer, value_bitpos (toval),
+		      value_contents (fromval), from_size - bits, bits, 1);
       else
-        move_bits (buffer, value_bitpos (toval),
-		   value_contents (fromval), 0, bits, 0);
+        copy_bitwise (buffer, value_bitpos (toval),
+		      value_contents (fromval), 0, bits, 0);
       write_memory_with_notification (to_addr, buffer, len);
 
       val = value_copy (toval);
@@ -2833,14 +2765,14 @@ value_assign_to_component (struct value *container, struct value *component,
 	  = TYPE_LENGTH (value_type (component)) * TARGET_CHAR_BIT - bits;
       else
 	src_offset = 0;
-      move_bits (value_contents_writeable (container) + offset_in_container,
-		 value_bitpos (container) + bit_offset_in_container,
-		 value_contents (val), src_offset, bits, 1);
+      copy_bitwise (value_contents_writeable (container) + offset_in_container,
+		    value_bitpos (container) + bit_offset_in_container,
+		    value_contents (val), src_offset, bits, 1);
     }
   else
-    move_bits (value_contents_writeable (container) + offset_in_container,
-	       value_bitpos (container) + bit_offset_in_container,
-	       value_contents (val), 0, bits, 0);
+    copy_bitwise (value_contents_writeable (container) + offset_in_container,
+		  value_bitpos (container) + bit_offset_in_container,
+		  value_contents (val), 0, bits, 0);
 }
 
 /* Determine if TYPE is an access to an unconstrained array.  */
diff --git a/gdb/dwarf2loc.c b/gdb/dwarf2loc.c
index ee6a8e2..caa3ab7 100644
--- a/gdb/dwarf2loc.c
+++ b/gdb/dwarf2loc.c
@@ -1555,236 +1555,6 @@ allocate_piece_closure (struct dwarf2_per_cu_data *per_cu,
   return c;
 }
 
-/* Copy NBITS bits from SOURCE to DEST starting at the given bit
-   offsets.  Use the bit order as specified by BITS_BIG_ENDIAN.
-   Source and destination buffers must not overlap.  */
-
-static void
-copy_bitwise (gdb_byte *dest, ULONGEST dest_offset,
-	      const gdb_byte *source, ULONGEST source_offset,
-	      ULONGEST nbits, int bits_big_endian)
-{
-  unsigned int buf, avail;
-
-  if (nbits == 0)
-    return;
-
-  if (bits_big_endian)
-    {
-      /* Start from the end, then work backwards.  */
-      dest_offset += nbits - 1;
-      dest += dest_offset / 8;
-      dest_offset = 7 - dest_offset % 8;
-      source_offset += nbits - 1;
-      source += source_offset / 8;
-      source_offset = 7 - source_offset % 8;
-    }
-  else
-    {
-      dest += dest_offset / 8;
-      dest_offset %= 8;
-      source += source_offset / 8;
-      source_offset %= 8;
-    }
-
-  /* Fill BUF with DEST_OFFSET bits from the destination and 8 -
-     SOURCE_OFFSET bits from the source.  */
-  buf = *(bits_big_endian ? source-- : source++) >> source_offset;
-  buf <<= dest_offset;
-  buf |= *dest & ((1 << dest_offset) - 1);
-
-  /* NBITS: bits yet to be written; AVAIL: BUF's fill level.  */
-  nbits += dest_offset;
-  avail = dest_offset + 8 - source_offset;
-
-  /* Flush 8 bits from BUF, if appropriate.  */
-  if (nbits >= 8 && avail >= 8)
-    {
-      *(bits_big_endian ? dest-- : dest++) = buf;
-      buf >>= 8;
-      avail -= 8;
-      nbits -= 8;
-    }
-
-  /* Copy the middle part.  */
-  if (nbits >= 8)
-    {
-      size_t len = nbits / 8;
-
-      /* Use a faster method for byte-aligned copies.  */
-      if (avail == 0)
-	{
-	  if (bits_big_endian)
-	    {
-	      dest -= len;
-	      source -= len;
-	      memcpy (dest + 1, source + 1, len);
-	    }
-	  else
-	    {
-	      memcpy (dest, source, len);
-	      dest += len;
-	      source += len;
-	    }
-	}
-      else
-	{
-	  while (len--)
-	    {
-	      buf |= *(bits_big_endian ? source-- : source++) << avail;
-	      *(bits_big_endian ? dest-- : dest++) = buf;
-	      buf >>= 8;
-	    }
-	}
-      nbits %= 8;
-    }
-
-  /* Write the last byte.  */
-  if (nbits)
-    {
-      if (avail < nbits)
-	buf |= *source << avail;
-
-      buf &= (1 << nbits) - 1;
-      *dest = (*dest & (~0 << nbits)) | buf;
-    }
-}
-
-#if GDB_SELF_TEST
-
-namespace selftests {
-
-/* Helper function for the unit test of copy_bitwise.  Convert NBITS bits
-   out of BITS, starting at OFFS, to the respective '0'/'1'-string.  MSB0
-   specifies whether to assume big endian bit numbering.  Store the
-   resulting (not null-terminated) string at STR.  */
-
-static void
-bits_to_str (char *str, const gdb_byte *bits, ULONGEST offs,
-	     ULONGEST nbits, int msb0)
-{
-  unsigned int j;
-  size_t i;
-
-  for (i = offs / 8, j = offs % 8; nbits; i++, j = 0)
-    {
-      unsigned int ch = bits[i];
-      for (; j < 8 && nbits; j++, nbits--)
-	*str++ = (ch & (msb0 ? (1 << (7 - j)) : (1 << j))) ? '1' : '0';
-    }
-}
-
-/* Check one invocation of copy_bitwise with the given parameters.  */
-
-static void
-check_copy_bitwise (const gdb_byte *dest, unsigned int dest_offset,
-		    const gdb_byte *source, unsigned int source_offset,
-		    unsigned int nbits, int msb0)
-{
-  size_t len = align_up (dest_offset + nbits, 8);
-  char *expected = (char *) alloca (len + 1);
-  char *actual = (char *) alloca (len + 1);
-  gdb_byte *buf = (gdb_byte *) alloca (len / 8);
-
-  /* Compose a '0'/'1'-string that represents the expected result of
-     copy_bitwise below:
-      Bits from [0, DEST_OFFSET) are filled from DEST.
-      Bits from [DEST_OFFSET, DEST_OFFSET + NBITS) are filled from SOURCE.
-      Bits from [DEST_OFFSET + NBITS, LEN) are filled from DEST.
-
-     E.g., with:
-      dest_offset: 4
-      nbits:       2
-      len:         8
-      dest:        00000000
-      source:      11111111
-
-     We should end up with:
-      buf:         00001100
-                   DDDDSSDD (D=dest, S=source)
-  */
-  bits_to_str (expected, dest, 0, len, msb0);
-  bits_to_str (expected + dest_offset, source, source_offset, nbits, msb0);
-
-  /* Fill BUF with data from DEST, apply copy_bitwise, and convert the
-     result to a '0'/'1'-string.  */
-  memcpy (buf, dest, len / 8);
-  copy_bitwise (buf, dest_offset, source, source_offset, nbits, msb0);
-  bits_to_str (actual, buf, 0, len, msb0);
-
-  /* Compare the resulting strings.  */
-  expected[len] = actual[len] = '\0';
-  if (strcmp (expected, actual) != 0)
-    error (_("copy_bitwise %s != %s (%u+%u -> %u)"),
-	   expected, actual, source_offset, nbits, dest_offset);
-}
-
-/* Unit test for copy_bitwise.  */
-
-static void
-copy_bitwise_tests (void)
-{
-  /* Data to be used as both source and destination buffers.  The two
-     arrays below represent the lsb0- and msb0- encoded versions of the
-     following bit string, respectively:
-       00000000 00011111 11111111 01001000 10100101 11110010
-     This pattern is chosen such that it contains:
-     - constant 0- and 1- chunks of more than a full byte;
-     - 0/1- and 1/0 transitions on all bit positions within a byte;
-     - several sufficiently asymmetric bytes.
-  */
-  static const gdb_byte data_lsb0[] = {
-    0x00, 0xf8, 0xff, 0x12, 0xa5, 0x4f
-  };
-  static const gdb_byte data_msb0[] = {
-    0x00, 0x1f, 0xff, 0x48, 0xa5, 0xf2
-  };
-
-  constexpr size_t data_nbits = 8 * sizeof (data_lsb0);
-  constexpr unsigned max_nbits = 24;
-
-  /* Try all combinations of:
-      lsb0/msb0 bit order (using the respective data array)
-       X [0, MAX_NBITS] copy bit width
-       X feasible source offsets for the given copy bit width
-       X feasible destination offsets
-  */
-  for (int msb0 = 0; msb0 < 2; msb0++)
-    {
-      const gdb_byte *data = msb0 ? data_msb0 : data_lsb0;
-
-      for (unsigned int nbits = 1; nbits <= max_nbits; nbits++)
-	{
-	  const unsigned int max_offset = data_nbits - nbits;
-
-	  for (unsigned source_offset = 0;
-	       source_offset <= max_offset;
-	       source_offset++)
-	    {
-	      for (unsigned dest_offset = 0;
-		   dest_offset <= max_offset;
-		   dest_offset++)
-		{
-		  check_copy_bitwise (data + dest_offset / 8,
-				      dest_offset % 8,
-				      data + source_offset / 8,
-				      source_offset % 8,
-				      nbits, msb0);
-		}
-	    }
-	}
-
-      /* Special cases: copy all, copy nothing.  */
-      check_copy_bitwise (data_lsb0, 0, data_msb0, 0, data_nbits, msb0);
-      check_copy_bitwise (data_msb0, 0, data_lsb0, 0, data_nbits, msb0);
-      check_copy_bitwise (data, data_nbits - 7, data, 9, 0, msb0);
-    }
-}
-
-} /* namespace selftests */
-
-#endif /* GDB_SELF_TEST */
-
 /* Return the number of bytes overlapping a contiguous chunk of N_BITS
    bits whose first bit is located at bit offset START.  */
 
@@ -4737,8 +4507,4 @@ _initialize_dwarf2loc (void)
 			     NULL,
 			     show_entry_values_debug,
 			     &setdebuglist, &showdebuglist);
-
-#if GDB_SELF_TEST
-  selftests::register_test ("copy_bitwise", selftests::copy_bitwise_tests);
-#endif
 }
diff --git a/gdb/utils.c b/gdb/utils.c
index 8d4a744..c088d8b 100644
--- a/gdb/utils.c
+++ b/gdb/utils.c
@@ -3219,6 +3219,234 @@ strip_leading_path_elements (const char *path, int n)
   return p;
 }
 
+/* See utils.h.  */
+
+void
+copy_bitwise (gdb_byte *dest, ULONGEST dest_offset,
+	      const gdb_byte *source, ULONGEST source_offset,
+	      ULONGEST nbits, int bits_big_endian)
+{
+  unsigned int buf, avail;
+
+  if (nbits == 0)
+    return;
+
+  if (bits_big_endian)
+    {
+      /* Start from the end, then work backwards.  */
+      dest_offset += nbits - 1;
+      dest += dest_offset / 8;
+      dest_offset = 7 - dest_offset % 8;
+      source_offset += nbits - 1;
+      source += source_offset / 8;
+      source_offset = 7 - source_offset % 8;
+    }
+  else
+    {
+      dest += dest_offset / 8;
+      dest_offset %= 8;
+      source += source_offset / 8;
+      source_offset %= 8;
+    }
+
+  /* Fill BUF with DEST_OFFSET bits from the destination and 8 -
+     SOURCE_OFFSET bits from the source.  */
+  buf = *(bits_big_endian ? source-- : source++) >> source_offset;
+  buf <<= dest_offset;
+  buf |= *dest & ((1 << dest_offset) - 1);
+
+  /* NBITS: bits yet to be written; AVAIL: BUF's fill level.  */
+  nbits += dest_offset;
+  avail = dest_offset + 8 - source_offset;
+
+  /* Flush 8 bits from BUF, if appropriate.  */
+  if (nbits >= 8 && avail >= 8)
+    {
+      *(bits_big_endian ? dest-- : dest++) = buf;
+      buf >>= 8;
+      avail -= 8;
+      nbits -= 8;
+    }
+
+  /* Copy the middle part.  */
+  if (nbits >= 8)
+    {
+      size_t len = nbits / 8;
+
+      /* Use a faster method for byte-aligned copies.  */
+      if (avail == 0)
+	{
+	  if (bits_big_endian)
+	    {
+	      dest -= len;
+	      source -= len;
+	      memcpy (dest + 1, source + 1, len);
+	    }
+	  else
+	    {
+	      memcpy (dest, source, len);
+	      dest += len;
+	      source += len;
+	    }
+	}
+      else
+	{
+	  while (len--)
+	    {
+	      buf |= *(bits_big_endian ? source-- : source++) << avail;
+	      *(bits_big_endian ? dest-- : dest++) = buf;
+	      buf >>= 8;
+	    }
+	}
+      nbits %= 8;
+    }
+
+  /* Write the last byte.  */
+  if (nbits)
+    {
+      if (avail < nbits)
+	buf |= *source << avail;
+
+      buf &= (1 << nbits) - 1;
+      *dest = (*dest & (~0 << nbits)) | buf;
+    }
+}
+
+#if GDB_SELF_TEST
+
+namespace selftests {
+
+/* Helper function for the unit test of copy_bitwise.  Convert NBITS bits
+   out of BITS, starting at OFFS, to the respective '0'/'1'-string.  MSB0
+   specifies whether to assume big endian bit numbering.  Store the
+   resulting (not null-terminated) string at STR.  */
+
+static void
+bits_to_str (char *str, const gdb_byte *bits, ULONGEST offs,
+	     ULONGEST nbits, int msb0)
+{
+  unsigned int j;
+  size_t i;
+
+  for (i = offs / 8, j = offs % 8; nbits; i++, j = 0)
+    {
+      unsigned int ch = bits[i];
+      for (; j < 8 && nbits; j++, nbits--)
+	*str++ = (ch & (msb0 ? (1 << (7 - j)) : (1 << j))) ? '1' : '0';
+    }
+}
+
+/* Check one invocation of copy_bitwise with the given parameters.  */
+
+static void
+check_copy_bitwise (const gdb_byte *dest, unsigned int dest_offset,
+		    const gdb_byte *source, unsigned int source_offset,
+		    unsigned int nbits, int msb0)
+{
+  size_t len = align_up (dest_offset + nbits, 8);
+  char *expected = (char *) alloca (len + 1);
+  char *actual = (char *) alloca (len + 1);
+  gdb_byte *buf = (gdb_byte *) alloca (len / 8);
+
+  /* Compose a '0'/'1'-string that represents the expected result of
+     copy_bitwise below:
+      Bits from [0, DEST_OFFSET) are filled from DEST.
+      Bits from [DEST_OFFSET, DEST_OFFSET + NBITS) are filled from SOURCE.
+      Bits from [DEST_OFFSET + NBITS, LEN) are filled from DEST.
+
+     E.g., with:
+      dest_offset: 4
+      nbits:       2
+      len:         8
+      dest:        00000000
+      source:      11111111
+
+     We should end up with:
+      buf:         00001100
+                   DDDDSSDD (D=dest, S=source)
+  */
+  bits_to_str (expected, dest, 0, len, msb0);
+  bits_to_str (expected + dest_offset, source, source_offset, nbits, msb0);
+
+  /* Fill BUF with data from DEST, apply copy_bitwise, and convert the
+     result to a '0'/'1'-string.  */
+  memcpy (buf, dest, len / 8);
+  copy_bitwise (buf, dest_offset, source, source_offset, nbits, msb0);
+  bits_to_str (actual, buf, 0, len, msb0);
+
+  /* Compare the resulting strings.  */
+  expected[len] = actual[len] = '\0';
+  if (strcmp (expected, actual) != 0)
+    error (_("copy_bitwise %s != %s (%u+%u -> %u)"),
+	   expected, actual, source_offset, nbits, dest_offset);
+}
+
+/* Unit test for copy_bitwise.  */
+
+static void
+copy_bitwise_tests (void)
+{
+  /* Data to be used as both source and destination buffers.  The two
+     arrays below represent the lsb0- and msb0- encoded versions of the
+     following bit string, respectively:
+       00000000 00011111 11111111 01001000 10100101 11110010
+     This pattern is chosen such that it contains:
+     - constant 0- and 1- chunks of more than a full byte;
+     - 0/1- and 1/0 transitions on all bit positions within a byte;
+     - several sufficiently asymmetric bytes.
+  */
+  static const gdb_byte data_lsb0[] = {
+    0x00, 0xf8, 0xff, 0x12, 0xa5, 0x4f
+  };
+  static const gdb_byte data_msb0[] = {
+    0x00, 0x1f, 0xff, 0x48, 0xa5, 0xf2
+  };
+
+  constexpr size_t data_nbits = 8 * sizeof (data_lsb0);
+  constexpr unsigned max_nbits = 24;
+
+  /* Try all combinations of:
+      lsb0/msb0 bit order (using the respective data array)
+       X [0, MAX_NBITS] copy bit width
+       X feasible source offsets for the given copy bit width
+       X feasible destination offsets
+  */
+  for (int msb0 = 0; msb0 < 2; msb0++)
+    {
+      const gdb_byte *data = msb0 ? data_msb0 : data_lsb0;
+
+      for (unsigned int nbits = 1; nbits <= max_nbits; nbits++)
+	{
+	  const unsigned int max_offset = data_nbits - nbits;
+
+	  for (unsigned source_offset = 0;
+	       source_offset <= max_offset;
+	       source_offset++)
+	    {
+	      for (unsigned dest_offset = 0;
+		   dest_offset <= max_offset;
+		   dest_offset++)
+		{
+		  check_copy_bitwise (data + dest_offset / 8,
+				      dest_offset % 8,
+				      data + source_offset / 8,
+				      source_offset % 8,
+				      nbits, msb0);
+		}
+	    }
+	}
+
+      /* Special cases: copy all, copy nothing.  */
+      check_copy_bitwise (data_lsb0, 0, data_msb0, 0, data_nbits, msb0);
+      check_copy_bitwise (data_msb0, 0, data_lsb0, 0, data_nbits, msb0);
+      check_copy_bitwise (data, data_nbits - 7, data, 9, 0, msb0);
+    }
+}
+
+} /* namespace selftests */
+
+#endif /* GDB_SELF_TEST */
+
 void
 _initialize_utils (void)
 {
@@ -3228,5 +3456,6 @@ _initialize_utils (void)
 
 #if GDB_SELF_TEST
   selftests::register_test ("gdb_realpath", gdb_realpath_tests);
+  selftests::register_test ("copy_bitwise", selftests::copy_bitwise_tests);
 #endif
 }
diff --git a/gdb/utils.h b/gdb/utils.h
index fa9a590..08a29af 100644
--- a/gdb/utils.h
+++ b/gdb/utils.h
@@ -519,4 +519,12 @@ extern void dump_core (void);
 
 extern char *make_hex_string (const gdb_byte *data, size_t length);
 
+/* Copy NBITS bits from SOURCE to DEST starting at the given bit
+   offsets.  Use the bit order as specified by BITS_BIG_ENDIAN.
+   Source and destination buffers must not overlap.  */
+
+extern void copy_bitwise (gdb_byte *dest, ULONGEST dest_offset,
+			  const gdb_byte *source, ULONGEST source_offset,
+			  ULONGEST nbits, int bits_big_endian);
+
 #endif /* UTILS_H */
-- 
2.1.4
Pedro Alves Nov. 14, 2018, 5:23 p.m. | #7
On 11/14/2018 05:11 PM, Joel Brobecker wrote:
> Hello,

> 

>>>> I was going to suggest that this would benefit from unit tests in

>>>> the style of dwarf2read.c:copy_bitwise's, but, actually, isn't this

>>>> exactly the same as copy_bitwise?  Can we get rid of ada-lang.c:move_bits?

>>>> (And maybe move copy_bitwise elsewhere?)

>>> I meant to say dwarf2loc.c instead of dwarf2read.c.

>> It does look exactly the same, doesn't it? I'll see if we can just

>> re-use dwarf2loc's copy_bitwise. Thanks for the suggestion!

> How about the attached? I ran it through AdaCore's testsuite on

> all the platforms we support as well as the official testsuite on

> x86_64-linux. No regression.

> 

> gdb/ChangeLog:

> 

>         * ada-lang.c (move_bits): Delete. Update all callers to use

>         copy_bitwise instead.

>         * dwarf2loc.c (copy_bitwise, bits_to_str::bits_to_str)

>         (selftests::check_copy_bitwise, selftests::copy_bitwise_tests):

>         Move from here to utils.c.

>         (_initialize_dwarf2loc): Remove call to register copy_bitwise

>         selftests.

>         * utils.h (copy_bitwise): Add declaration.

>         * utils.c (copy_bitwise, bits_to_str::bits_to_str)

>         (selftests::check_copy_bitwise, selftests::copy_bitwise_tests):

>         Moved here from dwarf2loc.c.

>         (_initialize_utils): Register copy_bitwise selftests.

> 

> Thank you!

> -- Joel

> 

> 


Great, thanks!

Nit, since the function is now public, I'd consider moving the unit
tests to under gdb/unittests/ instead, like, to a new
copy_bitwise-selftests.c file.  (I'm mildly thinking that'd be a better
filename than utils-selftest.c because the function may well
move again in the future.  Notice how gdb_realpath's unit tests
were left behind in gdb/utils.c even though gdb_realpath moved to 
common/pathstuff.c.)

If you do that, you can drop the
'#if GDB_SELF_TEST' around the tests, since files in that
directory are not compiled if unit tests are disabled.

Regardless, LGTM.

Thanks,
Pedro Alves
Joel Brobecker Nov. 14, 2018, 11:17 p.m. | #8
> > gdb/ChangeLog:

> > 

> >         * ada-lang.c (move_bits): Delete. Update all callers to use

> >         copy_bitwise instead.

> >         * dwarf2loc.c (copy_bitwise, bits_to_str::bits_to_str)

> >         (selftests::check_copy_bitwise, selftests::copy_bitwise_tests):

> >         Move from here to utils.c.

> >         (_initialize_dwarf2loc): Remove call to register copy_bitwise

> >         selftests.

> >         * utils.h (copy_bitwise): Add declaration.

> >         * utils.c (copy_bitwise, bits_to_str::bits_to_str)

> >         (selftests::check_copy_bitwise, selftests::copy_bitwise_tests):

> >         Moved here from dwarf2loc.c.

> >         (_initialize_utils): Register copy_bitwise selftests.

> > 

> > Thank you!

> > -- Joel

> > 

> > 

> 

> Great, thanks!

> 

> Nit, since the function is now public, I'd consider moving the unit

> tests to under gdb/unittests/ instead, like, to a new

> copy_bitwise-selftests.c file.  (I'm mildly thinking that'd be a better

> filename than utils-selftest.c because the function may well

> move again in the future.  Notice how gdb_realpath's unit tests

> were left behind in gdb/utils.c even though gdb_realpath moved to 

> common/pathstuff.c.)

> 

> If you do that, you can drop the

> '#if GDB_SELF_TEST' around the tests, since files in that

> directory are not compiled if unit tests are disabled.


I can do that. Since you said you're file reguardless, it's a little
easier for me to do it in two stages, so I'll go ahead and push this
one. I'll start on moving the unit tests again right after, and
will finish ASAP if it's not finished by end of today.

Thanks Pedro!
-- 
Joel

Patch

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index 1462271a71..7288d65df6 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -2682,9 +2682,12 @@  move_bits (gdb_byte *target, int targ_offset, const gdb_byte *source,
         {
           int unused_right;
 
-          accum = (accum << HOST_CHAR_BIT) + (unsigned char) *source;
-          accum_bits += HOST_CHAR_BIT;
-          source += 1;
+	  if (n > accum_bits)
+	    {
+	      accum = (accum << HOST_CHAR_BIT) + (unsigned char) *source;
+	      accum_bits += HOST_CHAR_BIT;
+	      source += 1;
+	    }
           chunk_size = HOST_CHAR_BIT - targ_offset;
           if (chunk_size > n)
             chunk_size = n;
@@ -2707,9 +2710,12 @@  move_bits (gdb_byte *target, int targ_offset, const gdb_byte *source,
 
       while (n > 0)
         {
-          accum = accum + ((unsigned char) *source << accum_bits);
-          accum_bits += HOST_CHAR_BIT;
-          source += 1;
+	  if (n > accum_bits)
+	    {
+	      accum = accum + ((unsigned char) *source << accum_bits);
+	      accum_bits += HOST_CHAR_BIT;
+	      source += 1;
+	    }
           chunk_size = HOST_CHAR_BIT - targ_offset;
           if (chunk_size > n)
             chunk_size = n;