Make arm_record_test work on big-endian machines

Message ID 20180312030943.32669-1-simon.marchi@polymtl.ca
State New
Headers show
Series
  • Make arm_record_test work on big-endian machines
Related show

Commit Message

Simon Marchi March 12, 2018, 3:09 a.m.
While running selftests on a big-endian PPC, I noticed arm_record_test
failed a test.  The reason is that the gdbarch_find_by_info call is done
with BFD_ENDIAN_UNKNOWN byte order in the info structure.  In that case,
it uses the byte order of the current default BFD, which happened to be
big-endian on that GDB build, and the gdbarch returned is a big-endian
one.  The instruction used for the 32-bits part of the test is written
in little-endian form, so GDB fails to decode the instruction properly.

Since ARM supports both endiannesses, and it should be possible to debug
using an host of both endiannesses, I changed the test to check with
gdbarches of both endiannesses.

gdb/ChangeLog:

	* arm-tdep.c (arm_record_test): Test with big and little endian
	gdbarch.
---
 gdb/arm-tdep.c | 106 ++++++++++++++++++++++++++++++---------------------------
 1 file changed, 56 insertions(+), 50 deletions(-)

-- 
2.16.2

Comments

Yao Qi March 12, 2018, 2:15 p.m. | #1
Simon Marchi <simon.marchi@polymtl.ca> writes:

> While running selftests on a big-endian PPC, I noticed arm_record_test

> failed a test.  The reason is that the gdbarch_find_by_info call is done

> with BFD_ENDIAN_UNKNOWN byte order in the info structure.  In that case,

> it uses the byte order of the current default BFD, which happened to be

> big-endian on that GDB build, and the gdbarch returned is a big-endian

> one.  The instruction used for the 32-bits part of the test is written

> in little-endian form, so GDB fails to decode the instruction

> properly.


The test instructions are endianess-neutral, because they are written as
uint16_t, and that is why you didn't have to adjust 16-bit thumb
instructions in the tests.

>

> Since ARM supports both endiannesses, and it should be possible to debug

> using an host of both endiannesses, I changed the test to check with

> gdbarches of both endiannesses.


Although ARM supports both endianesses, the big endian has two variants,
BE-32 and BE-8.  Before ARMv6, it was BE-32.  In ARMv6, BE-8 must be
supported, but support of BE-32 is implementation defined.  ARMv7 drops
the BE-32 support.  With BE-8, data is big endian and code is little
endian.  Thumb-2 instruction was introduced in armv6t2, so it is still
likely to have both big endian and little endian thumb-2 instructions
code.

As I said, the test case is written in an endianess-neutral way,

    static const uint16_t insns[] = {
      /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
      0xee1d, 0x7f70,
    };

but the arm process recording processes thumb-2 instruction as one
32-bit instruction, so that it has this,

	  /* Swap first half of 32bit thumb instruction with second half.  */
	  arm_record->arm_insn
	    = (arm_record->arm_insn >> 16) | (arm_record->arm_insn << 16);

I wonder this is the cause of the test fail, that is, we don't need to
swap them in big endian.  Since I've never run GDB tests on arm big
endian target, I am not very confident on my fix below.  It fixes the
test fail on ppc64, and also the fail with your patch to change the test
with two different endianess.

-- 
Yao (齐尧)
From 4f1aa44d6688ab2bc8885e3c0b02c49c15eb4bb7 Mon Sep 17 00:00:00 2001
From: Yao Qi <yao.qi@linaro.org>

Date: Mon, 12 Mar 2018 14:12:58 +0000
Subject: [PATCH] Fix process record on big endian

gdb:

2018-03-12  Yao Qi  <yao.qi@linaro.org>

	* arm-tdep.c (decode_insn): Swap thumb-2 instruction on little
	endian.

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index ef7e66b..153f568 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -13188,10 +13188,13 @@ decode_insn (abstract_memory_reader &reader, insn_decode_record *arm_record,
       /* As thumb does not have condition codes, we set negative.  */
       arm_record->cond = -1;
 
-      /* Swap first half of 32bit thumb instruction with second half.  */
-      arm_record->arm_insn
-	= (arm_record->arm_insn >> 16) | (arm_record->arm_insn << 16);
-
+      if (gdbarch_byte_order_for_code (arm_record->gdbarch)
+	  == BFD_ENDIAN_LITTLE)
+	{
+	  /* Swap first half of 32bit thumb instruction with second half.  */
+	  arm_record->arm_insn
+	    = (arm_record->arm_insn >> 16) | (arm_record->arm_insn << 16);
+	}
       ret = thumb2_record_decode_insn_handler (arm_record);
 
       if (ret != ARM_RECORD_SUCCESS)
Simon Marchi March 12, 2018, 3:41 p.m. | #2
On 2018-03-12 10:15, Yao Qi wrote:
> Simon Marchi <simon.marchi@polymtl.ca> writes:

>

>> While running selftests on a big-endian PPC, I noticed arm_record_test

>> failed a test.  The reason is that the gdbarch_find_by_info call is done

>> with BFD_ENDIAN_UNKNOWN byte order in the info structure.  In that case,

>> it uses the byte order of the current default BFD, which happened to be

>> big-endian on that GDB build, and the gdbarch returned is a big-endian

>> one.  The instruction used for the 32-bits part of the test is written

>> in little-endian form, so GDB fails to decode the instruction

>> properly.

>

> The test instructions are endianess-neutral, because they are written as

> uint16_t, and that is why you didn't have to adjust 16-bit thumb

> instructions in the tests.


Ok, and I think there was something confusing me with the order of the Thumb2 instructions
(maybe it's the order the bytes are written in the comment).  From what I understand now
after playing with objdump, the two 16-bit parts of a Thumb2 instructions are not swapped
on little endian, they are considered as two separate instructions (I guess it makes sense,
otherwise it would be impossible to determine whether the next instruction is 16-bit or
32-bit long).  So only the two bytes of each 16-bit chunk are swapped when switching
between big and little endian.  In memory, the bytes for that instruction should look like
this:

  little endian: 0x1d, 0xee, 0x70, 0x7f
  big endian:    0xee, 0x1d, 0x7f, 0x70

And in both cases, we want the value in arm_record->arm_insn to end up containing 0xee1d7f70.  Is that right?

>>

>> Since ARM supports both endiannesses, and it should be possible to debug

>> using an host of both endiannesses, I changed the test to check with

>> gdbarches of both endiannesses.

>

> Although ARM supports both endianesses, the big endian has two variants,

> BE-32 and BE-8.  Before ARMv6, it was BE-32.  In ARMv6, BE-8 must be

> supported, but support of BE-32 is implementation defined.  ARMv7 drops

> the BE-32 support.  With BE-8, data is big endian and code is little

> endian.  Thumb-2 instruction was introduced in armv6t2, so it is still

> likely to have both big endian and little endian thumb-2 instructions

> code.


Ok, thanks for the info.

> As I said, the test case is written in an endianess-neutral way,

>

>     static const uint16_t insns[] = {

>       /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */

>       0xee1d, 0x7f70,

>     };


If my assumption above is right, everything looks right until extract_arm_insn.
The 4-byte instruction is extracted with

  insn_record->arm_insn = (uint32_t) extract_unsigned_integer (&buf[0],
                           insn_size,
			   gdbarch_byte_order_for_code (insn_record->gdbarch));

Since the instruction is extracted as a single 4-byte value, on little endian,
extract_arm_insn leaves insn_record->arm_insn equal to 0x7f70ee1d.  On big
endian, it leaves it equal to 0xee1d7f70.  Hence the need for the word swapping
you show below.

In my opinion, it should be extract_arm_insn's responsibility to leave
insn_record->arm_insn with a consistent value regardless of the target/host
endiannesses.  The easiest way would be to read the instruction as two 16-bit
integers instead of one 32-bit one.  Or we can still do the word swap shown below
(only if the code is little endian), but at least I think it would make more sense
to do it in extract_arm_insn.

>

> but the arm process recording processes thumb-2 instruction as one

> 32-bit instruction, so that it has this,

>

> 	  /* Swap first half of 32bit thumb instruction with second half.  */

> 	  arm_record->arm_insn

> 	    = (arm_record->arm_insn >> 16) | (arm_record->arm_insn << 16);


> I wonder this is the cause of the test fail, that is, we don't need to

> swap them in big endian.  Since I've never run GDB tests on arm big

> endian target, I am not very confident on my fix below.  It fixes the

> test fail on ppc64, and also the fail with your patch to change the test

> with two different endianess.


What do you think about the patch below?  Functionally, it should be identical
to what you suggested, but I think it's clear to fix extract_arm_insn instead.


From 925b2f54e53f39b8713dccc7788b3f10fccc374c Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@ericsson.com>

Date: Mon, 12 Mar 2018 11:28:29 -0400
Subject: [PATCH] extract_arm_insn: Read 32-bit instruction as two 16-bit words

---
 gdb/arm-tdep.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index ef7e66b..e92ac13 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -13089,14 +13089,25 @@ extract_arm_insn (abstract_memory_reader& reader,
 		  insn_decode_record *insn_record, uint32_t insn_size)
 {
   gdb_byte buf[insn_size];
-
   memset (&buf[0], 0, insn_size);

+  gdb_assert (insn_size == 2 || insn_size == 4);
+
   if (!reader.read (insn_record->this_addr, buf, insn_size))
     return 1;
-  insn_record->arm_insn = (uint32_t) extract_unsigned_integer (&buf[0],
-                           insn_size,
-			   gdbarch_byte_order_for_code (insn_record->gdbarch));
+
+  bfd_endian endian = gdbarch_byte_order_for_code (insn_record->gdbarch);
+
+  insn_record->arm_insn
+    = (uint32_t) extract_unsigned_integer (&buf[0], 2, endian);
+
+  if (insn_size == 4)
+    {
+      insn_record->arm_insn <<= 16;
+      insn_record->arm_insn
+	|= (uint32_t) extract_unsigned_integer (&buf[2], 2, endian);
+    }
+
   return 0;
 }

@@ -13188,10 +13199,6 @@ decode_insn (abstract_memory_reader &reader, insn_decode_record *arm_record,
       /* As thumb does not have condition codes, we set negative.  */
       arm_record->cond = -1;

-      /* Swap first half of 32bit thumb instruction with second half.  */
-      arm_record->arm_insn
-	= (arm_record->arm_insn >> 16) | (arm_record->arm_insn << 16);
-
       ret = thumb2_record_decode_insn_handler (arm_record);

       if (ret != ARM_RECORD_SUCCESS)
-- 
2.7.4
Yao Qi March 12, 2018, 5:09 p.m. | #3
Simon Marchi <simon.marchi@ericsson.com> writes:

> What do you think about the patch below?  Functionally, it should be identical

> to what you suggested, but I think it's clear to fix extract_arm_insn instead.

>


Patch looks good to me.  Thanks.

> +

> +  insn_record->arm_insn

> +    = (uint32_t) extract_unsigned_integer (&buf[0], 2, endian);

> +

> +  if (insn_size == 4)

> +    {

> +      insn_record->arm_insn <<= 16;

> +      insn_record->arm_insn

> +	|= (uint32_t) extract_unsigned_integer (&buf[2], 2, endian);


Better to add some comments here.

-- 
Yao (齐尧)
Simon Marchi March 12, 2018, 6:53 p.m. | #4
On 2018-03-12 13:09, Yao Qi wrote:
> Simon Marchi <simon.marchi@ericsson.com> writes:

> 

>> What do you think about the patch below?  Functionally, it should be 

>> identical

>> to what you suggested, but I think it's clear to fix extract_arm_insn 

>> instead.

>> 

> 

> Patch looks good to me.  Thanks.

> 

>> +

>> +  insn_record->arm_insn

>> +    = (uint32_t) extract_unsigned_integer (&buf[0], 2, endian);

>> +

>> +  if (insn_size == 4)

>> +    {

>> +      insn_record->arm_insn <<= 16;

>> +      insn_record->arm_insn

>> +	|= (uint32_t) extract_unsigned_integer (&buf[2], 2, endian);

> 

> Better to add some comments here.


Oh, actually, does my patch get it wrong for ARM instructions?  For 
them, it seems like in little endian, the four bytes of an instruction 
are reversed (as if it was a 4-byte value).  So that would explain why 
the word swap would only occur under a if (THUMB2_RECORD == 
record_type).  There is no test with ARM instructions in 
arm_record_test... should there be one?

Simon

Patch

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index ef7e66b36a..840b82b57c 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -13247,70 +13247,76 @@  private:
 static void
 arm_record_test (void)
 {
-  struct gdbarch_info info;
-  gdbarch_info_init (&info);
-  info.bfd_arch_info = bfd_scan_arch ("arm");
+  std::array<bfd_endian, 2> endiannesses{{ BFD_ENDIAN_BIG, BFD_ENDIAN_LITTLE }};
 
-  struct gdbarch *gdbarch = gdbarch_find_by_info (info);
+  for (bfd_endian endian : endiannesses)
+    {
+      struct gdbarch_info info;
+      gdbarch_info_init (&info);
+      info.bfd_arch_info = bfd_scan_arch ("arm");
+      info.byte_order = endian;
+      info.byte_order_for_code = endian;
 
-  SELF_CHECK (gdbarch != NULL);
+      struct gdbarch *gdbarch = gdbarch_find_by_info (info);
 
-  /* 16-bit Thumb instructions.  */
-  {
-    insn_decode_record arm_record;
+      SELF_CHECK (gdbarch != NULL);
 
-    memset (&arm_record, 0, sizeof (insn_decode_record));
-    arm_record.gdbarch = gdbarch;
+      /* 16-bit Thumb instructions.  */
+      {
+	insn_decode_record arm_record;
 
-    static const uint16_t insns[] = {
-      /* db b2	uxtb	r3, r3 */
-      0xb2db,
-      /* cd 58	ldr	r5, [r1, r3] */
-      0x58cd,
-    };
+	memset (&arm_record, 0, sizeof (insn_decode_record));
+	arm_record.gdbarch = gdbarch;
 
-    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
-    instruction_reader_thumb reader (endian, insns);
-    int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
-			   THUMB_INSN_SIZE_BYTES);
+	static const uint16_t insns[] = {
+	  /* db b2	uxtb	r3, r3 */
+	  0xb2db,
+	  /* cd 58	ldr	r5, [r1, r3] */
+	  0x58cd,
+	};
 
-    SELF_CHECK (ret == 0);
-    SELF_CHECK (arm_record.mem_rec_count == 0);
-    SELF_CHECK (arm_record.reg_rec_count == 1);
-    SELF_CHECK (arm_record.arm_regs[0] == 3);
+	instruction_reader_thumb reader (endian, insns);
+	int ret = decode_insn (reader, &arm_record, THUMB_RECORD,
+			       THUMB_INSN_SIZE_BYTES);
 
-    arm_record.this_addr += 2;
-    ret = decode_insn (reader, &arm_record, THUMB_RECORD,
-		       THUMB_INSN_SIZE_BYTES);
+	SELF_CHECK (ret == 0);
+	SELF_CHECK (arm_record.mem_rec_count == 0);
+	SELF_CHECK (arm_record.reg_rec_count == 1);
+	SELF_CHECK (arm_record.arm_regs[0] == 3);
 
-    SELF_CHECK (ret == 0);
-    SELF_CHECK (arm_record.mem_rec_count == 0);
-    SELF_CHECK (arm_record.reg_rec_count == 1);
-    SELF_CHECK (arm_record.arm_regs[0] == 5);
-  }
+	arm_record.this_addr += 2;
+	ret = decode_insn (reader, &arm_record, THUMB_RECORD,
+			   THUMB_INSN_SIZE_BYTES);
 
-  /* 32-bit Thumb-2 instructions.  */
-  {
-    insn_decode_record arm_record;
+	SELF_CHECK (ret == 0);
+	SELF_CHECK (arm_record.mem_rec_count == 0);
+	SELF_CHECK (arm_record.reg_rec_count == 1);
+	SELF_CHECK (arm_record.arm_regs[0] == 5);
+      }
+
+      /* 32-bit Thumb-2 instructions.  */
+      {
+	insn_decode_record arm_record;
 
-    memset (&arm_record, 0, sizeof (insn_decode_record));
-    arm_record.gdbarch = gdbarch;
+	memset (&arm_record, 0, sizeof (insn_decode_record));
+	arm_record.gdbarch = gdbarch;
 
-    static const uint16_t insns[] = {
-      /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
-      0xee1d, 0x7f70,
-    };
+	/* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
+	static const uint16_t insns[2][2] = {
+	  { 0x7f70, 0xee1d }, /* big */
+	  { 0xee1d, 0x7f70 }, /* little */
+	};
 
-    enum bfd_endian endian = gdbarch_byte_order_for_code (arm_record.gdbarch);
-    instruction_reader_thumb reader (endian, insns);
-    int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
-			   THUMB2_INSN_SIZE_BYTES);
+	instruction_reader_thumb reader (endian, insns[endian]);
+	int ret = decode_insn (reader, &arm_record, THUMB2_RECORD,
+			       THUMB2_INSN_SIZE_BYTES);
 
-    SELF_CHECK (ret == 0);
-    SELF_CHECK (arm_record.mem_rec_count == 0);
-    SELF_CHECK (arm_record.reg_rec_count == 1);
-    SELF_CHECK (arm_record.arm_regs[0] == 7);
-  }
+	SELF_CHECK (ret == 0);
+	SELF_CHECK (arm_record.mem_rec_count == 0);
+	SELF_CHECK (arm_record.reg_rec_count == 1);
+	SELF_CHECK (arm_record.arm_regs[0] == 7);
+      }
+    }
 }
 } // namespace selftests
 #endif /* GDB_SELF_TEST */