ping: [patch] aarch64: PR 19806: watchpoints: false negatives + PR 20207 contiguous ones

Message ID 20180321190316.GA32598@host1.jankratochvil.net
State New
Headers show
Series
  • ping: [patch] aarch64: PR 19806: watchpoints: false negatives + PR 20207 contiguous ones
Related show

Commit Message

Jan Kratochvil March 21, 2018, 7:03 p.m.
https://sourceware.org/ml/gdb-patches/2017-11/msg00090.html
Message-ID: <20171103220437.GA13979@host1.jankratochvil.net>

On Fri, 03 Nov 2017 23:04:37 +0100, Jan Kratochvil wrote:
On Mon, 30 Oct 2017 12:27:15 +0100, Yao Qi wrote:
> Jan Kratochvil <jan.kratochvil@redhat.com> writes:

> > Therefore on old kernels (see below) it will now rather report watchpoint

> > which was hit only next to it (for rwatch/awatch).  But it will never miss

> > a watchpoint.

> 

> I am fine with the change.  Does your case watchpoint-unaligned.c cover

> this?  We need to have a case that show GDB misses a watchpoint (false

> negative), and with your patch applied, it becomes false positive.

> IIUC, all these change happen on the old kernel.


It is handled by the new testcase:

                    # We do not know if we run on a fixed kernel or not.
                    # Report XFAIL only in the FAIL case.
                    if {$expect_hit == 0 && $rdstart < $wpendaligned} {
                        setup_xfail external/20207 "aarch64-*-*"
                    }
                    fail $test

Sure an improvement of the testcase would be to make GDB display its internal
'have_any_contiguous' variable by some maintenance/info/debug command so that
it could display even XPASS for example.  But I do not think the testcase
could be mistaken by that.


> We need a NEWS entry, since the change is user-visible.


Done.


> > There may be a regresion that it now less merges watchpoints so that with

> > multiple overlapping watchpoints it may run out of the 4 hardware watchpoint

> > registers.  But as discussed in the original thread GDB needs some generic

> 

> Is it a regression in your mind or a regression in some existing test

> case?


In my mind.

> If we don't have case, can you write one case to show that, we'll

> run out of watchpoint registers with this fix, but don't run out of them

> before.  We can kfail the failure, which can be addressed by watchpoint

> merging stuff as you mentioned later.


Extended the attached *.exp testfile and filed for it tracker:
	aarch64: watchpoints are not merged regression
	https://sourceware.org/bugzilla/show_bug.cgi?id=22389

Therefore with current FSF GDB it does KPASS.


> Can you elaborate?  watchpoint is quite target specific, and each

> cpu/arch has completely different watchpoint registers.  What do you

> expect watchpoint merging framework do? in a target independent way, I suppose?


Some limitations are specific to arches but most of the logic would be the
same.  There could be some generic merging engine with some arch-specific
parameters.  Such as adjacent watchpoints (watch, not rwatch/awatch) can be
merged into a single larger watchpoint.  Or one could even place there one
bigger watchpoints with possible false positives etc.

But I am not going to implement that.  I consider it only as an incremental
improvement (although it would prevent the regression PR 22389).

I (also) find the code messy but I would have to rewrite it first as there are
member variables instead of accessors, it is not C++, there is missing the
arch-agnostic watchpoint splitting+merging etc.  I coded the patch driven by
the testcase given it tests all the combinations (it does not test splitting of
the watchpoints which I believe is simple enough but maybe that could also be
tested).


> >  aarch64_watchpoint_length (unsigned int ctrl)

> 

> The comments to this function should be updated too.


Done.


> Could you split the patch, that is, patch 1 can be about changing

> aarch64_watchpoint_length and adding aarch64_watchpoint_offset?  That is

> helpful to me to understand the patch.


Sorry I was looking at it for a few evenings and I do not really understand how
you wish the split to be done.  Should it just add the
aarch64_watchpoint_offset function still not supporting the new kernel feature?
Then what it will do there?  It will just always return 0.
If it already should support the new kernels then all the patch needs to be there.
Maybe except aarch64_downgrade_regs but that is a separate function anyway.



> > @@ -148,6 +148,7 @@ struct aarch64_debug_reg_state

> >  

> >    /* hardware watchpoint */

> >    CORE_ADDR dr_addr_wp[AARCH64_HWP_MAX_NUM];

> > +  CORE_ADDR dr_addr_orig_wp[AARCH64_HWP_MAX_NUM];

> 

> Could you add comments on how these two fields are used?


Done.
  // Address aligned down to AARCH64_HWP_ALIGNMENT.
  CORE_ADDR dr_addr_wp[AARCH64_HWP_MAX_NUM];
  // Address as entered by user without any alignment.
  CORE_ADDR dr_addr_orig_wp[AARCH64_HWP_MAX_NUM];


> > +if {![is_aarch64_target] && ![istarget "x86_64-*-*"] && ![istarget i?86-*]} {

> > +    verbose "Skipping ${gdb_test_file_name}."

> > +    return

> > +}

> 

> Any reason to only run this test only on these targets?  Better to run

> this test on every target if possible.


OK.


Thanks,
Jan

gdb/ChangeLog
2017-03-27  Jan Kratochvil  <jan.kratochvil@redhat.com>

	PR breakpoints/19806 and support for PR external/20207.
	* NEWS (Changes since GDB 8.0): Mention unaligned hardware watchpoints.
	* aarch64-linux-nat.c (aarch64_linux_stopped_data_address): Fix missed
	watchpoints and PR external/20207 watchpoints.
	* gdbserver/linux-aarch64-low.c (aarch64_stopped_data_address):
	Likewise.
	* nat/aarch64-linux-hw-point.c (have_any_contiguous): New.
	(aarch64_watchpoint_offset): New.
	(aarch64_watchpoint_length): Support PR external/20207 watchpoints.
	(aarch64_point_encode_ctrl_reg): New parameter offset, new asserts.
	(aarch64_point_is_aligned): Support PR external/20207 watchpoints.
	(aarch64_align_watchpoint): New parameters aligned_offset_p and
	next_addr_orig_p.  Support PR external/20207 watchpoints.
	(aarch64_downgrade_regs): New.
	(aarch64_dr_state_insert_one_point): New parameters offset and
	addr_orig.
	(aarch64_dr_state_remove_one_point): Likewise.
	(aarch64_handle_breakpoint): Update caller.
	(aarch64_handle_aligned_watchpoint): Likewise.
	(aarch64_handle_unaligned_watchpoint): Support addr_orig and
	aligned_offset.
	(aarch64_linux_set_debug_regs): Remove const from state.  Call
	aarch64_downgrade_regs.
	(aarch64_show_debug_reg_state): Print also dr_addr_orig_wp.
	* nat/aarch64-linux-hw-point.h (DR_CONTROL_LENGTH): Rename to ...
	(DR_CONTROL_MASK): ... here.
	(struct aarch64_debug_reg_state): New field dr_addr_orig_wp.
	(unsigned int aarch64_watchpoint_offset): New prototype.
	(aarch64_linux_set_debug_regs): Remove const from state.

gdb/testsuite/ChangeLog
2017-03-27  Jan Kratochvil  <jan.kratochvil@redhat.com>

	PR breakpoints/19806 and support for PR external/20207.
	* gdb.base/watchpoint-unaligned.c: New file.
	* gdb.base/watchpoint-unaligned.exp: New file.

Comments

Pedro Alves April 20, 2018, 2:49 p.m. | #1
Hi,

On 03/21/2018 07:03 PM, Jan Kratochvil wrote:

> gdb/ChangeLog

> 2017-03-27  Jan Kratochvil  <jan.kratochvil@redhat.com>

> 

> 	PR breakpoints/19806 and support for PR external/20207.

> 	* NEWS (Changes since GDB 8.0): Mention unaligned hardware watchpoints.

> 	* aarch64-linux-nat.c (aarch64_linux_stopped_data_address): Fix missed

> 	watchpoints and PR external/20207 watchpoints.

> 	* gdbserver/linux-aarch64-low.c (aarch64_stopped_data_address):

> 	Likewise.

> 	* nat/aarch64-linux-hw-point.c (have_any_contiguous): New.

> 	(aarch64_watchpoint_offset): New.

> 	(aarch64_watchpoint_length): Support PR external/20207 watchpoints.

> 	(aarch64_point_encode_ctrl_reg): New parameter offset, new asserts.

> 	(aarch64_point_is_aligned): Support PR external/20207 watchpoints.

> 	(aarch64_align_watchpoint): New parameters aligned_offset_p and

> 	next_addr_orig_p.  Support PR external/20207 watchpoints.

> 	(aarch64_downgrade_regs): New.

> 	(aarch64_dr_state_insert_one_point): New parameters offset and

> 	addr_orig.

> 	(aarch64_dr_state_remove_one_point): Likewise.

> 	(aarch64_handle_breakpoint): Update caller.

> 	(aarch64_handle_aligned_watchpoint): Likewise.

> 	(aarch64_handle_unaligned_watchpoint): Support addr_orig and

> 	aligned_offset.

> 	(aarch64_linux_set_debug_regs): Remove const from state.  Call

> 	aarch64_downgrade_regs.

> 	(aarch64_show_debug_reg_state): Print also dr_addr_orig_wp.

> 	* nat/aarch64-linux-hw-point.h (DR_CONTROL_LENGTH): Rename to ...

> 	(DR_CONTROL_MASK): ... here.

> 	(struct aarch64_debug_reg_state): New field dr_addr_orig_wp.

> 	(unsigned int aarch64_watchpoint_offset): New prototype.

> 	(aarch64_linux_set_debug_regs): Remove const from state.

> 

> gdb/testsuite/ChangeLog

> 2017-03-27  Jan Kratochvil  <jan.kratochvil@redhat.com>

> 

> 	PR breakpoints/19806 and support for PR external/20207.

> 	* gdb.base/watchpoint-unaligned.c: New file.

> 	* gdb.base/watchpoint-unaligned.exp: New file.


So I spent a couple days this week working through this.

The patch looks largely good to me, though as I was going
through this, I was having trouble understanding some of the
details, and kept wishing for clearer comments, so I tried to
clarify and copy/edit the comments as I went.  I also noticed a
few places more where we should update the comments to newer
reality, like the comments on top of aarch64_align_watchpoint.

Other copy/edits:

- gdbserver has its own ChangeLog file
- move align_down (and align_up) to common/, so we
  can use it in gdbserver too, and use it to make
  code a little bit clearer.
- rename the have_any_contiguous global as it wasn't
  immediately clear what it meant when I started
  reading this.  Use copy-init while at it.
- // -> /**/ comments.
- Misc formatting.
- Expanded/clarified the git commit log entry.

Here's the patch that I intend to squash with yours before
merging (keeping you as --author).

Could you double-check to see if I missed something?

I have one question though.  In aarch64_linux_stopped_data_address,
where we match ADDR_TRAP from siginfo to a watchpoint, we check
whether that address is within the aligned watch regions:

      if (state->dr_ref_count_wp[i]
	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
	  && addr_trap >= addr_watch_aligned
	  && addr_trap < addr_watch + len)
	{

However, by reading
<http://lkml.iu.edu/hypermail/linux/kernel/1611.1/05226.html>:

~~~~~~~~~~~~
Previously, when the hardware reported a watchpoint hit on an address
that did not match our watchpoint (this happens in case of instructions
which access large chunks of memory such as "stp") the process would
enter a loop where we would be continually resuming it (because we did
not recognise that watchpoint hit) and it would keep hitting the
watchpoint again and again. The tracing process would never get
notified of the watchpoint hit.
~~~~~~~~~~~~

... I'm left with the impression that ADDR_TRAP could be even
lower than addr_watch_aligned, in which case we'll still miss
watchpoints.  I wondering whether GDB should be using a similar
trick as that kernel patch does.  I may well be missing something,
though, as I only tried the patch on an older kernel.  WDYT?

From f265a8e13984aaa40a5ed59913ed14923bc67d9d Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>

Date: Fri, 20 Apr 2018 14:55:03 +0100
Subject: [PATCH] Aarch64: Fix watchpoints set on non-8-byte-aligned addresses
 are always missed (PR 19806)

Ref: https://sourceware.org/bugzilla/show_bug.cgi?id=19806

As described in detail on the bug report, on Aarch64, some unaligned
watchpoints are currently missed.  For example, with:

 union
 {
   char buf[4];
   unsigned int ul;
 } u;

 int
 main ()
 {
   u.ul = 0xffffffff;
   return 0;
 }

on x86-64, we get:

 (gdb) watch u.buf[1]
 Hardware watchpoint 1: u.buf[1]
 (gdb) c
 Continuing.
 Hardware watchpoint 1: u.buf[1]

 Old value = 0 '\000'
 New value = -1 '\377'
 main () at watch.c:11
 11              return 0;
 (gdb)

While on Aarch64, GDB misses the watchpoint hit.

Actually, the kernel reports the hit to gdb, and linux-nat.c forwards
the event to infrun.c.  However, it doesn't work as expected because
the Aarch64 backend code aligns the inserted watchpoint's address to
an 8-byte boundary, and then when the watchpoint triggers, the backend
reports that aligned address as the watchpoint stop address.  Since
that address falls out of any memory range covered by watchpoints that
the core of GDB knows about, the watchpoint hit is not reported to the
user (it's considered a spurious/moribund watchpoint trap, and
ignored).

This patch fixes it, by trying to match the kernel-reported trapped
address (ADDR_TRAP) with a watched region, so that the core of gdb
figures out which watchpoint triggered.

Additionally this patch makes GDB support any watchpoint masks as
described here:

 kernel RFE: aarch64: ptrace: BAS: Support any contiguous range
 https://sourceware.org/bugzilla/show_bug.cgi?id=20207

The above is already fixed in current Linux kernels.

The patch trades missing watchpoints (false negatives) for the
occasional rwatch/awatch false positive on unfixed kernels.  The
latter can happen if you have watchpoints set in near addresses; gdb
may report the wrong watchpoint being hit.

Note the change causes the aarch64 backend to merge fewer watchpoints,
since it now only considers the requested ranges for merging instead
of the enlarged/aligned ranges.  I.e., with multiple overlapping
watchpoints, gdb may run out of the 4 hardware watchpoint registers
earlier than before.  While we could make gdb considered the enlarged
ranges when running on an older kernel, I do not think it's worth the
bother, since older kernels will eventually be phased out.

Tested on RHEL-7.{3,4} for no regressions on:
	kernel-4.10.0-6.el7.aarch64 (contiguous watchpoints supported)
	kernel-4.5.0-15.el7.aarch64 (contiguous watchpoints unsupported)

gdb/ChangeLog:
yyyy-mm-dd  Jan Kratochvil  <jan.kratochvil@redhat.com>

	PR breakpoints/19806 and support for PR external/20207.

	* NEWS: Mention Aarch64 watchpoint improvements.

	* aarch64-linux-nat.c (aarch64_linux_stopped_data_address): Fix missed
	watchpoints and PR external/20207 watchpoints.
	* nat/aarch64-linux-hw-point.c
	(kernel_supports_any_contiguous_range): New.
	(aarch64_watchpoint_offset): New.
	(aarch64_watchpoint_length): Support PR external/20207 watchpoints.
	(aarch64_point_encode_ctrl_reg): New parameter offset, new asserts.
	(aarch64_point_is_aligned): Support PR external/20207 watchpoints.
	(aarch64_align_watchpoint): New parameters aligned_offset_p and
	next_addr_orig_p.  Support PR external/20207 watchpoints.
	(aarch64_downgrade_regs): New.
	(aarch64_dr_state_insert_one_point): New parameters offset and
	addr_orig.
	(aarch64_dr_state_remove_one_point): Likewise.
	(aarch64_handle_breakpoint): Update caller.
	(aarch64_handle_aligned_watchpoint): Likewise.
	(aarch64_handle_unaligned_watchpoint): Support addr_orig and
	aligned_offset.
	(aarch64_linux_set_debug_regs): Remove const from state.  Call
	aarch64_downgrade_regs.
	(aarch64_show_debug_reg_state): Print also dr_addr_orig_wp.
	* nat/aarch64-linux-hw-point.h (DR_CONTROL_LENGTH): Rename to ...
	(DR_CONTROL_MASK): ... this.
	(struct aarch64_debug_reg_state): New field dr_addr_orig_wp.
	(unsigned int aarch64_watchpoint_offset): New prototype.
	(aarch64_linux_set_debug_regs): Remove const from state.
	* utils.c (align_up, align_down): Move to ...
	* common/common-utils.c (align_up, align_down): ... here.
	* utils.h (align_up, align_down): Move to ...
	* common/common-utils.h (align_up, align_down): ... here.

gdb/gdbserver/ChangeLog:
yyyy-mm-dd  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* linux-aarch64-low.c (aarch64_stopped_data_address):
	Likewise.

gdb/testsuite/ChangeLog:
yyyy-mm-dd  Jan Kratochvil  <jan.kratochvil@redhat.com>

	PR breakpoints/19806 and support for PR external/20207.
	* gdb.base/watchpoint-unaligned.c: New file.
	* gdb.base/watchpoint-unaligned.exp: New file.
---
 gdb/NEWS                                        |  15 ++-
 gdb/aarch64-linux-nat.c                         |  33 +++---
 gdb/common/common-utils.c                       |  20 ++++
 gdb/common/common-utils.h                       |  32 ++++++
 gdb/gdbserver/linux-aarch64-low.c               |  33 +++---
 gdb/nat/aarch64-linux-hw-point.c                | 129 +++++++++++++-----------
 gdb/nat/aarch64-linux-hw-point.h                |   4 +-
 gdb/testsuite/gdb.base/watchpoint-unaligned.c   |   4 +-
 gdb/testsuite/gdb.base/watchpoint-unaligned.exp |  41 ++++----
 gdb/utils.c                                     |  16 ---
 gdb/utils.h                                     |  32 ------
 11 files changed, 198 insertions(+), 161 deletions(-)

diff --git a/gdb/NEWS b/gdb/NEWS
index 25c404bfc37..38043c4ff2e 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -35,6 +35,16 @@ SH-5/SH64 ELF			sh64-*-elf*, SH-5/SH64 support in sh*
 SH-5/SH64 running GNU/Linux	SH-5/SH64 support in sh*-*-linux*
 SH-5/SH64 running OpenBSD 	SH-5/SH64 support in sh*-*-openbsd*
 
+* Aarch64/Linux hardware watchpoints improvements
+
+  Hardware watchpoints on unaligned addresses are now properly
+  supported when running Linux kernel 4.10 or higher: read and access
+  watchpoints are no longer spuriously missed, and all watchpoints
+  lengths between 1 and 8 bytes are supported.  On older kernels,
+  watchpoints set on unaligned addresses are no longer missed, with
+  the tradeoff that there is a possibility of false hits being
+  reported.
+
 *** Changes in GDB 8.1
 
 * GDB now supports dynamically creating arbitrary register groups specified
@@ -148,11 +158,6 @@ SH-5/SH64 running OpenBSD 	SH-5/SH64 support in sh*-*-openbsd*
 
   Tab completion was adjusted accordingly as well.
 
-* Unaligned hardware watchpoints on aarch64-linux are now properly supported
-  with Linux kernel 4.10 and higher.  On older kernels some unaligned
-  watchpoints are no longer missed - but other unaligned watchpoints may make
-  false hits now.
-
 * Python Scripting
 
   ** New events gdb.new_inferior, gdb.inferior_deleted, and
diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index b69dfb2fce7..421d044e108 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -735,13 +735,12 @@ aarch64_linux_stopped_data_address (struct target_ops *target,
   state = aarch64_get_debug_reg_state (ptid_get_pid (inferior_ptid));
   for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
     {
-      const unsigned int offset = aarch64_watchpoint_offset
-							 (state->dr_ctrl_wp[i]);
+      const unsigned int offset
+	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
       const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
       const CORE_ADDR addr_trap = (CORE_ADDR) siginfo.si_addr;
       const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
-      const CORE_ADDR addr_watch_aligned = (state->dr_addr_wp[i]
-					    & -(CORE_ADDR) 8);
+      const CORE_ADDR addr_watch_aligned = align_down (state->dr_addr_wp[i], 8);
       const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
 
       if (state->dr_ref_count_wp[i]
@@ -749,14 +748,24 @@ aarch64_linux_stopped_data_address (struct target_ops *target,
 	  && addr_trap >= addr_watch_aligned
 	  && addr_trap < addr_watch + len)
 	{
-	  /* ADDR_TRAP reports first address of the range touched by CPU.
-	     ADDR_TRAP may be before the ADDR_WATCH..ADDR_WATCH+LEN range
-	     for larger CPU access hitting the watchpoint by its tail part.
-	     ADDR_TRAP must be in the ADDR_WATCH_ALIGNED..ADDR_WATCH+LEN range.
-	     Never report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
-	     range (not matching any known watchpoint range).
-	     ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false positive due to
-	     PR external/20207 buggy kernels.  */
+	  /* ADDR_TRAP reports the first address of the memory range
+	     accessed by the CPU, regardless of what was the memory
+	     range watched.  Thus, a large CPU access that straddles
+	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
+	     ADDR_TRAP that is lower than the
+	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
+
+	     addr: |   4   |   5   |   6   |   7   |   8   |
+				   |---- range watched ----|
+		   |----------- range accessed ------------|
+
+	     In this case, ADDR_TRAP will be 4.
+
+	     To match a watchpoint known to GDB core, we must never
+	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
+	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
+	     positive on kernels older than 4.10.  See PR
+	     external/20207.  */
 	  *addr_p = addr_orig;
 	  return 1;
 	}
diff --git a/gdb/common/common-utils.c b/gdb/common/common-utils.c
index 80de826ba78..8d839d10fa8 100644
--- a/gdb/common/common-utils.c
+++ b/gdb/common/common-utils.c
@@ -440,3 +440,23 @@ is_regular_file (const char *name, int *errno_ptr)
     *errno_ptr = EINVAL;
   return false;
 }
+
+/* See common/common-utils.h.  */
+
+ULONGEST
+align_up (ULONGEST v, int n)
+{
+  /* Check that N is really a power of two.  */
+  gdb_assert (n && (n & (n-1)) == 0);
+  return (v + n - 1) & -n;
+}
+
+/* See common/common-utils.h.  */
+
+ULONGEST
+align_down (ULONGEST v, int n)
+{
+  /* Check that N is really a power of two.  */
+  gdb_assert (n && (n & (n-1)) == 0);
+  return (v & -n);
+}
diff --git a/gdb/common/common-utils.h b/gdb/common/common-utils.h
index 5408c354693..7bc6e90f05c 100644
--- a/gdb/common/common-utils.h
+++ b/gdb/common/common-utils.h
@@ -151,4 +151,36 @@ in_inclusive_range (T value, T low, T high)
    we're expecting a regular file.  */
 extern bool is_regular_file (const char *name, int *errno_ptr);
 
+/* Ensure that V is aligned to an N byte boundary (B's assumed to be a
+   power of 2).  Round up/down when necessary.  Examples of correct
+   use include:
+
+    addr = align_up (addr, 8); -- VALUE needs 8 byte alignment
+    write_memory (addr, value, len);
+    addr += len;
+
+   and:
+
+    sp = align_down (sp - len, 16); -- Keep SP 16 byte aligned
+    write_memory (sp, value, len);
+
+   Note that uses such as:
+
+    write_memory (addr, value, len);
+    addr += align_up (len, 8);
+
+   and:
+
+    sp -= align_up (len, 8);
+    write_memory (sp, value, len);
+
+   are typically not correct as they don't ensure that the address (SP
+   or ADDR) is correctly aligned (relying on previous alignment to
+   keep things right).  This is also why the methods are called
+   "align_..." instead of "round_..." as the latter reads better with
+   this incorrect coding style.  */
+
+extern ULONGEST align_up (ULONGEST v, int n);
+extern ULONGEST align_down (ULONGEST v, int n);
+
 #endif
diff --git a/gdb/gdbserver/linux-aarch64-low.c b/gdb/gdbserver/linux-aarch64-low.c
index 133572ef0c5..7ea24c23634 100644
--- a/gdb/gdbserver/linux-aarch64-low.c
+++ b/gdb/gdbserver/linux-aarch64-low.c
@@ -360,13 +360,12 @@ aarch64_stopped_data_address (void)
   state = aarch64_get_debug_reg_state (pid_of (current_thread));
   for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
     {
-      const unsigned int offset = aarch64_watchpoint_offset
-							 (state->dr_ctrl_wp[i]);
+      const unsigned int offset
+	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
       const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
       const CORE_ADDR addr_trap = (CORE_ADDR) siginfo.si_addr;
       const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
-      const CORE_ADDR addr_watch_aligned = (state->dr_addr_wp[i]
-					    & -(CORE_ADDR) 8);
+      const CORE_ADDR addr_watch_aligned = align_down (state->dr_addr_wp[i], 8);
       const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
 
       if (state->dr_ref_count_wp[i]
@@ -374,14 +373,24 @@ aarch64_stopped_data_address (void)
 	  && addr_trap >= addr_watch_aligned
 	  && addr_trap < addr_watch + len)
 	{
-	  /* ADDR_TRAP reports first address of the range touched by CPU.
-	     ADDR_TRAP may be before the ADDR_WATCH..ADDR_WATCH+LEN range
-	     for larger CPU access hitting the watchpoint by its tail part.
-	     ADDR_TRAP must be in the ADDR_WATCH_ALIGNED..ADDR_WATCH+LEN range.
-	     Never report RETVAL outside of any ADDR_WATCH..ADDR_WATCH+LEN
-	     range (not matching any known watchpoint range).
-	     ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false positive due to
-	     PR external/20207 buggy kernels.  */
+	  /* ADDR_TRAP reports the first address of the memory range
+	     accessed by the CPU, regardless of what was the memory
+	     range watched.  Thus, a large CPU access that straddles
+	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
+	     ADDR_TRAP that is lower than the
+	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
+
+	     addr: |   4   |   5   |   6   |   7   |   8   |
+				   |---- range watched ----|
+		   |----------- range accessed ------------|
+
+	     In this case, ADDR_TRAP will be 4.
+
+	     To match a watchpoint known to GDB core, we must never
+	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
+	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
+	     positive on kernels older than 4.10.  See PR
+	     external/20207.  */
 	  return addr_orig;
 	}
     }
diff --git a/gdb/nat/aarch64-linux-hw-point.c b/gdb/nat/aarch64-linux-hw-point.c
index ef0022ac0e1..a3931ea6a94 100644
--- a/gdb/nat/aarch64-linux-hw-point.c
+++ b/gdb/nat/aarch64-linux-hw-point.c
@@ -34,10 +34,13 @@
 int aarch64_num_bp_regs;
 int aarch64_num_wp_regs;
 
-/* TRUE means this kernel has fixed PR external/20207.
-   Fixed kernel supports any contiguous range of bits in 8-bit byte
-   DR_CONTROL_MASK.  Buggy kernel supports only 0x01, 0x03, 0x0f and 0xff.  */
-static bool have_any_contiguous (true);
+/* True if this kernel does not have the bug described by PR
+   external/20207 (Linux >= 4.10).  A fixed kernel supports any
+   contiguous range of bits in 8-bit byte DR_CONTROL_MASK.  A buggy
+   kernel supports only 0x01, 0x03, 0x0f and 0xff.  We start by
+   assuming the bug is fixed, and then detect the bug at
+   PTRACE_SETREGSET time.  */
+static bool kernel_supports_any_contiguous_range = true;
 
 /* Return starting byte 0..7 incl. of a watchpoint encoded by CTRL.  */
 
@@ -47,7 +50,7 @@ aarch64_watchpoint_offset (unsigned int ctrl)
   uint8_t mask = DR_CONTROL_MASK (ctrl);
   unsigned retval;
 
-  // Shift out bottom zeroes.
+  /* Shift out bottom zeros.  */
   for (retval = 0; mask && (mask & 1) == 0; ++retval)
     mask >>= 1;
 
@@ -56,8 +59,8 @@ aarch64_watchpoint_offset (unsigned int ctrl)
 
 /* Utility function that returns the length in bytes of a watchpoint
    according to the content of a hardware debug control register CTRL.
-   Any contiguous range of bytes in CTRL is supported.  Returned value
-   can be between 0..8 incl.  */
+   Any contiguous range of bytes in CTRL is supported.  The returned
+   value can be between 0..8 (inclusive).  */
 
 unsigned int
 aarch64_watchpoint_length (unsigned int ctrl)
@@ -65,16 +68,16 @@ aarch64_watchpoint_length (unsigned int ctrl)
   uint8_t mask = DR_CONTROL_MASK (ctrl);
   unsigned retval;
 
-  // Shift out bottom zeroes.
+  /* Shift out bottom zeros.  */
   mask >>= aarch64_watchpoint_offset (ctrl);
 
-  // Count bottom ones;
+  /* Count bottom ones.  */
   for (retval = 0; (mask & 1) != 0; ++retval)
     mask >>= 1;
 
-  if (mask)
+  if (mask != 0)
     error (_("Unexpected hardware watchpoint length register value 0x%x"),
-      DR_CONTROL_MASK (ctrl));
+	   DR_CONTROL_MASK (ctrl));
 
   return retval;
 }
@@ -88,7 +91,7 @@ aarch64_point_encode_ctrl_reg (enum target_hw_bp_type type, int offset, int len)
 {
   unsigned int ctrl, ttype;
 
-  gdb_assert (offset == 0 || have_any_contiguous);
+  gdb_assert (offset == 0 || kernel_supports_any_contiguous_range);
   gdb_assert (offset + len <= AARCH64_HWP_MAX_LEN_PER_REG);
 
   /* type */
@@ -157,51 +160,56 @@ aarch64_point_is_aligned (int is_watchpoint, CORE_ADDR addr, int len)
   if (addr & (alignment - 1))
     return 0;
 
-  if ((!have_any_contiguous && len != 8 && len != 4 && len != 2 && len != 1)
-      || (have_any_contiguous && (len < 1 || len > 8)))
+  if ((!kernel_supports_any_contiguous_range
+       && len != 8 && len != 4 && len != 2 && len != 1)
+      || (kernel_supports_any_contiguous_range
+	  && (len < 1 || len > 8)))
     return 0;
 
   return 1;
 }
 
 /* Given the (potentially unaligned) watchpoint address in ADDR and
-   length in LEN, return the aligned address and aligned length in
-   *ALIGNED_ADDR_P and *ALIGNED_LEN_P, respectively.  The returned
-   aligned address and length will be valid values to write to the
-   hardware watchpoint value and control registers.
+   length in LEN, return the aligned address, offset from that base
+   address, and aligned length in *ALIGNED_ADDR_P, *ALIGNED_OFFSET_P
+   and *ALIGNED_LEN_P, respectively.  The returned values will be
+   valid values to write to the hardware watchpoint value and control
+   registers.
 
    The given watchpoint may get truncated if more than one hardware
    register is needed to cover the watched region.  *NEXT_ADDR_P
    and *NEXT_LEN_P, if non-NULL, will return the address and length
    of the remaining part of the watchpoint (which can be processed
-   by calling this routine again to generate another aligned address
-   and length pair.
+   by calling this routine again to generate another aligned address,
+   offset and length tuple.
 
    Essentially, unaligned watchpoint is achieved by minimally
    enlarging the watched area to meet the alignment requirement, and
    if necessary, splitting the watchpoint over several hardware
-   watchpoint registers.  The trade-off is that there will be
-   false-positive hits for the read-type or the access-type hardware
-   watchpoints; for the write type, which is more commonly used, there
-   will be no such issues, as the higher-level breakpoint management
-   in gdb always examines the exact watched region for any content
-   change, and transparently resumes a thread from a watchpoint trap
-   if there is no change to the watched region.
+   watchpoint registers.
+
+   On kernels that predate the support for Byte Address Select (BAS)
+   in the hardware watchpoint control register, the offset from the
+   base address is always zero, and so in that case the trade-off is
+   that there will be false-positive hits for the read-type or the
+   access-type hardware watchpoints; for the write type, which is more
+   commonly used, there will be no such issues, as the higher-level
+   breakpoint management in gdb always examines the exact watched
+   region for any content change, and transparently resumes a thread
+   from a watchpoint trap if there is no change to the watched region.
 
    Another limitation is that because the watched region is enlarged,
-   the watchpoint fault address returned by
+   the watchpoint fault address discovered by
    aarch64_stopped_data_address may be outside of the original watched
    region, especially when the triggering instruction is accessing a
    larger region.  When the fault address is not within any known
    range, watchpoints_triggered in gdb will get confused, as the
    higher-level watchpoint management is only aware of original
    watched regions, and will think that some unknown watchpoint has
-   been triggered.  In such a case, gdb may stop without displaying
-   any detailed information.
-
-   Once the kernel provides the full support for Byte Address Select
-   (BAS) in the hardware watchpoint control register, these
-   limitations can be largely relaxed with some further work.  */
+   been triggered.  To prevent such a case,
+   aarch64_stopped_data_address implementations in gdb and gdbserver
+   try to match the trapped address with a watched region, and return
+   an address within the latter. */
 
 static void
 aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
@@ -221,11 +229,12 @@ aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
   if (len <= 0)
     return;
 
-  /* Address to be put into the hardware watchpoint value register
-     must be aligned.  */
+  /* The address put into the hardware watchpoint value register must
+     be aligned.  */
   offset = addr & (alignment - 1);
   aligned_addr = addr - offset;
-  aligned_offset = have_any_contiguous ? addr & (alignment - 1) : 0;
+  aligned_offset
+    = kernel_supports_any_contiguous_range ? addr & (alignment - 1) : 0;
 
   gdb_assert (offset >= 0 && offset < alignment);
   gdb_assert (aligned_addr >= 0 && aligned_addr <= addr);
@@ -233,9 +242,10 @@ aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
 
   if (offset + len >= max_wp_len)
     {
-      /* Need more than one watchpoint registers; truncate it at the
+      /* Need more than one watchpoint register; truncate at the
 	 alignment boundary.  */
-      aligned_len = max_wp_len - (have_any_contiguous ? offset : 0);
+      aligned_len
+	= max_wp_len - (kernel_supports_any_contiguous_range ? offset : 0);
       len -= (max_wp_len - offset);
       addr += (max_wp_len - offset);
       gdb_assert ((addr & (alignment - 1)) == 0);
@@ -248,7 +258,7 @@ aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
 	aligned_len_array[AARCH64_HWP_MAX_LEN_PER_REG] =
 	{ 1, 2, 4, 4, 8, 8, 8, 8 };
 
-      aligned_len = (have_any_contiguous
+      aligned_len = (kernel_supports_any_contiguous_range
 		     ? len : aligned_len_array[offset + len - 1]);
       addr += len;
       len = 0;
@@ -265,8 +275,7 @@ aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
   if (next_len_p)
     *next_len_p = len;
   if (next_addr_orig_p)
-    *next_addr_orig_p = (((*next_addr_orig_p) + alignment)
-			 & -(CORE_ADDR) alignment);
+    *next_addr_orig_p = align_down (*next_addr_orig_p + alignment, alignment);
 }
 
 struct aarch64_dr_update_callback_param
@@ -356,13 +365,15 @@ aarch64_notify_debug_reg_change (const struct aarch64_debug_reg_state *state,
   iterate_over_lwps (pid_ptid, debug_reg_change_callback, (void *) &param);
 }
 
-/* Reconfigure STATE to be compatible with Linux kernel with buggy
-   PR external/20207 during HAVE_ANY_CONTIGUOUS true->false change.
-   A regression for buggy kernels is that originally GDB could support
-   more watchpoint combinations that had matching (and thus shared)
-   masks.  On such buggy kernels new GDB will try to first setup the
-   perfect matching ranges which will run out of registers (before this
-   function can merge them).  */
+/* Reconfigure STATE to be compatible with Linux kernels with the PR
+   external/20207 bug.  This is called when
+   KERNEL_SUPPORTS_ANY_CONTIGUOUS_RANGE transitions to false.  Note we
+   don't try to support combining watchpoints with matching (and thus
+   shared) masks, as it's too late when we get here.  On buggy
+   kernels, GDB will try to first setup the perfect matching ranges,
+   which will run out of registers before this function can merge
+   them.  It doesn't look like worth the effort to improve that, given
+   eventually buggy kernels will be phased out.  */
 
 static void
 aarch64_downgrade_regs (struct aarch64_debug_reg_state *state)
@@ -373,9 +384,9 @@ aarch64_downgrade_regs (struct aarch64_debug_reg_state *state)
 	gdb_assert (state->dr_ref_count_wp[i] != 0);
 	uint8_t mask_orig = (state->dr_ctrl_wp[i] >> 5) & 0xff;
 	gdb_assert (mask_orig != 0);
-	const std::array<uint8_t, 4> old_valid ({ 0x01, 0x03, 0x0f, 0xff });
-	uint8_t mask (0);
-	for (const uint8_t old_mask:old_valid)
+	static const uint8_t old_valid[] = { 0x01, 0x03, 0x0f, 0xff };
+	uint8_t mask = 0;
+	for (const uint8_t old_mask : old_valid)
 	  if (mask_orig <= old_mask)
 	    {
 	      mask = old_mask;
@@ -383,13 +394,14 @@ aarch64_downgrade_regs (struct aarch64_debug_reg_state *state)
 	    }
 	gdb_assert (mask != 0);
 
-	// No update needed for this watchpoint?
+	/* No update needed for this watchpoint?  */
 	if (mask == mask_orig)
 	  continue;
 	state->dr_ctrl_wp[i] |= mask << 5;
-	state->dr_addr_wp[i] &= -(CORE_ADDR) AARCH64_HWP_ALIGNMENT;
+	state->dr_addr_wp[i]
+	  = align_down (state->dr_addr_wp[i], AARCH64_HWP_ALIGNMENT);
 
-	// Try to match duplicate entries.
+	/* Try to match duplicate entries.  */
 	for (int j = 0; j < i; ++j)
 	  if ((state->dr_ctrl_wp[j] & 1) != 0
 	      && state->dr_addr_wp[j] == state->dr_addr_wp[i]
@@ -692,10 +704,11 @@ aarch64_linux_set_debug_regs (struct aarch64_debug_reg_state *state,
 	      watchpoint ? NT_ARM_HW_WATCH : NT_ARM_HW_BREAK,
 	      (void *) &iov))
     {
-      // Handle PR external/20207 buggy kernels?
-      if (watchpoint && errno == EINVAL && have_any_contiguous)
+      /* Handle Linux kernels with the PR external/20207 bug.  */
+      if (watchpoint && errno == EINVAL
+	  && kernel_supports_any_contiguous_range)
 	{
-	  have_any_contiguous = false;
+	  kernel_supports_any_contiguous_range = false;
 	  aarch64_downgrade_regs (state);
 	  aarch64_linux_set_debug_regs (state, tid, watchpoint);
 	  return;
diff --git a/gdb/nat/aarch64-linux-hw-point.h b/gdb/nat/aarch64-linux-hw-point.h
index 7c3513e301e..1940b06a89e 100644
--- a/gdb/nat/aarch64-linux-hw-point.h
+++ b/gdb/nat/aarch64-linux-hw-point.h
@@ -147,9 +147,9 @@ struct aarch64_debug_reg_state
   unsigned int dr_ref_count_bp[AARCH64_HBP_MAX_NUM];
 
   /* hardware watchpoint */
-  // Address aligned down to AARCH64_HWP_ALIGNMENT.
+  /* Address aligned down to AARCH64_HWP_ALIGNMENT.  */
   CORE_ADDR dr_addr_wp[AARCH64_HWP_MAX_NUM];
-  // Address as entered by user without any alignment.
+  /* Address as entered by user without any forced alignment.  */
   CORE_ADDR dr_addr_orig_wp[AARCH64_HWP_MAX_NUM];
   unsigned int dr_ctrl_wp[AARCH64_HWP_MAX_NUM];
   unsigned int dr_ref_count_wp[AARCH64_HWP_MAX_NUM];
diff --git a/gdb/testsuite/gdb.base/watchpoint-unaligned.c b/gdb/testsuite/gdb.base/watchpoint-unaligned.c
index f653e77529d..ea844e947f6 100644
--- a/gdb/testsuite/gdb.base/watchpoint-unaligned.c
+++ b/gdb/testsuite/gdb.base/watchpoint-unaligned.c
@@ -1,6 +1,6 @@
 /* This testcase is part of GDB, the GNU debugger.
 
-   Copyright 2017 Free Software Foundation, Inc.
+   Copyright 2017-2018 Free Software Foundation, Inc.
 
    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -62,7 +62,7 @@ main (void)
 	  assert (0);
 	}
       size = 0;
-      size = size; /* again_start */
+      size = size; /* start_again */
     }
   return 0;
 }
diff --git a/gdb/testsuite/gdb.base/watchpoint-unaligned.exp b/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
index 8d463b12fad..833997b04d7 100644
--- a/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
+++ b/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
@@ -1,4 +1,4 @@
-# Copyright 2017 Free Software Foundation, Inc.
+# Copyright 2017-2018 Free Software Foundation, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -11,11 +11,12 @@
 # GNU General Public License for more details.
 #
 # You should have received a copy of the GNU General Public License
-# along with this program; if not, write to the Free Software
-# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
 #
 # This file is part of the gdb testsuite.
 
+# Test inserting read watchpoints on unaligned addresses.
+
 standard_testfile
 if { [prepare_for_testing ${testfile}.exp ${testfile} ${srcfile}] } {
     return -1
@@ -26,21 +27,21 @@ if ![runto_main] {
     return -1
 }
 
-gdb_breakpoint "[gdb_get_line_number "again_start"]" "Breakpoint $decimal at $hex" "again_start"
+gdb_breakpoint [gdb_get_line_number "start_again"] "Breakpoint $decimal at $hex" "start_again"
 
 set sizes {1 2 4 8}
 array set alignedend {1 1  2 2  3 4  4 4  5 8  6 8  7 8  8 8}
 
 foreach wpsize $sizes {
-    for {set wpoffset 0} {$wpoffset < 8/$wpsize} {incr wpoffset} {
+    for {set wpoffset 0} {$wpoffset < 8 / $wpsize} {incr wpoffset} {
 	set wpstart [expr $wpoffset * $wpsize]
 	set wpend [expr ($wpoffset + 1) * $wpsize]
 	set wpendaligned $alignedend($wpend)
 	foreach rdsize $sizes {
-	    for {set rdoffset 0} {$rdoffset < 8/$rdsize} {incr rdoffset} {
+	    for {set rdoffset 0} {$rdoffset < 8 / $rdsize} {incr rdoffset} {
 		set rdstart [expr $rdoffset * $rdsize]
 		set rdend [expr ($rdoffset + 1) * $rdsize]
-		set expect_hit [expr max($wpstart,$rdstart) < min($wpend,$rdend)]
+		set expect_hit [expr max ($wpstart, $rdstart) < min ($wpend, $rdend)]
 		set test "rwatch data.u.size$wpsize\[$wpoffset\]"
 		set wpnum ""
 		gdb_test_multiple $test $test {
@@ -51,25 +52,25 @@ foreach wpsize $sizes {
 		gdb_test_no_output "set variable size = $rdsize" ""
 		gdb_test_no_output "set variable offset = $rdoffset" ""
 		set test "continue"
-		set did_hit 0
+		set got_hit 0
 		gdb_test_multiple $test $test {
 		    -re "Hardware read watchpoint $wpnum:.*Value = .*\r\n$gdb_prompt $" {
-			set did_hit 1
+			set got_hit 1
 			send_gdb "continue\n"
 			exp_continue
 		    }
-		    -re " again_start .*\r\n$gdb_prompt $" {
+		    -re " start_again .*\r\n$gdb_prompt $" {
 		    }
 		}
 		gdb_test_no_output "delete $wpnum" ""
 		set test "wp(size=$wpsize offset=$wpoffset) rd(size=$rdsize offset=$rdoffset) expect=$expect_hit"
-		if {$expect_hit == $did_hit} {
+		if {$expect_hit == $got_hit} {
 		    pass $test
 		} else {
-		    # We do not know if we run on a fixed kernel or not.
-		    # Report XFAIL only in the FAIL case.
+		    # We do not know if we run on a fixed Linux kernel
+		    # or not.  Report XFAIL only in the FAIL case.
 		    if {$expect_hit == 0 && $rdstart < $wpendaligned} {
-			setup_xfail external/20207 "aarch64-*-*"
+			setup_xfail external/20207 "aarch64*-*-linux*"
 		    }
 		    fail $test
 		}
@@ -92,16 +93,16 @@ foreach wpcount {4 7} {
     gdb_test_no_output "set variable size = 1" ""
     gdb_test_no_output "set variable offset = 1" ""
     set test "continue"
-    set did_hit 0
+    set got_hit 0
     gdb_test_multiple $test $test {
 	-re "\r\nCould not insert hardware watchpoint .*\r\n$gdb_prompt $" {
 	}
 	-re "Hardware read watchpoint $wpoffset_to_wpnum(1):.*Value = .*\r\n$gdb_prompt $" {
-	    set did_hit 1
+	    set got_hit 1
 	    send_gdb "continue\n"
 	    exp_continue
 	}
-	-re " again_start .*\r\n$gdb_prompt $" {
+	-re " start_again .*\r\n$gdb_prompt $" {
 	}
     }
     for {set wpoffset 1} {$wpoffset <= $wpcount} {incr wpoffset} {
@@ -111,9 +112,5 @@ foreach wpcount {4 7} {
     if {$wpcount > 4} {
 	setup_kfail tdep/22389 *-*-*
     }
-    if {$did_hit} {
-	pass $test
-    } else {
-	fail $test
-    }
+    gdb_assert $got_hit $test
 }
diff --git a/gdb/utils.c b/gdb/utils.c
index b957b0dc5c2..e274f026764 100644
--- a/gdb/utils.c
+++ b/gdb/utils.c
@@ -2849,22 +2849,6 @@ gdb_realpath_tests ()
 
 #endif /* GDB_SELF_TEST */
 
-ULONGEST
-align_up (ULONGEST v, int n)
-{
-  /* Check that N is really a power of two.  */
-  gdb_assert (n && (n & (n-1)) == 0);
-  return (v + n - 1) & -n;
-}
-
-ULONGEST
-align_down (ULONGEST v, int n)
-{
-  /* Check that N is really a power of two.  */
-  gdb_assert (n && (n & (n-1)) == 0);
-  return (v & -n);
-}
-
 /* Allocation function for the libiberty hash table which uses an
    obstack.  The obstack is passed as DATA.  */
 
diff --git a/gdb/utils.h b/gdb/utils.h
index 4dec889d83e..c728449429e 100644
--- a/gdb/utils.h
+++ b/gdb/utils.h
@@ -494,38 +494,6 @@ extern pid_t wait_to_die_with_timeout (pid_t pid, int *status, int timeout);
 
 extern int myread (int, char *, int);
 
-/* Ensure that V is aligned to an N byte boundary (B's assumed to be a
-   power of 2).  Round up/down when necessary.  Examples of correct
-   use include:
-
-   addr = align_up (addr, 8); -- VALUE needs 8 byte alignment
-   write_memory (addr, value, len);
-   addr += len;
-
-   and:
-
-   sp = align_down (sp - len, 16); -- Keep SP 16 byte aligned
-   write_memory (sp, value, len);
-
-   Note that uses such as:
-
-   write_memory (addr, value, len);
-   addr += align_up (len, 8);
-
-   and:
-
-   sp -= align_up (len, 8);
-   write_memory (sp, value, len);
-
-   are typically not correct as they don't ensure that the address (SP
-   or ADDR) is correctly aligned (relying on previous alignment to
-   keep things right).  This is also why the methods are called
-   "align_..." instead of "round_..." as the latter reads better with
-   this incorrect coding style.  */
-
-extern ULONGEST align_up (ULONGEST v, int n);
-extern ULONGEST align_down (ULONGEST v, int n);
-
 /* Resource limits used by getrlimit and setrlimit.  */
 
 enum resource_limit_kind
-- 
2.14.3
Pedro Alves April 20, 2018, 2:55 p.m. | #2
On 04/20/2018 03:49 PM, Pedro Alves wrote:

> Here's the patch that I intend to squash with yours before

> merging (keeping you as --author).

> 

> Could you double-check to see if I missed something?


Like this.

From 684b840e7cf9771d3b4d06529200048e2731a410 Mon Sep 17 00:00:00 2001
From: Jan Kratochvil <jan.kratochvil@redhat.com>

Date: Fri, 20 Apr 2018 15:51:57 +0100
Subject: [PATCH] Aarch64: Fix watchpoints set on non-8-byte-aligned addresses
 are always missed (PR 19806)

Ref: https://sourceware.org/bugzilla/show_bug.cgi?id=19806

As described in detail on the bug report, on Aarch64, some unaligned
watchpoints are currently missed.  For example, with:

 union
 {
   char buf[4];
   unsigned int ul;
 } u;

 int
 main ()
 {
   u.ul = 0xffffffff;
   return 0;
 }

on x86-64, we get:

 (gdb) watch u.buf[1]
 Hardware watchpoint 1: u.buf[1]
 (gdb) c
 Continuing.
 Hardware watchpoint 1: u.buf[1]

 Old value = 0 '\000'
 New value = -1 '\377'
 main () at watch.c:11
 11              return 0;
 (gdb)

While on Aarch64, GDB misses the watchpoint hit.

Actually, the kernel reports the hit to gdb, and linux-nat.c forwards
the event to infrun.c.  However, it doesn't work as expected because
the Aarch64 backend code aligns the inserted watchpoint's address to
an 8-byte boundary, and then when the watchpoint triggers, the backend
reports that aligned address as the watchpoint stop address.  Since
that address falls out of any memory range covered by watchpoints that
the core of GDB knows about, the watchpoint hit is not reported to the
user (it's considered a spurious/moribund watchpoint trap, and
ignored).

This patch fixes it, by trying to match the kernel-reported trapped
address (ADDR_TRAP) with a watched region, so that the core of gdb
figures out which watchpoint triggered.

Additionally this patch makes GDB support any watchpoint masks as
described here:

 kernel RFE: aarch64: ptrace: BAS: Support any contiguous range
 https://sourceware.org/bugzilla/show_bug.cgi?id=20207

The above is already fixed in current Linux kernels.

The patch trades missing watchpoints (false negatives) for the
occasional rwatch/awatch false positive on unfixed kernels.  The
latter can happen if you have watchpoints set in near addresses; gdb
may report the wrong watchpoint being hit.

Note the change causes the aarch64 backend to merge fewer watchpoints,
since it now only considers the requested ranges for merging instead
of the enlarged/aligned ranges.  I.e., with multiple overlapping
watchpoints, gdb may run out of the 4 hardware watchpoint registers
earlier than before.  While we could make gdb considered the enlarged
ranges when running on an older kernel, I do not think it's worth the
bother, since older kernels will eventually be phased out.

Tested on RHEL-7.{3,4} for no regressions on:
	kernel-4.10.0-6.el7.aarch64 (contiguous watchpoints supported)
	kernel-4.5.0-15.el7.aarch64 (contiguous watchpoints unsupported)

gdb/ChangeLog:
yyyy-mm-dd  Jan Kratochvil  <jan.kratochvil@redhat.com>
	    Pedro Alves <palves@redhat.com>

	PR breakpoints/19806 and support for PR external/20207.
	* NEWS: Mention Aarch64 watchpoint improvements.
	* aarch64-linux-nat.c (aarch64_linux_stopped_data_address): Fix missed
	watchpoints and PR external/20207 watchpoints.
	* nat/aarch64-linux-hw-point.c
	(kernel_supports_any_contiguous_range): New.
	(aarch64_watchpoint_offset): New.
	(aarch64_watchpoint_length): Support PR external/20207 watchpoints.
	(aarch64_point_encode_ctrl_reg): New parameter offset, new asserts.
	(aarch64_point_is_aligned): Support PR external/20207 watchpoints.
	(aarch64_align_watchpoint): New parameters aligned_offset_p and
	next_addr_orig_p.  Support PR external/20207 watchpoints.
	(aarch64_downgrade_regs): New.
	(aarch64_dr_state_insert_one_point): New parameters offset and
	addr_orig.
	(aarch64_dr_state_remove_one_point): Likewise.
	(aarch64_handle_breakpoint): Update caller.
	(aarch64_handle_aligned_watchpoint): Likewise.
	(aarch64_handle_unaligned_watchpoint): Support addr_orig and
	aligned_offset.
	(aarch64_linux_set_debug_regs): Remove const from state.  Call
	aarch64_downgrade_regs.
	(aarch64_show_debug_reg_state): Print also dr_addr_orig_wp.
	* nat/aarch64-linux-hw-point.h (DR_CONTROL_LENGTH): Rename to ...
	(DR_CONTROL_MASK): ... this.
	(struct aarch64_debug_reg_state): New field dr_addr_orig_wp.
	(unsigned int aarch64_watchpoint_offset): New prototype.
	(aarch64_linux_set_debug_regs): Remove const from state.
	* utils.c (align_up, align_down): Move to ...
	* common/common-utils.c (align_up, align_down): ... here.
	* utils.h (align_up, align_down): Move to ...
	* common/common-utils.h (align_up, align_down): ... here.

gdb/gdbserver/ChangeLog:
yyyy-mm-dd  Jan Kratochvil  <jan.kratochvil@redhat.com>
	    Pedro Alves <palves@redhat.com>

	* linux-aarch64-low.c (aarch64_stopped_data_address):
	Likewise.

gdb/testsuite/ChangeLog:
yyyy-mm-dd  Jan Kratochvil  <jan.kratochvil@redhat.com>
	    Pedro Alves <palves@redhat.com>

	PR breakpoints/19806 and support for PR external/20207.
	* gdb.base/watchpoint-unaligned.c: New file.
	* gdb.base/watchpoint-unaligned.exp: New file.
---
 gdb/NEWS                                        |  10 +
 gdb/aarch64-linux-nat.c                         |  28 ++-
 gdb/common/common-utils.c                       |  20 ++
 gdb/common/common-utils.h                       |  32 +++
 gdb/gdbserver/linux-aarch64-low.c               |  31 ++-
 gdb/nat/aarch64-linux-hw-point.c                | 277 ++++++++++++++++++------
 gdb/nat/aarch64-linux-hw-point.h                |  10 +-
 gdb/testsuite/gdb.base/watchpoint-unaligned.c   |  68 ++++++
 gdb/testsuite/gdb.base/watchpoint-unaligned.exp | 116 ++++++++++
 gdb/utils.c                                     |  16 --
 gdb/utils.h                                     |  32 ---
 11 files changed, 512 insertions(+), 128 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/watchpoint-unaligned.c
 create mode 100644 gdb/testsuite/gdb.base/watchpoint-unaligned.exp

diff --git a/gdb/NEWS b/gdb/NEWS
index 63fe30d1754..38043c4ff2e 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -35,6 +35,16 @@ SH-5/SH64 ELF			sh64-*-elf*, SH-5/SH64 support in sh*
 SH-5/SH64 running GNU/Linux	SH-5/SH64 support in sh*-*-linux*
 SH-5/SH64 running OpenBSD 	SH-5/SH64 support in sh*-*-openbsd*
 
+* Aarch64/Linux hardware watchpoints improvements
+
+  Hardware watchpoints on unaligned addresses are now properly
+  supported when running Linux kernel 4.10 or higher: read and access
+  watchpoints are no longer spuriously missed, and all watchpoints
+  lengths between 1 and 8 bytes are supported.  On older kernels,
+  watchpoints set on unaligned addresses are no longer missed, with
+  the tradeoff that there is a possibility of false hits being
+  reported.
+
 *** Changes in GDB 8.1
 
 * GDB now supports dynamically creating arbitrary register groups specified
diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index 9385659f140..421d044e108 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -735,16 +735,38 @@ aarch64_linux_stopped_data_address (struct target_ops *target,
   state = aarch64_get_debug_reg_state (ptid_get_pid (inferior_ptid));
   for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
     {
+      const unsigned int offset
+	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
       const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
       const CORE_ADDR addr_trap = (CORE_ADDR) siginfo.si_addr;
-      const CORE_ADDR addr_watch = state->dr_addr_wp[i];
+      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
+      const CORE_ADDR addr_watch_aligned = align_down (state->dr_addr_wp[i], 8);
+      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
 
       if (state->dr_ref_count_wp[i]
 	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
-	  && addr_trap >= addr_watch
+	  && addr_trap >= addr_watch_aligned
 	  && addr_trap < addr_watch + len)
 	{
-	  *addr_p = addr_trap;
+	  /* ADDR_TRAP reports the first address of the memory range
+	     accessed by the CPU, regardless of what was the memory
+	     range watched.  Thus, a large CPU access that straddles
+	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
+	     ADDR_TRAP that is lower than the
+	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
+
+	     addr: |   4   |   5   |   6   |   7   |   8   |
+				   |---- range watched ----|
+		   |----------- range accessed ------------|
+
+	     In this case, ADDR_TRAP will be 4.
+
+	     To match a watchpoint known to GDB core, we must never
+	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
+	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
+	     positive on kernels older than 4.10.  See PR
+	     external/20207.  */
+	  *addr_p = addr_orig;
 	  return 1;
 	}
     }
diff --git a/gdb/common/common-utils.c b/gdb/common/common-utils.c
index 80de826ba78..8d839d10fa8 100644
--- a/gdb/common/common-utils.c
+++ b/gdb/common/common-utils.c
@@ -440,3 +440,23 @@ is_regular_file (const char *name, int *errno_ptr)
     *errno_ptr = EINVAL;
   return false;
 }
+
+/* See common/common-utils.h.  */
+
+ULONGEST
+align_up (ULONGEST v, int n)
+{
+  /* Check that N is really a power of two.  */
+  gdb_assert (n && (n & (n-1)) == 0);
+  return (v + n - 1) & -n;
+}
+
+/* See common/common-utils.h.  */
+
+ULONGEST
+align_down (ULONGEST v, int n)
+{
+  /* Check that N is really a power of two.  */
+  gdb_assert (n && (n & (n-1)) == 0);
+  return (v & -n);
+}
diff --git a/gdb/common/common-utils.h b/gdb/common/common-utils.h
index 5408c354693..7bc6e90f05c 100644
--- a/gdb/common/common-utils.h
+++ b/gdb/common/common-utils.h
@@ -151,4 +151,36 @@ in_inclusive_range (T value, T low, T high)
    we're expecting a regular file.  */
 extern bool is_regular_file (const char *name, int *errno_ptr);
 
+/* Ensure that V is aligned to an N byte boundary (B's assumed to be a
+   power of 2).  Round up/down when necessary.  Examples of correct
+   use include:
+
+    addr = align_up (addr, 8); -- VALUE needs 8 byte alignment
+    write_memory (addr, value, len);
+    addr += len;
+
+   and:
+
+    sp = align_down (sp - len, 16); -- Keep SP 16 byte aligned
+    write_memory (sp, value, len);
+
+   Note that uses such as:
+
+    write_memory (addr, value, len);
+    addr += align_up (len, 8);
+
+   and:
+
+    sp -= align_up (len, 8);
+    write_memory (sp, value, len);
+
+   are typically not correct as they don't ensure that the address (SP
+   or ADDR) is correctly aligned (relying on previous alignment to
+   keep things right).  This is also why the methods are called
+   "align_..." instead of "round_..." as the latter reads better with
+   this incorrect coding style.  */
+
+extern ULONGEST align_up (ULONGEST v, int n);
+extern ULONGEST align_down (ULONGEST v, int n);
+
 #endif
diff --git a/gdb/gdbserver/linux-aarch64-low.c b/gdb/gdbserver/linux-aarch64-low.c
index eccac4da138..7ea24c23634 100644
--- a/gdb/gdbserver/linux-aarch64-low.c
+++ b/gdb/gdbserver/linux-aarch64-low.c
@@ -360,14 +360,39 @@ aarch64_stopped_data_address (void)
   state = aarch64_get_debug_reg_state (pid_of (current_thread));
   for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
     {
+      const unsigned int offset
+	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
       const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
       const CORE_ADDR addr_trap = (CORE_ADDR) siginfo.si_addr;
-      const CORE_ADDR addr_watch = state->dr_addr_wp[i];
+      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
+      const CORE_ADDR addr_watch_aligned = align_down (state->dr_addr_wp[i], 8);
+      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
+
       if (state->dr_ref_count_wp[i]
 	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
-	  && addr_trap >= addr_watch
+	  && addr_trap >= addr_watch_aligned
 	  && addr_trap < addr_watch + len)
-	return addr_trap;
+	{
+	  /* ADDR_TRAP reports the first address of the memory range
+	     accessed by the CPU, regardless of what was the memory
+	     range watched.  Thus, a large CPU access that straddles
+	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
+	     ADDR_TRAP that is lower than the
+	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
+
+	     addr: |   4   |   5   |   6   |   7   |   8   |
+				   |---- range watched ----|
+		   |----------- range accessed ------------|
+
+	     In this case, ADDR_TRAP will be 4.
+
+	     To match a watchpoint known to GDB core, we must never
+	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
+	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
+	     positive on kernels older than 4.10.  See PR
+	     external/20207.  */
+	  return addr_orig;
+	}
     }
 
   return (CORE_ADDR) 0;
diff --git a/gdb/nat/aarch64-linux-hw-point.c b/gdb/nat/aarch64-linux-hw-point.c
index ce26f28fad1..a3931ea6a94 100644
--- a/gdb/nat/aarch64-linux-hw-point.c
+++ b/gdb/nat/aarch64-linux-hw-point.c
@@ -34,29 +34,52 @@
 int aarch64_num_bp_regs;
 int aarch64_num_wp_regs;
 
+/* True if this kernel does not have the bug described by PR
+   external/20207 (Linux >= 4.10).  A fixed kernel supports any
+   contiguous range of bits in 8-bit byte DR_CONTROL_MASK.  A buggy
+   kernel supports only 0x01, 0x03, 0x0f and 0xff.  We start by
+   assuming the bug is fixed, and then detect the bug at
+   PTRACE_SETREGSET time.  */
+static bool kernel_supports_any_contiguous_range = true;
+
+/* Return starting byte 0..7 incl. of a watchpoint encoded by CTRL.  */
+
+unsigned int
+aarch64_watchpoint_offset (unsigned int ctrl)
+{
+  uint8_t mask = DR_CONTROL_MASK (ctrl);
+  unsigned retval;
+
+  /* Shift out bottom zeros.  */
+  for (retval = 0; mask && (mask & 1) == 0; ++retval)
+    mask >>= 1;
+
+  return retval;
+}
+
 /* Utility function that returns the length in bytes of a watchpoint
    according to the content of a hardware debug control register CTRL.
-   Note that the kernel currently only supports the following Byte
-   Address Select (BAS) values: 0x1, 0x3, 0xf and 0xff, which means
-   that for a hardware watchpoint, its valid length can only be 1
-   byte, 2 bytes, 4 bytes or 8 bytes.  */
+   Any contiguous range of bytes in CTRL is supported.  The returned
+   value can be between 0..8 (inclusive).  */
 
 unsigned int
 aarch64_watchpoint_length (unsigned int ctrl)
 {
-  switch (DR_CONTROL_LENGTH (ctrl))
-    {
-    case 0x01:
-      return 1;
-    case 0x03:
-      return 2;
-    case 0x0f:
-      return 4;
-    case 0xff:
-      return 8;
-    default:
-      return 0;
-    }
+  uint8_t mask = DR_CONTROL_MASK (ctrl);
+  unsigned retval;
+
+  /* Shift out bottom zeros.  */
+  mask >>= aarch64_watchpoint_offset (ctrl);
+
+  /* Count bottom ones.  */
+  for (retval = 0; (mask & 1) != 0; ++retval)
+    mask >>= 1;
+
+  if (mask != 0)
+    error (_("Unexpected hardware watchpoint length register value 0x%x"),
+	   DR_CONTROL_MASK (ctrl));
+
+  return retval;
 }
 
 /* Given the hardware breakpoint or watchpoint type TYPE and its
@@ -64,10 +87,13 @@ aarch64_watchpoint_length (unsigned int ctrl)
    breakpoint/watchpoint control register.  */
 
 static unsigned int
-aarch64_point_encode_ctrl_reg (enum target_hw_bp_type type, int len)
+aarch64_point_encode_ctrl_reg (enum target_hw_bp_type type, int offset, int len)
 {
   unsigned int ctrl, ttype;
 
+  gdb_assert (offset == 0 || kernel_supports_any_contiguous_range);
+  gdb_assert (offset + len <= AARCH64_HWP_MAX_LEN_PER_REG);
+
   /* type */
   switch (type)
     {
@@ -89,8 +115,8 @@ aarch64_point_encode_ctrl_reg (enum target_hw_bp_type type, int len)
 
   ctrl = ttype << 3;
 
-  /* length bitmask */
-  ctrl |= ((1 << len) - 1) << 5;
+  /* offset and length bitmask */
+  ctrl |= ((1 << len) - 1) << (5 + offset);
   /* enabled at el0 */
   ctrl |= (2 << 1) | 1;
 
@@ -134,58 +160,65 @@ aarch64_point_is_aligned (int is_watchpoint, CORE_ADDR addr, int len)
   if (addr & (alignment - 1))
     return 0;
 
-  if (len != 8 && len != 4 && len != 2 && len != 1)
+  if ((!kernel_supports_any_contiguous_range
+       && len != 8 && len != 4 && len != 2 && len != 1)
+      || (kernel_supports_any_contiguous_range
+	  && (len < 1 || len > 8)))
     return 0;
 
   return 1;
 }
 
 /* Given the (potentially unaligned) watchpoint address in ADDR and
-   length in LEN, return the aligned address and aligned length in
-   *ALIGNED_ADDR_P and *ALIGNED_LEN_P, respectively.  The returned
-   aligned address and length will be valid values to write to the
-   hardware watchpoint value and control registers.
+   length in LEN, return the aligned address, offset from that base
+   address, and aligned length in *ALIGNED_ADDR_P, *ALIGNED_OFFSET_P
+   and *ALIGNED_LEN_P, respectively.  The returned values will be
+   valid values to write to the hardware watchpoint value and control
+   registers.
 
    The given watchpoint may get truncated if more than one hardware
    register is needed to cover the watched region.  *NEXT_ADDR_P
    and *NEXT_LEN_P, if non-NULL, will return the address and length
    of the remaining part of the watchpoint (which can be processed
-   by calling this routine again to generate another aligned address
-   and length pair.
+   by calling this routine again to generate another aligned address,
+   offset and length tuple.
 
    Essentially, unaligned watchpoint is achieved by minimally
    enlarging the watched area to meet the alignment requirement, and
    if necessary, splitting the watchpoint over several hardware
-   watchpoint registers.  The trade-off is that there will be
-   false-positive hits for the read-type or the access-type hardware
-   watchpoints; for the write type, which is more commonly used, there
-   will be no such issues, as the higher-level breakpoint management
-   in gdb always examines the exact watched region for any content
-   change, and transparently resumes a thread from a watchpoint trap
-   if there is no change to the watched region.
+   watchpoint registers.
+
+   On kernels that predate the support for Byte Address Select (BAS)
+   in the hardware watchpoint control register, the offset from the
+   base address is always zero, and so in that case the trade-off is
+   that there will be false-positive hits for the read-type or the
+   access-type hardware watchpoints; for the write type, which is more
+   commonly used, there will be no such issues, as the higher-level
+   breakpoint management in gdb always examines the exact watched
+   region for any content change, and transparently resumes a thread
+   from a watchpoint trap if there is no change to the watched region.
 
    Another limitation is that because the watched region is enlarged,
-   the watchpoint fault address returned by
+   the watchpoint fault address discovered by
    aarch64_stopped_data_address may be outside of the original watched
    region, especially when the triggering instruction is accessing a
    larger region.  When the fault address is not within any known
    range, watchpoints_triggered in gdb will get confused, as the
    higher-level watchpoint management is only aware of original
    watched regions, and will think that some unknown watchpoint has
-   been triggered.  In such a case, gdb may stop without displaying
-   any detailed information.
-
-   Once the kernel provides the full support for Byte Address Select
-   (BAS) in the hardware watchpoint control register, these
-   limitations can be largely relaxed with some further work.  */
+   been triggered.  To prevent such a case,
+   aarch64_stopped_data_address implementations in gdb and gdbserver
+   try to match the trapped address with a watched region, and return
+   an address within the latter. */
 
 static void
 aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
-			  int *aligned_len_p, CORE_ADDR *next_addr_p,
-			  int *next_len_p)
+			  int *aligned_offset_p, int *aligned_len_p,
+			  CORE_ADDR *next_addr_p, int *next_len_p,
+			  CORE_ADDR *next_addr_orig_p)
 {
   int aligned_len;
-  unsigned int offset;
+  unsigned int offset, aligned_offset;
   CORE_ADDR aligned_addr;
   const unsigned int alignment = AARCH64_HWP_ALIGNMENT;
   const unsigned int max_wp_len = AARCH64_HWP_MAX_LEN_PER_REG;
@@ -196,10 +229,12 @@ aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
   if (len <= 0)
     return;
 
-  /* Address to be put into the hardware watchpoint value register
-     must be aligned.  */
+  /* The address put into the hardware watchpoint value register must
+     be aligned.  */
   offset = addr & (alignment - 1);
   aligned_addr = addr - offset;
+  aligned_offset
+    = kernel_supports_any_contiguous_range ? addr & (alignment - 1) : 0;
 
   gdb_assert (offset >= 0 && offset < alignment);
   gdb_assert (aligned_addr >= 0 && aligned_addr <= addr);
@@ -207,9 +242,10 @@ aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
 
   if (offset + len >= max_wp_len)
     {
-      /* Need more than one watchpoint registers; truncate it at the
+      /* Need more than one watchpoint register; truncate at the
 	 alignment boundary.  */
-      aligned_len = max_wp_len;
+      aligned_len
+	= max_wp_len - (kernel_supports_any_contiguous_range ? offset : 0);
       len -= (max_wp_len - offset);
       addr += (max_wp_len - offset);
       gdb_assert ((addr & (alignment - 1)) == 0);
@@ -222,19 +258,24 @@ aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
 	aligned_len_array[AARCH64_HWP_MAX_LEN_PER_REG] =
 	{ 1, 2, 4, 4, 8, 8, 8, 8 };
 
-      aligned_len = aligned_len_array[offset + len - 1];
+      aligned_len = (kernel_supports_any_contiguous_range
+		     ? len : aligned_len_array[offset + len - 1]);
       addr += len;
       len = 0;
     }
 
   if (aligned_addr_p)
     *aligned_addr_p = aligned_addr;
+  if (aligned_offset_p)
+    *aligned_offset_p = aligned_offset;
   if (aligned_len_p)
     *aligned_len_p = aligned_len;
   if (next_addr_p)
     *next_addr_p = addr;
   if (next_len_p)
     *next_len_p = len;
+  if (next_addr_orig_p)
+    *next_addr_orig_p = align_down (*next_addr_orig_p + alignment, alignment);
 }
 
 struct aarch64_dr_update_callback_param
@@ -324,17 +365,73 @@ aarch64_notify_debug_reg_change (const struct aarch64_debug_reg_state *state,
   iterate_over_lwps (pid_ptid, debug_reg_change_callback, (void *) &param);
 }
 
+/* Reconfigure STATE to be compatible with Linux kernels with the PR
+   external/20207 bug.  This is called when
+   KERNEL_SUPPORTS_ANY_CONTIGUOUS_RANGE transitions to false.  Note we
+   don't try to support combining watchpoints with matching (and thus
+   shared) masks, as it's too late when we get here.  On buggy
+   kernels, GDB will try to first setup the perfect matching ranges,
+   which will run out of registers before this function can merge
+   them.  It doesn't look like worth the effort to improve that, given
+   eventually buggy kernels will be phased out.  */
+
+static void
+aarch64_downgrade_regs (struct aarch64_debug_reg_state *state)
+{
+  for (int i = 0; i < aarch64_num_wp_regs; ++i)
+    if ((state->dr_ctrl_wp[i] & 1) != 0)
+      {
+	gdb_assert (state->dr_ref_count_wp[i] != 0);
+	uint8_t mask_orig = (state->dr_ctrl_wp[i] >> 5) & 0xff;
+	gdb_assert (mask_orig != 0);
+	static const uint8_t old_valid[] = { 0x01, 0x03, 0x0f, 0xff };
+	uint8_t mask = 0;
+	for (const uint8_t old_mask : old_valid)
+	  if (mask_orig <= old_mask)
+	    {
+	      mask = old_mask;
+	      break;
+	    }
+	gdb_assert (mask != 0);
+
+	/* No update needed for this watchpoint?  */
+	if (mask == mask_orig)
+	  continue;
+	state->dr_ctrl_wp[i] |= mask << 5;
+	state->dr_addr_wp[i]
+	  = align_down (state->dr_addr_wp[i], AARCH64_HWP_ALIGNMENT);
+
+	/* Try to match duplicate entries.  */
+	for (int j = 0; j < i; ++j)
+	  if ((state->dr_ctrl_wp[j] & 1) != 0
+	      && state->dr_addr_wp[j] == state->dr_addr_wp[i]
+	      && state->dr_addr_orig_wp[j] == state->dr_addr_orig_wp[i]
+	      && state->dr_ctrl_wp[j] == state->dr_ctrl_wp[i])
+	    {
+	      state->dr_ref_count_wp[j] += state->dr_ref_count_wp[i];
+	      state->dr_ref_count_wp[i] = 0;
+	      state->dr_addr_wp[i] = 0;
+	      state->dr_addr_orig_wp[i] = 0;
+	      state->dr_ctrl_wp[i] &= ~1;
+	      break;
+	    }
+
+	aarch64_notify_debug_reg_change (state, 1 /* is_watchpoint */, i);
+      }
+}
+
 /* Record the insertion of one breakpoint/watchpoint, as represented
    by ADDR and CTRL, in the process' arch-specific data area *STATE.  */
 
 static int
 aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
 				   enum target_hw_bp_type type,
-				   CORE_ADDR addr, int len)
+				   CORE_ADDR addr, int offset, int len,
+				   CORE_ADDR addr_orig)
 {
   int i, idx, num_regs, is_watchpoint;
   unsigned int ctrl, *dr_ctrl_p, *dr_ref_count;
-  CORE_ADDR *dr_addr_p;
+  CORE_ADDR *dr_addr_p, *dr_addr_orig_p;
 
   /* Set up state pointers.  */
   is_watchpoint = (type != hw_execute);
@@ -343,6 +440,7 @@ aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
     {
       num_regs = aarch64_num_wp_regs;
       dr_addr_p = state->dr_addr_wp;
+      dr_addr_orig_p = state->dr_addr_orig_wp;
       dr_ctrl_p = state->dr_ctrl_wp;
       dr_ref_count = state->dr_ref_count_wp;
     }
@@ -350,11 +448,12 @@ aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
     {
       num_regs = aarch64_num_bp_regs;
       dr_addr_p = state->dr_addr_bp;
+      dr_addr_orig_p = nullptr;
       dr_ctrl_p = state->dr_ctrl_bp;
       dr_ref_count = state->dr_ref_count_bp;
     }
 
-  ctrl = aarch64_point_encode_ctrl_reg (type, len);
+  ctrl = aarch64_point_encode_ctrl_reg (type, offset, len);
 
   /* Find an existing or free register in our cache.  */
   idx = -1;
@@ -366,7 +465,9 @@ aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
 	  idx = i;
 	  /* no break; continue hunting for an exising one.  */
 	}
-      else if (dr_addr_p[i] == addr && dr_ctrl_p[i] == ctrl)
+      else if (dr_addr_p[i] == addr
+	       && (dr_addr_orig_p == nullptr || dr_addr_orig_p[i] == addr_orig)
+	       && dr_ctrl_p[i] == ctrl)
 	{
 	  gdb_assert (dr_ref_count[i] != 0);
 	  idx = i;
@@ -383,6 +484,8 @@ aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
     {
       /* new entry */
       dr_addr_p[idx] = addr;
+      if (dr_addr_orig_p != nullptr)
+	dr_addr_orig_p[idx] = addr_orig;
       dr_ctrl_p[idx] = ctrl;
       dr_ref_count[idx] = 1;
       /* Notify the change.  */
@@ -403,11 +506,12 @@ aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
 static int
 aarch64_dr_state_remove_one_point (struct aarch64_debug_reg_state *state,
 				   enum target_hw_bp_type type,
-				   CORE_ADDR addr, int len)
+				   CORE_ADDR addr, int offset, int len,
+				   CORE_ADDR addr_orig)
 {
   int i, num_regs, is_watchpoint;
   unsigned int ctrl, *dr_ctrl_p, *dr_ref_count;
-  CORE_ADDR *dr_addr_p;
+  CORE_ADDR *dr_addr_p, *dr_addr_orig_p;
 
   /* Set up state pointers.  */
   is_watchpoint = (type != hw_execute);
@@ -415,6 +519,7 @@ aarch64_dr_state_remove_one_point (struct aarch64_debug_reg_state *state,
     {
       num_regs = aarch64_num_wp_regs;
       dr_addr_p = state->dr_addr_wp;
+      dr_addr_orig_p = state->dr_addr_orig_wp;
       dr_ctrl_p = state->dr_ctrl_wp;
       dr_ref_count = state->dr_ref_count_wp;
     }
@@ -422,15 +527,18 @@ aarch64_dr_state_remove_one_point (struct aarch64_debug_reg_state *state,
     {
       num_regs = aarch64_num_bp_regs;
       dr_addr_p = state->dr_addr_bp;
+      dr_addr_orig_p = nullptr;
       dr_ctrl_p = state->dr_ctrl_bp;
       dr_ref_count = state->dr_ref_count_bp;
     }
 
-  ctrl = aarch64_point_encode_ctrl_reg (type, len);
+  ctrl = aarch64_point_encode_ctrl_reg (type, offset, len);
 
   /* Find the entry that matches the ADDR and CTRL.  */
   for (i = 0; i < num_regs; ++i)
-    if (dr_addr_p[i] == addr && dr_ctrl_p[i] == ctrl)
+    if (dr_addr_p[i] == addr
+	&& (dr_addr_orig_p == nullptr || dr_addr_orig_p[i] == addr_orig)
+	&& dr_ctrl_p[i] == ctrl)
       {
 	gdb_assert (dr_ref_count[i] != 0);
 	break;
@@ -446,6 +554,8 @@ aarch64_dr_state_remove_one_point (struct aarch64_debug_reg_state *state,
       /* Clear the enable bit.  */
       ctrl &= ~1;
       dr_addr_p[i] = 0;
+      if (dr_addr_orig_p != nullptr)
+	dr_addr_orig_p[i] = 0;
       dr_ctrl_p[i] = ctrl;
       /* Notify the change.  */
       aarch64_notify_debug_reg_change (state, is_watchpoint, i);
@@ -472,10 +582,10 @@ aarch64_handle_breakpoint (enum target_hw_bp_type type, CORE_ADDR addr,
       if (!aarch64_point_is_aligned (0 /* is_watchpoint */ , addr, len))
 	return -1;
 
-      return aarch64_dr_state_insert_one_point (state, type, addr, len);
+      return aarch64_dr_state_insert_one_point (state, type, addr, 0, len, -1);
     }
   else
-    return aarch64_dr_state_remove_one_point (state, type, addr, len);
+    return aarch64_dr_state_remove_one_point (state, type, addr, 0, len, -1);
 }
 
 /* This is essentially the same as aarch64_handle_breakpoint, apart
@@ -487,9 +597,9 @@ aarch64_handle_aligned_watchpoint (enum target_hw_bp_type type,
 				   struct aarch64_debug_reg_state *state)
 {
   if (is_insert)
-    return aarch64_dr_state_insert_one_point (state, type, addr, len);
+    return aarch64_dr_state_insert_one_point (state, type, addr, 0, len, addr);
   else
-    return aarch64_dr_state_remove_one_point (state, type, addr, len);
+    return aarch64_dr_state_remove_one_point (state, type, addr, 0, len, addr);
 }
 
 /* Insert/remove unaligned watchpoint by calling
@@ -504,29 +614,42 @@ aarch64_handle_unaligned_watchpoint (enum target_hw_bp_type type,
 				     CORE_ADDR addr, int len, int is_insert,
 				     struct aarch64_debug_reg_state *state)
 {
+  CORE_ADDR addr_orig = addr;
+
   while (len > 0)
     {
       CORE_ADDR aligned_addr;
-      int aligned_len, ret;
+      int aligned_offset, aligned_len, ret;
+      CORE_ADDR addr_orig_next = addr_orig;
 
-      aarch64_align_watchpoint (addr, len, &aligned_addr, &aligned_len,
-				&addr, &len);
+      aarch64_align_watchpoint (addr, len, &aligned_addr, &aligned_offset,
+				&aligned_len, &addr, &len, &addr_orig_next);
 
       if (is_insert)
 	ret = aarch64_dr_state_insert_one_point (state, type, aligned_addr,
-						 aligned_len);
+						 aligned_offset,
+						 aligned_len, addr_orig);
       else
 	ret = aarch64_dr_state_remove_one_point (state, type, aligned_addr,
-						 aligned_len);
+						 aligned_offset,
+						 aligned_len, addr_orig);
 
       if (show_debug_regs)
 	debug_printf ("handle_unaligned_watchpoint: is_insert: %d\n"
 		      "                             "
 		      "aligned_addr: %s, aligned_len: %d\n"
 		      "                                "
-		      "next_addr: %s,    next_len: %d\n",
+		      "addr_orig: %s\n"
+		      "                                "
+		      "next_addr: %s,    next_len: %d\n"
+		      "                           "
+		      "addr_orig_next: %s\n",
 		      is_insert, core_addr_to_string_nz (aligned_addr),
-		      aligned_len, core_addr_to_string_nz (addr), len);
+		      aligned_len, core_addr_to_string_nz (addr_orig),
+		      core_addr_to_string_nz (addr), len,
+		      core_addr_to_string_nz (addr_orig_next));
+
+      addr_orig = addr_orig_next;
 
       if (ret != 0)
 	return ret;
@@ -552,7 +675,7 @@ aarch64_handle_watchpoint (enum target_hw_bp_type type, CORE_ADDR addr,
    registers with data from *STATE.  */
 
 void
-aarch64_linux_set_debug_regs (const struct aarch64_debug_reg_state *state,
+aarch64_linux_set_debug_regs (struct aarch64_debug_reg_state *state,
 			      int tid, int watchpoint)
 {
   int i, count;
@@ -580,7 +703,18 @@ aarch64_linux_set_debug_regs (const struct aarch64_debug_reg_state *state,
   if (ptrace (PTRACE_SETREGSET, tid,
 	      watchpoint ? NT_ARM_HW_WATCH : NT_ARM_HW_BREAK,
 	      (void *) &iov))
-    error (_("Unexpected error setting hardware debug registers"));
+    {
+      /* Handle Linux kernels with the PR external/20207 bug.  */
+      if (watchpoint && errno == EINVAL
+	  && kernel_supports_any_contiguous_range)
+	{
+	  kernel_supports_any_contiguous_range = false;
+	  aarch64_downgrade_regs (state);
+	  aarch64_linux_set_debug_regs (state, tid, watchpoint);
+	  return;
+	}
+      error (_("Unexpected error setting hardware debug registers"));
+    }
 }
 
 /* Print the values of the cached breakpoint/watchpoint registers.  */
@@ -611,8 +745,9 @@ aarch64_show_debug_reg_state (struct aarch64_debug_reg_state *state,
 
   debug_printf ("\tWATCHPOINTs:\n");
   for (i = 0; i < aarch64_num_wp_regs; i++)
-    debug_printf ("\tWP%d: addr=%s, ctrl=0x%08x, ref.count=%d\n",
+    debug_printf ("\tWP%d: addr=%s (orig=%s), ctrl=0x%08x, ref.count=%d\n",
 		  i, core_addr_to_string_nz (state->dr_addr_wp[i]),
+		  core_addr_to_string_nz (state->dr_addr_orig_wp[i]),
 		  state->dr_ctrl_wp[i], state->dr_ref_count_wp[i]);
 }
 
diff --git a/gdb/nat/aarch64-linux-hw-point.h b/gdb/nat/aarch64-linux-hw-point.h
index 7c42b96d1b0..1940b06a89e 100644
--- a/gdb/nat/aarch64-linux-hw-point.h
+++ b/gdb/nat/aarch64-linux-hw-point.h
@@ -77,13 +77,13 @@
 
    31                             13          5      3      1     0
    +--------------------------------+----------+------+------+----+
-   |         RESERVED (SBZ)         |  LENGTH  | TYPE | PRIV | EN |
+   |         RESERVED (SBZ)         |   MASK   | TYPE | PRIV | EN |
    +--------------------------------+----------+------+------+----+
 
    The TYPE field is ignored for breakpoints.  */
 
 #define DR_CONTROL_ENABLED(ctrl)	(((ctrl) & 0x1) == 1)
-#define DR_CONTROL_LENGTH(ctrl)		(((ctrl) >> 5) & 0xff)
+#define DR_CONTROL_MASK(ctrl)		(((ctrl) >> 5) & 0xff)
 
 /* Each bit of a variable of this type is used to indicate whether a
    hardware breakpoint or watchpoint setting has been changed since
@@ -147,7 +147,10 @@ struct aarch64_debug_reg_state
   unsigned int dr_ref_count_bp[AARCH64_HBP_MAX_NUM];
 
   /* hardware watchpoint */
+  /* Address aligned down to AARCH64_HWP_ALIGNMENT.  */
   CORE_ADDR dr_addr_wp[AARCH64_HWP_MAX_NUM];
+  /* Address as entered by user without any forced alignment.  */
+  CORE_ADDR dr_addr_orig_wp[AARCH64_HWP_MAX_NUM];
   unsigned int dr_ctrl_wp[AARCH64_HWP_MAX_NUM];
   unsigned int dr_ref_count_wp[AARCH64_HWP_MAX_NUM];
 };
@@ -166,6 +169,7 @@ struct arch_lwp_info
 extern int aarch64_num_bp_regs;
 extern int aarch64_num_wp_regs;
 
+unsigned int aarch64_watchpoint_offset (unsigned int ctrl);
 unsigned int aarch64_watchpoint_length (unsigned int ctrl);
 
 int aarch64_handle_breakpoint (enum target_hw_bp_type type, CORE_ADDR addr,
@@ -175,7 +179,7 @@ int aarch64_handle_watchpoint (enum target_hw_bp_type type, CORE_ADDR addr,
 			       int len, int is_insert,
 			       struct aarch64_debug_reg_state *state);
 
-void aarch64_linux_set_debug_regs (const struct aarch64_debug_reg_state *state,
+void aarch64_linux_set_debug_regs (struct aarch64_debug_reg_state *state,
 				   int tid, int watchpoint);
 
 void aarch64_show_debug_reg_state (struct aarch64_debug_reg_state *state,
diff --git a/gdb/testsuite/gdb.base/watchpoint-unaligned.c b/gdb/testsuite/gdb.base/watchpoint-unaligned.c
new file mode 100644
index 00000000000..ea844e947f6
--- /dev/null
+++ b/gdb/testsuite/gdb.base/watchpoint-unaligned.c
@@ -0,0 +1,68 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017-2018 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdint.h>
+#include <assert.h>
+
+static int again;
+
+static volatile struct
+{
+  uint64_t alignment;
+  union
+    {
+      uint64_t size8[1];
+      uint32_t size4[2];
+      uint16_t size2[4];
+      uint8_t size1[8];
+    }
+  u;
+} data;
+
+static int size = 0;
+static int offset;
+
+int
+main (void)
+{
+  volatile uint64_t local;
+
+  assert (sizeof (data) == 8 + 8);
+  while (size)
+    {
+      switch (size)
+	{
+	case 8:
+	  local = data.u.size8[offset];
+	  break;
+	case 4:
+	  local = data.u.size4[offset];
+	  break;
+	case 2:
+	  local = data.u.size2[offset];
+	  break;
+	case 1:
+	  local = data.u.size1[offset];
+	  break;
+	default:
+	  assert (0);
+	}
+      size = 0;
+      size = size; /* start_again */
+    }
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/watchpoint-unaligned.exp b/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
new file mode 100644
index 00000000000..833997b04d7
--- /dev/null
+++ b/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
@@ -0,0 +1,116 @@
+# Copyright 2017-2018 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+# This file is part of the gdb testsuite.
+
+# Test inserting read watchpoints on unaligned addresses.
+
+standard_testfile
+if { [prepare_for_testing ${testfile}.exp ${testfile} ${srcfile}] } {
+    return -1
+}
+
+if ![runto_main] {
+    untested "could not run to main"
+    return -1
+}
+
+gdb_breakpoint [gdb_get_line_number "start_again"] "Breakpoint $decimal at $hex" "start_again"
+
+set sizes {1 2 4 8}
+array set alignedend {1 1  2 2  3 4  4 4  5 8  6 8  7 8  8 8}
+
+foreach wpsize $sizes {
+    for {set wpoffset 0} {$wpoffset < 8 / $wpsize} {incr wpoffset} {
+	set wpstart [expr $wpoffset * $wpsize]
+	set wpend [expr ($wpoffset + 1) * $wpsize]
+	set wpendaligned $alignedend($wpend)
+	foreach rdsize $sizes {
+	    for {set rdoffset 0} {$rdoffset < 8 / $rdsize} {incr rdoffset} {
+		set rdstart [expr $rdoffset * $rdsize]
+		set rdend [expr ($rdoffset + 1) * $rdsize]
+		set expect_hit [expr max ($wpstart, $rdstart) < min ($wpend, $rdend)]
+		set test "rwatch data.u.size$wpsize\[$wpoffset\]"
+		set wpnum ""
+		gdb_test_multiple $test $test {
+		    -re "Hardware read watchpoint (\[0-9\]+): .*\r\n$gdb_prompt $" {
+			set wpnum $expect_out(1,string)
+		    }
+		}
+		gdb_test_no_output "set variable size = $rdsize" ""
+		gdb_test_no_output "set variable offset = $rdoffset" ""
+		set test "continue"
+		set got_hit 0
+		gdb_test_multiple $test $test {
+		    -re "Hardware read watchpoint $wpnum:.*Value = .*\r\n$gdb_prompt $" {
+			set got_hit 1
+			send_gdb "continue\n"
+			exp_continue
+		    }
+		    -re " start_again .*\r\n$gdb_prompt $" {
+		    }
+		}
+		gdb_test_no_output "delete $wpnum" ""
+		set test "wp(size=$wpsize offset=$wpoffset) rd(size=$rdsize offset=$rdoffset) expect=$expect_hit"
+		if {$expect_hit == $got_hit} {
+		    pass $test
+		} else {
+		    # We do not know if we run on a fixed Linux kernel
+		    # or not.  Report XFAIL only in the FAIL case.
+		    if {$expect_hit == 0 && $rdstart < $wpendaligned} {
+			setup_xfail external/20207 "aarch64*-*-linux*"
+		    }
+		    fail $test
+		}
+	    }
+	}
+    }
+}
+
+foreach wpcount {4 7} {
+    array set wpoffset_to_wpnum {}
+    for {set wpoffset 1} {$wpoffset <= $wpcount} {incr wpoffset} {
+	set test "rwatch data.u.size1\[$wpoffset\]"
+	set wpnum ""
+	gdb_test_multiple $test $test {
+	    -re "Hardware read watchpoint (\[0-9\]+): .*\r\n$gdb_prompt $" {
+		set wpoffset_to_wpnum($wpoffset) $expect_out(1,string)
+	    }
+	}
+    }
+    gdb_test_no_output "set variable size = 1" ""
+    gdb_test_no_output "set variable offset = 1" ""
+    set test "continue"
+    set got_hit 0
+    gdb_test_multiple $test $test {
+	-re "\r\nCould not insert hardware watchpoint .*\r\n$gdb_prompt $" {
+	}
+	-re "Hardware read watchpoint $wpoffset_to_wpnum(1):.*Value = .*\r\n$gdb_prompt $" {
+	    set got_hit 1
+	    send_gdb "continue\n"
+	    exp_continue
+	}
+	-re " start_again .*\r\n$gdb_prompt $" {
+	}
+    }
+    for {set wpoffset 1} {$wpoffset <= $wpcount} {incr wpoffset} {
+	gdb_test_no_output "delete $wpoffset_to_wpnum($wpoffset)" ""
+    }
+    set test "wpcount($wpcount)"
+    if {$wpcount > 4} {
+	setup_kfail tdep/22389 *-*-*
+    }
+    gdb_assert $got_hit $test
+}
diff --git a/gdb/utils.c b/gdb/utils.c
index b957b0dc5c2..e274f026764 100644
--- a/gdb/utils.c
+++ b/gdb/utils.c
@@ -2849,22 +2849,6 @@ gdb_realpath_tests ()
 
 #endif /* GDB_SELF_TEST */
 
-ULONGEST
-align_up (ULONGEST v, int n)
-{
-  /* Check that N is really a power of two.  */
-  gdb_assert (n && (n & (n-1)) == 0);
-  return (v + n - 1) & -n;
-}
-
-ULONGEST
-align_down (ULONGEST v, int n)
-{
-  /* Check that N is really a power of two.  */
-  gdb_assert (n && (n & (n-1)) == 0);
-  return (v & -n);
-}
-
 /* Allocation function for the libiberty hash table which uses an
    obstack.  The obstack is passed as DATA.  */
 
diff --git a/gdb/utils.h b/gdb/utils.h
index 4dec889d83e..c728449429e 100644
--- a/gdb/utils.h
+++ b/gdb/utils.h
@@ -494,38 +494,6 @@ extern pid_t wait_to_die_with_timeout (pid_t pid, int *status, int timeout);
 
 extern int myread (int, char *, int);
 
-/* Ensure that V is aligned to an N byte boundary (B's assumed to be a
-   power of 2).  Round up/down when necessary.  Examples of correct
-   use include:
-
-   addr = align_up (addr, 8); -- VALUE needs 8 byte alignment
-   write_memory (addr, value, len);
-   addr += len;
-
-   and:
-
-   sp = align_down (sp - len, 16); -- Keep SP 16 byte aligned
-   write_memory (sp, value, len);
-
-   Note that uses such as:
-
-   write_memory (addr, value, len);
-   addr += align_up (len, 8);
-
-   and:
-
-   sp -= align_up (len, 8);
-   write_memory (sp, value, len);
-
-   are typically not correct as they don't ensure that the address (SP
-   or ADDR) is correctly aligned (relying on previous alignment to
-   keep things right).  This is also why the methods are called
-   "align_..." instead of "round_..." as the latter reads better with
-   this incorrect coding style.  */
-
-extern ULONGEST align_up (ULONGEST v, int n);
-extern ULONGEST align_down (ULONGEST v, int n);
-
 /* Resource limits used by getrlimit and setrlimit.  */
 
 enum resource_limit_kind
-- 
2.14.3
Eli Zaretskii April 20, 2018, 3:54 p.m. | #3
> Cc: gdb-patches@sourceware.org

> From: Pedro Alves <palves@redhat.com>

> Date: Fri, 20 Apr 2018 15:49:39 +0100

> 

> diff --git a/gdb/NEWS b/gdb/NEWS

> index 25c404bfc37..38043c4ff2e 100644

> --- a/gdb/NEWS

> +++ b/gdb/NEWS

> @@ -35,6 +35,16 @@ SH-5/SH64 ELF			sh64-*-elf*, SH-5/SH64 support in sh*

>  SH-5/SH64 running GNU/Linux	SH-5/SH64 support in sh*-*-linux*

>  SH-5/SH64 running OpenBSD 	SH-5/SH64 support in sh*-*-openbsd*

>  

> +* Aarch64/Linux hardware watchpoints improvements

> +

> +  Hardware watchpoints on unaligned addresses are now properly

> +  supported when running Linux kernel 4.10 or higher: read and access

> +  watchpoints are no longer spuriously missed, and all watchpoints

> +  lengths between 1 and 8 bytes are supported.  On older kernels,

> +  watchpoints set on unaligned addresses are no longer missed, with

> +  the tradeoff that there is a possibility of false hits being

> +  reported.


I wonder whether we need this NEWS entry.  We don't normally call out
bugfixes there, do we?

Thanks.
Pedro Alves April 20, 2018, 3:59 p.m. | #4
On 04/20/2018 04:54 PM, Eli Zaretskii wrote:

>> diff --git a/gdb/NEWS b/gdb/NEWS

>> index 25c404bfc37..38043c4ff2e 100644

>> --- a/gdb/NEWS

>> +++ b/gdb/NEWS

>> @@ -35,6 +35,16 @@ SH-5/SH64 ELF			sh64-*-elf*, SH-5/SH64 support in sh*

>>  SH-5/SH64 running GNU/Linux	SH-5/SH64 support in sh*-*-linux*

>>  SH-5/SH64 running OpenBSD 	SH-5/SH64 support in sh*-*-openbsd*

>>  

>> +* Aarch64/Linux hardware watchpoints improvements

>> +

>> +  Hardware watchpoints on unaligned addresses are now properly

>> +  supported when running Linux kernel 4.10 or higher: read and access

>> +  watchpoints are no longer spuriously missed, and all watchpoints

>> +  lengths between 1 and 8 bytes are supported.  On older kernels,

>> +  watchpoints set on unaligned addresses are no longer missed, with

>> +  the tradeoff that there is a possibility of false hits being

>> +  reported.

> 

> I wonder whether we need this NEWS entry.  We don't normally call out

> bugfixes there, do we?


Yao requested one in an earlier review.  I'm borderline about it
myself, with no strong opinion.
There's a tradeoff here, which may be worth advertising, and the
mention of kernel version might be useful too.

Thanks,
Pedro Alves
Jan Kratochvil April 26, 2018, 8:12 p.m. | #5
On Fri, 20 Apr 2018 16:49:39 +0200, Pedro Alves wrote:
> ~~~~~~~~~~~~

> Previously, when the hardware reported a watchpoint hit on an address

> that did not match our watchpoint (this happens in case of instructions

> which access large chunks of memory such as "stp") the process would

> enter a loop where we would be continually resuming it (because we did

> not recognise that watchpoint hit) and it would keep hitting the

> watchpoint again and again. The tracing process would never get

> notified of the watchpoint hit.

> ~~~~~~~~~~~~

> 

> ... I'm left with the impression that ADDR_TRAP could be even

> lower than addr_watch_aligned, in which case we'll still miss

> watchpoints.  I wondering whether GDB should be using a similar

> trick as that kernel patch does.


This is new for me what you found.  I just did not expect the changed region
region could be larger than aligned 8 bytes.

Unfortunately I cannot reproduce that so I cannot do much with that.
Does anyone know how to reproduce it?


Thanks,
Jan


I was unable to make GCC use the "stp" instruction so I hand-edited it there:
# gcc -o w2.S w2.c -Wall -g -O3 -S -dA
# output attached+edited
gcc -o w2 w2.S -Wall 

aarch64-7s-rhel-alt-v1.ss.eng.rdu.redhat.com
kernel-4.14.0-49.el7a.aarch64

../gdb -data-directory ../data-directory/ ./w2 -batch -ex 'b main' -ex r -ex 'p &g1' -ex 'p &g2' -ex 'b 8' -ex 'watch g2' -ex 'set debug infrun 1' -ex c -ex disas
Breakpoint 1 at 0x400458: file w2.c, line 7.
Breakpoint 1, main () at w2.c:7
7	  f(1,2);
$1 = (long *) 0x420030 <g1>
$2 = (long *) 0x420038 <g2>
Breakpoint 2 at 0x400464: file w2.c, line 9.
Hardware watchpoint 3: g2
infrun: clear_proceed_status_thread (process 30847)
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
infrun: step-over queue now empty
infrun: resuming [process 30847] for step-over
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [process 30847] at 0x400458
infrun: infrun_async(1)
infrun: prepare_to_wait
infrun: proceed: [process 30847] resumed
infrun: target_wait (-1.0.0, status) =
infrun:   30847.30847.0 [process 30847],
infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x40045c
infrun: no stepping, continue
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 30847] at 0x40045c
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun:   30847.30847.0 [process 30847],
infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x4005d8
infrun: stopped by watchpoint
infrun: stopped data address = 0x420038
infrun: stop_all_threads
infrun: stop_all_threads, pass=0, iterations=0
infrun:   process 30847 not executing
infrun: stop_all_threads, pass=1, iterations=1
infrun:   process 30847 not executing
infrun: stop_all_threads done
infrun: stepping past non-steppable watchpoint. skipping watchpoint at 0x420038:8
infrun: stepping past non-steppable watchpoint. skipping watchpoint at 0x420038:8
infrun: stepping past non-steppable watchpoint. skipping watchpoint at 0x420038:8
infrun: stepping past non-steppable watchpoint. skipping watchpoint at 0x420038:8
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [process 30847] at 0x4005d8
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun:   30847.30847.0 [process 30847],
infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: TARGET_WAITKIND_STOPPED
infrun: clear_step_over_info
infrun: restart threads: [process 30847] is event thread
infrun: stop_pc = 0x4005dc
infrun: BPSTAT_WHAT_STOP_NOISY
infrun: stop_waiting
infrun: stop_all_threads
infrun: stop_all_threads, pass=0, iterations=0
infrun:   process 30847 not executing
infrun: stop_all_threads, pass=1, iterations=1
infrun:   process 30847 not executing
infrun: stop_all_threads done
Hardware watchpoint 3: g2
Old value = 0
New value = 2
f (p1=p1@entry=1, p2=p2@entry=2) at w2.c:5
5	}
infrun: infrun_async(0)
Dump of assembler code for function f:
   0x00000000004005d0 <+0>:	adrp	x2, 0x420000 <__libc_start_main@got.plt>
   0x00000000004005d4 <+4>:	add	x2, x2, #0x30
   0x00000000004005d8 <+8>:	stp	x0, x1, [x2]
=> 0x00000000004005dc <+12>:	ret
End of assembler dump.
.cpu generic+fp+simd
	.file	"w2.c"
	.text
.Ltext0:
	.align	2
	.global	f
	.type	f, %function
f:
.LFB0:
	.file 1 "w2.c"
	// w2.c:2
	.loc 1 2 0
	.cfi_startproc
.LVL0:
// BLOCK 2 freq:10000 seq:0
// PRED: ENTRY [100.0%]  (FALLTHRU)
	// w2.c:3
	.loc 1 3 0
	adrp	x2, .LANCHOR0
	add	x2, x2, :lo12:.LANCHOR0
#if 0
	str	x0, [x2]
	// w2.c:4
	.loc 1 4 0
	str	x1, [x2,8]
#else
	stp	x0, x1, [x2]
#endif
// SUCC: EXIT [100.0%] 
	// w2.c:5
	.loc 1 5 0
	ret
	.cfi_endproc
.LFE0:
	.size	f, .-f
	.section	.text.startup,"ax",%progbits
	.align	2
	.global	main
	.type	main, %function
main:
.LFB1:
	// w2.c:6
	.loc 1 6 0
	.cfi_startproc
// BLOCK 2 freq:10000 seq:0
// PRED: ENTRY [100.0%]  (FALLTHRU)
	stp	x29, x30, [sp, -16]!
	.cfi_def_cfa_offset 16
	.cfi_offset 29, -16
	.cfi_offset 30, -8
	add	x29, sp, 0
	.cfi_def_cfa_register 29
	// w2.c:7
	.loc 1 7 0
	mov	x0, 1
	mov	x1, 2
	bl	f
.LVL1:
	// w2.c:9
	.loc 1 9 0
	mov	w0, 0
	ldp	x29, x30, [sp], 16
	.cfi_restore 30
	.cfi_restore 29
	.cfi_def_cfa 31, 0
// SUCC: EXIT [100.0%] 
	ret
	.cfi_endproc
.LFE1:
	.size	main, .-main
	.bss
	.align	3
	.zero	8
.LANCHOR0 = . + 0
	.type	g1, %object
	.size	g1, 8
g1:
	.zero	8
	.type	g2, %object
	.size	g2, 8
g2:
	.zero	8
	.text
.Letext0:
	.section	.debug_info,"",%progbits
.Ldebug_info0:
	.4byte	0xca	// Length of Compilation Unit Info
	.2byte	0x4	// DWARF version number
	.4byte	.Ldebug_abbrev0	// Offset Into Abbrev. Section
	.byte	0x8	// Pointer Size (in bytes)
	.uleb128 0x1	// (DIE (0xb) DW_TAG_compile_unit)
	.4byte	.LASF1	// DW_AT_producer: "GNU C 4.8.5 20150623 (Red Hat 4.8.5-28) -g -O3"
	.byte	0x1	// DW_AT_language
	.4byte	.LASF2	// DW_AT_name: "w2.c"
	.4byte	.LASF3	// DW_AT_comp_dir: "/root/jkratoch/redhat/gdb-git/gdb/testsuite"
	.4byte	.Ldebug_ranges0+0	// DW_AT_ranges
	.8byte	0	// DW_AT_low_pc
	.4byte	.Ldebug_line0	// DW_AT_stmt_list
	.uleb128 0x2	// (DIE (0x29) DW_TAG_subprogram)
			// DW_AT_external
	.ascii "f\0"	// DW_AT_name
	.byte	0x1	// DW_AT_decl_file (w2.c)
	.byte	0x2	// DW_AT_decl_line
			// DW_AT_prototyped
	.8byte	.LFB0	// DW_AT_low_pc
	.8byte	.LFE0-.LFB0	// DW_AT_high_pc
	.uleb128 0x1	// DW_AT_frame_base
	.byte	0x9c	// DW_OP_call_frame_cfa
			// DW_AT_GNU_all_call_sites
	.4byte	0x5d	// DW_AT_sibling
	.uleb128 0x3	// (DIE (0x44) DW_TAG_formal_parameter)
	.ascii "p1\0"	// DW_AT_name
	.byte	0x1	// DW_AT_decl_file (w2.c)
	.byte	0x2	// DW_AT_decl_line
	.4byte	0x5d	// DW_AT_type
	.uleb128 0x1	// DW_AT_location
	.byte	0x50	// DW_OP_reg0
	.uleb128 0x3	// (DIE (0x50) DW_TAG_formal_parameter)
	.ascii "p2\0"	// DW_AT_name
	.byte	0x1	// DW_AT_decl_file (w2.c)
	.byte	0x2	// DW_AT_decl_line
	.4byte	0x5d	// DW_AT_type
	.uleb128 0x1	// DW_AT_location
	.byte	0x51	// DW_OP_reg1
	.byte	0	// end of children of DIE 0x29
	.uleb128 0x4	// (DIE (0x5d) DW_TAG_base_type)
	.byte	0x8	// DW_AT_byte_size
	.byte	0x5	// DW_AT_encoding
	.4byte	.LASF0	// DW_AT_name: "long int"
	.uleb128 0x5	// (DIE (0x64) DW_TAG_subprogram)
			// DW_AT_external
	.4byte	.LASF4	// DW_AT_name: "main"
	.byte	0x1	// DW_AT_decl_file (w2.c)
	.byte	0x6	// DW_AT_decl_line
			// DW_AT_prototyped
	.4byte	0x9e	// DW_AT_type
	.8byte	.LFB1	// DW_AT_low_pc
	.8byte	.LFE1-.LFB1	// DW_AT_high_pc
	.uleb128 0x1	// DW_AT_frame_base
	.byte	0x9c	// DW_OP_call_frame_cfa
			// DW_AT_GNU_all_call_sites
	.4byte	0x9e	// DW_AT_sibling
	.uleb128 0x6	// (DIE (0x85) DW_TAG_GNU_call_site)
	.8byte	.LVL1	// DW_AT_low_pc
	.4byte	0x29	// DW_AT_abstract_origin
	.uleb128 0x7	// (DIE (0x92) DW_TAG_GNU_call_site_parameter)
	.uleb128 0x1	// DW_AT_location
	.byte	0x51	// DW_OP_reg1
	.uleb128 0x1	// DW_AT_GNU_call_site_value
	.byte	0x32	// DW_OP_lit2
	.uleb128 0x7	// (DIE (0x97) DW_TAG_GNU_call_site_parameter)
	.uleb128 0x1	// DW_AT_location
	.byte	0x50	// DW_OP_reg0
	.uleb128 0x1	// DW_AT_GNU_call_site_value
	.byte	0x31	// DW_OP_lit1
	.byte	0	// end of children of DIE 0x85
	.byte	0	// end of children of DIE 0x64
	.uleb128 0x8	// (DIE (0x9e) DW_TAG_base_type)
	.byte	0x4	// DW_AT_byte_size
	.byte	0x5	// DW_AT_encoding
	.ascii "int\0"	// DW_AT_name
	.uleb128 0x9	// (DIE (0xa5) DW_TAG_variable)
	.ascii "g1\0"	// DW_AT_name
	.byte	0x1	// DW_AT_decl_file (w2.c)
	.byte	0x1	// DW_AT_decl_line
	.4byte	0x5d	// DW_AT_type
	.uleb128 0x9	// DW_AT_location
	.byte	0x3	// DW_OP_addr
	.8byte	g1
	.uleb128 0x9	// (DIE (0xb9) DW_TAG_variable)
	.ascii "g2\0"	// DW_AT_name
	.byte	0x1	// DW_AT_decl_file (w2.c)
	.byte	0x1	// DW_AT_decl_line
	.4byte	0x5d	// DW_AT_type
	.uleb128 0x9	// DW_AT_location
	.byte	0x3	// DW_OP_addr
	.8byte	g2
	.byte	0	// end of children of DIE 0xb
	.section	.debug_abbrev,"",%progbits
.Ldebug_abbrev0:
	.uleb128 0x1	// (abbrev code)
	.uleb128 0x11	// (TAG: DW_TAG_compile_unit)
	.byte	0x1	// DW_children_yes
	.uleb128 0x25	// (DW_AT_producer)
	.uleb128 0xe	// (DW_FORM_strp)
	.uleb128 0x13	// (DW_AT_language)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3	// (DW_AT_name)
	.uleb128 0xe	// (DW_FORM_strp)
	.uleb128 0x1b	// (DW_AT_comp_dir)
	.uleb128 0xe	// (DW_FORM_strp)
	.uleb128 0x55	// (DW_AT_ranges)
	.uleb128 0x17	// (DW_FORM_sec_offset)
	.uleb128 0x11	// (DW_AT_low_pc)
	.uleb128 0x1	// (DW_FORM_addr)
	.uleb128 0x10	// (DW_AT_stmt_list)
	.uleb128 0x17	// (DW_FORM_sec_offset)
	.byte	0
	.byte	0
	.uleb128 0x2	// (abbrev code)
	.uleb128 0x2e	// (TAG: DW_TAG_subprogram)
	.byte	0x1	// DW_children_yes
	.uleb128 0x3f	// (DW_AT_external)
	.uleb128 0x19	// (DW_FORM_flag_present)
	.uleb128 0x3	// (DW_AT_name)
	.uleb128 0x8	// (DW_FORM_string)
	.uleb128 0x3a	// (DW_AT_decl_file)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3b	// (DW_AT_decl_line)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x27	// (DW_AT_prototyped)
	.uleb128 0x19	// (DW_FORM_flag_present)
	.uleb128 0x11	// (DW_AT_low_pc)
	.uleb128 0x1	// (DW_FORM_addr)
	.uleb128 0x12	// (DW_AT_high_pc)
	.uleb128 0x7	// (DW_FORM_data8)
	.uleb128 0x40	// (DW_AT_frame_base)
	.uleb128 0x18	// (DW_FORM_exprloc)
	.uleb128 0x2117	// (DW_AT_GNU_all_call_sites)
	.uleb128 0x19	// (DW_FORM_flag_present)
	.uleb128 0x1	// (DW_AT_sibling)
	.uleb128 0x13	// (DW_FORM_ref4)
	.byte	0
	.byte	0
	.uleb128 0x3	// (abbrev code)
	.uleb128 0x5	// (TAG: DW_TAG_formal_parameter)
	.byte	0	// DW_children_no
	.uleb128 0x3	// (DW_AT_name)
	.uleb128 0x8	// (DW_FORM_string)
	.uleb128 0x3a	// (DW_AT_decl_file)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3b	// (DW_AT_decl_line)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x49	// (DW_AT_type)
	.uleb128 0x13	// (DW_FORM_ref4)
	.uleb128 0x2	// (DW_AT_location)
	.uleb128 0x18	// (DW_FORM_exprloc)
	.byte	0
	.byte	0
	.uleb128 0x4	// (abbrev code)
	.uleb128 0x24	// (TAG: DW_TAG_base_type)
	.byte	0	// DW_children_no
	.uleb128 0xb	// (DW_AT_byte_size)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3e	// (DW_AT_encoding)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3	// (DW_AT_name)
	.uleb128 0xe	// (DW_FORM_strp)
	.byte	0
	.byte	0
	.uleb128 0x5	// (abbrev code)
	.uleb128 0x2e	// (TAG: DW_TAG_subprogram)
	.byte	0x1	// DW_children_yes
	.uleb128 0x3f	// (DW_AT_external)
	.uleb128 0x19	// (DW_FORM_flag_present)
	.uleb128 0x3	// (DW_AT_name)
	.uleb128 0xe	// (DW_FORM_strp)
	.uleb128 0x3a	// (DW_AT_decl_file)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3b	// (DW_AT_decl_line)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x27	// (DW_AT_prototyped)
	.uleb128 0x19	// (DW_FORM_flag_present)
	.uleb128 0x49	// (DW_AT_type)
	.uleb128 0x13	// (DW_FORM_ref4)
	.uleb128 0x11	// (DW_AT_low_pc)
	.uleb128 0x1	// (DW_FORM_addr)
	.uleb128 0x12	// (DW_AT_high_pc)
	.uleb128 0x7	// (DW_FORM_data8)
	.uleb128 0x40	// (DW_AT_frame_base)
	.uleb128 0x18	// (DW_FORM_exprloc)
	.uleb128 0x2117	// (DW_AT_GNU_all_call_sites)
	.uleb128 0x19	// (DW_FORM_flag_present)
	.uleb128 0x1	// (DW_AT_sibling)
	.uleb128 0x13	// (DW_FORM_ref4)
	.byte	0
	.byte	0
	.uleb128 0x6	// (abbrev code)
	.uleb128 0x4109	// (TAG: DW_TAG_GNU_call_site)
	.byte	0x1	// DW_children_yes
	.uleb128 0x11	// (DW_AT_low_pc)
	.uleb128 0x1	// (DW_FORM_addr)
	.uleb128 0x31	// (DW_AT_abstract_origin)
	.uleb128 0x13	// (DW_FORM_ref4)
	.byte	0
	.byte	0
	.uleb128 0x7	// (abbrev code)
	.uleb128 0x410a	// (TAG: DW_TAG_GNU_call_site_parameter)
	.byte	0	// DW_children_no
	.uleb128 0x2	// (DW_AT_location)
	.uleb128 0x18	// (DW_FORM_exprloc)
	.uleb128 0x2111	// (DW_AT_GNU_call_site_value)
	.uleb128 0x18	// (DW_FORM_exprloc)
	.byte	0
	.byte	0
	.uleb128 0x8	// (abbrev code)
	.uleb128 0x24	// (TAG: DW_TAG_base_type)
	.byte	0	// DW_children_no
	.uleb128 0xb	// (DW_AT_byte_size)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3e	// (DW_AT_encoding)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3	// (DW_AT_name)
	.uleb128 0x8	// (DW_FORM_string)
	.byte	0
	.byte	0
	.uleb128 0x9	// (abbrev code)
	.uleb128 0x34	// (TAG: DW_TAG_variable)
	.byte	0	// DW_children_no
	.uleb128 0x3	// (DW_AT_name)
	.uleb128 0x8	// (DW_FORM_string)
	.uleb128 0x3a	// (DW_AT_decl_file)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x3b	// (DW_AT_decl_line)
	.uleb128 0xb	// (DW_FORM_data1)
	.uleb128 0x49	// (DW_AT_type)
	.uleb128 0x13	// (DW_FORM_ref4)
	.uleb128 0x2	// (DW_AT_location)
	.uleb128 0x18	// (DW_FORM_exprloc)
	.byte	0
	.byte	0
	.byte	0
	.section	.debug_aranges,"",%progbits
	.4byte	0x3c	// Length of Address Ranges Info
	.2byte	0x2	// DWARF Version
	.4byte	.Ldebug_info0	// Offset of Compilation Unit Info
	.byte	0x8	// Size of Address
	.byte	0	// Size of Segment Descriptor
	.2byte	0	// Pad to 16 byte boundary
	.2byte	0
	.8byte	.Ltext0	// Address
	.8byte	.Letext0-.Ltext0	// Length
	.8byte	.LFB1	// Address
	.8byte	.LFE1-.LFB1	// Length
	.8byte	0
	.8byte	0
	.section	.debug_ranges,"",%progbits
.Ldebug_ranges0:
	.8byte	.Ltext0	// Offset 0
	.8byte	.Letext0
	.8byte	.LFB1	// Offset 0x10
	.8byte	.LFE1
	.8byte	0
	.8byte	0
	.section	.debug_line,"",%progbits
.Ldebug_line0:
	.section	.debug_str,"MS",%progbits,1
.LASF3:
	.string	"/root/jkratoch/redhat/gdb-git/gdb/testsuite"
.LASF2:
	.string	"w2.c"
.LASF0:
	.string	"long int"
.LASF4:
	.string	"main"
.LASF1:
	.string	"GNU C 4.8.5 20150623 (Red Hat 4.8.5-28) -g -O3"
	.ident	"GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-28)"
	.section	.note.GNU-stack,"",%progbits
Jan Kratochvil May 1, 2018, 11:03 p.m. | #6
On Thu, 26 Apr 2018 22:12:16 +0200, Jan Kratochvil wrote:
> On Fri, 20 Apr 2018 16:49:39 +0200, Pedro Alves wrote:

> > ~~~~~~~~~~~~

> > Previously, when the hardware reported a watchpoint hit on an address

> > that did not match our watchpoint (this happens in case of instructions

> > which access large chunks of memory such as "stp") the process would

> > enter a loop where we would be continually resuming it (because we did

> > not recognise that watchpoint hit) and it would keep hitting the

> > watchpoint again and again. The tracing process would never get

> > notified of the watchpoint hit.

> > ~~~~~~~~~~~~

> > 

> > ... I'm left with the impression that ADDR_TRAP could be even

> > lower than addr_watch_aligned, in which case we'll still miss

> > watchpoints.  I wondering whether GDB should be using a similar

> > trick as that kernel patch does.

> 

> This is new for me what you found.  I just did not expect the changed region

> region could be larger than aligned 8 bytes.

> 

> Unfortunately I cannot reproduce that so I cannot do much with that.

> Does anyone know how to reproduce it?


Attaching the change I made, I will repost the patch in a next mail.


Jan
diff --git a/gdb/testsuite/gdb.base/watchpoint-unaligned.c b/gdb/testsuite/gdb.base/watchpoint-unaligned.c
index ea844e9..97832a0 100644
--- a/gdb/testsuite/gdb.base/watchpoint-unaligned.c
+++ b/gdb/testsuite/gdb.base/watchpoint-unaligned.c
@@ -29,6 +29,7 @@ static volatile struct
       uint32_t size4[2];
       uint16_t size2[4];
       uint8_t size1[8];
+      uint64_t size8twice[2];
     }
   u;
 } data;
@@ -36,12 +37,32 @@ static volatile struct
 static int size = 0;
 static int offset;
 
+static void
+write_size8twice (void)
+{
+  static const uint64_t first = 1;
+  static const uint64_t second = 2;
+
+#ifdef __aarch64__
+  asm volatile ("stp %1, %2, [%0]"
+		: /* output */
+		: "r" (data.u.size8twice), "r" (first), "r" (second) /* input */
+		: "memory" /* clobber */);
+#else
+  data.u.size8twice[0] = first;
+  data.u.size8twice[1] = second;
+#endif
+}
+
 int
 main (void)
 {
   volatile uint64_t local;
 
-  assert (sizeof (data) == 8 + 8);
+  assert (sizeof (data) == 8 + 2 * 8);
+
+  write_size8twice ();
+
   while (size)
     {
       switch (size)
@@ -64,5 +85,5 @@ main (void)
       size = 0;
       size = size; /* start_again */
     }
-  return 0;
+  return 0; /* final_return */
 }
diff --git a/gdb/testsuite/gdb.base/watchpoint-unaligned.exp b/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
index 833997b..25a15a8 100644
--- a/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
+++ b/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
@@ -114,3 +114,29 @@ foreach wpcount {4 7} {
     }
     gdb_assert $got_hit $test
 }
+
+if ![runto_main] {
+    return -1
+}
+gdb_breakpoint [gdb_get_line_number "final_return"] "Breakpoint $decimal at $hex" "final_return"
+set test {watch data.u.size8twice[1]}
+set wpnum ""
+gdb_test_multiple $test $test {
+    -re "Hardware watchpoint (\[0-9\]+): .*\r\n$gdb_prompt $" {
+	set wpnum $expect_out(1,string)
+    }
+}
+set test "continue"
+set got_hit 0
+gdb_test_multiple $test $test {
+    -re "\r\nCould not insert hardware watchpoint .*\r\n$gdb_prompt $" {
+    }
+    -re "Hardware watchpoint $wpnum:.*New value = .*\r\n$gdb_prompt $" {
+	set got_hit 1
+	send_gdb "continue\n"
+	exp_continue
+    }
+    -re " final_return .*\r\n$gdb_prompt $" {
+    }
+}
+gdb_assert $got_hit "size8twice write"

Patch

diff --git a/gdb/NEWS b/gdb/NEWS
index fbf5591..fd1d664 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -18,6 +18,11 @@ 
   the remote inferior is started by the GDBserver, use the "unset
   environment" command.
 
+* Unaligned hardware watchpoints on aarch64-linux are now properly supported
+  with Linux kernel 4.10 and higher.  On older kernels some unaligned
+  watchpoints are no longer missed - but other unaligned watchpoints may make
+  false hits now.
+
 * Python Scripting
 
   ** New events gdb.new_inferior, gdb.inferior_deleted, and
diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index de18edd..5cfb615 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -735,16 +735,29 @@  aarch64_linux_stopped_data_address (struct target_ops *target,
   state = aarch64_get_debug_reg_state (ptid_get_pid (inferior_ptid));
   for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
     {
+      const unsigned int offset = aarch64_watchpoint_offset
+							 (state->dr_ctrl_wp[i]);
       const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
       const CORE_ADDR addr_trap = (CORE_ADDR) siginfo.si_addr;
-      const CORE_ADDR addr_watch = state->dr_addr_wp[i];
+      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
+      const CORE_ADDR addr_watch_aligned = (state->dr_addr_wp[i]
+					    & -(CORE_ADDR) 8);
+      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
 
       if (state->dr_ref_count_wp[i]
 	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
-	  && addr_trap >= addr_watch
+	  && addr_trap >= addr_watch_aligned
 	  && addr_trap < addr_watch + len)
 	{
-	  *addr_p = addr_trap;
+	  /* ADDR_TRAP reports first address of the range touched by CPU.
+	     ADDR_TRAP may be before the ADDR_WATCH..ADDR_WATCH+LEN range
+	     for larger CPU access hitting the watchpoint by its tail part.
+	     ADDR_TRAP must be in the ADDR_WATCH_ALIGNED..ADDR_WATCH+LEN range.
+	     Never report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
+	     range (not matching any known watchpoint range).
+	     ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false positive due to
+	     PR external/20207 buggy kernels.  */
+	  *addr_p = addr_orig;
 	  return 1;
 	}
     }
diff --git a/gdb/gdbserver/linux-aarch64-low.c b/gdb/gdbserver/linux-aarch64-low.c
index b00d5c5..086db8e 100644
--- a/gdb/gdbserver/linux-aarch64-low.c
+++ b/gdb/gdbserver/linux-aarch64-low.c
@@ -363,14 +363,30 @@  aarch64_stopped_data_address (void)
   state = aarch64_get_debug_reg_state (pid_of (current_thread));
   for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
     {
+      const unsigned int offset = aarch64_watchpoint_offset
+							 (state->dr_ctrl_wp[i]);
       const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
       const CORE_ADDR addr_trap = (CORE_ADDR) siginfo.si_addr;
-      const CORE_ADDR addr_watch = state->dr_addr_wp[i];
+      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
+      const CORE_ADDR addr_watch_aligned = (state->dr_addr_wp[i]
+					    & -(CORE_ADDR) 8);
+      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
+
       if (state->dr_ref_count_wp[i]
 	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
-	  && addr_trap >= addr_watch
+	  && addr_trap >= addr_watch_aligned
 	  && addr_trap < addr_watch + len)
-	return addr_trap;
+	{
+	  /* ADDR_TRAP reports first address of the range touched by CPU.
+	     ADDR_TRAP may be before the ADDR_WATCH..ADDR_WATCH+LEN range
+	     for larger CPU access hitting the watchpoint by its tail part.
+	     ADDR_TRAP must be in the ADDR_WATCH_ALIGNED..ADDR_WATCH+LEN range.
+	     Never report RETVAL outside of any ADDR_WATCH..ADDR_WATCH+LEN
+	     range (not matching any known watchpoint range).
+	     ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false positive due to
+	     PR external/20207 buggy kernels.  */
+	  return addr_orig;
+	}
     }
 
   return (CORE_ADDR) 0;
diff --git a/gdb/nat/aarch64-linux-hw-point.c b/gdb/nat/aarch64-linux-hw-point.c
index 9800d9a..7a5b657 100644
--- a/gdb/nat/aarch64-linux-hw-point.c
+++ b/gdb/nat/aarch64-linux-hw-point.c
@@ -34,29 +34,49 @@ 
 int aarch64_num_bp_regs;
 int aarch64_num_wp_regs;
 
+/* TRUE means this kernel has fixed PR external/20207.
+   Fixed kernel supports any contiguous range of bits in 8-bit byte
+   DR_CONTROL_MASK.  Buggy kernel supports only 0x01, 0x03, 0x0f and 0xff.  */
+static bool have_any_contiguous (true);
+
+/* Return starting byte 0..7 incl. of a watchpoint encoded by CTRL.  */
+
+unsigned int
+aarch64_watchpoint_offset (unsigned int ctrl)
+{
+  uint8_t mask = DR_CONTROL_MASK (ctrl);
+  unsigned retval;
+
+  // Shift out bottom zeroes.
+  for (retval = 0; mask && (mask & 1) == 0; ++retval)
+    mask >>= 1;
+
+  return retval;
+}
+
 /* Utility function that returns the length in bytes of a watchpoint
    according to the content of a hardware debug control register CTRL.
-   Note that the kernel currently only supports the following Byte
-   Address Select (BAS) values: 0x1, 0x3, 0xf and 0xff, which means
-   that for a hardware watchpoint, its valid length can only be 1
-   byte, 2 bytes, 4 bytes or 8 bytes.  */
+   Any contiguous range of bytes in CTRL is supported.  Returned value
+   can be between 0..8 incl.  */
 
 unsigned int
 aarch64_watchpoint_length (unsigned int ctrl)
 {
-  switch (DR_CONTROL_LENGTH (ctrl))
-    {
-    case 0x01:
-      return 1;
-    case 0x03:
-      return 2;
-    case 0x0f:
-      return 4;
-    case 0xff:
-      return 8;
-    default:
-      return 0;
-    }
+  uint8_t mask = DR_CONTROL_MASK (ctrl);
+  unsigned retval;
+
+  // Shift out bottom zeroes.
+  mask >>= aarch64_watchpoint_offset (ctrl);
+
+  // Count bottom ones;
+  for (retval = 0; (mask & 1) != 0; ++retval)
+    mask >>= 1;
+
+  if (mask)
+    error (_("Unexpected hardware watchpoint length register value 0x%x"),
+      DR_CONTROL_MASK (ctrl));
+
+  return retval;
 }
 
 /* Given the hardware breakpoint or watchpoint type TYPE and its
@@ -64,10 +84,13 @@  aarch64_watchpoint_length (unsigned int ctrl)
    breakpoint/watchpoint control register.  */
 
 static unsigned int
-aarch64_point_encode_ctrl_reg (enum target_hw_bp_type type, int len)
+aarch64_point_encode_ctrl_reg (enum target_hw_bp_type type, int offset, int len)
 {
   unsigned int ctrl, ttype;
 
+  gdb_assert (offset == 0 || have_any_contiguous);
+  gdb_assert (offset + len <= AARCH64_HWP_MAX_LEN_PER_REG);
+
   /* type */
   switch (type)
     {
@@ -89,8 +112,8 @@  aarch64_point_encode_ctrl_reg (enum target_hw_bp_type type, int len)
 
   ctrl = ttype << 3;
 
-  /* length bitmask */
-  ctrl |= ((1 << len) - 1) << 5;
+  /* offset and length bitmask */
+  ctrl |= ((1 << len) - 1) << (5 + offset);
   /* enabled at el0 */
   ctrl |= (2 << 1) | 1;
 
@@ -134,7 +157,8 @@  aarch64_point_is_aligned (int is_watchpoint, CORE_ADDR addr, int len)
   if (addr & (alignment - 1))
     return 0;
 
-  if (len != 8 && len != 4 && len != 2 && len != 1)
+  if ((!have_any_contiguous && len != 8 && len != 4 && len != 2 && len != 1)
+      || (have_any_contiguous && (len < 1 || len > 8)))
     return 0;
 
   return 1;
@@ -181,11 +205,12 @@  aarch64_point_is_aligned (int is_watchpoint, CORE_ADDR addr, int len)
 
 static void
 aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
-			  int *aligned_len_p, CORE_ADDR *next_addr_p,
-			  int *next_len_p)
+			  int *aligned_offset_p, int *aligned_len_p,
+			  CORE_ADDR *next_addr_p, int *next_len_p,
+			  CORE_ADDR *next_addr_orig_p)
 {
   int aligned_len;
-  unsigned int offset;
+  unsigned int offset, aligned_offset;
   CORE_ADDR aligned_addr;
   const unsigned int alignment = AARCH64_HWP_ALIGNMENT;
   const unsigned int max_wp_len = AARCH64_HWP_MAX_LEN_PER_REG;
@@ -200,6 +225,7 @@  aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
      must be aligned.  */
   offset = addr & (alignment - 1);
   aligned_addr = addr - offset;
+  aligned_offset = have_any_contiguous ? addr & (alignment - 1) : 0;
 
   gdb_assert (offset >= 0 && offset < alignment);
   gdb_assert (aligned_addr >= 0 && aligned_addr <= addr);
@@ -209,7 +235,7 @@  aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
     {
       /* Need more than one watchpoint registers; truncate it at the
 	 alignment boundary.  */
-      aligned_len = max_wp_len;
+      aligned_len = max_wp_len - (have_any_contiguous ? offset : 0);
       len -= (max_wp_len - offset);
       addr += (max_wp_len - offset);
       gdb_assert ((addr & (alignment - 1)) == 0);
@@ -222,19 +248,25 @@  aarch64_align_watchpoint (CORE_ADDR addr, int len, CORE_ADDR *aligned_addr_p,
 	aligned_len_array[AARCH64_HWP_MAX_LEN_PER_REG] =
 	{ 1, 2, 4, 4, 8, 8, 8, 8 };
 
-      aligned_len = aligned_len_array[offset + len - 1];
+      aligned_len = (have_any_contiguous
+		     ? len : aligned_len_array[offset + len - 1]);
       addr += len;
       len = 0;
     }
 
   if (aligned_addr_p)
     *aligned_addr_p = aligned_addr;
+  if (aligned_offset_p)
+    *aligned_offset_p = aligned_offset;
   if (aligned_len_p)
     *aligned_len_p = aligned_len;
   if (next_addr_p)
     *next_addr_p = addr;
   if (next_len_p)
     *next_len_p = len;
+  if (next_addr_orig_p)
+    *next_addr_orig_p = (((*next_addr_orig_p) + alignment)
+			 & -(CORE_ADDR) alignment);
 }
 
 struct aarch64_dr_update_callback_param
@@ -324,17 +356,70 @@  aarch64_notify_debug_reg_change (const struct aarch64_debug_reg_state *state,
   iterate_over_lwps (pid_ptid, debug_reg_change_callback, (void *) &param);
 }
 
+/* Reconfigure STATE to be compatible with Linux kernel with buggy
+   PR external/20207 during HAVE_ANY_CONTIGUOUS true->false change.
+   A regression for buggy kernels is that originally GDB could support
+   more watchpoint combinations that had matching (and thus shared)
+   masks.  On such buggy kernels new GDB will try to first setup the
+   perfect matching ranges which will run out of registers (before this
+   function can merge them).  */
+
+static void
+aarch64_downgrade_regs (struct aarch64_debug_reg_state *state)
+{
+  for (int i = 0; i < aarch64_num_wp_regs; ++i)
+    if ((state->dr_ctrl_wp[i] & 1) != 0)
+      {
+	gdb_assert (state->dr_ref_count_wp[i] != 0);
+	uint8_t mask_orig = (state->dr_ctrl_wp[i] >> 5) & 0xff;
+	gdb_assert (mask_orig != 0);
+	const std::array<uint8_t, 4> old_valid ({ 0x01, 0x03, 0x0f, 0xff });
+	uint8_t mask (0);
+	for (const uint8_t old_mask:old_valid)
+	  if (mask_orig <= old_mask)
+	    {
+	      mask = old_mask;
+	      break;
+	    }
+	gdb_assert (mask != 0);
+
+	// No update needed for this watchpoint?
+	if (mask == mask_orig)
+	  continue;
+	state->dr_ctrl_wp[i] |= mask << 5;
+	state->dr_addr_wp[i] &= -(CORE_ADDR) AARCH64_HWP_ALIGNMENT;
+
+	// Try to match duplicate entries.
+	for (int j = 0; j < i; ++j)
+	  if ((state->dr_ctrl_wp[j] & 1) != 0
+	      && state->dr_addr_wp[j] == state->dr_addr_wp[i]
+	      && state->dr_addr_orig_wp[j] == state->dr_addr_orig_wp[i]
+	      && state->dr_ctrl_wp[j] == state->dr_ctrl_wp[i])
+	    {
+	      state->dr_ref_count_wp[j] += state->dr_ref_count_wp[i];
+	      state->dr_ref_count_wp[i] = 0;
+	      state->dr_addr_wp[i] = 0;
+	      state->dr_addr_orig_wp[i] = 0;
+	      state->dr_ctrl_wp[i] &= ~1;
+	      break;
+	    }
+
+	aarch64_notify_debug_reg_change (state, 1 /* is_watchpoint */, i);
+      }
+}
+
 /* Record the insertion of one breakpoint/watchpoint, as represented
    by ADDR and CTRL, in the process' arch-specific data area *STATE.  */
 
 static int
 aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
 				   enum target_hw_bp_type type,
-				   CORE_ADDR addr, int len)
+				   CORE_ADDR addr, int offset, int len,
+				   CORE_ADDR addr_orig)
 {
   int i, idx, num_regs, is_watchpoint;
   unsigned int ctrl, *dr_ctrl_p, *dr_ref_count;
-  CORE_ADDR *dr_addr_p;
+  CORE_ADDR *dr_addr_p, *dr_addr_orig_p;
 
   /* Set up state pointers.  */
   is_watchpoint = (type != hw_execute);
@@ -343,6 +428,7 @@  aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
     {
       num_regs = aarch64_num_wp_regs;
       dr_addr_p = state->dr_addr_wp;
+      dr_addr_orig_p = state->dr_addr_orig_wp;
       dr_ctrl_p = state->dr_ctrl_wp;
       dr_ref_count = state->dr_ref_count_wp;
     }
@@ -350,11 +436,12 @@  aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
     {
       num_regs = aarch64_num_bp_regs;
       dr_addr_p = state->dr_addr_bp;
+      dr_addr_orig_p = nullptr;
       dr_ctrl_p = state->dr_ctrl_bp;
       dr_ref_count = state->dr_ref_count_bp;
     }
 
-  ctrl = aarch64_point_encode_ctrl_reg (type, len);
+  ctrl = aarch64_point_encode_ctrl_reg (type, offset, len);
 
   /* Find an existing or free register in our cache.  */
   idx = -1;
@@ -366,7 +453,9 @@  aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
 	  idx = i;
 	  /* no break; continue hunting for an exising one.  */
 	}
-      else if (dr_addr_p[i] == addr && dr_ctrl_p[i] == ctrl)
+      else if (dr_addr_p[i] == addr
+	       && (dr_addr_orig_p == nullptr || dr_addr_orig_p[i] == addr_orig)
+	       && dr_ctrl_p[i] == ctrl)
 	{
 	  gdb_assert (dr_ref_count[i] != 0);
 	  idx = i;
@@ -383,6 +472,8 @@  aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
     {
       /* new entry */
       dr_addr_p[idx] = addr;
+      if (dr_addr_orig_p != nullptr)
+	dr_addr_orig_p[idx] = addr_orig;
       dr_ctrl_p[idx] = ctrl;
       dr_ref_count[idx] = 1;
       /* Notify the change.  */
@@ -403,11 +494,12 @@  aarch64_dr_state_insert_one_point (struct aarch64_debug_reg_state *state,
 static int
 aarch64_dr_state_remove_one_point (struct aarch64_debug_reg_state *state,
 				   enum target_hw_bp_type type,
-				   CORE_ADDR addr, int len)
+				   CORE_ADDR addr, int offset, int len,
+				   CORE_ADDR addr_orig)
 {
   int i, num_regs, is_watchpoint;
   unsigned int ctrl, *dr_ctrl_p, *dr_ref_count;
-  CORE_ADDR *dr_addr_p;
+  CORE_ADDR *dr_addr_p, *dr_addr_orig_p;
 
   /* Set up state pointers.  */
   is_watchpoint = (type != hw_execute);
@@ -415,6 +507,7 @@  aarch64_dr_state_remove_one_point (struct aarch64_debug_reg_state *state,
     {
       num_regs = aarch64_num_wp_regs;
       dr_addr_p = state->dr_addr_wp;
+      dr_addr_orig_p = state->dr_addr_orig_wp;
       dr_ctrl_p = state->dr_ctrl_wp;
       dr_ref_count = state->dr_ref_count_wp;
     }
@@ -422,15 +515,18 @@  aarch64_dr_state_remove_one_point (struct aarch64_debug_reg_state *state,
     {
       num_regs = aarch64_num_bp_regs;
       dr_addr_p = state->dr_addr_bp;
+      dr_addr_orig_p = nullptr;
       dr_ctrl_p = state->dr_ctrl_bp;
       dr_ref_count = state->dr_ref_count_bp;
     }
 
-  ctrl = aarch64_point_encode_ctrl_reg (type, len);
+  ctrl = aarch64_point_encode_ctrl_reg (type, offset, len);
 
   /* Find the entry that matches the ADDR and CTRL.  */
   for (i = 0; i < num_regs; ++i)
-    if (dr_addr_p[i] == addr && dr_ctrl_p[i] == ctrl)
+    if (dr_addr_p[i] == addr
+	&& (dr_addr_orig_p == nullptr || dr_addr_orig_p[i] == addr_orig)
+	&& dr_ctrl_p[i] == ctrl)
       {
 	gdb_assert (dr_ref_count[i] != 0);
 	break;
@@ -446,6 +542,8 @@  aarch64_dr_state_remove_one_point (struct aarch64_debug_reg_state *state,
       /* Clear the enable bit.  */
       ctrl &= ~1;
       dr_addr_p[i] = 0;
+      if (dr_addr_orig_p != nullptr)
+	dr_addr_orig_p[i] = 0;
       dr_ctrl_p[i] = ctrl;
       /* Notify the change.  */
       aarch64_notify_debug_reg_change (state, is_watchpoint, i);
@@ -472,10 +570,10 @@  aarch64_handle_breakpoint (enum target_hw_bp_type type, CORE_ADDR addr,
       if (!aarch64_point_is_aligned (0 /* is_watchpoint */ , addr, len))
 	return -1;
 
-      return aarch64_dr_state_insert_one_point (state, type, addr, len);
+      return aarch64_dr_state_insert_one_point (state, type, addr, 0, len, -1);
     }
   else
-    return aarch64_dr_state_remove_one_point (state, type, addr, len);
+    return aarch64_dr_state_remove_one_point (state, type, addr, 0, len, -1);
 }
 
 /* This is essentially the same as aarch64_handle_breakpoint, apart
@@ -487,9 +585,9 @@  aarch64_handle_aligned_watchpoint (enum target_hw_bp_type type,
 				   struct aarch64_debug_reg_state *state)
 {
   if (is_insert)
-    return aarch64_dr_state_insert_one_point (state, type, addr, len);
+    return aarch64_dr_state_insert_one_point (state, type, addr, 0, len, addr);
   else
-    return aarch64_dr_state_remove_one_point (state, type, addr, len);
+    return aarch64_dr_state_remove_one_point (state, type, addr, 0, len, addr);
 }
 
 /* Insert/remove unaligned watchpoint by calling
@@ -504,29 +602,42 @@  aarch64_handle_unaligned_watchpoint (enum target_hw_bp_type type,
 				     CORE_ADDR addr, int len, int is_insert,
 				     struct aarch64_debug_reg_state *state)
 {
+  CORE_ADDR addr_orig = addr;
+
   while (len > 0)
     {
       CORE_ADDR aligned_addr;
-      int aligned_len, ret;
+      int aligned_offset, aligned_len, ret;
+      CORE_ADDR addr_orig_next = addr_orig;
 
-      aarch64_align_watchpoint (addr, len, &aligned_addr, &aligned_len,
-				&addr, &len);
+      aarch64_align_watchpoint (addr, len, &aligned_addr, &aligned_offset,
+				&aligned_len, &addr, &len, &addr_orig_next);
 
       if (is_insert)
 	ret = aarch64_dr_state_insert_one_point (state, type, aligned_addr,
-						 aligned_len);
+						 aligned_offset,
+						 aligned_len, addr_orig);
       else
 	ret = aarch64_dr_state_remove_one_point (state, type, aligned_addr,
-						 aligned_len);
+						 aligned_offset,
+						 aligned_len, addr_orig);
 
       if (show_debug_regs)
 	debug_printf ("handle_unaligned_watchpoint: is_insert: %d\n"
 		      "                             "
 		      "aligned_addr: %s, aligned_len: %d\n"
 		      "                                "
-		      "next_addr: %s,    next_len: %d\n",
+		      "addr_orig: %s\n"
+		      "                                "
+		      "next_addr: %s,    next_len: %d\n"
+		      "                           "
+		      "addr_orig_next: %s\n",
 		      is_insert, core_addr_to_string_nz (aligned_addr),
-		      aligned_len, core_addr_to_string_nz (addr), len);
+		      aligned_len, core_addr_to_string_nz (addr_orig),
+		      core_addr_to_string_nz (addr), len,
+		      core_addr_to_string_nz (addr_orig_next));
+
+      addr_orig = addr_orig_next;
 
       if (ret != 0)
 	return ret;
@@ -552,7 +663,7 @@  aarch64_handle_watchpoint (enum target_hw_bp_type type, CORE_ADDR addr,
    registers with data from *STATE.  */
 
 void
-aarch64_linux_set_debug_regs (const struct aarch64_debug_reg_state *state,
+aarch64_linux_set_debug_regs (struct aarch64_debug_reg_state *state,
 			      int tid, int watchpoint)
 {
   int i, count;
@@ -580,7 +691,17 @@  aarch64_linux_set_debug_regs (const struct aarch64_debug_reg_state *state,
   if (ptrace (PTRACE_SETREGSET, tid,
 	      watchpoint ? NT_ARM_HW_WATCH : NT_ARM_HW_BREAK,
 	      (void *) &iov))
-    error (_("Unexpected error setting hardware debug registers"));
+    {
+      // Handle PR external/20207 buggy kernels?
+      if (watchpoint && errno == EINVAL && have_any_contiguous)
+	{
+	  have_any_contiguous = false;
+	  aarch64_downgrade_regs (state);
+	  aarch64_linux_set_debug_regs (state, tid, watchpoint);
+	  return;
+	}
+      error (_("Unexpected error setting hardware debug registers"));
+    }
 }
 
 /* Print the values of the cached breakpoint/watchpoint registers.  */
@@ -611,8 +732,9 @@  aarch64_show_debug_reg_state (struct aarch64_debug_reg_state *state,
 
   debug_printf ("\tWATCHPOINTs:\n");
   for (i = 0; i < aarch64_num_wp_regs; i++)
-    debug_printf ("\tWP%d: addr=%s, ctrl=0x%08x, ref.count=%d\n",
+    debug_printf ("\tWP%d: addr=%s (orig=%s), ctrl=0x%08x, ref.count=%d\n",
 		  i, core_addr_to_string_nz (state->dr_addr_wp[i]),
+		  core_addr_to_string_nz (state->dr_addr_orig_wp[i]),
 		  state->dr_ctrl_wp[i], state->dr_ref_count_wp[i]);
 }
 
diff --git a/gdb/nat/aarch64-linux-hw-point.h b/gdb/nat/aarch64-linux-hw-point.h
index 610a5f1..7c323a2 100644
--- a/gdb/nat/aarch64-linux-hw-point.h
+++ b/gdb/nat/aarch64-linux-hw-point.h
@@ -77,13 +77,13 @@ 
 
    31                             13          5      3      1     0
    +--------------------------------+----------+------+------+----+
-   |         RESERVED (SBZ)         |  LENGTH  | TYPE | PRIV | EN |
+   |         RESERVED (SBZ)         |   MASK   | TYPE | PRIV | EN |
    +--------------------------------+----------+------+------+----+
 
    The TYPE field is ignored for breakpoints.  */
 
 #define DR_CONTROL_ENABLED(ctrl)	(((ctrl) & 0x1) == 1)
-#define DR_CONTROL_LENGTH(ctrl)		(((ctrl) >> 5) & 0xff)
+#define DR_CONTROL_MASK(ctrl)		(((ctrl) >> 5) & 0xff)
 
 /* Each bit of a variable of this type is used to indicate whether a
    hardware breakpoint or watchpoint setting has been changed since
@@ -147,7 +147,10 @@  struct aarch64_debug_reg_state
   unsigned int dr_ref_count_bp[AARCH64_HBP_MAX_NUM];
 
   /* hardware watchpoint */
+  // Address aligned down to AARCH64_HWP_ALIGNMENT.
   CORE_ADDR dr_addr_wp[AARCH64_HWP_MAX_NUM];
+  // Address as entered by user without any alignment.
+  CORE_ADDR dr_addr_orig_wp[AARCH64_HWP_MAX_NUM];
   unsigned int dr_ctrl_wp[AARCH64_HWP_MAX_NUM];
   unsigned int dr_ref_count_wp[AARCH64_HWP_MAX_NUM];
 };
@@ -166,6 +169,7 @@  struct arch_lwp_info
 extern int aarch64_num_bp_regs;
 extern int aarch64_num_wp_regs;
 
+unsigned int aarch64_watchpoint_offset (unsigned int ctrl);
 unsigned int aarch64_watchpoint_length (unsigned int ctrl);
 
 int aarch64_handle_breakpoint (enum target_hw_bp_type type, CORE_ADDR addr,
@@ -175,7 +179,7 @@  int aarch64_handle_watchpoint (enum target_hw_bp_type type, CORE_ADDR addr,
 			       int len, int is_insert,
 			       struct aarch64_debug_reg_state *state);
 
-void aarch64_linux_set_debug_regs (const struct aarch64_debug_reg_state *state,
+void aarch64_linux_set_debug_regs (struct aarch64_debug_reg_state *state,
 				   int tid, int watchpoint);
 
 void aarch64_show_debug_reg_state (struct aarch64_debug_reg_state *state,
diff --git a/gdb/testsuite/gdb.base/watchpoint-unaligned.c b/gdb/testsuite/gdb.base/watchpoint-unaligned.c
new file mode 100644
index 0000000..f653e77
--- /dev/null
+++ b/gdb/testsuite/gdb.base/watchpoint-unaligned.c
@@ -0,0 +1,68 @@ 
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2017 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdint.h>
+#include <assert.h>
+
+static int again;
+
+static volatile struct
+{
+  uint64_t alignment;
+  union
+    {
+      uint64_t size8[1];
+      uint32_t size4[2];
+      uint16_t size2[4];
+      uint8_t size1[8];
+    }
+  u;
+} data;
+
+static int size = 0;
+static int offset;
+
+int
+main (void)
+{
+  volatile uint64_t local;
+
+  assert (sizeof (data) == 8 + 8);
+  while (size)
+    {
+      switch (size)
+	{
+	case 8:
+	  local = data.u.size8[offset];
+	  break;
+	case 4:
+	  local = data.u.size4[offset];
+	  break;
+	case 2:
+	  local = data.u.size2[offset];
+	  break;
+	case 1:
+	  local = data.u.size1[offset];
+	  break;
+	default:
+	  assert (0);
+	}
+      size = 0;
+      size = size; /* again_start */
+    }
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/watchpoint-unaligned.exp b/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
new file mode 100644
index 0000000..8d463b1
--- /dev/null
+++ b/gdb/testsuite/gdb.base/watchpoint-unaligned.exp
@@ -0,0 +1,119 @@ 
+# Copyright 2017 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+#
+# This file is part of the gdb testsuite.
+
+standard_testfile
+if { [prepare_for_testing ${testfile}.exp ${testfile} ${srcfile}] } {
+    return -1
+}
+
+if ![runto_main] {
+    untested "could not run to main"
+    return -1
+}
+
+gdb_breakpoint "[gdb_get_line_number "again_start"]" "Breakpoint $decimal at $hex" "again_start"
+
+set sizes {1 2 4 8}
+array set alignedend {1 1  2 2  3 4  4 4  5 8  6 8  7 8  8 8}
+
+foreach wpsize $sizes {
+    for {set wpoffset 0} {$wpoffset < 8/$wpsize} {incr wpoffset} {
+	set wpstart [expr $wpoffset * $wpsize]
+	set wpend [expr ($wpoffset + 1) * $wpsize]
+	set wpendaligned $alignedend($wpend)
+	foreach rdsize $sizes {
+	    for {set rdoffset 0} {$rdoffset < 8/$rdsize} {incr rdoffset} {
+		set rdstart [expr $rdoffset * $rdsize]
+		set rdend [expr ($rdoffset + 1) * $rdsize]
+		set expect_hit [expr max($wpstart,$rdstart) < min($wpend,$rdend)]
+		set test "rwatch data.u.size$wpsize\[$wpoffset\]"
+		set wpnum ""
+		gdb_test_multiple $test $test {
+		    -re "Hardware read watchpoint (\[0-9\]+): .*\r\n$gdb_prompt $" {
+			set wpnum $expect_out(1,string)
+		    }
+		}
+		gdb_test_no_output "set variable size = $rdsize" ""
+		gdb_test_no_output "set variable offset = $rdoffset" ""
+		set test "continue"
+		set did_hit 0
+		gdb_test_multiple $test $test {
+		    -re "Hardware read watchpoint $wpnum:.*Value = .*\r\n$gdb_prompt $" {
+			set did_hit 1
+			send_gdb "continue\n"
+			exp_continue
+		    }
+		    -re " again_start .*\r\n$gdb_prompt $" {
+		    }
+		}
+		gdb_test_no_output "delete $wpnum" ""
+		set test "wp(size=$wpsize offset=$wpoffset) rd(size=$rdsize offset=$rdoffset) expect=$expect_hit"
+		if {$expect_hit == $did_hit} {
+		    pass $test
+		} else {
+		    # We do not know if we run on a fixed kernel or not.
+		    # Report XFAIL only in the FAIL case.
+		    if {$expect_hit == 0 && $rdstart < $wpendaligned} {
+			setup_xfail external/20207 "aarch64-*-*"
+		    }
+		    fail $test
+		}
+	    }
+	}
+    }
+}
+
+foreach wpcount {4 7} {
+    array set wpoffset_to_wpnum {}
+    for {set wpoffset 1} {$wpoffset <= $wpcount} {incr wpoffset} {
+	set test "rwatch data.u.size1\[$wpoffset\]"
+	set wpnum ""
+	gdb_test_multiple $test $test {
+	    -re "Hardware read watchpoint (\[0-9\]+): .*\r\n$gdb_prompt $" {
+		set wpoffset_to_wpnum($wpoffset) $expect_out(1,string)
+	    }
+	}
+    }
+    gdb_test_no_output "set variable size = 1" ""
+    gdb_test_no_output "set variable offset = 1" ""
+    set test "continue"
+    set did_hit 0
+    gdb_test_multiple $test $test {
+	-re "\r\nCould not insert hardware watchpoint .*\r\n$gdb_prompt $" {
+	}
+	-re "Hardware read watchpoint $wpoffset_to_wpnum(1):.*Value = .*\r\n$gdb_prompt $" {
+	    set did_hit 1
+	    send_gdb "continue\n"
+	    exp_continue
+	}
+	-re " again_start .*\r\n$gdb_prompt $" {
+	}
+    }
+    for {set wpoffset 1} {$wpoffset <= $wpcount} {incr wpoffset} {
+	gdb_test_no_output "delete $wpoffset_to_wpnum($wpoffset)" ""
+    }
+    set test "wpcount($wpcount)"
+    if {$wpcount > 4} {
+	setup_kfail tdep/22389 *-*-*
+    }
+    if {$did_hit} {
+	pass $test
+    } else {
+	fail $test
+    }
+}