[0/X] HWASAN v3

Message ID 157616229728.30610.11942820198797258041.scripted-patch-series@arm.com
State New
Headers show
Series
  • [0/X] HWASAN v3
Related show

Commit Message

Matthew Malcomson Dec. 12, 2019, 3:18 p.m.
Hello,

I've gone through the suggestions Martin made and implemented  the ones I think
I can implement for GCC10.

The two functionality changes in this version are:
Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,
hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan has
and that make sense for hwasan.

Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a
deterministic tagging approach.


There are a lot of extra comments and tests.


Bootstrapped and regtested on x86_64 and AArch64.
Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and hwasan
features tested there.
Built the linux kernel using this feature and ran the test_kasan.ko testing to
check the this works for the kernel.
(NOTE: I actually did all the above testing before a search and replace of
`memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the
`hwasan-instrument-allocas` parameter name, I will run all the tests again
before committing but figure I'll send this out now since I fully expect the
tests to still pass).


I noticed one extra testsuite failure from those mentioned in the previous
version emails: g++.dg/cpp2a/ucn2.C.
I believe this is HWASAN correctly catching a problem in the compiler.
I've logged the issue here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 .


I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,
since I'm not sure the way I found to implement this would be acceptable.  The
inlined patch below works but it requires a special declaration instead of just
an ~#include~.




Entire patch series attached to cover letter.

Comments

Matthew Malcomson Dec. 17, 2019, 2:11 p.m. | #1
I've noticed a few minor problems with this patch series after I sent it 
out (mostly testcase stuff, one documentation tidy-up, but also that one 
patch didn't bootstrap due to something fixed in a later patch).

I also rely on a documentation change that isn't part of the series.

I figure I should make this easy on anyone that wants to try the patch 
series out, so I'm attaching a compressed tarfile containing the entire 
patch series plus the additional documentation patch so it can all be 
applied at once with `git apply *`.

It's attached.

Matthew.



On 12/12/2019 15:18, Matthew Malcomson wrote:
> Hello,

> 

> I've gone through the suggestions Martin made and implemented  the ones I think

> I can implement for GCC10.

> 

> The two functionality changes in this version are:

> Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,

> hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan has

> and that make sense for hwasan.

> 

> Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a

> deterministic tagging approach.

> 

> 

> There are a lot of extra comments and tests.

> 

> 

> Bootstrapped and regtested on x86_64 and AArch64.

> Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and hwasan

> features tested there.

> Built the linux kernel using this feature and ran the test_kasan.ko testing to

> check the this works for the kernel.

> (NOTE: I actually did all the above testing before a search and replace of

> `memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the

> `hwasan-instrument-allocas` parameter name, I will run all the tests again

> before committing but figure I'll send this out now since I fully expect the

> tests to still pass).

> 

> 

> I noticed one extra testsuite failure from those mentioned in the previous

> version emails: g++.dg/cpp2a/ucn2.C.

> I believe this is HWASAN correctly catching a problem in the compiler.

> I've logged the issue here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 .

> 

> 

> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,

> since I'm not sure the way I found to implement this would be acceptable.  The

> inlined patch below works but it requires a special declaration instead of just

> an ~#include~.

> 

> 

> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h

> index a1bc081..d81eb12 100644

> --- a/gcc/internal-fn.h

> +++ b/gcc/internal-fn.h

> @@ -101,10 +101,16 @@ extern void init_internal_fns ();

>   

>   extern const char *const internal_fn_name_array[];

>   

> +

> +extern bool hwasan_sanitize_p (void);

>   static inline const char *

>   internal_fn_name (enum internal_fn fn)

>   {

> -  return internal_fn_name_array[(int) fn];

> +  const char *ret = internal_fn_name_array[(int) fn];

> +  if (! strcmp (ret, "ASAN_MARK")

> +      && hwasan_sanitize_p ())

> +    return "HWASAN_MARK";

> +  return ret;

>   }

>   

>   extern internal_fn lookup_internal_fn (const char *);

> 

> 

> Entire patch series attached to cover letter.

>
Matthew Malcomson Jan. 6, 2020, 3:26 p.m. | #2
Ping


On 17/12/2019 14:11, Matthew Malcomson wrote:
> I've noticed a few minor problems with this patch series after I sent it

> out (mostly testcase stuff, one documentation tidy-up, but also that one

> patch didn't bootstrap due to something fixed in a later patch).

> 

> I also rely on a documentation change that isn't part of the series.

> 

> I figure I should make this easy on anyone that wants to try the patch

> series out, so I'm attaching a compressed tarfile containing the entire

> patch series plus the additional documentation patch so it can all be

> applied at once with `git apply *`.

> 

> It's attached.

> 

> Matthew.

> 

> 

> 

> On 12/12/2019 15:18, Matthew Malcomson wrote:

>> Hello,

>>

>> I've gone through the suggestions Martin made and implemented  the ones I think

>> I can implement for GCC10.

>>

>> The two functionality changes in this version are:

>> Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,

>> hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan has

>> and that make sense for hwasan.

>>

>> Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a

>> deterministic tagging approach.

>>

>>

>> There are a lot of extra comments and tests.

>>

>>

>> Bootstrapped and regtested on x86_64 and AArch64.

>> Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and hwasan

>> features tested there.

>> Built the linux kernel using this feature and ran the test_kasan.ko testing to

>> check the this works for the kernel.

>> (NOTE: I actually did all the above testing before a search and replace of

>> `memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the

>> `hwasan-instrument-allocas` parameter name, I will run all the tests again

>> before committing but figure I'll send this out now since I fully expect the

>> tests to still pass).

>>

>>

>> I noticed one extra testsuite failure from those mentioned in the previous

>> version emails: g++.dg/cpp2a/ucn2.C.

>> I believe this is HWASAN correctly catching a problem in the compiler.

>> I've logged the issue here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 .

>>

>>

>> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,

>> since I'm not sure the way I found to implement this would be acceptable.  The

>> inlined patch below works but it requires a special declaration instead of just

>> an ~#include~.

>>

>>

>> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h

>> index a1bc081..d81eb12 100644

>> --- a/gcc/internal-fn.h

>> +++ b/gcc/internal-fn.h

>> @@ -101,10 +101,16 @@ extern void init_internal_fns ();

>>    

>>    extern const char *const internal_fn_name_array[];

>>    

>> +

>> +extern bool hwasan_sanitize_p (void);

>>    static inline const char *

>>    internal_fn_name (enum internal_fn fn)

>>    {

>> -  return internal_fn_name_array[(int) fn];

>> +  const char *ret = internal_fn_name_array[(int) fn];

>> +  if (! strcmp (ret, "ASAN_MARK")

>> +      && hwasan_sanitize_p ())

>> +    return "HWASAN_MARK";

>> +  return ret;

>>    }

>>    

>>    extern internal_fn lookup_internal_fn (const char *);

>>

>>

>> Entire patch series attached to cover letter.

>>
Martin Liška Jan. 7, 2020, 3:14 p.m. | #3
On 12/12/19 4:18 PM, Matthew Malcomson wrote:

Hello.

I've just sent few comments that are related to the v3 of the patch set.
Based on the HWASAN (limited) knowledge the patch seems reasonable to me.
I haven't looked much at the newly introduced RTL-hooks.
But these seems to me isolated to the aarch64 port.

I can also verify that the patchset works on my aarch64 linux machine and
hwasan.exp and asan.exp tests succeed.

> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,

> since I'm not sure the way I found to implement this would be acceptable.  The

> inlined patch below works but it requires a special declaration instead of just

> an ~#include~.


Knowing that, I would not bother with the printing of HWASAN_MARK.

Thanks for the series,
Martin
Matthew Malcomson Jan. 8, 2020, 11:26 a.m. | #4
Hi everyone,

I'm writing this email to summarise & publicise the state of this patch 
series, especially the difficulties around approval for GCC 10 mentioned 
on IRC.


The main obstacle seems to be that no maintainer feels they have enough 
knowledge about hwasan and justification that it's worthwhile to approve 
the patch series.

Similarly, Martin has given a review of the parts of the code he can 
(thanks!), but doesn't feel he can do a deep review of the code related 
to the RTL hooks and stack expansion -- hence that part is as yet not 
reviewed in-depth.



The questions around justification raised on IRC are mainly that it 
seems like a proof-of-concept for MTE rather than a stand-alone useable 
sanitizer.  Especially since in the GNU world hwasan instrumented code 
is not really ready for production since we can only use the 
less-"interceptor ABI" rather than the "platform ABI".  This restriction 
is because there is no version of glibc with the required modifications 
to provide the "platform ABI".

(n.b. that since https://reviews.llvm.org/D69574 the code-generation for 
these ABI's is the same).


 From my perspective the reasons that make HWASAN useful in itself are:

1) Much less memory usage.

 From a back-of-the-envelope calculation based on the hwasan paper's 
table of memory overhead from over-alignment 
https://arxiv.org/pdf/1802.09517.pdf  I guess hwasan instrumented code 
has an overhead of about 1.1x (~4% from overalignment and ~6.25% from 
shadow memory), while asan seems to have an overhead somewhere in the 
range 1.5x - 3x.

Maybe there's some data out there comparing total overheads that I 
haven't found? (I'd appreciate a reference if anyone has that info).



2) Available on more architectures that MTE.

HWASAN only requires TBI, which is a feature of all AArch64 machines, 
while MTE will be an optional extension and only available on certain 
architectures.


3) This enables using hwasan in the kernel.

While instrumented user-space applications will be using the 
"interceptor ABI" and hence are likely not production-quality, the 
biggest aim of implementing hwasan in GCC is to allow building the Linux 
kernel with tag-based sanitization using GCC.

Instrumented kernel code uses hooks in the kernel itself, so this ABI 
distinction is no longer relevant, and this sanitizer should produce a 
production-quality kernel binary.




I'm hoping I can find a maintainer willing to review and ACK this patch 
series -- especially with stage3 coming to a close soon.  If there's 
anything else I could do to help get someone willing up-to-speed then 
please just ask.


Cheers,
Matthew



On 07/01/2020 15:14, Martin Liška wrote:
> On 12/12/19 4:18 PM, Matthew Malcomson wrote:

> 

> Hello.

> 

> I've just sent few comments that are related to the v3 of the patch set.

> Based on the HWASAN (limited) knowledge the patch seems reasonable to me.

> I haven't looked much at the newly introduced RTL-hooks.

> But these seems to me isolated to the aarch64 port.

> 

> I can also verify that the patchset works on my aarch64 linux machine and

> hwasan.exp and asan.exp tests succeed.

> 

>> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory 

>> tagging,

>> since I'm not sure the way I found to implement this would be 

>> acceptable.  The

>> inlined patch below works but it requires a special declaration 

>> instead of just

>> an ~#include~.

> 

> Knowing that, I would not bother with the printing of HWASAN_MARK.

> 

> Thanks for the series,

> Martin
Paul Richard Thomas via Gcc-patches Jan. 8, 2020, 7:30 p.m. | #5
[asan/hwasan co-author here, with clearly biased opinions]

On Android, HWASAN is already a fully usable testing tool.
We apply it to the kernel, user space system libraries, and select apps.
A phone with HWASAN-ified system is fully usable (I carry one as my
primary device since March 2019).
HWASAN has discovered over 120 bugs by now (heap-use-after-free,
heap/stack buffer overflows, stack-use-after-return, double free).
Many of the bugs were discovered during the everyday use (as opposed
to testing in the lab).
The overhead is low enough that on a top-tier CPU the user will rarely
notice any slowdown
(the increased battery drain *is* noticeable - compiler
instrumentation is not a substitute for hardware).
HWASAN has also helped discover 4 instances of future incompatibility
with MTE, all fixed.

The main benefit of HWASAN over ASAN is, as Matthew correctly
explains, the memory usage.
On embedded devices, this is often the difference between "can't
deploy" and "can deploy"
because, unlike in the server land, you can't install more RAM.

The other, more subtle benefit, is that HWASAN is more sensitive to
some types of bugs,
such as buffer-overflow-far-from-bounds or use-after-long-ago-free, etc.

MTE hardware is years away. Even once we have it in major new devices,
many smaller devices will still be running on Arm v8, for a decade or two.
As with ASAN/TSAN/UBSAN, having this sanitizer implemented in GCC will
vastly extend its user base and applicability and thus contribute to
the overall code quality and security.

Whether HWASAN should intercept libc functions or libc itself should
support HWASAN...
My strong opinion is that today the interception approach can only be
seen as a way to prototype.
ASAN, implemented in 2011, had to use interception because we needed
to get a new idea working fast.
However, over these 9 years, the interception caused an enormous
amount of complexity and user dissatisfaction.
The Android implementation of HWASAN (with hooks in the Bionic libc
and no interceptors) is
many times simpler, robust, and complete.
We need to do the same for other LIBCs, eventually, but we don't have
to do it immediately.

--kcc





On Wed, Jan 8, 2020 at 3:26 AM Matthew Malcomson
<Matthew.Malcomson@arm.com> wrote:
>

> Hi everyone,

>

> I'm writing this email to summarise & publicise the state of this patch

> series, especially the difficulties around approval for GCC 10 mentioned

> on IRC.

>

>

> The main obstacle seems to be that no maintainer feels they have enough

> knowledge about hwasan and justification that it's worthwhile to approve

> the patch series.

>

> Similarly, Martin has given a review of the parts of the code he can

> (thanks!), but doesn't feel he can do a deep review of the code related

> to the RTL hooks and stack expansion -- hence that part is as yet not

> reviewed in-depth.

>

>

>

> The questions around justification raised on IRC are mainly that it

> seems like a proof-of-concept for MTE rather than a stand-alone useable

> sanitizer.  Especially since in the GNU world hwasan instrumented code

> is not really ready for production since we can only use the

> less-"interceptor ABI" rather than the "platform ABI".  This restriction

> is because there is no version of glibc with the required modifications

> to provide the "platform ABI".

>

> (n.b. that since https://reviews.llvm.org/D69574 the code-generation for

> these ABI's is the same).

>

>

>  From my perspective the reasons that make HWASAN useful in itself are:

>

> 1) Much less memory usage.

>

>  From a back-of-the-envelope calculation based on the hwasan paper's

> table of memory overhead from over-alignment

> https://arxiv.org/pdf/1802.09517.pdf  I guess hwasan instrumented code

> has an overhead of about 1.1x (~4% from overalignment and ~6.25% from

> shadow memory), while asan seems to have an overhead somewhere in the

> range 1.5x - 3x.

>

> Maybe there's some data out there comparing total overheads that I

> haven't found? (I'd appreciate a reference if anyone has that info).

>

>

>

> 2) Available on more architectures that MTE.

>

> HWASAN only requires TBI, which is a feature of all AArch64 machines,

> while MTE will be an optional extension and only available on certain

> architectures.

>

>

> 3) This enables using hwasan in the kernel.

>

> While instrumented user-space applications will be using the

> "interceptor ABI" and hence are likely not production-quality, the

> biggest aim of implementing hwasan in GCC is to allow building the Linux

> kernel with tag-based sanitization using GCC.

>

> Instrumented kernel code uses hooks in the kernel itself, so this ABI

> distinction is no longer relevant, and this sanitizer should produce a

> production-quality kernel binary.

>

>

>

>

> I'm hoping I can find a maintainer willing to review and ACK this patch

> series -- especially with stage3 coming to a close soon.  If there's

> anything else I could do to help get someone willing up-to-speed then

> please just ask.

>

>

> Cheers,

> Matthew

>

>

>

> On 07/01/2020 15:14, Martin Liška wrote:

> > On 12/12/19 4:18 PM, Matthew Malcomson wrote:

> >

> > Hello.

> >

> > I've just sent few comments that are related to the v3 of the patch set.

> > Based on the HWASAN (limited) knowledge the patch seems reasonable to me.

> > I haven't looked much at the newly introduced RTL-hooks.

> > But these seems to me isolated to the aarch64 port.

> >

> > I can also verify that the patchset works on my aarch64 linux machine and

> > hwasan.exp and asan.exp tests succeed.

> >

> >> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory

> >> tagging,

> >> since I'm not sure the way I found to implement this would be

> >> acceptable.  The

> >> inlined patch below works but it requires a special declaration

> >> instead of just

> >> an ~#include~.

> >

> > Knowing that, I would not bother with the printing of HWASAN_MARK.

> >

> > Thanks for the series,

> > Martin

>
Kyrill Tkachov Jan. 10, 2020, 4:16 p.m. | #6
On 1/8/20 11:26 AM, Matthew Malcomson wrote:
> Hi everyone,

>

> I'm writing this email to summarise & publicise the state of this patch

> series, especially the difficulties around approval for GCC 10 mentioned

> on IRC.

>

>

> The main obstacle seems to be that no maintainer feels they have enough

> knowledge about hwasan and justification that it's worthwhile to approve

> the patch series.

>

> Similarly, Martin has given a review of the parts of the code he can

> (thanks!), but doesn't feel he can do a deep review of the code related

> to the RTL hooks and stack expansion -- hence that part is as yet not

> reviewed in-depth.

>

>

>

> The questions around justification raised on IRC are mainly that it

> seems like a proof-of-concept for MTE rather than a stand-alone useable

> sanitizer.  Especially since in the GNU world hwasan instrumented code

> is not really ready for production since we can only use the

> less-"interceptor ABI" rather than the "platform ABI".  This restriction

> is because there is no version of glibc with the required modifications

> to provide the "platform ABI".

>

> (n.b. that since https://reviews.llvm.org/D69574 the code-generation for

> these ABI's is the same).

>

>

>  From my perspective the reasons that make HWASAN useful in itself are:

>

> 1) Much less memory usage.

>

>  From a back-of-the-envelope calculation based on the hwasan paper's

> table of memory overhead from over-alignment

> https://arxiv.org/pdf/1802.09517.pdf I guess hwasan instrumented code

> has an overhead of about 1.1x (~4% from overalignment and ~6.25% from

> shadow memory), while asan seems to have an overhead somewhere in the

> range 1.5x - 3x.

>

> Maybe there's some data out there comparing total overheads that I

> haven't found? (I'd appreciate a reference if anyone has that info).

>

>

>

> 2) Available on more architectures that MTE.

>

> HWASAN only requires TBI, which is a feature of all AArch64 machines,

> while MTE will be an optional extension and only available on certain

> architectures.

>

>

> 3) This enables using hwasan in the kernel.

>

> While instrumented user-space applications will be using the

> "interceptor ABI" and hence are likely not production-quality, the

> biggest aim of implementing hwasan in GCC is to allow building the Linux

> kernel with tag-based sanitization using GCC.

>

> Instrumented kernel code uses hooks in the kernel itself, so this ABI

> distinction is no longer relevant, and this sanitizer should produce a

> production-quality kernel binary.

>

>

>

>

> I'm hoping I can find a maintainer willing to review and ACK this patch

> series -- especially with stage3 coming to a close soon.  If there's

> anything else I could do to help get someone willing up-to-speed then

> please just ask.

>


FWIW I've reviewed the aarch64 parts over the lifetime of the patch 
series and I am okay with them.

Given the reviews of the sanitiser, library and aarch64 backend 
components, and the data at

https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00387.html

how can we move forward with commit approval ? Is this something a 
global reviewer can help with, Jeff ? :)

Thanks,

Kyrill



>

> Cheers,

> Matthew

>

>

>

> On 07/01/2020 15:14, Martin Liška wrote:

> > On 12/12/19 4:18 PM, Matthew Malcomson wrote:

> >

> > Hello.

> >

> > I've just sent few comments that are related to the v3 of the patch set.

> > Based on the HWASAN (limited) knowledge the patch seems reasonable 

> to me.

> > I haven't looked much at the newly introduced RTL-hooks.

> > But these seems to me isolated to the aarch64 port.

> >

> > I can also verify that the patchset works on my aarch64 linux 

> machine and

> > hwasan.exp and asan.exp tests succeed.

> >

> >> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory

> >> tagging,

> >> since I'm not sure the way I found to implement this would be

> >> acceptable.  The

> >> inlined patch below works but it requires a special declaration

> >> instead of just

> >> an ~#include~.

> >

> > Knowing that, I would not bother with the printing of HWASAN_MARK.

> >

> > Thanks for the series,

> > Martin

>
Matthew Malcomson Aug. 17, 2020, 2:12 p.m. | #7
Hello,

This is v4 of the HWASAN patches which add the LLVM hardware address
sanitizer (HWASAN) to GCC.
The document describing HWASAN can be found here
http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html.

This address sanitizer only works for AArch64 at the moment.  It
requires the "top byte ignore" feature where the top byte of a pointer
does not affect dereferences.  This is checked for by a backend hook so
that if other architectures have this feature HWASAN can be used for
them.

We require a linux kernel with the relaxed ABI to allow tagged pointers
in system calls.  This is in the linux mainline, I have been testing
this feature on 5.8.0, but it has been in since at least 5.5.0.

HWASAN works by storing a tag in the top bits of every pointer and a tag in
a shadow memory region corresponding to each area of memory pointed at.
On every memory access through a pointer the tag in the pointer is
checked against the tag in shadow memory corresponding to the memory the
pointer is accessing.  If the pointer tag and memory tag do not match
then a fault is signalled.

The instrumentation required for this sanitizer has a large overlap with
the instrumentation required for implementing MTE (which has similar
functionality but checks are automatically done in the hardware and
instructions for tagging shadow memory and for managing tags are
provided by the architecture).
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-a-profile-architecture-2018-developments-armv85a

We hope to use the HWASAN framework to implement MTE for the stack, so I
have included a "dummy" patch showing how this might be done in the full
patch set attached to this cover letter.  (mte-approach.patch)



Mainly this is the same patch series as posted in December rebased onto
master, but I have also identified and fixed a few minor bugs.

  - A few bugs around the use of ptr_mode and Pmode
    (for AArch64 only observable with -mabi=ilp32).
  - A frame-extent calculation bug when every stack object in a frame is
    `large aligned` -- see the arguments that hwasan_emit_prologue
    passes to hwasan_emit_untag_frame.
  - A few changes of parameter names to match Clang.
    The names were originally chosen to match ASAN, but I found that
    most of these parameters can be mapped to a Clang configurable and
    that renaming them to match clang makes Makefiles (like
    scripts/Makefile.kasan in the linux kernel) easier to write.

Bootstrapped and regtested on x86_64 and AArch64.
Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and
hwasan features tested there (all new regression failures accounted for
manually).
Built the linux kernel using this feature and ran the test_kasan.ko
testing to check the this works for the kernel.

(NOTE: Stack-tagging for the linux kernel has recently been added for
clang.  Testing GCC stack-tagging on the kernel showed two small bugs --
1) using __hwasan_tag_pointer for allocas which isn't available in the
kernel, 2) not avoiding zero tag due to the stack pointer having tag of
0xff. I'm working on fixing these but I'm sending up the patch series
as-is to get feedback earlier.  A small change to a kernel makefile, and
some headers are needed to build it with this sanitizer -- a patch with
the changes and a text file explaining how to build and test the kernel
with HWASAN is attached.
  linux-for-gcc-hwasan.patch and testing-kasan.txt )


Last time the patch set went up Martin Liska had provided a patch to
build upon that added libhwasan into the libsanitizer directory so the
patch set didn't contain anything to do that.

The "full" patchset attached to this email contains a patch that adds
this library so that anyone who wants to test this can do so, but there
is no corresponding email for the individual patch.
(introduce-libhwasan.patch)


NOTE:
  1) The target of having this sanitizer in GCC is for use on the Linux
  kernel.  The implementation should be good for the kernel, while the
  userspace story is not robust.

  The main use case of HWASAN is when it lies *underneath* the system
  libc.  In this approach the libc calls into libhwasan on important
  events (e.g. longjmp).
  This has been used very well in Android.
  At the moment there are no plans to do similar on distros with a
  modified glibc.

  A userspace story can be made by intercepting important functions.
  LLVM maintain this approach for *testing* only, and hence it is not of
  a production quality.
  Similarly we aim to use this "interception" model for testing, while
  maintaining the focus of the GCC port of use on the kernel.

  2) This sanitizer has no handling of C++ exceptions.
  If an exception is thrown the "shadow stack" is left in an
  inconsistent state and will likely eventually cause a false positive
  later on in the program.
  This is due to the fact that the handling of exceptions in LLVM relies
  on having the frame record appear after any locals(*).  This
  restriction is not satisfied by GCC due to its frame layout
  optimisation.
  
  (*) https://github.com/llvm-mirror/compiler-rt/blob/master/lib/hwasan/hwasan_exceptions.cpp#L52



Entire patch set attached to this cover letter.
Matthew Malcomson Aug. 17, 2020, 2:13 p.m. | #8
Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

	* c-c++-common/hwasan/aligned-alloc.c: New test.
	* c-c++-common/hwasan/alloca-array-accessible.c: New test.
	* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
	* c-c++-common/hwasan/alloca-outside-caught.c: New test.
	* c-c++-common/hwasan/arguments.c: New test.
	* c-c++-common/hwasan/arguments-1.c: New test.
	* c-c++-common/hwasan/arguments-2.c: New test.
	* c-c++-common/hwasan/arguments-3.c: New test.
	* c-c++-common/hwasan/asan-pr63316.c: New test.
	* c-c++-common/hwasan/asan-pr70541.c: New test.
	* c-c++-common/hwasan/asan-pr78106.c: New test.
	* c-c++-common/hwasan/asan-pr79944.c: New test.
	* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
	* c-c++-common/hwasan/bitfield-1.c: New test.
	* c-c++-common/hwasan/bitfield-2.c: New test.
	* c-c++-common/hwasan/builtin-special-handling.c: New test.
	* c-c++-common/hwasan/check-interface.c: New test.
	* c-c++-common/hwasan/halt_on_error-1.c: New test.
	* c-c++-common/hwasan/heap-overflow.c: New test.
	* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
	* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
	* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
	* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
	* c-c++-common/hwasan/hwasan-thread-success.c: New test.
	* c-c++-common/hwasan/kernel-defaults.c: New test.
	* c-c++-common/hwasan/large-aligned-0.c: New test.
	* c-c++-common/hwasan/large-aligned-1.c: New test.
	* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
	* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
	* c-c++-common/hwasan/large-aligned-untagging-2.c: New test.
	* c-c++-common/hwasan/large-aligned-untagging-3.c: New test.
	* c-c++-common/hwasan/large-aligned-untagging-4.c: New test.
	* c-c++-common/hwasan/large-aligned-untagging-5.c: New test.
	* c-c++-common/hwasan/large-aligned-untagging-6.c: New test.
	* c-c++-common/hwasan/large-aligned-untagging-7.c: New test.
	* c-c++-common/hwasan/macro-definition.c: New test.
	* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
	* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
	* c-c++-common/hwasan/param-instrument-reads.c: New test.
	* c-c++-common/hwasan/param-instrument-writes.c: New test.
	* c-c++-common/hwasan/param-instrument-mem-intrinsics.c: New test.
	* c-c++-common/hwasan/random-frame-tag.c: New test.
	* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
	* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
	* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
	* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
	* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
	* c-c++-common/hwasan/stack-tagging-disable.c: New test.
	* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
	* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
	* c-c++-common/hwasan/use-after-free.c: New test.
	* c-c++-common/hwasan/vararray-outside-caught.c: New test.
	* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
	* c-c++-common/hwasan/very-large-objects.c: New test.
	* g++.dg/hwasan/hwasan.exp: New file.
	* g++.dg/hwasan/rvo-handled.C: New test.
	* gcc.dg/hwasan/hwasan.exp: New file.
	* gcc.dg/hwasan/nested-functions-0.c: New test.
	* gcc.dg/hwasan/nested-functions-1.c: New test.
	* gcc.dg/hwasan/nested-functions-2.c: New test.
	* lib/hwasan-dg.exp: New file.

Patch

diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index a1bc081..d81eb12 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -101,10 +101,16 @@  extern void init_internal_fns ();
 
 extern const char *const internal_fn_name_array[];
 
+
+extern bool hwasan_sanitize_p (void);
 static inline const char *
 internal_fn_name (enum internal_fn fn)
 {
-  return internal_fn_name_array[(int) fn];
+  const char *ret = internal_fn_name_array[(int) fn];
+  if (! strcmp (ret, "ASAN_MARK")
+      && hwasan_sanitize_p ())
+    return "HWASAN_MARK";
+  return ret;
 }
 
 extern internal_fn lookup_internal_fn (const char *);