follow SSA defs for asan base

Message ID ory2gmgeh7.fsf@lxoliva.fsfla.org
State New
Headers show
Series
  • follow SSA defs for asan base
Related show

Commit Message

Alexandre Oliva Jan. 21, 2021, 9:35 p.m.
Ada makes extensive use of nested functions, which turn all automatic
variables of the enclosing function that are used in nested ones into
members of an artificial FRAME record type.

The address of a local variable is usually passed to asan marking
functions without using a temporary.  Taking the address of a member
of FRAME within a nested function, however, is not regarded as a
gimple val: while introducing FRAME variables, current_function_decl
is always the outermost function, even while processing a nested
function, so decl_address_invariant_p returns false for such
ADDR_EXPRs.  So, as automatic variables are moved into FRAME, any asan
call that marks such a variable has its ADDR_EXPR replaced with a
SSA_NAME set to the ADDR_EXPR of the FRAME member.

asan_expand_mark_ifn was not prepared to deal with ADDR_EXPRs split
out into SSA_NAMEs.  This patch deals with such cases.

[It does NOT deal with PHI nodes and whatnot.  I'm not even sure it
should.  Maybe we want the ADDR_EXPR to be a gimple val instead, but
this more conservative fix felt more appropriate for this stage.]

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

	* asan.c (asan_expand_mark_ifn): Follow SSA_NAME defs for
	an ADDR_EXPR base.

for  gcc/testsuite/ChangeLog

	* gcc.dg/asan/nested-1.c: New.
---
 gcc/asan.c                           |   21 +++++++++++++++++++++
 gcc/testsuite/gcc.dg/asan/nested-1.c |   24 ++++++++++++++++++++++++
 2 files changed, 45 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/asan/nested-1.c



-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist         GNU Toolchain Engineer
        Vim, Vi, Voltei pro Emacs -- GNUlius Caesar

Comments

H.J. Lu via Gcc-patches Jan. 22, 2021, 9:02 a.m. | #1
On Thu, Jan 21, 2021 at 06:35:48PM -0300, Alexandre Oliva wrote:
> 

> Ada makes extensive use of nested functions, which turn all automatic

> variables of the enclosing function that are used in nested ones into

> members of an artificial FRAME record type.

> 

> The address of a local variable is usually passed to asan marking

> functions without using a temporary.  Taking the address of a member

> of FRAME within a nested function, however, is not regarded as a

> gimple val: while introducing FRAME variables, current_function_decl

> is always the outermost function, even while processing a nested

> function, so decl_address_invariant_p returns false for such

> ADDR_EXPRs.  So, as automatic variables are moved into FRAME, any asan

> call that marks such a variable has its ADDR_EXPR replaced with a

> SSA_NAME set to the ADDR_EXPR of the FRAME member.

> 

> asan_expand_mark_ifn was not prepared to deal with ADDR_EXPRs split

> out into SSA_NAMEs.  This patch deals with such cases.

> 

> [It does NOT deal with PHI nodes and whatnot.  I'm not even sure it

> should.  Maybe we want the ADDR_EXPR to be a gimple val instead, but

> this more conservative fix felt more appropriate for this stage.]

> 

> Regstrapped on x86_64-linux-gnu.  Ok to install?

> 

> 

> for  gcc/ChangeLog

> 

> 	* asan.c (asan_expand_mark_ifn): Follow SSA_NAME defs for

> 	an ADDR_EXPR base.

> 

> for  gcc/testsuite/ChangeLog

> 

> 	* gcc.dg/asan/nested-1.c: New.


Ok, thanks (and thanks for coming with a C testcase, for me as not an Ada
speaker that helps a lot).

	Jakub
H.J. Lu via Gcc-patches Jan. 22, 2021, 2:16 p.m. | #2
On Thu, Jan 21, 2021 at 10:36 PM Alexandre Oliva <oliva@adacore.com> wrote:
>

>

> Ada makes extensive use of nested functions, which turn all automatic

> variables of the enclosing function that are used in nested ones into

> members of an artificial FRAME record type.

>

> The address of a local variable is usually passed to asan marking

> functions without using a temporary.  Taking the address of a member

> of FRAME within a nested function, however, is not regarded as a

> gimple val: while introducing FRAME variables, current_function_decl

> is always the outermost function, even while processing a nested

> function, so decl_address_invariant_p returns false for such

> ADDR_EXPRs.  So, as automatic variables are moved into FRAME, any asan

> call that marks such a variable has its ADDR_EXPR replaced with a

> SSA_NAME set to the ADDR_EXPR of the FRAME member.

>

> asan_expand_mark_ifn was not prepared to deal with ADDR_EXPRs split

> out into SSA_NAMEs.  This patch deals with such cases.

>

> [It does NOT deal with PHI nodes and whatnot.  I'm not even sure it

> should.  Maybe we want the ADDR_EXPR to be a gimple val instead, but

> this more conservative fix felt more appropriate for this stage.]


Yeah, I guess such addresses could be decl_address_invariant_p by changing
the

          || DECL_CONTEXT (op) == current_function_decl
          || decl_function_context (op) == current_function_decl)

to sth like

       || auto_var_p (op)

but I guess it won't help much since the access in the nested function
will be through the static chain pointer and thus &chain->member
which definitely isn't a gimple val.  When it still looks like &<FRAME>.member
it could be but that arises more re-gimplification needs during the
nested function lowering then.

>

> Regstrapped on x86_64-linux-gnu.  Ok to install?


few comments below

>

> for  gcc/ChangeLog

>

>         * asan.c (asan_expand_mark_ifn): Follow SSA_NAME defs for

>         an ADDR_EXPR base.

>

> for  gcc/testsuite/ChangeLog

>

>         * gcc.dg/asan/nested-1.c: New.

> ---

>  gcc/asan.c                           |   21 +++++++++++++++++++++

>  gcc/testsuite/gcc.dg/asan/nested-1.c |   24 ++++++++++++++++++++++++

>  2 files changed, 45 insertions(+)

>  create mode 100644 gcc/testsuite/gcc.dg/asan/nested-1.c

>

> diff --git a/gcc/asan.c b/gcc/asan.c

> index 89ecd99b18294..2d2fb97098b2f 100644

> --- a/gcc/asan.c

> +++ b/gcc/asan.c

> @@ -3629,6 +3629,27 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter)

>    bool is_poison = ((asan_mark_flags)flag) == ASAN_MARK_POISON;

>

>    tree base = gimple_call_arg (g, 1);

> +  while (TREE_CODE (base) == SSA_NAME)

> +    {

> +      gimple *def = SSA_NAME_DEF_STMT (base);

> +      if (!def)

> +       break;

> +

> +      if (!is_gimple_assign (def))

> +       break;

> +

> +      if (!SINGLE_SSA_TREE_OPERAND (def, SSA_OP_DEF))

> +       break;


redundant

> +      if (gimple_num_ops (def) != 2)

> +       break;


likewise

> +      if (gimple_expr_code (def) == ADDR_EXPR

> +         || gimple_expr_code (def) == SSA_NAME)


Use gimple_assign_rhs_code (def), not gimple_expr_code.

> +       base = gimple_assign_rhs1 (def);

> +      else

> +       break;

> +    }


note with the above the assert below will become dangerous
if you think of PRE/CSE or other things we could do to the
split out address.

>    gcc_checking_assert (TREE_CODE (base) == ADDR_EXPR);

>    tree decl = TREE_OPERAND (base, 0);

>

> diff --git a/gcc/testsuite/gcc.dg/asan/nested-1.c b/gcc/testsuite/gcc.dg/asan/nested-1.c

> new file mode 100644

> index 0000000000000..87e842098077c

> --- /dev/null

> +++ b/gcc/testsuite/gcc.dg/asan/nested-1.c

> @@ -0,0 +1,24 @@

> +/* { dg-do compile } */

> +/* { dg-options "-fsanitize=address" } */

> +

> +int f(int i) {

> +  auto int h() {

> +    int r;

> +    int *p;

> +

> +    {

> +      int x[3];

> +

> +      auto int g() {

> +       return x[i];

> +      }

> +

> +      p = &r;

> +      *p = g();

> +    }

> +

> +    return *p;

> +  }

> +

> +  return h();

> +}

>

>

> --

> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/

>    Free Software Activist         GNU Toolchain Engineer

>         Vim, Vi, Voltei pro Emacs -- GNUlius Caesar
Alexandre Oliva Jan. 26, 2021, 9:52 a.m. | #3
On Jan 22, 2021, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

> Yeah, I guess such addresses could be decl_address_invariant_p by changing

> the


>           || DECL_CONTEXT (op) == current_function_decl

>           || decl_function_context (op) == current_function_decl)


> to sth like


>        || auto_var_p (op)



> but I guess it won't help much since the access in the nested function

> will be through the static chain pointer and thus &chain->member

> which definitely isn't a gimple val.


I think we only get poison/unpoison calls in the scope of the automatic
variables, so it won't be the case that these calls will get such
indirect frame references: they will only get references to the own
function's frame, and those have invariant addresses, and thus can be
regarded as such, and as gimple vals.

So I gave this alternate change a spin, and both regstrap and asan+ubsan
bootstrap completed successfully.

Given the considerations of risk about the assert you pointed out, I'm
now inclined to regard this change as safer and superior.  Do you all
concur?  Ok to install if so?

(I see the code in tree.c was untabified, and one of my changes
introduced a tab that misaligned stuff.  Tabify, untabify, or leave it
inconsistent as in the tested patch below?


regard the address of auto vars and consts as invariant

From: Alexandre Oliva <oliva@adacore.com>


Ada makes extensive use of nested functions, which turn all automatic
variables of the enclosing function that are used in nested ones into
members of an artificial FRAME record type.

The address of a local variable is usually passed to asan marking
functions without using a temporary.  asan_expand_mark_ifn will reject
an ADDR_EXPRs if it's split out from the call into an SSA_NAMEs.

Taking the address of a member of FRAME within a nested function was
not regarded as a gimple val: while introducing FRAME variables,
current_function_decl pointed to the outermost function, even while
processing a nested function, so decl_address_invariant_p, checking
that the context of the variable is current_function_decl, returned
false for such ADDR_EXPRs.

This patch changes decl_address_invariant_p to disregard
current_function_decl, and regard all automatic variables as having
invariant addresses, regardless of nesting.  This may initially
include references to variables in other nesting levels, but once they
become references to enclosing frames, the indirection makes them
non-gimple_vals.  As long as poisoning and unpoisoning calls doesn't
kick in for variables in other frames, this shouldn't be a problem.


for  gcc/ChangeLog

	* tree.c (decl_address_invariant_p): Accept auto variables and
	constants.

for  gcc/testsuite/ChangeLog

	* gcc.dg/asan/nested-1.c: New.
---
 gcc/testsuite/gcc.dg/asan/nested-1.c |   24 ++++++++++++++++++++++++
 gcc/tree.c                           |    5 ++---
 2 files changed, 26 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/asan/nested-1.c

diff --git a/gcc/testsuite/gcc.dg/asan/nested-1.c b/gcc/testsuite/gcc.dg/asan/nested-1.c
new file mode 100644
index 0000000000000..87e842098077c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/nested-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=address" } */
+
+int f(int i) {
+  auto int h() {
+    int r;
+    int *p;
+
+    {
+      int x[3];
+
+      auto int g() {
+	return x[i];
+      }
+
+      p = &r;
+      *p = g();
+    }
+
+    return *p;
+  }
+
+  return h();
+}
diff --git a/gcc/tree.c b/gcc/tree.c
index 287e5001dc3b3..3de3085f42c8a 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -3590,14 +3590,13 @@ decl_address_invariant_p (const_tree op)
     case VAR_DECL:
       if ((TREE_STATIC (op) || DECL_EXTERNAL (op))
           || DECL_THREAD_LOCAL_P (op)
-          || DECL_CONTEXT (op) == current_function_decl
-          || decl_function_context (op) == current_function_decl)
+	  || auto_var_p (op))
         return true;
       break;
 
     case CONST_DECL:
       if ((TREE_STATIC (op) || DECL_EXTERNAL (op))
-          || decl_function_context (op) == current_function_decl)
+          || auto_var_p (op))
         return true;
       break;
 


-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist         GNU Toolchain Engineer
        Vim, Vi, Voltei pro Emacs -- GNUlius Caesar
H.J. Lu via Gcc-patches Jan. 26, 2021, 10:28 a.m. | #4
On Tue, Jan 26, 2021 at 10:52 AM Alexandre Oliva <oliva@adacore.com> wrote:
>

> On Jan 22, 2021, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

>

> > Yeah, I guess such addresses could be decl_address_invariant_p by changing

> > the

>

> >           || DECL_CONTEXT (op) == current_function_decl

> >           || decl_function_context (op) == current_function_decl)

>

> > to sth like

>

> >        || auto_var_p (op)

>

>

> > but I guess it won't help much since the access in the nested function

> > will be through the static chain pointer and thus &chain->member

> > which definitely isn't a gimple val.

>

> I think we only get poison/unpoison calls in the scope of the automatic

> variables, so it won't be the case that these calls will get such

> indirect frame references: they will only get references to the own

> function's frame, and those have invariant addresses, and thus can be

> regarded as such, and as gimple vals.

>

> So I gave this alternate change a spin, and both regstrap and asan+ubsan

> bootstrap completed successfully.

>

> Given the considerations of risk about the assert you pointed out, I'm

> now inclined to regard this change as safer and superior.  Do you all

> concur?  Ok to install if so?

>

> (I see the code in tree.c was untabified, and one of my changes

> introduced a tab that misaligned stuff.  Tabify, untabify, or leave it

> inconsistent as in the tested patch below?

>

>

> regard the address of auto vars and consts as invariant

>

> From: Alexandre Oliva <oliva@adacore.com>

>

> Ada makes extensive use of nested functions, which turn all automatic

> variables of the enclosing function that are used in nested ones into

> members of an artificial FRAME record type.

>

> The address of a local variable is usually passed to asan marking

> functions without using a temporary.  asan_expand_mark_ifn will reject

> an ADDR_EXPRs if it's split out from the call into an SSA_NAMEs.

>

> Taking the address of a member of FRAME within a nested function was

> not regarded as a gimple val: while introducing FRAME variables,

> current_function_decl pointed to the outermost function, even while

> processing a nested function, so decl_address_invariant_p, checking

> that the context of the variable is current_function_decl, returned

> false for such ADDR_EXPRs.

>

> This patch changes decl_address_invariant_p to disregard

> current_function_decl, and regard all automatic variables as having

> invariant addresses, regardless of nesting.  This may initially

> include references to variables in other nesting levels, but once they

> become references to enclosing frames, the indirection makes them

> non-gimple_vals.  As long as poisoning and unpoisoning calls doesn't

> kick in for variables in other frames, this shouldn't be a problem.


So while I think it's safe let's look at if we can improve tree-nested.c,
like I see (probably not the correct place):

static tree
convert_nonlocal_reference_op (tree *tp, int *walk_subtrees, void *data)
{
...
            /* If we changed anything, we might no longer be directly
               referencing a decl.  */
            save_context = current_function_decl;
            current_function_decl = info->context;
            recompute_tree_invariant_for_addr_expr (t);
            current_function_decl = save_context;

            /* If the callback converted the address argument in a context
               where we only accept variables (and min_invariant, presumably),
               then compute the address into a temporary.  */
            if (save_val_only)
              *tp = gsi_gimplify_val ((struct nesting_info *) wi->info,
                                      t, &wi->gsi);

seeing how we adjust current_function_decl around the
recompute_tree_invariant_for_addr_expr call but not the
gsi_gimplify_val one (we already pass it a nesting_info,
not sure if wi->info is the same as the 'info' used above though),
so eventually we can fix it in one place?

I see the testcase ICEs only at -O0 and with -O we CCP the
address so we definitely treat it as invariant just not in the
skewed context tree-nested operates in.

Richard.

>

> for  gcc/ChangeLog

>

>         * tree.c (decl_address_invariant_p): Accept auto variables and

>         constants.

>

> for  gcc/testsuite/ChangeLog

>

>         * gcc.dg/asan/nested-1.c: New.

> ---

>  gcc/testsuite/gcc.dg/asan/nested-1.c |   24 ++++++++++++++++++++++++

>  gcc/tree.c                           |    5 ++---

>  2 files changed, 26 insertions(+), 3 deletions(-)

>  create mode 100644 gcc/testsuite/gcc.dg/asan/nested-1.c

>

> diff --git a/gcc/testsuite/gcc.dg/asan/nested-1.c b/gcc/testsuite/gcc.dg/asan/nested-1.c

> new file mode 100644

> index 0000000000000..87e842098077c

> --- /dev/null

> +++ b/gcc/testsuite/gcc.dg/asan/nested-1.c

> @@ -0,0 +1,24 @@

> +/* { dg-do compile } */

> +/* { dg-options "-fsanitize=address" } */

> +

> +int f(int i) {

> +  auto int h() {

> +    int r;

> +    int *p;

> +

> +    {

> +      int x[3];

> +

> +      auto int g() {

> +       return x[i];

> +      }

> +

> +      p = &r;

> +      *p = g();

> +    }

> +

> +    return *p;

> +  }

> +

> +  return h();

> +}

> diff --git a/gcc/tree.c b/gcc/tree.c

> index 287e5001dc3b3..3de3085f42c8a 100644

> --- a/gcc/tree.c

> +++ b/gcc/tree.c

> @@ -3590,14 +3590,13 @@ decl_address_invariant_p (const_tree op)

>      case VAR_DECL:

>        if ((TREE_STATIC (op) || DECL_EXTERNAL (op))

>            || DECL_THREAD_LOCAL_P (op)

> -          || DECL_CONTEXT (op) == current_function_decl

> -          || decl_function_context (op) == current_function_decl)

> +         || auto_var_p (op))

>          return true;

>        break;

>

>      case CONST_DECL:

>        if ((TREE_STATIC (op) || DECL_EXTERNAL (op))

> -          || decl_function_context (op) == current_function_decl)

> +          || auto_var_p (op))

>          return true;

>        break;

>

>

>

> --

> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/

>    Free Software Activist         GNU Toolchain Engineer

>         Vim, Vi, Voltei pro Emacs -- GNUlius Caesar
Alexandre Oliva Jan. 27, 2021, 12:29 p.m. | #5
On Jan 26, 2021, Richard Biener <richard.guenther@gmail.com> wrote:

> So while I think it's safe let's look at if we can improve tree-nested.c,

> like I see (probably not the correct place):


*nod*, it's just not the *only* place.

> seeing how we adjust current_function_decl around the

> recompute_tree_invariant_for_addr_expr call but not the

> gsi_gimplify_val one (we already pass it a nesting_info,

> not sure if wi->info is the same as the 'info' used above though),

> so eventually we can fix it in one place?


There are pieces of nested function lowering for which we set cfun and
current_function_decl while walking each function, and there are other
pieces that just don't bother, and we only set up current_function_decl
temporarily for ADDR_EXPR handling.

This patch adjusts both of the ADDR_EXPR handlers that override
current_function_decl, so that the temporary overriding remains in
effect during the re-gimplification.  That is enough to avoid the
problem.  But I'm not very happy with this temporary overriding, it
seems fishy.  I'd rather we set things up for the entire duration of the
walking of each function.

But that's only relevant because we rely on current_function_decl for
address handling.  It's not clear to me that we should, as the other
patch demonstrated.  With it, we could probably even do away with these
overriders.

But, for this stage, this is probably as conservative a change as we
could possibly hope for.  I've regstrapped it on x86_64-linux-gnu, and
also bootstrapped it with asan and ubsan.  Ok to install?


restore current_function_decl after re-gimplifying nested ADDR_EXPRs

From: Alexandre Oliva <oliva@adacore.com>


Ada makes extensive use of nested functions, which turn all automatic
variables of the enclosing function that are used in nested ones into
members of an artificial FRAME record type.

The address of a local variable is usually passed to asan marking
functions without using a temporary.  asan_expand_mark_ifn will reject
an ADDR_EXPRs if it's split out from the call into an SSA_NAMEs.

Taking the address of a member of FRAME within a nested function was
not regarded as a gimple val: while introducing FRAME variables,
current_function_decl pointed to the outermost function, even while
processing a nested function, so decl_address_invariant_p, checking
that the context of the variable is current_function_decl, returned
false for such ADDR_EXPRs.

decl_address_invariant_p, called when determining whether an
expression is a legitimate gimple value, compares the context of
automatic variables with current_function_decl.  Some of the
tree-nested function processing doesn't set current_function_decl, but
ADDR_EXPR-processing bits temporarily override it.  However, they
restore it before re-gimplifying, which causes even ADDR_EXPRs
referencing automatic variables in the FRAME struct of a nested
function to not be regarded as address-invariant.

This patch moves the restores of current_function_decl in the
ADDR_EXPR-handling bits after the re-gimplification, so that the
correct current_function_decl is used when testing for address
invariance.


for  gcc/ChangeLog

	* tree-nested.c (convert_nonlocal_reference_op): Move
	current_function_decl restore after re-gimplification.
	(convert_local_reference_op): Likewise.

for  gcc/testsuite/ChangeLog

	* gcc.dg/asan/nested-1.c: New.
---
 gcc/testsuite/gcc.dg/asan/nested-1.c |   24 ++++++++++++++++++++++++
 gcc/tree-nested.c                    |    4 ++--
 2 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/asan/nested-1.c

diff --git a/gcc/testsuite/gcc.dg/asan/nested-1.c b/gcc/testsuite/gcc.dg/asan/nested-1.c
new file mode 100644
index 0000000000000..87e842098077c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/nested-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=address" } */
+
+int f(int i) {
+  auto int h() {
+    int r;
+    int *p;
+
+    {
+      int x[3];
+
+      auto int g() {
+	return x[i];
+      }
+
+      p = &r;
+      *p = g();
+    }
+
+    return *p;
+  }
+
+  return h();
+}
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index 1b52669b622aa..addd6eef9aba6 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -1214,7 +1214,6 @@ convert_nonlocal_reference_op (tree *tp, int *walk_subtrees, void *data)
 	    save_context = current_function_decl;
 	    current_function_decl = info->context;
 	    recompute_tree_invariant_for_addr_expr (t);
-	    current_function_decl = save_context;
 
 	    /* If the callback converted the address argument in a context
 	       where we only accept variables (and min_invariant, presumably),
@@ -1222,6 +1221,7 @@ convert_nonlocal_reference_op (tree *tp, int *walk_subtrees, void *data)
 	    if (save_val_only)
 	      *tp = gsi_gimplify_val ((struct nesting_info *) wi->info,
 				      t, &wi->gsi);
+	    current_function_decl = save_context;
 	  }
       }
       break;
@@ -1969,13 +1969,13 @@ convert_local_reference_op (tree *tp, int *walk_subtrees, void *data)
 	  save_context = current_function_decl;
 	  current_function_decl = info->context;
 	  recompute_tree_invariant_for_addr_expr (t);
-	  current_function_decl = save_context;
 
 	  /* If we are in a context where we only accept values, then
 	     compute the address into a temporary.  */
 	  if (save_val_only)
 	    *tp = gsi_gimplify_val ((struct nesting_info *) wi->info,
 				    t, &wi->gsi);
+	  current_function_decl = save_context;
 	}
       break;
 


-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist         GNU Toolchain Engineer
        Vim, Vi, Voltei pro Emacs -- GNUlius Caesar
H.J. Lu via Gcc-patches Jan. 27, 2021, 2:58 p.m. | #6
On Wed, Jan 27, 2021 at 1:29 PM Alexandre Oliva <oliva@adacore.com> wrote:
>

> On Jan 26, 2021, Richard Biener <richard.guenther@gmail.com> wrote:

>

> > So while I think it's safe let's look at if we can improve tree-nested.c,

> > like I see (probably not the correct place):

>

> *nod*, it's just not the *only* place.

>

> > seeing how we adjust current_function_decl around the

> > recompute_tree_invariant_for_addr_expr call but not the

> > gsi_gimplify_val one (we already pass it a nesting_info,

> > not sure if wi->info is the same as the 'info' used above though),

> > so eventually we can fix it in one place?

>

> There are pieces of nested function lowering for which we set cfun and

> current_function_decl while walking each function, and there are other

> pieces that just don't bother, and we only set up current_function_decl

> temporarily for ADDR_EXPR handling.

>

> This patch adjusts both of the ADDR_EXPR handlers that override

> current_function_decl, so that the temporary overriding remains in

> effect during the re-gimplification.  That is enough to avoid the

> problem.  But I'm not very happy with this temporary overriding, it

> seems fishy.  I'd rather we set things up for the entire duration of the

> walking of each function.

>

> But that's only relevant because we rely on current_function_decl for

> address handling.  It's not clear to me that we should, as the other

> patch demonstrated.  With it, we could probably even do away with these

> overriders.


True, but I guess at least documentation of the predicates need to be
more precise.  I've also considered making the current_function_decl
accesses function parameters instead.  We do have
decl_address_ip_invariant_p on which decl_address_invariant_p
could build on by simply doing

 decl_address_ip_invariant_p (op) || auto_var_p (op)

(if there were not the strange STRING_CST handling in the _ip_ variant)

> But, for this stage, this is probably as conservative a change as we

> could possibly hope for.  I've regstrapped it on x86_64-linux-gnu, and

> also bootstrapped it with asan and ubsan.  Ok to install?


Yes, OK.

Thanks,
Richard.

>

> restore current_function_decl after re-gimplifying nested ADDR_EXPRs

>

> From: Alexandre Oliva <oliva@adacore.com>

>

> Ada makes extensive use of nested functions, which turn all automatic

> variables of the enclosing function that are used in nested ones into

> members of an artificial FRAME record type.

>

> The address of a local variable is usually passed to asan marking

> functions without using a temporary.  asan_expand_mark_ifn will reject

> an ADDR_EXPRs if it's split out from the call into an SSA_NAMEs.

>

> Taking the address of a member of FRAME within a nested function was

> not regarded as a gimple val: while introducing FRAME variables,

> current_function_decl pointed to the outermost function, even while

> processing a nested function, so decl_address_invariant_p, checking

> that the context of the variable is current_function_decl, returned

> false for such ADDR_EXPRs.

>

> decl_address_invariant_p, called when determining whether an

> expression is a legitimate gimple value, compares the context of

> automatic variables with current_function_decl.  Some of the

> tree-nested function processing doesn't set current_function_decl, but

> ADDR_EXPR-processing bits temporarily override it.  However, they

> restore it before re-gimplifying, which causes even ADDR_EXPRs

> referencing automatic variables in the FRAME struct of a nested

> function to not be regarded as address-invariant.

>

> This patch moves the restores of current_function_decl in the

> ADDR_EXPR-handling bits after the re-gimplification, so that the

> correct current_function_decl is used when testing for address

> invariance.

>

>

> for  gcc/ChangeLog

>

>         * tree-nested.c (convert_nonlocal_reference_op): Move

>         current_function_decl restore after re-gimplification.

>         (convert_local_reference_op): Likewise.

>

> for  gcc/testsuite/ChangeLog

>

>         * gcc.dg/asan/nested-1.c: New.

> ---

>  gcc/testsuite/gcc.dg/asan/nested-1.c |   24 ++++++++++++++++++++++++

>  gcc/tree-nested.c                    |    4 ++--

>  2 files changed, 26 insertions(+), 2 deletions(-)

>  create mode 100644 gcc/testsuite/gcc.dg/asan/nested-1.c

>

> diff --git a/gcc/testsuite/gcc.dg/asan/nested-1.c b/gcc/testsuite/gcc.dg/asan/nested-1.c

> new file mode 100644

> index 0000000000000..87e842098077c

> --- /dev/null

> +++ b/gcc/testsuite/gcc.dg/asan/nested-1.c

> @@ -0,0 +1,24 @@

> +/* { dg-do compile } */

> +/* { dg-options "-fsanitize=address" } */

> +

> +int f(int i) {

> +  auto int h() {

> +    int r;

> +    int *p;

> +

> +    {

> +      int x[3];

> +

> +      auto int g() {

> +       return x[i];

> +      }

> +

> +      p = &r;

> +      *p = g();

> +    }

> +

> +    return *p;

> +  }

> +

> +  return h();

> +}

> diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c

> index 1b52669b622aa..addd6eef9aba6 100644

> --- a/gcc/tree-nested.c

> +++ b/gcc/tree-nested.c

> @@ -1214,7 +1214,6 @@ convert_nonlocal_reference_op (tree *tp, int *walk_subtrees, void *data)

>             save_context = current_function_decl;

>             current_function_decl = info->context;

>             recompute_tree_invariant_for_addr_expr (t);

> -           current_function_decl = save_context;

>

>             /* If the callback converted the address argument in a context

>                where we only accept variables (and min_invariant, presumably),

> @@ -1222,6 +1221,7 @@ convert_nonlocal_reference_op (tree *tp, int *walk_subtrees, void *data)

>             if (save_val_only)

>               *tp = gsi_gimplify_val ((struct nesting_info *) wi->info,

>                                       t, &wi->gsi);

> +           current_function_decl = save_context;

>           }

>        }

>        break;

> @@ -1969,13 +1969,13 @@ convert_local_reference_op (tree *tp, int *walk_subtrees, void *data)

>           save_context = current_function_decl;

>           current_function_decl = info->context;

>           recompute_tree_invariant_for_addr_expr (t);

> -         current_function_decl = save_context;

>

>           /* If we are in a context where we only accept values, then

>              compute the address into a temporary.  */

>           if (save_val_only)

>             *tp = gsi_gimplify_val ((struct nesting_info *) wi->info,

>                                     t, &wi->gsi);

> +         current_function_decl = save_context;

>         }

>        break;

>

>

>

> --

> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/

>    Free Software Activist         GNU Toolchain Engineer

>         Vim, Vi, Voltei pro Emacs -- GNUlius Caesar

Patch

diff --git a/gcc/asan.c b/gcc/asan.c
index 89ecd99b18294..2d2fb97098b2f 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -3629,6 +3629,27 @@  asan_expand_mark_ifn (gimple_stmt_iterator *iter)
   bool is_poison = ((asan_mark_flags)flag) == ASAN_MARK_POISON;
 
   tree base = gimple_call_arg (g, 1);
+  while (TREE_CODE (base) == SSA_NAME)
+    {
+      gimple *def = SSA_NAME_DEF_STMT (base);
+      if (!def)
+	break;
+
+      if (!is_gimple_assign (def))
+	break;
+
+      if (!SINGLE_SSA_TREE_OPERAND (def, SSA_OP_DEF))
+	break;
+
+      if (gimple_num_ops (def) != 2)
+	break;
+
+      if (gimple_expr_code (def) == ADDR_EXPR
+	  || gimple_expr_code (def) == SSA_NAME)
+	base = gimple_assign_rhs1 (def);
+      else
+	break;
+    }
   gcc_checking_assert (TREE_CODE (base) == ADDR_EXPR);
   tree decl = TREE_OPERAND (base, 0);
 
diff --git a/gcc/testsuite/gcc.dg/asan/nested-1.c b/gcc/testsuite/gcc.dg/asan/nested-1.c
new file mode 100644
index 0000000000000..87e842098077c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/nested-1.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=address" } */
+
+int f(int i) {
+  auto int h() {
+    int r;
+    int *p;
+
+    {
+      int x[3];
+
+      auto int g() {
+	return x[i];
+      }
+
+      p = &r;
+      *p = g();
+    }
+
+    return *p;
+  }
+
+  return h();
+}