[2/2] Fix alignment for local variable [PR90811]

Message ID 20200327180656.83131-2-kito.cheng@sifive.com
State New
Headers show
Series
  • [1/2] Move out increase_alignment into ipa-increase-alignment.cc
Related show

Commit Message

Kito Cheng March 27, 2020, 6:06 p.m.
- The alignment for local variable was adjust during estimate_stack_frame_size,
   however it seems wrong spot to adjust that, expand phase will adjust that
   but it little too late to some gimple optimization, which rely on certain
   target hooks need to check alignment, forwprop is an example for
   that, result of simplify_builtin_call rely on the alignment on some
   target like ARM or RISC-V.

 - So we must adjust at some point, and here is already a pass to adjust
   alignment, which is ipa-increase-alignment, used for tree vectorization.

 - This patch fix gfortran.dg/pr45636.f90 for arm and riscv.

 - Regression test on riscv32/riscv64 and x86_64-linux-gnu, no new fail
   introduced.

gcc/ChangeLog

	PR target/90811
	* ipa-increase-alignment.cc (increase_alignment_local_var): New.
	(increase_alignment_global_var): New.
	(pass_ipa_increase_alignment::gate): Remove.
---
 gcc/ipa-increase-alignment.cc | 45 ++++++++++++++++++++++++++++-------
 1 file changed, 36 insertions(+), 9 deletions(-)

-- 
2.25.2

Comments

Andrew Pinski March 27, 2020, 6:15 p.m. | #1
On Fri, Mar 27, 2020 at 11:07 AM Kito Cheng <kito.cheng@sifive.com> wrote:
>

>  - The alignment for local variable was adjust during estimate_stack_frame_size,

>    however it seems wrong spot to adjust that, expand phase will adjust that

>    but it little too late to some gimple optimization, which rely on certain

>    target hooks need to check alignment, forwprop is an example for

>    that, result of simplify_builtin_call rely on the alignment on some

>    target like ARM or RISC-V.

>

>  - So we must adjust at some point, and here is already a pass to adjust

>    alignment, which is ipa-increase-alignment, used for tree vectorization.

>

>  - This patch fix gfortran.dg/pr45636.f90 for arm and riscv.

>

>  - Regression test on riscv32/riscv64 and x86_64-linux-gnu, no new fail

>    introduced.

>

> gcc/ChangeLog

>

>         PR target/90811

>         * ipa-increase-alignment.cc (increase_alignment_local_var): New.

>         (increase_alignment_global_var): New.

>         (pass_ipa_increase_alignment::gate): Remove.

> ---

>  gcc/ipa-increase-alignment.cc | 45 ++++++++++++++++++++++++++++-------

>  1 file changed, 36 insertions(+), 9 deletions(-)

>

> diff --git a/gcc/ipa-increase-alignment.cc b/gcc/ipa-increase-alignment.cc

> index 6e34124bc03..8f28ad83349 100644

> --- a/gcc/ipa-increase-alignment.cc

> +++ b/gcc/ipa-increase-alignment.cc

> @@ -148,10 +148,35 @@ get_vec_alignment_for_type (tree type)

>    return (alignment > TYPE_ALIGN (type)) ? alignment : 0;

>  }

>

> -/* Entry point to increase_alignment pass.  */

> -static unsigned int

> -increase_alignment (void)

> +/* Adjust alignment for local variable.  */

> +static void

> +increase_alignment_local_var (void)

> +{

> +  size_t i;

> +  tree var;

> +  struct cgraph_node *node;

> +  unsigned int align;

> +

> +  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)

> +    {

> +      function *fun = node->get_fun ();

> +      FOR_EACH_LOCAL_DECL (fun, i, var)

> +       {

> +         align = LOCAL_DECL_ALIGNMENT (var);

> +

> +         SET_DECL_ALIGN (var, align);



I think this is wrong if the user has set the alignment already.
You need to check DECL_USER_ALIGN.

Thanks,
Andrew Pinski

> +       }

> +    }

> +}

> +

> +/* Adjust alignment for global variable, only used for tree vectorization

> +   currently.  */

> +static void

> +increase_alignment_global_var (void)

>  {

> +  if (!(flag_section_anchors && flag_tree_loop_vectorize))

> +    return;

> +

>    varpool_node *vnode;

>

>    vect_location = dump_user_location_t ();

> @@ -178,6 +203,14 @@ increase_alignment (void)

>      }

>

>    delete type_align_map;

> +}

> +

> +/* Entry point to increase_alignment pass.  */

> +static unsigned int

> +increase_alignment (void)

> +{

> +  increase_alignment_local_var ();

> +  increase_alignment_global_var ();

>    return 0;

>  }

>

> @@ -204,12 +237,6 @@ public:

>      : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt)

>    {}

>

> -  /* opt_pass methods: */

> -  virtual bool gate (function *)

> -    {

> -      return flag_section_anchors && flag_tree_loop_vectorize;

> -    }

> -

>    virtual unsigned int execute (function *) { return increase_alignment (); }

>

>  }; // class pass_ipa_increase_alignment

> --

> 2.25.2

>
Xionghu Luo via Gcc-patches March 27, 2020, 6:26 p.m. | #2
On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote:
> 	PR target/90811

> 	* ipa-increase-alignment.cc (increase_alignment_local_var): New.

> 	(increase_alignment_global_var): New.

> 	(pass_ipa_increase_alignment::gate): Remove.


I'm afraid this is too early, pass_ipa_increase_alignment is part of
all_small_ipa_passes list, those are run before the LTO streaming.
See cgraphunit.c (ipa_passes).
For OpenMP/OpenACC offloading, this has to run after the streaming,
which means either a pass in the all_regular_ipa_passes list, or much better
(I don't see any advantage of this being an IPA pass) some new pass that is
run very early in the all_passes list.

At which point I don't really see the benefit of moving this into a new file
either.

	Jakub
Xionghu Luo via Gcc-patches March 30, 2020, 9:09 a.m. | #3
On Fri, Mar 27, 2020 at 7:27 PM Jakub Jelinek <jakub@redhat.com> wrote:
>

> On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote:

> >       PR target/90811

> >       * ipa-increase-alignment.cc (increase_alignment_local_var): New.

> >       (increase_alignment_global_var): New.

> >       (pass_ipa_increase_alignment::gate): Remove.

>

> I'm afraid this is too early, pass_ipa_increase_alignment is part of

> all_small_ipa_passes list, those are run before the LTO streaming.

> See cgraphunit.c (ipa_passes).

> For OpenMP/OpenACC offloading, this has to run after the streaming,

> which means either a pass in the all_regular_ipa_passes list, or much better

> (I don't see any advantage of this being an IPA pass) some new pass that is

> run very early in the all_passes list.


It's an IPA pass because we IPA propagate alignment.

I fear that offload targets have to agree on local variable alignment if we want
to go this way (update DECL_ALIGN).  But I wonder if we can instead
fix the memcpy inlining issue by making the predicates involved honor
LOCAL_ALIGNMENT instead of relying on DECL_ALIGN?

>

> At which point I don't really see the benefit of moving this into a new file

> either.

>

>         Jakub

>
Xionghu Luo via Gcc-patches March 30, 2020, 9:14 a.m. | #4
On Mon, Mar 30, 2020 at 11:09:40AM +0200, Richard Biener wrote:
> On Fri, Mar 27, 2020 at 7:27 PM Jakub Jelinek <jakub@redhat.com> wrote:

> >

> > On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote:

> > >       PR target/90811

> > >       * ipa-increase-alignment.cc (increase_alignment_local_var): New.

> > >       (increase_alignment_global_var): New.

> > >       (pass_ipa_increase_alignment::gate): Remove.

> >

> > I'm afraid this is too early, pass_ipa_increase_alignment is part of

> > all_small_ipa_passes list, those are run before the LTO streaming.

> > See cgraphunit.c (ipa_passes).

> > For OpenMP/OpenACC offloading, this has to run after the streaming,

> > which means either a pass in the all_regular_ipa_passes list, or much better

> > (I don't see any advantage of this being an IPA pass) some new pass that is

> > run very early in the all_passes list.

> 

> It's an IPA pass because we IPA propagate alignment.


For global vars sure, but we are talking here about automatic vars, and the
offloading targets really have serious problems dealing with the 16/32 byte
alignment forced from x86 backend.

	Jakub
Kito Cheng March 30, 2020, 9:56 a.m. | #5
> But I wonder if we can instead fix the memcpy inlining issue by

> making the predicates involved honor LOCAL_ALIGNMENT

> instead of relying on DECL_ALIGN?


The memcpy inlining issue is not the only one affected by alignment
issue, I guess?
So I think it would be better to fix DECL_ALIGN?


On Mon, Mar 30, 2020 at 5:15 PM Jakub Jelinek <jakub@redhat.com> wrote:
>

> On Mon, Mar 30, 2020 at 11:09:40AM +0200, Richard Biener wrote:

> > On Fri, Mar 27, 2020 at 7:27 PM Jakub Jelinek <jakub@redhat.com> wrote:

> > >

> > > On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote:

> > > >       PR target/90811

> > > >       * ipa-increase-alignment.cc (increase_alignment_local_var): New.

> > > >       (increase_alignment_global_var): New.

> > > >       (pass_ipa_increase_alignment::gate): Remove.

> > >

> > > I'm afraid this is too early, pass_ipa_increase_alignment is part of

> > > all_small_ipa_passes list, those are run before the LTO streaming.

> > > See cgraphunit.c (ipa_passes).

> > > For OpenMP/OpenACC offloading, this has to run after the streaming,

> > > which means either a pass in the all_regular_ipa_passes list, or much better

> > > (I don't see any advantage of this being an IPA pass) some new pass that is

> > > run very early in the all_passes list.

> >

> > It's an IPA pass because we IPA propagate alignment.

>

> For global vars sure, but we are talking here about automatic vars, and the

> offloading targets really have serious problems dealing with the 16/32 byte

> alignment forced from x86 backend.

>

>         Jakub

>
Kito Cheng March 30, 2020, 10:24 a.m. | #6
Hi Jakub:

I saw the omp and oacc related passes are in the head of all_passes,
so I plan added after pass_omp_target_link, does it late enough?

diff --git a/gcc/passes.def b/gcc/passes.def
index 2bf2cb78fc5..92cbe587a8a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -183,6 +183,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_oacc_device_lower);
  NEXT_PASS (pass_omp_device_lower);
  NEXT_PASS (pass_omp_target_link);
+  NEXT_PASS (pass_adjust_alignment);
  NEXT_PASS (pass_all_optimizations);
  PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations)
      NEXT_PASS (pass_remove_cgraph_callee_edges);

Thanks :)

On Sat, Mar 28, 2020 at 2:27 AM Jakub Jelinek <jakub@redhat.com> wrote:
>

> On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote:

> >       PR target/90811

> >       * ipa-increase-alignment.cc (increase_alignment_local_var): New.

> >       (increase_alignment_global_var): New.

> >       (pass_ipa_increase_alignment::gate): Remove.

>

> I'm afraid this is too early, pass_ipa_increase_alignment is part of

> all_small_ipa_passes list, those are run before the LTO streaming.

> See cgraphunit.c (ipa_passes).

> For OpenMP/OpenACC offloading, this has to run after the streaming,

> which means either a pass in the all_regular_ipa_passes list, or much better

> (I don't see any advantage of this being an IPA pass) some new pass that is

> run very early in the all_passes list.

>

> At which point I don't really see the benefit of moving this into a new file

> either.

>

>         Jakub

>
Kito Cheng March 30, 2020, 10:25 a.m. | #7
Hi Andrew:

> > +  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)

> > +    {

> > +      function *fun = node->get_fun ();

> > +      FOR_EACH_LOCAL_DECL (fun, i, var)

> > +       {

> > +         align = LOCAL_DECL_ALIGNMENT (var);

> > +

> > +         SET_DECL_ALIGN (var, align);

>

>

> I think this is wrong if the user has set the alignment already.

> You need to check DECL_USER_ALIGN.


Good point, maybe we need to fix align_local_variable too :)

Thanks.
Xionghu Luo via Gcc-patches March 30, 2020, 10:29 a.m. | #8
On Mon, Mar 30, 2020 at 06:24:32PM +0800, Kito Cheng wrote:
> Hi Jakub:

> 

> I saw the omp and oacc related passes are in the head of all_passes,

> so I plan added after pass_omp_target_link, does it late enough?


Yes, that is ok.

> diff --git a/gcc/passes.def b/gcc/passes.def

> index 2bf2cb78fc5..92cbe587a8a 100644

> --- a/gcc/passes.def

> +++ b/gcc/passes.def

> @@ -183,6 +183,7 @@ along with GCC; see the file COPYING3.  If not see

>   NEXT_PASS (pass_oacc_device_lower);

>   NEXT_PASS (pass_omp_device_lower);

>   NEXT_PASS (pass_omp_target_link);

> +  NEXT_PASS (pass_adjust_alignment);

>   NEXT_PASS (pass_all_optimizations);

>   PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations)

>       NEXT_PASS (pass_remove_cgraph_callee_edges);


	Jakub

Patch

diff --git a/gcc/ipa-increase-alignment.cc b/gcc/ipa-increase-alignment.cc
index 6e34124bc03..8f28ad83349 100644
--- a/gcc/ipa-increase-alignment.cc
+++ b/gcc/ipa-increase-alignment.cc
@@ -148,10 +148,35 @@  get_vec_alignment_for_type (tree type)
   return (alignment > TYPE_ALIGN (type)) ? alignment : 0;
 }
 
-/* Entry point to increase_alignment pass.  */
-static unsigned int
-increase_alignment (void)
+/* Adjust alignment for local variable.  */
+static void
+increase_alignment_local_var (void)
+{
+  size_t i;
+  tree var;
+  struct cgraph_node *node;
+  unsigned int align;
+
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+    {
+      function *fun = node->get_fun ();
+      FOR_EACH_LOCAL_DECL (fun, i, var)
+	{
+	  align = LOCAL_DECL_ALIGNMENT (var);
+
+	  SET_DECL_ALIGN (var, align);
+	}
+    }
+}
+
+/* Adjust alignment for global variable, only used for tree vectorization
+   currently.  */
+static void
+increase_alignment_global_var (void)
 {
+  if (!(flag_section_anchors && flag_tree_loop_vectorize))
+    return;
+
   varpool_node *vnode;
 
   vect_location = dump_user_location_t ();
@@ -178,6 +203,14 @@  increase_alignment (void)
     }
 
   delete type_align_map;
+}
+
+/* Entry point to increase_alignment pass.  */
+static unsigned int
+increase_alignment (void)
+{
+  increase_alignment_local_var ();
+  increase_alignment_global_var ();
   return 0;
 }
 
@@ -204,12 +237,6 @@  public:
     : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt)
   {}
 
-  /* opt_pass methods: */
-  virtual bool gate (function *)
-    {
-      return flag_section_anchors && flag_tree_loop_vectorize;
-    }
-
   virtual unsigned int execute (function *) { return increase_alignment (); }
 
 }; // class pass_ipa_increase_alignment