[10/n] Make less use of get_same_sized_vectype

Message ID mpt36fh9evd.fsf@arm.com
State New
Headers show
Series
  • [10/n] Make less use of get_same_sized_vectype
Related show

Commit Message

Richard Sandiford Oct. 25, 2019, 12:41 p.m.
Some callers of get_same_sized_vectype were dealing with operands that
are constant or defined externally, and so have no STMT_VINFO_VECTYPE
available.  Under the current model, using get_same_sized_vectype for
that case is equivalent to using get_vectype_for_scalar_type, since
get_vectype_for_scalar_type always returns vectors of the same size,
once a size is fixed.

Using get_vectype_for_scalar_type is arguably more obvious though:
if we're using the same scalar type as we would for internal
definitions, we should use the same vector type too.  (Constant and
external definitions sometimes let us change the original scalar type
to a "nicer" scalar type, but that isn't what's happening here.)

This is a prerequisite to supporting multiple vector sizes in the same
vec_info.


2019-10-24  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-stmts.c (vectorizable_call): If an operand is
	constant or external, use get_vectype_for_scalar_type
	rather than get_same_sized_vectype to get its vector type.
	(vectorizable_conversion, vectorizable_shift): Likewise.
	(vectorizable_operation): Likewise.

Comments

Richard Biener Nov. 5, 2019, 12:50 p.m. | #1
On Fri, Oct 25, 2019 at 2:41 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> Some callers of get_same_sized_vectype were dealing with operands that

> are constant or defined externally, and so have no STMT_VINFO_VECTYPE

> available.  Under the current model, using get_same_sized_vectype for

> that case is equivalent to using get_vectype_for_scalar_type, since

> get_vectype_for_scalar_type always returns vectors of the same size,

> once a size is fixed.

>

> Using get_vectype_for_scalar_type is arguably more obvious though:

> if we're using the same scalar type as we would for internal

> definitions, we should use the same vector type too.  (Constant and

> external definitions sometimes let us change the original scalar type

> to a "nicer" scalar type, but that isn't what's happening here.)

>

> This is a prerequisite to supporting multiple vector sizes in the same

> vec_info.


This might change the actual type we get back, IIRC we mass-changed
it in the opposite direction from your change in the past, because it's
more obvious to relate the type used to another vector type on the
stmt.  So isn't it better to use the new related_vector_type thing here?

Richard.

>

> 2019-10-24  Richard Sandiford  <richard.sandiford@arm.com>

>

> gcc/

>         * tree-vect-stmts.c (vectorizable_call): If an operand is

>         constant or external, use get_vectype_for_scalar_type

>         rather than get_same_sized_vectype to get its vector type.

>         (vectorizable_conversion, vectorizable_shift): Likewise.

>         (vectorizable_operation): Likewise.

>

> Index: gcc/tree-vect-stmts.c

> ===================================================================

> --- gcc/tree-vect-stmts.c       2019-10-25 13:27:19.313736209 +0100

> +++ gcc/tree-vect-stmts.c       2019-10-25 13:27:22.985710263 +0100

> @@ -3308,10 +3308,10 @@ vectorizable_call (stmt_vec_info stmt_in

>           return false;

>         }

>      }

> -  /* If all arguments are external or constant defs use a vector type with

> -     the same size as the output vector type.  */

> +  /* If all arguments are external or constant defs, infer the vector type

> +     from the scalar type.  */

>    if (!vectype_in)

> -    vectype_in = get_same_sized_vectype (rhs_type, vectype_out);

> +    vectype_in = get_vectype_for_scalar_type (vinfo, rhs_type);

>    if (vec_stmt)

>      gcc_assert (vectype_in);

>    if (!vectype_in)

> @@ -4800,10 +4800,10 @@ vectorizable_conversion (stmt_vec_info s

>         }

>      }

>

> -  /* If op0 is an external or constant defs use a vector type of

> -     the same size as the output vector type.  */

> +  /* If op0 is an external or constant def, infer the vector type

> +     from the scalar type.  */

>    if (!vectype_in)

> -    vectype_in = get_same_sized_vectype (rhs_type, vectype_out);

> +    vectype_in = get_vectype_for_scalar_type (vinfo, rhs_type);

>    if (vec_stmt)

>      gcc_assert (vectype_in);

>    if (!vectype_in)

> @@ -5564,10 +5564,10 @@ vectorizable_shift (stmt_vec_info stmt_i

>                           "use not simple.\n");

>        return false;

>      }

> -  /* If op0 is an external or constant def use a vector type with

> -     the same size as the output vector type.  */

> +  /* If op0 is an external or constant def, infer the vector type

> +     from the scalar type.  */

>    if (!vectype)

> -    vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out);

> +    vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0));

>    if (vec_stmt)

>      gcc_assert (vectype);

>    if (!vectype)

> @@ -5666,7 +5666,7 @@ vectorizable_shift (stmt_vec_info stmt_i

>                           "vector/vector shift/rotate found.\n");

>

>        if (!op1_vectype)

> -       op1_vectype = get_same_sized_vectype (TREE_TYPE (op1), vectype_out);

> +       op1_vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op1));

>        incompatible_op1_vectype_p

>         = (op1_vectype == NULL_TREE

>            || maybe_ne (TYPE_VECTOR_SUBPARTS (op1_vectype),

> @@ -5997,8 +5997,8 @@ vectorizable_operation (stmt_vec_info st

>                           "use not simple.\n");

>        return false;

>      }

> -  /* If op0 is an external or constant def use a vector type with

> -     the same size as the output vector type.  */

> +  /* If op0 is an external or constant def, infer the vector type

> +     from the scalar type.  */

>    if (!vectype)

>      {

>        /* For boolean type we cannot determine vectype by

> @@ -6018,7 +6018,7 @@ vectorizable_operation (stmt_vec_info st

>           vectype = vectype_out;

>         }

>        else

> -       vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out);

> +       vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0));

>      }

>    if (vec_stmt)

>      gcc_assert (vectype);
Richard Sandiford Nov. 5, 2019, 3:34 p.m. | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Oct 25, 2019 at 2:41 PM Richard Sandiford

> <richard.sandiford@arm.com> wrote:

>>

>> Some callers of get_same_sized_vectype were dealing with operands that

>> are constant or defined externally, and so have no STMT_VINFO_VECTYPE

>> available.  Under the current model, using get_same_sized_vectype for

>> that case is equivalent to using get_vectype_for_scalar_type, since

>> get_vectype_for_scalar_type always returns vectors of the same size,

>> once a size is fixed.

>>

>> Using get_vectype_for_scalar_type is arguably more obvious though:

>> if we're using the same scalar type as we would for internal

>> definitions, we should use the same vector type too.  (Constant and

>> external definitions sometimes let us change the original scalar type

>> to a "nicer" scalar type, but that isn't what's happening here.)

>>

>> This is a prerequisite to supporting multiple vector sizes in the same

>> vec_info.

>

> This might change the actual type we get back, IIRC we mass-changed

> it in the opposite direction from your change in the past, because it's

> more obvious to relate the type used to another vector type on the

> stmt.  So isn't it better to use the new related_vector_type thing here?


I guess this is a downside of the patch order.  Hopefully this looks
like a more sensible decision after 16/n, where we also pass the
group size to get_vectype_for_scalar_type.  If not: :-)

At the moment, there can only ever be one vector type for a given scalar
type within a vector loop.  We don't e.g. allow one loop vector stmt to
use V2SI and another to V4SI, because all vectorised SIs have to be
compatible in the same way that the original scalar SIs were.  So once
we have a loop_vec_info and once we have a scalar type, there's only one
valid choice of vector type.  I think trying to circumvent that by using
get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type
would run the risk of introducing accidental mismatches.  I.e. if we do
it right, calling get_related_vectype_for_scalar_type would give the same
result as calling get_vectype_for_scalar_type (and so we might as well
just call get_vectype_for_scalar_type).  If we do it wrong we can end up
with a different type from the one that other statements were expecting.

BB vectorisation currently works the same way.  But after 16/n, there
is instead one vector type for each (bb_vinfo, scalar_type, group_size)
triple.  So that patch makes it possible for different SLP instances to
have different vector types for the same scalar type, but the choice is
still fixed within an SLP instance, in a similar way to loop
vectorisation.

So I think using anything other than get_vectype_for_scalar_type
would give a false sense of freedom.  Calling it directly is mostly
useful for temporaries, e.g. in epilogue reduction handling.

Thanks,
Richard
Richard Biener Nov. 5, 2019, 4:09 p.m. | #3
On Tue, Nov 5, 2019 at 4:34 PM Richard Sandiford
<Richard.Sandiford@arm.com> wrote:
>

> Richard Biener <richard.guenther@gmail.com> writes:

> > On Fri, Oct 25, 2019 at 2:41 PM Richard Sandiford

> > <richard.sandiford@arm.com> wrote:

> >>

> >> Some callers of get_same_sized_vectype were dealing with operands that

> >> are constant or defined externally, and so have no STMT_VINFO_VECTYPE

> >> available.  Under the current model, using get_same_sized_vectype for

> >> that case is equivalent to using get_vectype_for_scalar_type, since

> >> get_vectype_for_scalar_type always returns vectors of the same size,

> >> once a size is fixed.

> >>

> >> Using get_vectype_for_scalar_type is arguably more obvious though:

> >> if we're using the same scalar type as we would for internal

> >> definitions, we should use the same vector type too.  (Constant and

> >> external definitions sometimes let us change the original scalar type

> >> to a "nicer" scalar type, but that isn't what's happening here.)

> >>

> >> This is a prerequisite to supporting multiple vector sizes in the same

> >> vec_info.

> >

> > This might change the actual type we get back, IIRC we mass-changed

> > it in the opposite direction from your change in the past, because it's

> > more obvious to relate the type used to another vector type on the

> > stmt.  So isn't it better to use the new related_vector_type thing here?

>

> I guess this is a downside of the patch order.  Hopefully this looks

> like a more sensible decision after 16/n, where we also pass the

> group size to get_vectype_for_scalar_type.  If not: :-)

>

> At the moment, there can only ever be one vector type for a given scalar

> type within a vector loop.  We don't e.g. allow one loop vector stmt to

> use V2SI and another to V4SI, because all vectorised SIs have to be

> compatible in the same way that the original scalar SIs were.  So once

> we have a loop_vec_info and once we have a scalar type, there's only one

> valid choice of vector type.  I think trying to circumvent that by using

> get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type

> would run the risk of introducing accidental mismatches.  I.e. if we do

> it right, calling get_related_vectype_for_scalar_type would give the same

> result as calling get_vectype_for_scalar_type (and so we might as well

> just call get_vectype_for_scalar_type).  If we do it wrong we can end up

> with a different type from the one that other statements were expecting.

>

> BB vectorisation currently works the same way.  But after 16/n, there

> is instead one vector type for each (bb_vinfo, scalar_type, group_size)

> triple.  So that patch makes it possible for different SLP instances to

> have different vector types for the same scalar type, but the choice is

> still fixed within an SLP instance, in a similar way to loop

> vectorisation.

>

> So I think using anything other than get_vectype_for_scalar_type

> would give a false sense of freedom.  Calling it directly is mostly

> useful for temporaries, e.g. in epilogue reduction handling.


OK, I guess we'll see how it plays out in the end of the series(es).

What I think should be there in the end is vectorizable_* asking
for the vector type that matches what they implement in terms
of the operation.  Say, if it is a PLUS and one operand has a
vector type set in the internal_def definition then the constant/external_def
should get the very same vector type.  That is, ideally those
routines would _not_ call "get me a vector type" but simply compute
the required type.  That is, vectorizable_operation should not need to call
these functions at unless we run into a completely invariant operation
(in which case we might as well fail vectorization).  I see it doesn't
care for the vector type of op1/2 which is probably a bug since
with multiple sizes we may get V2SI on one op and V4SI on another?

Richard.

> Thanks,

> Richard

Patch

Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2019-10-25 13:27:19.313736209 +0100
+++ gcc/tree-vect-stmts.c	2019-10-25 13:27:22.985710263 +0100
@@ -3308,10 +3308,10 @@  vectorizable_call (stmt_vec_info stmt_in
 	  return false;
 	}
     }
-  /* If all arguments are external or constant defs use a vector type with
-     the same size as the output vector type.  */
+  /* If all arguments are external or constant defs, infer the vector type
+     from the scalar type.  */
   if (!vectype_in)
-    vectype_in = get_same_sized_vectype (rhs_type, vectype_out);
+    vectype_in = get_vectype_for_scalar_type (vinfo, rhs_type);
   if (vec_stmt)
     gcc_assert (vectype_in);
   if (!vectype_in)
@@ -4800,10 +4800,10 @@  vectorizable_conversion (stmt_vec_info s
 	}
     }
 
-  /* If op0 is an external or constant defs use a vector type of
-     the same size as the output vector type.  */
+  /* If op0 is an external or constant def, infer the vector type
+     from the scalar type.  */
   if (!vectype_in)
-    vectype_in = get_same_sized_vectype (rhs_type, vectype_out);
+    vectype_in = get_vectype_for_scalar_type (vinfo, rhs_type);
   if (vec_stmt)
     gcc_assert (vectype_in);
   if (!vectype_in)
@@ -5564,10 +5564,10 @@  vectorizable_shift (stmt_vec_info stmt_i
                          "use not simple.\n");
       return false;
     }
-  /* If op0 is an external or constant def use a vector type with
-     the same size as the output vector type.  */
+  /* If op0 is an external or constant def, infer the vector type
+     from the scalar type.  */
   if (!vectype)
-    vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out);
+    vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0));
   if (vec_stmt)
     gcc_assert (vectype);
   if (!vectype)
@@ -5666,7 +5666,7 @@  vectorizable_shift (stmt_vec_info stmt_i
                          "vector/vector shift/rotate found.\n");
 
       if (!op1_vectype)
-	op1_vectype = get_same_sized_vectype (TREE_TYPE (op1), vectype_out);
+	op1_vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op1));
       incompatible_op1_vectype_p
 	= (op1_vectype == NULL_TREE
 	   || maybe_ne (TYPE_VECTOR_SUBPARTS (op1_vectype),
@@ -5997,8 +5997,8 @@  vectorizable_operation (stmt_vec_info st
                          "use not simple.\n");
       return false;
     }
-  /* If op0 is an external or constant def use a vector type with
-     the same size as the output vector type.  */
+  /* If op0 is an external or constant def, infer the vector type
+     from the scalar type.  */
   if (!vectype)
     {
       /* For boolean type we cannot determine vectype by
@@ -6018,7 +6018,7 @@  vectorizable_operation (stmt_vec_info st
 	  vectype = vectype_out;
 	}
       else
-	vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out);
+	vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0));
     }
   if (vec_stmt)
     gcc_assert (vectype);