Message ID  mpt36fh9evd.fsf@arm.com 

State  New 
Headers  show 
Series 

Related  show 
On Fri, Oct 25, 2019 at 2:41 PM Richard Sandiford <richard.sandiford@arm.com> wrote: > > Some callers of get_same_sized_vectype were dealing with operands that > are constant or defined externally, and so have no STMT_VINFO_VECTYPE > available. Under the current model, using get_same_sized_vectype for > that case is equivalent to using get_vectype_for_scalar_type, since > get_vectype_for_scalar_type always returns vectors of the same size, > once a size is fixed. > > Using get_vectype_for_scalar_type is arguably more obvious though: > if we're using the same scalar type as we would for internal > definitions, we should use the same vector type too. (Constant and > external definitions sometimes let us change the original scalar type > to a "nicer" scalar type, but that isn't what's happening here.) > > This is a prerequisite to supporting multiple vector sizes in the same > vec_info. This might change the actual type we get back, IIRC we masschanged it in the opposite direction from your change in the past, because it's more obvious to relate the type used to another vector type on the stmt. So isn't it better to use the new related_vector_type thing here? Richard. > > 20191024 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > * treevectstmts.c (vectorizable_call): If an operand is > constant or external, use get_vectype_for_scalar_type > rather than get_same_sized_vectype to get its vector type. > (vectorizable_conversion, vectorizable_shift): Likewise. > (vectorizable_operation): Likewise. > > Index: gcc/treevectstmts.c > =================================================================== >  gcc/treevectstmts.c 20191025 13:27:19.313736209 +0100 > +++ gcc/treevectstmts.c 20191025 13:27:22.985710263 +0100 > @@ 3308,10 +3308,10 @@ vectorizable_call (stmt_vec_info stmt_in > return false; > } > } >  /* If all arguments are external or constant defs use a vector type with >  the same size as the output vector type. */ > + /* If all arguments are external or constant defs, infer the vector type > + from the scalar type. */ > if (!vectype_in) >  vectype_in = get_same_sized_vectype (rhs_type, vectype_out); > + vectype_in = get_vectype_for_scalar_type (vinfo, rhs_type); > if (vec_stmt) > gcc_assert (vectype_in); > if (!vectype_in) > @@ 4800,10 +4800,10 @@ vectorizable_conversion (stmt_vec_info s > } > } > >  /* If op0 is an external or constant defs use a vector type of >  the same size as the output vector type. */ > + /* If op0 is an external or constant def, infer the vector type > + from the scalar type. */ > if (!vectype_in) >  vectype_in = get_same_sized_vectype (rhs_type, vectype_out); > + vectype_in = get_vectype_for_scalar_type (vinfo, rhs_type); > if (vec_stmt) > gcc_assert (vectype_in); > if (!vectype_in) > @@ 5564,10 +5564,10 @@ vectorizable_shift (stmt_vec_info stmt_i > "use not simple.\n"); > return false; > } >  /* If op0 is an external or constant def use a vector type with >  the same size as the output vector type. */ > + /* If op0 is an external or constant def, infer the vector type > + from the scalar type. */ > if (!vectype) >  vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out); > + vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0)); > if (vec_stmt) > gcc_assert (vectype); > if (!vectype) > @@ 5666,7 +5666,7 @@ vectorizable_shift (stmt_vec_info stmt_i > "vector/vector shift/rotate found.\n"); > > if (!op1_vectype) >  op1_vectype = get_same_sized_vectype (TREE_TYPE (op1), vectype_out); > + op1_vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op1)); > incompatible_op1_vectype_p > = (op1_vectype == NULL_TREE >  maybe_ne (TYPE_VECTOR_SUBPARTS (op1_vectype), > @@ 5997,8 +5997,8 @@ vectorizable_operation (stmt_vec_info st > "use not simple.\n"); > return false; > } >  /* If op0 is an external or constant def use a vector type with >  the same size as the output vector type. */ > + /* If op0 is an external or constant def, infer the vector type > + from the scalar type. */ > if (!vectype) > { > /* For boolean type we cannot determine vectype by > @@ 6018,7 +6018,7 @@ vectorizable_operation (stmt_vec_info st > vectype = vectype_out; > } > else >  vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out); > + vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0)); > } > if (vec_stmt) > gcc_assert (vectype);
Richard Biener <richard.guenther@gmail.com> writes: > On Fri, Oct 25, 2019 at 2:41 PM Richard Sandiford > <richard.sandiford@arm.com> wrote: >> >> Some callers of get_same_sized_vectype were dealing with operands that >> are constant or defined externally, and so have no STMT_VINFO_VECTYPE >> available. Under the current model, using get_same_sized_vectype for >> that case is equivalent to using get_vectype_for_scalar_type, since >> get_vectype_for_scalar_type always returns vectors of the same size, >> once a size is fixed. >> >> Using get_vectype_for_scalar_type is arguably more obvious though: >> if we're using the same scalar type as we would for internal >> definitions, we should use the same vector type too. (Constant and >> external definitions sometimes let us change the original scalar type >> to a "nicer" scalar type, but that isn't what's happening here.) >> >> This is a prerequisite to supporting multiple vector sizes in the same >> vec_info. > > This might change the actual type we get back, IIRC we masschanged > it in the opposite direction from your change in the past, because it's > more obvious to relate the type used to another vector type on the > stmt. So isn't it better to use the new related_vector_type thing here? I guess this is a downside of the patch order. Hopefully this looks like a more sensible decision after 16/n, where we also pass the group size to get_vectype_for_scalar_type. If not: :) At the moment, there can only ever be one vector type for a given scalar type within a vector loop. We don't e.g. allow one loop vector stmt to use V2SI and another to V4SI, because all vectorised SIs have to be compatible in the same way that the original scalar SIs were. So once we have a loop_vec_info and once we have a scalar type, there's only one valid choice of vector type. I think trying to circumvent that by using get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type would run the risk of introducing accidental mismatches. I.e. if we do it right, calling get_related_vectype_for_scalar_type would give the same result as calling get_vectype_for_scalar_type (and so we might as well just call get_vectype_for_scalar_type). If we do it wrong we can end up with a different type from the one that other statements were expecting. BB vectorisation currently works the same way. But after 16/n, there is instead one vector type for each (bb_vinfo, scalar_type, group_size) triple. So that patch makes it possible for different SLP instances to have different vector types for the same scalar type, but the choice is still fixed within an SLP instance, in a similar way to loop vectorisation. So I think using anything other than get_vectype_for_scalar_type would give a false sense of freedom. Calling it directly is mostly useful for temporaries, e.g. in epilogue reduction handling. Thanks, Richard
On Tue, Nov 5, 2019 at 4:34 PM Richard Sandiford <Richard.Sandiford@arm.com> wrote: > > Richard Biener <richard.guenther@gmail.com> writes: > > On Fri, Oct 25, 2019 at 2:41 PM Richard Sandiford > > <richard.sandiford@arm.com> wrote: > >> > >> Some callers of get_same_sized_vectype were dealing with operands that > >> are constant or defined externally, and so have no STMT_VINFO_VECTYPE > >> available. Under the current model, using get_same_sized_vectype for > >> that case is equivalent to using get_vectype_for_scalar_type, since > >> get_vectype_for_scalar_type always returns vectors of the same size, > >> once a size is fixed. > >> > >> Using get_vectype_for_scalar_type is arguably more obvious though: > >> if we're using the same scalar type as we would for internal > >> definitions, we should use the same vector type too. (Constant and > >> external definitions sometimes let us change the original scalar type > >> to a "nicer" scalar type, but that isn't what's happening here.) > >> > >> This is a prerequisite to supporting multiple vector sizes in the same > >> vec_info. > > > > This might change the actual type we get back, IIRC we masschanged > > it in the opposite direction from your change in the past, because it's > > more obvious to relate the type used to another vector type on the > > stmt. So isn't it better to use the new related_vector_type thing here? > > I guess this is a downside of the patch order. Hopefully this looks > like a more sensible decision after 16/n, where we also pass the > group size to get_vectype_for_scalar_type. If not: :) > > At the moment, there can only ever be one vector type for a given scalar > type within a vector loop. We don't e.g. allow one loop vector stmt to > use V2SI and another to V4SI, because all vectorised SIs have to be > compatible in the same way that the original scalar SIs were. So once > we have a loop_vec_info and once we have a scalar type, there's only one > valid choice of vector type. I think trying to circumvent that by using > get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type > would run the risk of introducing accidental mismatches. I.e. if we do > it right, calling get_related_vectype_for_scalar_type would give the same > result as calling get_vectype_for_scalar_type (and so we might as well > just call get_vectype_for_scalar_type). If we do it wrong we can end up > with a different type from the one that other statements were expecting. > > BB vectorisation currently works the same way. But after 16/n, there > is instead one vector type for each (bb_vinfo, scalar_type, group_size) > triple. So that patch makes it possible for different SLP instances to > have different vector types for the same scalar type, but the choice is > still fixed within an SLP instance, in a similar way to loop > vectorisation. > > So I think using anything other than get_vectype_for_scalar_type > would give a false sense of freedom. Calling it directly is mostly > useful for temporaries, e.g. in epilogue reduction handling. OK, I guess we'll see how it plays out in the end of the series(es). What I think should be there in the end is vectorizable_* asking for the vector type that matches what they implement in terms of the operation. Say, if it is a PLUS and one operand has a vector type set in the internal_def definition then the constant/external_def should get the very same vector type. That is, ideally those routines would _not_ call "get me a vector type" but simply compute the required type. That is, vectorizable_operation should not need to call these functions at unless we run into a completely invariant operation (in which case we might as well fail vectorization). I see it doesn't care for the vector type of op1/2 which is probably a bug since with multiple sizes we may get V2SI on one op and V4SI on another? Richard. > Thanks, > Richard
Index: gcc/treevectstmts.c ===================================================================  gcc/treevectstmts.c 20191025 13:27:19.313736209 +0100 +++ gcc/treevectstmts.c 20191025 13:27:22.985710263 +0100 @@ 3308,10 +3308,10 @@ vectorizable_call (stmt_vec_info stmt_in return false; } }  /* If all arguments are external or constant defs use a vector type with  the same size as the output vector type. */ + /* If all arguments are external or constant defs, infer the vector type + from the scalar type. */ if (!vectype_in)  vectype_in = get_same_sized_vectype (rhs_type, vectype_out); + vectype_in = get_vectype_for_scalar_type (vinfo, rhs_type); if (vec_stmt) gcc_assert (vectype_in); if (!vectype_in) @@ 4800,10 +4800,10 @@ vectorizable_conversion (stmt_vec_info s } }  /* If op0 is an external or constant defs use a vector type of  the same size as the output vector type. */ + /* If op0 is an external or constant def, infer the vector type + from the scalar type. */ if (!vectype_in)  vectype_in = get_same_sized_vectype (rhs_type, vectype_out); + vectype_in = get_vectype_for_scalar_type (vinfo, rhs_type); if (vec_stmt) gcc_assert (vectype_in); if (!vectype_in) @@ 5564,10 +5564,10 @@ vectorizable_shift (stmt_vec_info stmt_i "use not simple.\n"); return false; }  /* If op0 is an external or constant def use a vector type with  the same size as the output vector type. */ + /* If op0 is an external or constant def, infer the vector type + from the scalar type. */ if (!vectype)  vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out); + vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0)); if (vec_stmt) gcc_assert (vectype); if (!vectype) @@ 5666,7 +5666,7 @@ vectorizable_shift (stmt_vec_info stmt_i "vector/vector shift/rotate found.\n"); if (!op1_vectype)  op1_vectype = get_same_sized_vectype (TREE_TYPE (op1), vectype_out); + op1_vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op1)); incompatible_op1_vectype_p = (op1_vectype == NULL_TREE  maybe_ne (TYPE_VECTOR_SUBPARTS (op1_vectype), @@ 5997,8 +5997,8 @@ vectorizable_operation (stmt_vec_info st "use not simple.\n"); return false; }  /* If op0 is an external or constant def use a vector type with  the same size as the output vector type. */ + /* If op0 is an external or constant def, infer the vector type + from the scalar type. */ if (!vectype) { /* For boolean type we cannot determine vectype by @@ 6018,7 +6018,7 @@ vectorizable_operation (stmt_vec_info st vectype = vectype_out; } else  vectype = get_same_sized_vectype (TREE_TYPE (op0), vectype_out); + vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op0)); } if (vec_stmt) gcc_assert (vectype);