Make assemble_real generate canonical CONST_INTs

Message ID mptmuf33scs.fsf@arm.com
State New
Headers show
Series
  • Make assemble_real generate canonical CONST_INTs
Related show

Commit Message

Richard Sandiford Sept. 17, 2019, 2:33 p.m.
assemble_real used GEN_INT to create integers directly from the
longs returned by real_to_target.  assemble_integer then went on
to interpret the const_ints as though they had the mode corresponding
to the accompanying size parameter:

      imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require ();

      for (i = 0; i < size; i += subsize)
	{
	  rtx partial = simplify_subreg (omode, x, imode, i);

But in the assemble_real case, X might not be canonical for IMODE.

If the interface to assemble_integer is supposed to allow outputting
(say) the low 4 bytes of a DImode integer, then the simplify_subreg
above is wrong.  But if the number of bytes passed to assemble_integer
is supposed to be the number of bytes that the integer actually contains,
assemble_real is wrong.

This patch takes the latter interpretation and makes assemble_real
generate const_ints that are canonical for the number of bytes passed.

The flip_storage_order handling assumes that each long is a full
SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats
whose memory size is not a multiple of 32 bits (which includes
HFmode at least).  The patch therefore leaves that code alone.
If interpreting each integer as SImode is correct, the const_ints
that it generates are also correct.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Also tested
by making sure that there were no new errors from a range of
cross-built targets.  OK to install?

Richard


2019-09-17  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* varasm.c (assemble_real): Generate canonical const_ints.

Comments

Richard Biener Sept. 18, 2019, 8:41 a.m. | #1
On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> assemble_real used GEN_INT to create integers directly from the

> longs returned by real_to_target.  assemble_integer then went on

> to interpret the const_ints as though they had the mode corresponding

> to the accompanying size parameter:

>

>       imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require ();

>

>       for (i = 0; i < size; i += subsize)

>         {

>           rtx partial = simplify_subreg (omode, x, imode, i);

>

> But in the assemble_real case, X might not be canonical for IMODE.

>

> If the interface to assemble_integer is supposed to allow outputting

> (say) the low 4 bytes of a DImode integer, then the simplify_subreg

> above is wrong.  But if the number of bytes passed to assemble_integer

> is supposed to be the number of bytes that the integer actually contains,

> assemble_real is wrong.

>

> This patch takes the latter interpretation and makes assemble_real

> generate const_ints that are canonical for the number of bytes passed.

>

> The flip_storage_order handling assumes that each long is a full

> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats

> whose memory size is not a multiple of 32 bits (which includes

> HFmode at least).  The patch therefore leaves that code alone.

> If interpreting each integer as SImode is correct, the const_ints

> that it generates are also correct.

>

> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Also tested

> by making sure that there were no new errors from a range of

> cross-built targets.  OK to install?

>

> Richard

>

>

> 2019-09-17  Richard Sandiford  <richard.sandiford@arm.com>

>

> gcc/

>         * varasm.c (assemble_real): Generate canonical const_ints.

>

> Index: gcc/varasm.c

> ===================================================================

> --- gcc/varasm.c        2019-09-05 08:49:30.829739618 +0100

> +++ gcc/varasm.c        2019-09-17 15:30:10.400740515 +0100

> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar

>    real_to_target (data, &d, mode);

>

>    /* Put out the first word with the specified alignment.  */

> +  unsigned int chunk_nunits = MIN (nunits, units_per);

>    if (reverse)

>      elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode));

>    else

> -    elt = GEN_INT (data[0]);

> -  assemble_integer (elt, MIN (nunits, units_per), align, 1);

> -  nunits -= units_per;

> +    elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT));


why the appearant difference between the storage-order flipping
variant using gen_int_mode vs. the GEN_INT with sext_hwi?
Can't we use gen_int_mode in the non-flipping path and be done with that?

> +  assemble_integer (elt, chunk_nunits, align, 1);

> +  nunits -= chunk_nunits;

>

>    /* Subsequent words need only 32-bit alignment.  */

>    align = min_align (align, 32);

>

>    for (int i = 1; i < nelts; i++)

>      {

> +      chunk_nunits = MIN (nunits, units_per);

>        if (reverse)

>         elt = flip_storage_order (SImode,

>                                   gen_int_mode (data[nelts - 1 - i], SImode));

>        else

> -       elt = GEN_INT (data[i]);

> -      assemble_integer (elt, MIN (nunits, units_per), align, 1);

> -      nunits -= units_per;

> +       elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT));

> +      assemble_integer (elt, chunk_nunits, align, 1);

> +      nunits -= chunk_nunits;

>      }

>  }

>
Richard Sandiford Sept. 18, 2019, 9:41 a.m. | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford

> <richard.sandiford@arm.com> wrote:

>>

>> assemble_real used GEN_INT to create integers directly from the

>> longs returned by real_to_target.  assemble_integer then went on

>> to interpret the const_ints as though they had the mode corresponding

>> to the accompanying size parameter:

>>

>>       imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require ();

>>

>>       for (i = 0; i < size; i += subsize)

>>         {

>>           rtx partial = simplify_subreg (omode, x, imode, i);

>>

>> But in the assemble_real case, X might not be canonical for IMODE.

>>

>> If the interface to assemble_integer is supposed to allow outputting

>> (say) the low 4 bytes of a DImode integer, then the simplify_subreg

>> above is wrong.  But if the number of bytes passed to assemble_integer

>> is supposed to be the number of bytes that the integer actually contains,

>> assemble_real is wrong.

>>

>> This patch takes the latter interpretation and makes assemble_real

>> generate const_ints that are canonical for the number of bytes passed.

>>

>> The flip_storage_order handling assumes that each long is a full

>> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats

>> whose memory size is not a multiple of 32 bits (which includes

>> HFmode at least).  The patch therefore leaves that code alone.

>> If interpreting each integer as SImode is correct, the const_ints

>> that it generates are also correct.

>>

>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Also tested

>> by making sure that there were no new errors from a range of

>> cross-built targets.  OK to install?

>>

>> Richard

>>

>>

>> 2019-09-17  Richard Sandiford  <richard.sandiford@arm.com>

>>

>> gcc/

>>         * varasm.c (assemble_real): Generate canonical const_ints.

>>

>> Index: gcc/varasm.c

>> ===================================================================

>> --- gcc/varasm.c        2019-09-05 08:49:30.829739618 +0100

>> +++ gcc/varasm.c        2019-09-17 15:30:10.400740515 +0100

>> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar

>>    real_to_target (data, &d, mode);

>>

>>    /* Put out the first word with the specified alignment.  */

>> +  unsigned int chunk_nunits = MIN (nunits, units_per);

>>    if (reverse)

>>      elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode));

>>    else

>> -    elt = GEN_INT (data[0]);

>> -  assemble_integer (elt, MIN (nunits, units_per), align, 1);

>> -  nunits -= units_per;

>> +    elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT));

>

> why the appearant difference between the storage-order flipping

> variant using gen_int_mode vs. the GEN_INT with sext_hwi?

> Can't we use gen_int_mode in the non-flipping path and be done with that?


Yeah, I mentioned this in the covering note.  The flip_storage_order
stuff only seems to work for floats that are a multiple of 32 bits in
size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the
new "else" does.  Hard-coding SImode also hard-codes BITS_PER_UNIT==8,
unlike the "else".

So if anything, it's flip_storage_order that might need to change
to avoid hard-coding SImode.  That doesn't look like a trivial change
though.  E.g. the number of bytes passed to assemble_integer would need
to match the number of bytes in data[nelts - 1] rather than data[0].
The alignment code below would also need to be adjusted.  Fixing that
(if it is a bug) seems like a separate change and TBH I'd rather not
touch it here.

Thanks,
Richard

>

>> +  assemble_integer (elt, chunk_nunits, align, 1);

>> +  nunits -= chunk_nunits;

>>

>>    /* Subsequent words need only 32-bit alignment.  */

>>    align = min_align (align, 32);

>>

>>    for (int i = 1; i < nelts; i++)

>>      {

>> +      chunk_nunits = MIN (nunits, units_per);

>>        if (reverse)

>>         elt = flip_storage_order (SImode,

>>                                   gen_int_mode (data[nelts - 1 - i], SImode));

>>        else

>> -       elt = GEN_INT (data[i]);

>> -      assemble_integer (elt, MIN (nunits, units_per), align, 1);

>> -      nunits -= units_per;

>> +       elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT));

>> +      assemble_integer (elt, chunk_nunits, align, 1);

>> +      nunits -= chunk_nunits;

>>      }

>>  }

>>
Richard Biener Sept. 18, 2019, 10:07 a.m. | #3
On Wed, Sep 18, 2019 at 11:41 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> Richard Biener <richard.guenther@gmail.com> writes:

> > On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford

> > <richard.sandiford@arm.com> wrote:

> >>

> >> assemble_real used GEN_INT to create integers directly from the

> >> longs returned by real_to_target.  assemble_integer then went on

> >> to interpret the const_ints as though they had the mode corresponding

> >> to the accompanying size parameter:

> >>

> >>       imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require ();

> >>

> >>       for (i = 0; i < size; i += subsize)

> >>         {

> >>           rtx partial = simplify_subreg (omode, x, imode, i);

> >>

> >> But in the assemble_real case, X might not be canonical for IMODE.

> >>

> >> If the interface to assemble_integer is supposed to allow outputting

> >> (say) the low 4 bytes of a DImode integer, then the simplify_subreg

> >> above is wrong.  But if the number of bytes passed to assemble_integer

> >> is supposed to be the number of bytes that the integer actually contains,

> >> assemble_real is wrong.

> >>

> >> This patch takes the latter interpretation and makes assemble_real

> >> generate const_ints that are canonical for the number of bytes passed.

> >>

> >> The flip_storage_order handling assumes that each long is a full

> >> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats

> >> whose memory size is not a multiple of 32 bits (which includes

> >> HFmode at least).  The patch therefore leaves that code alone.

> >> If interpreting each integer as SImode is correct, the const_ints

> >> that it generates are also correct.

> >>

> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Also tested

> >> by making sure that there were no new errors from a range of

> >> cross-built targets.  OK to install?

> >>

> >> Richard

> >>

> >>

> >> 2019-09-17  Richard Sandiford  <richard.sandiford@arm.com>

> >>

> >> gcc/

> >>         * varasm.c (assemble_real): Generate canonical const_ints.

> >>

> >> Index: gcc/varasm.c

> >> ===================================================================

> >> --- gcc/varasm.c        2019-09-05 08:49:30.829739618 +0100

> >> +++ gcc/varasm.c        2019-09-17 15:30:10.400740515 +0100

> >> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar

> >>    real_to_target (data, &d, mode);

> >>

> >>    /* Put out the first word with the specified alignment.  */

> >> +  unsigned int chunk_nunits = MIN (nunits, units_per);

> >>    if (reverse)

> >>      elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode));

> >>    else

> >> -    elt = GEN_INT (data[0]);

> >> -  assemble_integer (elt, MIN (nunits, units_per), align, 1);

> >> -  nunits -= units_per;

> >> +    elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT));

> >

> > why the appearant difference between the storage-order flipping

> > variant using gen_int_mode vs. the GEN_INT with sext_hwi?

> > Can't we use gen_int_mode in the non-flipping path and be done with that?

>

> Yeah, I mentioned this in the covering note.  The flip_storage_order

> stuff only seems to work for floats that are a multiple of 32 bits in

> size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the

> new "else" does.  Hard-coding SImode also hard-codes BITS_PER_UNIT==8,

> unlike the "else".

>

> So if anything, it's flip_storage_order that might need to change

> to avoid hard-coding SImode.  That doesn't look like a trivial change

> though.  E.g. the number of bytes passed to assemble_integer would need

> to match the number of bytes in data[nelts - 1] rather than data[0].

> The alignment code below would also need to be adjusted.  Fixing that

> (if it is a bug) seems like a separate change and TBH I'd rather not

> touch it here.


Hmm, ok.  Patch is OK then.

Thanks,
Richard.

> Thanks,

> Richard

>

> >

> >> +  assemble_integer (elt, chunk_nunits, align, 1);

> >> +  nunits -= chunk_nunits;

> >>

> >>    /* Subsequent words need only 32-bit alignment.  */

> >>    align = min_align (align, 32);

> >>

> >>    for (int i = 1; i < nelts; i++)

> >>      {

> >> +      chunk_nunits = MIN (nunits, units_per);

> >>        if (reverse)

> >>         elt = flip_storage_order (SImode,

> >>                                   gen_int_mode (data[nelts - 1 - i], SImode));

> >>        else

> >> -       elt = GEN_INT (data[i]);

> >> -      assemble_integer (elt, MIN (nunits, units_per), align, 1);

> >> -      nunits -= units_per;

> >> +       elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT));

> >> +      assemble_integer (elt, chunk_nunits, align, 1);

> >> +      nunits -= chunk_nunits;

> >>      }

> >>  }

> >>
Christophe Lyon Sept. 20, 2019, 1:41 p.m. | #4
On Wed, 18 Sep 2019 at 11:41, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>

> Richard Biener <richard.guenther@gmail.com> writes:

> > On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford

> > <richard.sandiford@arm.com> wrote:

> >>

> >> assemble_real used GEN_INT to create integers directly from the

> >> longs returned by real_to_target.  assemble_integer then went on

> >> to interpret the const_ints as though they had the mode corresponding

> >> to the accompanying size parameter:

> >>

> >>       imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require ();

> >>

> >>       for (i = 0; i < size; i += subsize)

> >>         {

> >>           rtx partial = simplify_subreg (omode, x, imode, i);

> >>

> >> But in the assemble_real case, X might not be canonical for IMODE.

> >>

> >> If the interface to assemble_integer is supposed to allow outputting

> >> (say) the low 4 bytes of a DImode integer, then the simplify_subreg

> >> above is wrong.  But if the number of bytes passed to assemble_integer

> >> is supposed to be the number of bytes that the integer actually contains,

> >> assemble_real is wrong.

> >>

> >> This patch takes the latter interpretation and makes assemble_real

> >> generate const_ints that are canonical for the number of bytes passed.

> >>

> >> The flip_storage_order handling assumes that each long is a full

> >> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats

> >> whose memory size is not a multiple of 32 bits (which includes

> >> HFmode at least).  The patch therefore leaves that code alone.

> >> If interpreting each integer as SImode is correct, the const_ints

> >> that it generates are also correct.

> >>

> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Also tested

> >> by making sure that there were no new errors from a range of

> >> cross-built targets.  OK to install?

> >>

> >> Richard

> >>

> >>

> >> 2019-09-17  Richard Sandiford  <richard.sandiford@arm.com>

> >>

> >> gcc/

> >>         * varasm.c (assemble_real): Generate canonical const_ints.

> >>

> >> Index: gcc/varasm.c

> >> ===================================================================

> >> --- gcc/varasm.c        2019-09-05 08:49:30.829739618 +0100

> >> +++ gcc/varasm.c        2019-09-17 15:30:10.400740515 +0100

> >> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar

> >>    real_to_target (data, &d, mode);

> >>

> >>    /* Put out the first word with the specified alignment.  */

> >> +  unsigned int chunk_nunits = MIN (nunits, units_per);

> >>    if (reverse)

> >>      elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode));

> >>    else

> >> -    elt = GEN_INT (data[0]);

> >> -  assemble_integer (elt, MIN (nunits, units_per), align, 1);

> >> -  nunits -= units_per;

> >> +    elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT));

> >

> > why the appearant difference between the storage-order flipping

> > variant using gen_int_mode vs. the GEN_INT with sext_hwi?

> > Can't we use gen_int_mode in the non-flipping path and be done with that?

>

> Yeah, I mentioned this in the covering note.  The flip_storage_order

> stuff only seems to work for floats that are a multiple of 32 bits in

> size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the

> new "else" does.  Hard-coding SImode also hard-codes BITS_PER_UNIT==8,

> unlike the "else".

>

> So if anything, it's flip_storage_order that might need to change

> to avoid hard-coding SImode.  That doesn't look like a trivial change

> though.  E.g. the number of bytes passed to assemble_integer would need

> to match the number of bytes in data[nelts - 1] rather than data[0].

> The alignment code below would also need to be adjusted.  Fixing that

> (if it is a bug) seems like a separate change and TBH I'd rather not

> touch it here.

>


Hi Richard,

I suspect you've probably noticed already, but in case you haven't:
this patch causes a regression on arm:
FAIL: gcc.target/arm/fp16-compile-alt-3.c scan-assembler \t.short\t49152
FAIL: gcc.target/arm/fp16-compile-ieee-3.c scan-assembler \t.short\t49152

Christophe

> Thanks,

> Richard

>

> >

> >> +  assemble_integer (elt, chunk_nunits, align, 1);

> >> +  nunits -= chunk_nunits;

> >>

> >>    /* Subsequent words need only 32-bit alignment.  */

> >>    align = min_align (align, 32);

> >>

> >>    for (int i = 1; i < nelts; i++)

> >>      {

> >> +      chunk_nunits = MIN (nunits, units_per);

> >>        if (reverse)

> >>         elt = flip_storage_order (SImode,

> >>                                   gen_int_mode (data[nelts - 1 - i], SImode));

> >>        else

> >> -       elt = GEN_INT (data[i]);

> >> -      assemble_integer (elt, MIN (nunits, units_per), align, 1);

> >> -      nunits -= units_per;

> >> +       elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT));

> >> +      assemble_integer (elt, chunk_nunits, align, 1);

> >> +      nunits -= chunk_nunits;

> >>      }

> >>  }

> >>
Richard Sandiford Sept. 26, 2019, 10:48 a.m. | #5
Christophe Lyon <christophe.lyon@linaro.org> writes:
> On Wed, 18 Sep 2019 at 11:41, Richard Sandiford

> <richard.sandiford@arm.com> wrote:

>>

>> Richard Biener <richard.guenther@gmail.com> writes:

>> > On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford

>> > <richard.sandiford@arm.com> wrote:

>> >>

>> >> assemble_real used GEN_INT to create integers directly from the

>> >> longs returned by real_to_target.  assemble_integer then went on

>> >> to interpret the const_ints as though they had the mode corresponding

>> >> to the accompanying size parameter:

>> >>

>> >>       imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require ();

>> >>

>> >>       for (i = 0; i < size; i += subsize)

>> >>         {

>> >>           rtx partial = simplify_subreg (omode, x, imode, i);

>> >>

>> >> But in the assemble_real case, X might not be canonical for IMODE.

>> >>

>> >> If the interface to assemble_integer is supposed to allow outputting

>> >> (say) the low 4 bytes of a DImode integer, then the simplify_subreg

>> >> above is wrong.  But if the number of bytes passed to assemble_integer

>> >> is supposed to be the number of bytes that the integer actually contains,

>> >> assemble_real is wrong.

>> >>

>> >> This patch takes the latter interpretation and makes assemble_real

>> >> generate const_ints that are canonical for the number of bytes passed.

>> >>

>> >> The flip_storage_order handling assumes that each long is a full

>> >> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats

>> >> whose memory size is not a multiple of 32 bits (which includes

>> >> HFmode at least).  The patch therefore leaves that code alone.

>> >> If interpreting each integer as SImode is correct, the const_ints

>> >> that it generates are also correct.

>> >>

>> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Also tested

>> >> by making sure that there were no new errors from a range of

>> >> cross-built targets.  OK to install?

>> >>

>> >> Richard

>> >>

>> >>

>> >> 2019-09-17  Richard Sandiford  <richard.sandiford@arm.com>

>> >>

>> >> gcc/

>> >>         * varasm.c (assemble_real): Generate canonical const_ints.

>> >>

>> >> Index: gcc/varasm.c

>> >> ===================================================================

>> >> --- gcc/varasm.c        2019-09-05 08:49:30.829739618 +0100

>> >> +++ gcc/varasm.c        2019-09-17 15:30:10.400740515 +0100

>> >> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar

>> >>    real_to_target (data, &d, mode);

>> >>

>> >>    /* Put out the first word with the specified alignment.  */

>> >> +  unsigned int chunk_nunits = MIN (nunits, units_per);

>> >>    if (reverse)

>> >>      elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode));

>> >>    else

>> >> -    elt = GEN_INT (data[0]);

>> >> -  assemble_integer (elt, MIN (nunits, units_per), align, 1);

>> >> -  nunits -= units_per;

>> >> +    elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT));

>> >

>> > why the appearant difference between the storage-order flipping

>> > variant using gen_int_mode vs. the GEN_INT with sext_hwi?

>> > Can't we use gen_int_mode in the non-flipping path and be done with that?

>>

>> Yeah, I mentioned this in the covering note.  The flip_storage_order

>> stuff only seems to work for floats that are a multiple of 32 bits in

>> size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the

>> new "else" does.  Hard-coding SImode also hard-codes BITS_PER_UNIT==8,

>> unlike the "else".

>>

>> So if anything, it's flip_storage_order that might need to change

>> to avoid hard-coding SImode.  That doesn't look like a trivial change

>> though.  E.g. the number of bytes passed to assemble_integer would need

>> to match the number of bytes in data[nelts - 1] rather than data[0].

>> The alignment code below would also need to be adjusted.  Fixing that

>> (if it is a bug) seems like a separate change and TBH I'd rather not

>> touch it here.

>>

>

> Hi Richard,

>

> I suspect you've probably noticed already, but in case you haven't:

> this patch causes a regression on arm:

> FAIL: gcc.target/arm/fp16-compile-alt-3.c scan-assembler \t.short\t49152

> FAIL: gcc.target/arm/fp16-compile-ieee-3.c scan-assembler \t.short\t49152


Hadn't noticed that actually (but should have) -- thanks for the heads up.
I've applied the below as obvious after testing on armeb-eabi.

Richard


2019-09-26  Richard Sandiford  <richard.sandiford@arm.com>

gcc/testsuite/
	* gcc.target/arm/fp16-compile-alt-3.c: Expect (__fp16) -2.0
	to be written as a negative short rather than a positive one.
	* gcc.target/arm/fp16-compile-ieee-3.c: Likewise.

Index: gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c
===================================================================
--- gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c	2019-03-08 18:14:28.836998325 +0000
+++ gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c	2019-09-26 11:42:47.502378676 +0100
@@ -7,4 +7,4 @@
 __fp16 xx = -2.0;
 
 /* { dg-final { scan-assembler "\t.size\txx, 2" } } */
-/* { dg-final { scan-assembler "\t.short\t49152" } } */
+/* { dg-final { scan-assembler "\t.short\t-16384" } } */
Index: gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c
===================================================================
--- gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c	2019-03-08 18:14:28.732998720 +0000
+++ gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c	2019-09-26 11:42:47.506378645 +0100
@@ -6,4 +6,4 @@
 __fp16 xx = -2.0;
 
 /* { dg-final { scan-assembler "\t.size\txx, 2" } } */
-/* { dg-final { scan-assembler "\t.short\t49152" } } */
+/* { dg-final { scan-assembler "\t.short\t-16384" } } */

Patch

Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	2019-09-05 08:49:30.829739618 +0100
+++ gcc/varasm.c	2019-09-17 15:30:10.400740515 +0100
@@ -2873,25 +2873,27 @@  assemble_real (REAL_VALUE_TYPE d, scalar
   real_to_target (data, &d, mode);
 
   /* Put out the first word with the specified alignment.  */
+  unsigned int chunk_nunits = MIN (nunits, units_per);
   if (reverse)
     elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode));
   else
-    elt = GEN_INT (data[0]);
-  assemble_integer (elt, MIN (nunits, units_per), align, 1);
-  nunits -= units_per;
+    elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT));
+  assemble_integer (elt, chunk_nunits, align, 1);
+  nunits -= chunk_nunits;
 
   /* Subsequent words need only 32-bit alignment.  */
   align = min_align (align, 32);
 
   for (int i = 1; i < nelts; i++)
     {
+      chunk_nunits = MIN (nunits, units_per);
       if (reverse)
 	elt = flip_storage_order (SImode,
 				  gen_int_mode (data[nelts - 1 - i], SImode));
       else
-	elt = GEN_INT (data[i]);
-      assemble_integer (elt, MIN (nunits, units_per), align, 1);
-      nunits -= units_per;
+	elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT));
+      assemble_integer (elt, chunk_nunits, align, 1);
+      nunits -= chunk_nunits;
     }
 }