[0/2,MSP430] Optimize zero_extend insns and PSImode pointer manipulation

Message ID 20191008113450.5b6fa184@jozef-kubuntu
Headers show
Series
  • Optimize zero_extend insns and PSImode pointer manipulation
Related show

Message

Jozef Lawrynowicz Oct. 8, 2019, 10:34 a.m.
In the large memory model, MSP430 instructions have some useful properties when
performing byte, word or address-word writes to registers or memory:
- Byte-writes to registers clear bits 19:8
- Word-writes to registers clear bits 19:16
- PSImode writes to memory clear bits 16:4 of the second memory word

This patch makes use of these properties to optimize some zero_extend
instructions.

There are some "synonyms" for these zero_extend instructions that combine
searches for when optimizing code which manipulates PSImode pointers. The patch
adds a number of these unnamed RTL insns.

The first patch is an "obvious" patch with no functional changes, which just
reorders the zero_extend insns in the md file so we get them in one place.
The second patch has functional changes.

(Note that the patches will not apply cleanly unless the recently submitted
patch to implement post increment addressing has been applied:
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00492.html)

Successfully regtested on trunk in the small and large memory models.

Ok for trunk?

Jozef Lawrynowicz (2):
  MSP430: Reorder and group zero_extend insns in msp430.md
  MSP430: PSImode pointer manipulation and zero extend insn
    optimizations

 gcc/config/msp430/msp430.md | 236 +++++++++++++++++++++++++-----------
 1 file changed, 168 insertions(+), 68 deletions(-)

Comments

Jeff Law Oct. 14, 2019, 9:18 p.m. | #1
On 10/8/19 4:34 AM, Jozef Lawrynowicz wrote:
> In the large memory model, MSP430 instructions have some useful properties when

> performing byte, word or address-word writes to registers or memory:

> - Byte-writes to registers clear bits 19:8

> - Word-writes to registers clear bits 19:16

> - PSImode writes to memory clear bits 16:4 of the second memory word

> 

> This patch makes use of these properties to optimize some zero_extend

> instructions.

> 

> There are some "synonyms" for these zero_extend instructions that combine

> searches for when optimizing code which manipulates PSImode pointers. The patch

> adds a number of these unnamed RTL insns.

> 

> The first patch is an "obvious" patch with no functional changes, which just

> reorders the zero_extend insns in the md file so we get them in one place.

> The second patch has functional changes.

> 

> (Note that the patches will not apply cleanly unless the recently submitted

> patch to implement post increment addressing has been applied:

> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00492.html)

> 

> Successfully regtested on trunk in the small and large memory models.

> 

> Ok for trunk?

> 

> Jozef Lawrynowicz (2):

>   MSP430: Reorder and group zero_extend insns in msp430.md

>   MSP430: PSImode pointer manipulation and zero extend insn

>     optimizations

FWIW, we often end up describing some of this stuff via
WORD_REGISTER_OPERATIONS.  However that may not work for word sizes that
are not a power of two bits.

I only mention it because it may be worth a bit of experimentation on
your end.

jeff
Jozef Lawrynowicz Oct. 15, 2019, 12:49 p.m. | #2
On Mon, 14 Oct 2019 15:18:08 -0600
Jeff Law <law@redhat.com> wrote:

> On 10/8/19 4:34 AM, Jozef Lawrynowicz wrote:

> > In the large memory model, MSP430 instructions have some useful properties when

> > performing byte, word or address-word writes to registers or memory:

> > - Byte-writes to registers clear bits 19:8

> > - Word-writes to registers clear bits 19:16

> > - PSImode writes to memory clear bits 16:4 of the second memory word

> > 

> > This patch makes use of these properties to optimize some zero_extend

> > instructions.

> > 

> > There are some "synonyms" for these zero_extend instructions that combine

> > searches for when optimizing code which manipulates PSImode pointers. The patch

> > adds a number of these unnamed RTL insns.

> > 

> > The first patch is an "obvious" patch with no functional changes, which just

> > reorders the zero_extend insns in the md file so we get them in one place.

> > The second patch has functional changes.

> > 

> > (Note that the patches will not apply cleanly unless the recently submitted

> > patch to implement post increment addressing has been applied:

> > https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00492.html)

> > 

> > Successfully regtested on trunk in the small and large memory models.

> > 

> > Ok for trunk?

> > 

> > Jozef Lawrynowicz (2):

> >   MSP430: Reorder and group zero_extend insns in msp430.md

> >   MSP430: PSImode pointer manipulation and zero extend insn

> >     optimizations  

> FWIW, we often end up describing some of this stuff via

> WORD_REGISTER_OPERATIONS.  However that may not work for word sizes that

> are not a power of two bits.

> 

> I only mention it because it may be worth a bit of experimentation on

> your end.


We have WORD_REGISTER_OPERATIONS defined to 1, but UNITS_PER_WORD is always 2
bytes. But yes I should look at how the RTL passes use this macro.

I played around with some other macros e.g. TARGET_TRULY_NOOP_TRUNCATION,
REGMODE_NATURAL_SIZE but none had any desirable effect. It feels like there is
still something that needs to be fixed which will unlock better RTL generation
for PSImode operations, but I believe I've exhausted the target macros which
can influence this. So maybe something in the middle-end needs to be added or
extended.
I should also study some of the other ports with PSImode registers more closely.

P.S. I took at the old mn10200 port as you suggested before but I wasn't able
to improve code gen by changing any of the target macros that were defined
differently. The combiner patterns for pointer manipulation in the port were
similar but not close enough to the patterns I added in this patch.

Thanks,
Jozef
> 

> jeff
Jeff Law Oct. 15, 2019, 6:49 p.m. | #3
On 10/15/19 6:49 AM, Jozef Lawrynowicz wrote:
> On Mon, 14 Oct 2019 15:18:08 -0600

> Jeff Law <law@redhat.com> wrote:

> 

>> On 10/8/19 4:34 AM, Jozef Lawrynowicz wrote:

>>> In the large memory model, MSP430 instructions have some useful properties when

>>> performing byte, word or address-word writes to registers or memory:

>>> - Byte-writes to registers clear bits 19:8

>>> - Word-writes to registers clear bits 19:16

>>> - PSImode writes to memory clear bits 16:4 of the second memory word

>>>

>>> This patch makes use of these properties to optimize some zero_extend

>>> instructions.

>>>

>>> There are some "synonyms" for these zero_extend instructions that combine

>>> searches for when optimizing code which manipulates PSImode pointers. The patch

>>> adds a number of these unnamed RTL insns.

>>>

>>> The first patch is an "obvious" patch with no functional changes, which just

>>> reorders the zero_extend insns in the md file so we get them in one place.

>>> The second patch has functional changes.

>>>

>>> (Note that the patches will not apply cleanly unless the recently submitted

>>> patch to implement post increment addressing has been applied:

>>> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00492.html)

>>>

>>> Successfully regtested on trunk in the small and large memory models.

>>>

>>> Ok for trunk?

>>>

>>> Jozef Lawrynowicz (2):

>>>   MSP430: Reorder and group zero_extend insns in msp430.md

>>>   MSP430: PSImode pointer manipulation and zero extend insn

>>>     optimizations  

>> FWIW, we often end up describing some of this stuff via

>> WORD_REGISTER_OPERATIONS.  However that may not work for word sizes that

>> are not a power of two bits.

>>

>> I only mention it because it may be worth a bit of experimentation on

>> your end.

> 

> We have WORD_REGISTER_OPERATIONS defined to 1, but UNITS_PER_WORD is always 2

> bytes. But yes I should look at how the RTL passes use this macro.

> 

> I played around with some other macros e.g. TARGET_TRULY_NOOP_TRUNCATION,

> REGMODE_NATURAL_SIZE but none had any desirable effect. It feels like there is

> still something that needs to be fixed which will unlock better RTL generation

> for PSImode operations, but I believe I've exhausted the target macros which

> can influence this. So maybe something in the middle-end needs to be added or

> extended.

> I should also study some of the other ports with PSImode registers more closely.

There aren't many that use PSImode.  In general we don't handle partial
modes well in the optimizers -- largely because they're just not that
common and the exact size is unspecified.  PSImode for example can be
anywhere between 16 and 32 bits.  We can't track the sign bit state,
simplify extensions, etc etc.

One of the problems I distinctly remember is the promotion of integer
loop induction variables to ptrdiff_t then using them in pointer
arithmetic generated *horrible* code.  It's ultimately necessary from a
standard standpoint.


> 

> P.S. I took at the old mn10200 port as you suggested before but I wasn't able

> to improve code gen by changing any of the target macros that were defined

> differently. The combiner patterns for pointer manipulation in the port were

> similar but not close enough to the patterns I added in this patch.

ACK.

jeff
Segher Boessenkool Oct. 15, 2019, 11 p.m. | #4
On Tue, Oct 15, 2019 at 12:49:18PM -0600, Jeff Law wrote:
> There aren't many that use PSImode.  In general we don't handle partial

> modes well in the optimizers -- largely because they're just not that

> common and the exact size is unspecified.  PSImode for example can be

> anywhere between 16 and 32 bits.  We can't track the sign bit state,

> simplify extensions, etc etc.


For GCC, it has exactly 32 bits, just not all are necessarily significant:

  A @code{MODE_PARTIAL_INT} mode behaves as if it were as wide as the
  corresponding @code{MODE_INT} mode, except that it has an unknown
  number of undefined bits.

So sure, you could use only the low 16 bits of it, for example, if that
works out well for your port.

Such modes cannot (in the portable code) be used for any arithmetic, only
for data movement and the like.

> One of the problems I distinctly remember is the promotion of integer

> loop induction variables to ptrdiff_t then using them in pointer

> arithmetic generated *horrible* code.


Heh.


Segher