[0/5] C-SKY port

Message ID 49d0a2c8-51a0-4a74-d015-0bf1c1098e38@codesourcery.com
Headers show
Series
  • C-SKY port
Related show

Message

Sandra Loosemore July 24, 2018, 4:17 a.m.
This patch series is a new port for C-SKY ABI V2 processors.  It's
based on the original work by C-SKY with cleanup, modernization, and
bug-fixing by Mentor Graphics.  The supported targets are csky-elf and
csky-linux-* (both glibc and uClibc).  At this time we're only
supporting the C and C++ languages, but we've built all the other
front ends except Ada where we're still trying to sort out build
environment issues.

C-SKY proposes Han Mao <han_mao@c-sky.com> and Yunhai Shang
<yunhai_shang@c-sky.com> as maintainers for this port.  We expect that
C-SKY will also be providing a public link to the processor and ABI
documentation at some point.  Meanwhile, here's brief overview
of the architecture from the compiler's point of view:

This is a 32-bit target with mixed 16- and 32-bit instructions and
support for both endiannesses.  There are 5 different architecture
variants.  The ck801 variant is substantially different from the
others in terms of code generation -- it has only a few 32-bit
instructions and a restricted register set, so it's almost analagous
to a Thumb-only ARM core.  ck802 and ck803 have more 32-bit
instructions and 16 registers, while ck807 and ck810 are Linux-capable
with 32 registers.

The ABI is fairly standard; arguments are passed in r0-r3 with the
overflow on the stack, return values in r0-r1, etc.  There is no
dedicated frame pointer register.

The ELF target is configured to build 16(!) multilibs and the Linux
targets 8, to account for hard/soft float variants and endiannesses.
We've been testing all of them with QEMU (user-mode for the Linux
targets, since we don't have a stable kernel yet) and the results are
quite good now.

I've split the patch set up into 5 pieces, with part 2 having the bulk
of the new code:

[1] Configury
[2] Backend implementation
[3] Documentation
[4] Testsuite
[5] libgcc

We also have a couple small patches to target-independent code to fix
bugs, which we'll post separately.

-Sandra

Comments

Sandra Loosemore July 24, 2018, 4:21 a.m. | #1
2018-07-23  Jojo  <jijie_rong@c-sky.com>
             Huibin Wang  <huibin_wang@c-sky.com>
             Sandra Loosemore  <sandra@codesourcery.com>
             Chung-Lin Tang  <cltang@codesourcery.com>

         C-SKY port: Backend implementation

         gcc/
         * config/csky/*: New.
         * common/config/csky/*: New.
Sandra Loosemore July 24, 2018, 3:23 p.m. | #2
On 07/23/2018 10:17 PM, Sandra Loosemore wrote:

> C-SKY proposes Han Mao <han_mao@c-sky.com> and Yunhai Shang

> <yunhai_shang@c-sky.com> as maintainers for this port.  


I apologize, I made a mistake here.  The correct proposed maintainers 
are Xianmiao Qu <xianmiao_qu@c-sky.com> and Yunhai Shang
<yunhai_shang@c-sky.com>.

> We expect that

> C-SKY will also be providing a public link to the processor and ABI

> documentation at some point.


The ABI manual has been posted, but not the ISA documentation yet.  (I'd 
guess that when it does show up it will be in the same place, though.)

https://github.com/c-sky/csky-doc

-Sandra
Jeff Law July 24, 2018, 3:45 p.m. | #3
On 07/23/2018 10:21 PM, Sandra Loosemore wrote:
> 2018-07-23  Jojo  <jijie_rong@c-sky.com>

>             Huibin Wang  <huibin_wang@c-sky.com>

>             Sandra Loosemore  <sandra@codesourcery.com>

>             Chung-Lin Tang  <cltang@codesourcery.com>

> 

>         C-SKY port: Backend implementation

> 

>         gcc/

>         * config/csky/*: New.

>         * common/config/csky/*: New.


Let's avoid gratutious whitespace that attempts to line up conditionals.
  As an example, look at the predicate csky_load_multiple_operation.  I
think just doing a quick pass over the .c, .h and main .md files should
be sufficient here.

I'm not a big fan of more awk code, but I'm not going to object to it :-)

Why does the port have its own little pass for condition code
optimization (cse_cc)?  What is it doing that can't be done with our
generic optimizers?

Any thoughts on using the newer function descriptor bits rather than old
style stack trampolines?

I don't see anything terribly concerning in the core of the port.  The
amount of support code for minipool is huge and I wonder if some sharing
across the various ports would be possible, but I don't think that
should be a blocking issue for this port.


Can you update the backends.html web page here appropriately for the
c-sky target?

I'd like to take a closer look, but those are the high level comment's
I've got this morning :-)


Jeff
Sandra Loosemore July 24, 2018, 6:18 p.m. | #4
On 07/24/2018 09:45 AM, Jeff Law wrote:
> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:

>> 2018-07-23  Jojo  <jijie_rong@c-sky.com>

>>              Huibin Wang  <huibin_wang@c-sky.com>

>>              Sandra Loosemore  <sandra@codesourcery.com>

>>              Chung-Lin Tang  <cltang@codesourcery.com>

>>

>>          C-SKY port: Backend implementation

>>

>>          gcc/

>>          * config/csky/*: New.

>>          * common/config/csky/*: New.

> 

> Let's avoid gratutious whitespace that attempts to line up conditionals.

>    As an example, look at the predicate csky_load_multiple_operation.  I

> think just doing a quick pass over the .c, .h and main .md files should

> be sufficient here.


OK, will do.

> I'm not a big fan of more awk code, but I'm not going to object to it :-)

> 

> Why does the port have its own little pass for condition code

> optimization (cse_cc)?  What is it doing that can't be done with our

> generic optimizers?


This pass was included in the initial patch set we got from C-SKY, and 
as it didn't seem to break anything I left it in.  Perhaps C-SKY can 
provide a testcase that demonstrates why it's still useful in the 
current version of GCC; otherwise we can remove this from the initial 
port submission and restore it later if some performance analysis shows 
it is still worthwhile.

> Any thoughts on using the newer function descriptor bits rather than old

> style stack trampolines?


Has that been committed?  I vaguely remembered discussion of a new way 
to handle nested functions without using the trampoline interface, but I 
couldn't find any documentation in the internals manual.

> I don't see anything terribly concerning in the core of the port.  The

> amount of support code for minipool is huge and I wonder if some sharing

> across the various ports would be possible, but I don't think that

> should be a blocking issue for this port.


Yes, that code was clearly copied almost verbatim from the ARM backend. 
I left it alone as much as possible to simplify any future attempts at 
genericizing it.

> Can you update the backends.html web page here appropriately for the

> c-sky target?


Sure, I can take care of updating that when the port is committed.  I 
believe the right entry is

"csky                          b   ia"

> I'd like to take a closer look, but those are the high level comment's

> I've got this morning :-)


Thanks.  I'll wait a bit for more comments to come in before preparing a 
revised patch.

-Sandra
Jeff Law July 24, 2018, 9:24 p.m. | #5
On 07/24/2018 12:18 PM, Sandra Loosemore wrote:
> On 07/24/2018 09:45 AM, Jeff Law wrote:

>> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:

>>> 2018-07-23  Jojo  <jijie_rong@c-sky.com>

>>>              Huibin Wang  <huibin_wang@c-sky.com>

>>>              Sandra Loosemore  <sandra@codesourcery.com>

>>>              Chung-Lin Tang  <cltang@codesourcery.com>

>>>

>>>          C-SKY port: Backend implementation

>>>

>>>          gcc/

>>>          * config/csky/*: New.

>>>          * common/config/csky/*: New.

>>

>> Let's avoid gratutious whitespace that attempts to line up conditionals.

>>    As an example, look at the predicate csky_load_multiple_operation.  I

>> think just doing a quick pass over the .c, .h and main .md files should

>> be sufficient here.

> 

> OK, will do.

> 

>> I'm not a big fan of more awk code, but I'm not going to object to it :-)

>>

>> Why does the port have its own little pass for condition code

>> optimization (cse_cc)?  What is it doing that can't be done with our

>> generic optimizers?

> 

> This pass was included in the initial patch set we got from C-SKY, and

> as it didn't seem to break anything I left it in.  Perhaps C-SKY can

> provide a testcase that demonstrates why it's still useful in the

> current version of GCC; otherwise we can remove this from the initial

> port submission and restore it later if some performance analysis shows

> it is still worthwhile.

FWIW it looks like we model CC setting on just a few insns, (add,
subtract) so I'd be surprised if this little mini pass found much.  I'd
definitely like to hear from the csky authors here.

Alternately, you could do add some instrumentation to flag when it
triggers, take a test or two that does, reduce it and we can then look
at the key RTL sequences and see what the pass is really doing.

> 

>> Any thoughts on using the newer function descriptor bits rather than old

>> style stack trampolines?

> 

> Has that been committed?  I vaguely remembered discussion of a new way

> to handle nested functions without using the trampoline interface, but I

> couldn't find any documentation in the internals manual.

It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)
ports that define it.



> 

>> I don't see anything terribly concerning in the core of the port.  The

>> amount of support code for minipool is huge and I wonder if some sharing

>> across the various ports would be possible, but I don't think that

>> should be a blocking issue for this port.

> 

> Yes, that code was clearly copied almost verbatim from the ARM backend.

> I left it alone as much as possible to simplify any future attempts at

> genericizing it.

Understood -- I'd assumed it was largely copied from ARM, but hadn't
gone back to the ARM bits to verify.

> 

>> Can you update the backends.html web page here appropriately for the

>> c-sky target?

> 

> Sure, I can take care of updating that when the port is committed.  I

> believe the right entry is

> 

> "csky                          b   ia"

Yea, that seems right to me.

Jeff
Sandra Loosemore July 25, 2018, 12:17 a.m. | #6
On 07/24/2018 03:24 PM, Jeff Law wrote:
>>

>>> Any thoughts on using the newer function descriptor bits rather than old

>>> style stack trampolines?

>>

>> Has that been committed?  I vaguely remembered discussion of a new way

>> to handle nested functions without using the trampoline interface, but I

>> couldn't find any documentation in the internals manual.

> It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)

> ports that define it.


Hmmm, I completely failed to make that connection from the docs -- the 
whole description of that hook is pretty gibberishy and I thought it was 
something for targets where the ABI already specifies some "standard 
calling sequence" using descriptors (C-SKY doesn't), rather than a 
generic alternative to executable trampolines.  Putting on my doc 
maintainer hat briefly, I can see this needs a lot of work.  :-(

Anyway, is this required for new ports nowadays?  If so, I at least know 
what to search for now.  At this point I couldn't say whether this would 
do anything to fix the situation on ck801 targets where there simply 
aren't enough spare registers available to the trampoline to both hold 
the static link and do an indirect jump.

-Sandra
Jeff Law July 25, 2018, 4:50 a.m. | #7
On 07/24/2018 06:17 PM, Sandra Loosemore wrote:
> On 07/24/2018 03:24 PM, Jeff Law wrote:

>>>

>>>> Any thoughts on using the newer function descriptor bits rather than

>>>> old

>>>> style stack trampolines?

>>>

>>> Has that been committed?  I vaguely remembered discussion of a new way

>>> to handle nested functions without using the trampoline interface, but I

>>> couldn't find any documentation in the internals manual.

>> It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)

>> ports that define it.

> 

> Hmmm, I completely failed to make that connection from the docs -- the

> whole description of that hook is pretty gibberishy and I thought it was

> something for targets where the ABI already specifies some "standard

> calling sequence" using descriptors (C-SKY doesn't), rather than a

> generic alternative to executable trampolines.  Putting on my doc

> maintainer hat briefly, I can see this needs a lot of work.  :-(

Most likely :-)  So many things to do, so little time.


> 

> Anyway, is this required for new ports nowadays?  If so, I at least know

> what to search for now.  At this point I couldn't say whether this would

> do anything to fix the situation on ck801 targets where there simply

> aren't enough spare registers available to the trampoline to both hold

> the static link and do an indirect jump.

It's not required, but preferred, particularly if the part has an MMU
that can provide no-execute protections on pages in memory.  If the
target doesn't have an mmu, then it's of marginal value.

The key advantage it has over the old trampoline implementation is that
stacks can remain non-executable, even for Ada and nested functions.
That's a big win from a security standpoint.

Jeff
Paul Koning July 25, 2018, 1:16 p.m. | #8
> On Jul 25, 2018, at 12:50 AM, Jeff Law <law@redhat.com> wrote:

> 

>>>> ...

>>> It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)

>>> ports that define it.

>> 

>> Hmmm, I completely failed to make that connection from the docs -- the

>> whole description of that hook is pretty gibberishy and I thought it was

>> something for targets where the ABI already specifies some "standard

>> calling sequence" using descriptors (C-SKY doesn't), rather than a

>> generic alternative to executable trampolines.  Putting on my doc

>> maintainer hat briefly, I can see this needs a lot of work.  :-(

> Most likely :-)  So many things to do, so little time.

> 

> 

>> 

>> Anyway, is this required for new ports nowadays?  If so, I at least know

>> what to search for now.  At this point I couldn't say whether this would

>> do anything to fix the situation on ck801 targets where there simply

>> aren't enough spare registers available to the trampoline to both hold

>> the static link and do an indirect jump.

> It's not required, but preferred, particularly if the part has an MMU

> that can provide no-execute protections on pages in memory.  If the

> target doesn't have an mmu, then it's of marginal value.

> 

> The key advantage it has over the old trampoline implementation is that

> stacks can remain non-executable, even for Ada and nested functions.

> That's a big win from a security standpoint.


Non-executable stacks are a very good thing.

That said, I also looked at the target hook documentation and was left without any clue whatsoever.  It sure isn't clear what powers of two have to do with descriptors, or what descriptors have to do with support for nested functions.

Can you suggest places to look to get an understanding of this feature?  It sounds like the only other option is "Use the source, Luke".  Any specific targets that make a good reference implementation for this?

	paul
Sandra Loosemore July 25, 2018, 2:54 p.m. | #9
On 07/25/2018 07:16 AM, Paul Koning wrote:

> Non-executable stacks are a very good thing.

> 

> That said, I also looked at the target hook documentation and was

> left without any clue whatsoever.  It sure isn't clear what powers of

> two have to do with descriptors, or what descriptors have to do with

> support for nested functions.

> 

> Can you suggest places to look to get an understanding of this

> feature?  It sounds like the only other option is "Use the source,

> Luke".  Any specific targets that make a good reference

> implementation for this?


FYI, so far I have found PR ada/67205 and the original patch posting 
here, but it looks like "Use the source" is indeed where we are on this. 
  :-(

https://gcc.gnu.org/ml/gcc-patches/2016-06/msg02016.html

-Sandra
Xianmiao Qu July 26, 2018, 6:06 a.m. | #10
> 在 2018年7月25日,上午5:24,Jeff Law <law@redhat.com> 写道:

> 

> On 07/24/2018 12:18 PM, Sandra Loosemore wrote:

>> On 07/24/2018 09:45 AM, Jeff Law wrote:

>>> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:

>>> I'm not a big fan of more awk code, but I'm not going to object to it :-)

>>> 

>>> Why does the port have its own little pass for condition code

>>> optimization (cse_cc)?  What is it doing that can't be done with our

>>> generic optimizers?

>> 

>> This pass was included in the initial patch set we got from C-SKY, and

>> as it didn't seem to break anything I left it in.  Perhaps C-SKY can

>> provide a testcase that demonstrates why it's still useful in the

>> current version of GCC; otherwise we can remove this from the initial

>> port submission and restore it later if some performance analysis shows

>> it is still worthwhile.

> FWIW it looks like we model CC setting on just a few insns, (add,

> subtract) so I'd be surprised if this little mini pass found much.  I'd

> definitely like to hear from the csky authors here.

> 

> Alternately, you could do add some instrumentation to flag when it

> triggers, take a test or two that does, reduce it and we can then look

> at the key RTL sequences and see what the pass is really doing.

> 


I wrote a case to reproduce this problem on C-SKY. C code is as follows:
-----------------------------------------------------------------------
int e1, e2;

void func (int a, int b, int c, int d, int f, int g)
{
  e1 = a > b ? f : g;
  e2 = a > b ? c : d;

  return;
}
-----------------------------------------------------------------------

compile to assembler with option “-O3 -S” :
-----------------------------------------------------------------------
func:
  cmplt a1, a0
  ld.w  t1, (sp, 0)
  ld.w  t0, (sp, 4)
  movt  t0, t1
  cmplt a1, a0
  movt  a3, a2
  lrw a1, e2
  lrw a2, e1
  st.w  a3, (a1, 0)
  st.w  t0, (a2, 0)
  rts
-----------------------------------------------------------------------
There is an extra “cmplt a1, a0" in the above code without cse_cc. This situation mainly occurs when a relatively short branch jump is converted into a conditional execution instruction. And the CSE pass can not reduce the same conditional comparison instruction . Here is the rtx sequence after “cse2” pass.

-----------------------------------------------------------------------
(insn 28 13 29 2 (set (reg:CC 33 c)
        (gt:CC (reg/v:SI 77 [ a ])
            (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
     (nil))
(insn 29 28 30 2 (set (reg/v:SI 82 [ g ])
        (if_then_else:SI (eq (reg:CC 33 c)
                (const_int 0 [0]))
            (reg/v:SI 82 [ g ])
            (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}
     (expr_list:REG_DEAD (reg/v:SI 81 [ f ])
        (expr_list:REG_DEAD (reg:CC 33 c)
            (nil))))
(insn 30 29 31 2 (set (reg:CC 33 c)
        (gt:CC (reg/v:SI 77 [ a ])
            (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
     (expr_list:REG_DEAD (reg/v:SI 78 [ b ])
        (expr_list:REG_DEAD (reg/v:SI 77 [ a ])
            (nil))))
(insn 31 30 18 2 (set (reg/v:SI 80 [ d ])
        (if_then_else:SI (eq (reg:CC 33 c)
                (const_int 0 [0]))
            (reg/v:SI 80 [ d ])
            (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}
     (expr_list:REG_DEAD (reg/v:SI 79 [ c ])
        (expr_list:REG_DEAD (reg:CC 33 c)
            (nil))))
-----------------------------------------------------------------------

It doesn't seem to check the same conditional comparison instruction .So I wrote this to solve this problem, but I am not sure if this is the best way : )

PS, the same conditional comparison instruction cannot be reduced with the latest version gcc with C-SKY because I just insert the “cse_cc” after “cse1”, when I insert after “cse2”, this problem can be solved very well.


Thanks,
Xianmiao
Joseph Myers July 26, 2018, 11:04 p.m. | #11
On Tue, 24 Jul 2018, Sandra Loosemore wrote:

> On 07/23/2018 10:17 PM, Sandra Loosemore wrote:

> 

> > C-SKY proposes Han Mao <han_mao@c-sky.com> and Yunhai Shang

> > <yunhai_shang@c-sky.com> as maintainers for this port.  

> 

> I apologize, I made a mistake here.  The correct proposed maintainers are

> Xianmiao Qu <xianmiao_qu@c-sky.com> and Yunhai Shang

> <yunhai_shang@c-sky.com>.


I'd like to confirm: will the maintainers be running a bot reporting test 
results for trunk for at least one C-SKY configuration to gcc-testresults 
(daily or thereabouts), and monitoring results for regressions and 
addressing those regressions promptly?  We don't yet have such a 
requirement, but when an architecture-independent patch causes failures on 
a particular architectures, it's much better if the architecture 
maintainers raise the issue while the patch is still fresh in the author's 
mind (meaning within a few days at most of the commit) than if they only 
look at results and track down the patch causing the failure many months 
later (e.g. while reviewing test results in preparation for the following 
release).

> > We expect that

> > C-SKY will also be providing a public link to the processor and ABI

> > documentation at some point.

> 

> The ABI manual has been posted, but not the ISA documentation yet.  (I'd guess

> that when it does show up it will be in the same place, though.)

> 

> https://github.com/c-sky/csky-doc


Could you provide the proposed GCC website changes for the port 
(backends.html, readings.html, news item for index.html)?  readings.html, 
in particular, would link to the ABI and ISA documentation, while 
backends.html gives summary information about the properties of both the 
architecture and the GCC port.

-- 
Joseph S. Myers
joseph@codesourcery.com
Sandra Loosemore July 28, 2018, 1:49 a.m. | #12
On 07/26/2018 12:06 AM, 瞿仙淼 wrote:
> 

> I wrote a case to reproduce this problem on C-SKY. C code is as follows:

> -----------------------------------------------------------------------

> int e1, e2;

> 

> void func (int a, int b, int c, int d, int f, int g)

> {

>    e1 = a > b ? f : g;

>    e2 = a > b ? c : d;

> 

>    return;

> }

> -----------------------------------------------------------------------

> 

> compile to assembler with option “-O3 -S” :

> -----------------------------------------------------------------------

> func:

>    cmplt a1, a0

>    ld.w  t1, (sp, 0)

>    ld.w  t0, (sp, 4)

>    movt  t0, t1

>    cmplt a1, a0

>    movt  a3, a2

>    lrw a1, e2

>    lrw a2, e1

>    st.w  a3, (a1, 0)

>    st.w  t0, (a2, 0)

>    rts

> -----------------------------------------------------------------------

> There is an extra “cmplt a1, a0" in the above code without cse_cc. This situation mainly occurs when a relatively short branch jump is converted into a conditional execution instruction. And the CSE pass can not reduce the same conditional comparison instruction . Here is the rtx sequence after “cse2” pass.

> 

> -----------------------------------------------------------------------

> (insn 28 13 29 2 (set (reg:CC 33 c)

>          (gt:CC (reg/v:SI 77 [ a ])

>              (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}

>       (nil))

> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])

>          (if_then_else:SI (eq (reg:CC 33 c)

>                  (const_int 0 [0]))

>              (reg/v:SI 82 [ g ])

>              (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}

>       (expr_list:REG_DEAD (reg/v:SI 81 [ f ])

>          (expr_list:REG_DEAD (reg:CC 33 c)

>              (nil))))

> (insn 30 29 31 2 (set (reg:CC 33 c)

>          (gt:CC (reg/v:SI 77 [ a ])

>              (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}

>       (expr_list:REG_DEAD (reg/v:SI 78 [ b ])

>          (expr_list:REG_DEAD (reg/v:SI 77 [ a ])

>              (nil))))

> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])

>          (if_then_else:SI (eq (reg:CC 33 c)

>                  (const_int 0 [0]))

>              (reg/v:SI 80 [ d ])

>              (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}

>       (expr_list:REG_DEAD (reg/v:SI 79 [ c ])

>          (expr_list:REG_DEAD (reg:CC 33 c)

>              (nil))))

> -----------------------------------------------------------------------

> 

> It doesn't seem to check the same conditional comparison instruction .So I wrote this to solve this problem, but I am not sure if this is the best way : )

> 

> PS, the same conditional comparison instruction cannot be reduced with the latest version gcc with C-SKY because I just insert the “cse_cc” after “cse1”, when I insert after “cse2”, this problem can be solved very well.


Thanks, this is very helpful.  I've verified this and I'm moving the 
pass as you suggest, adding a more detailed comment to the source to 
explain what the pass is for, and adding your test case to the 
testsuite.  This will be included when I resubmit the patches to address 
other review comments too.

Jeff, does that adequately address your concerns about whether the pass 
is useful?

-Sandra
Sandra Loosemore July 30, 2018, 4:59 p.m. | #13
On 07/26/2018 05:04 PM, Joseph Myers wrote:

> Could you provide the proposed GCC website changes for the port

> (backends.html, readings.html, news item for index.html)?  readings.html,

> in particular, would link to the ABI and ISA documentation, while

> backends.html gives summary information about the properties of both the

> architecture and the GCC port.


The attached patch is a bit drafty, but I think it at least as 
placeholders for everything.

-Sandra
? htdocs/backends.html.~1.78.~
? htdocs/index.html.~1.1090.~
? htdocs/readings.html.~1.296.~
Index: htdocs/backends.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/backends.html,v
retrieving revision 1.78
diff -u -r1.78 backends.html
--- htdocs/backends.html	28 Jul 2018 22:20:20 -0000	1.78
+++ htdocs/backends.html	30 Jul 2018 16:56:17 -0000
@@ -76,6 +76,7 @@
 c6x        |   S     CB          gi
 cr16       |    L  F C           g    s
 cris       |       F  B     c    gi   s
+csky       |                  b   ia
 epiphany   |         C           gi   s
 fr30       | ??    FI B      pb mg    s
 frv        | ??       B       b   i   s
Index: htdocs/index.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.1090
diff -u -r1.1090 index.html
--- htdocs/index.html	30 Jul 2018 01:38:08 -0000	1.1090
+++ htdocs/index.html	30 Jul 2018 16:56:17 -0000
@@ -52,6 +52,11 @@
 <h2 id="news">News</h2>
 <dl>
 
+<dt><span>C-SKY support</span>
+     <span class="date">[2018-xx-xx]</span></dt>
+     <dd>GCC support for C-SKY V2 processors has been added.  This back
+       end was contributed by C-SKY Microsystems and Mentor Graphics.</dd>
+
 <dt><span><a href="https://gcc.gnu.org/wiki/cauldron2018">GNU Tools Cauldron 2018</a></span>
     <span class="date">[2018-07-29]</span></dt>
     <dd>Will be held in Manchester, September 7-9 2018</dd>
Index: htdocs/readings.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
retrieving revision 1.296
diff -u -r1.296 readings.html
--- htdocs/readings.html	28 Jul 2018 22:20:20 -0000	1.296
+++ htdocs/readings.html	30 Jul 2018 16:56:17 -0000
@@ -120,6 +120,11 @@
    <br /><a href="http://developer.axis.com/">Site with CPU documentation</a>
  </li>
  
+ <li>C-SKY
+   <br />Manufacturer: C-SKY Microsystems
+   <br /><a href="https://github.com/c-sky/csky-doc">C-SKY Documentation</a>
+ </li>
+ 
  <li>Epiphany
   <br />Manufacturer: Adapteva
   <br /><a href="http://www.adapteva.com/">Manufacturer's website</a> with
Index: htdocs/gcc-9/changes.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.15
diff -u -r1.15 changes.html
--- htdocs/gcc-9/changes.html	25 Jul 2018 21:54:59 -0000	1.15
+++ htdocs/gcc-9/changes.html	30 Jul 2018 16:56:17 -0000
@@ -138,6 +138,13 @@
 
 <!-- <h3 id="avr">AVR</h3> -->
 
+<h3 id="csky">C-SKY</h3>
+<ul>
+  <li>
+    A new back end targeting C-SKY V2 processors has been contributed to GCC.
+  </li>
+</ul>
+
 <!-- <h3 id="hsa">Heterogeneous Systems Architecture</h3> -->
 
 <!-- <h3 id="x86">IA-32/x86-64</h3> -->
Xianmiao Qu Aug. 1, 2018, 2:28 p.m. | #14
>>> We expect that

>>> C-SKY will also be providing a public link to the processor and ABI

>>> documentation at some point.

>> 

>> The ABI manual has been posted, but not the ISA documentation yet.  (I'd guess

>> that when it does show up it will be in the same place, though.)

>> 

>> https://github.com/c-sky/csky-doc

> 

> Could you provide the proposed GCC website changes for the port 

> (backends.html, readings.html, news item for index.html)?  readings.html, 

> in particular, would link to the ABI and ISA documentation, while 

> backends.html gives summary information about the properties of both the 

> architecture and the GCC port.

> 

> -- 

> Joseph S. Myers

> joseph@codesourcery.com


Hi,
	The ISA documentation is now available from  https://github.com/c-sky/csky-doc

-Xianmiao
Jeff Law Aug. 2, 2018, 10:27 p.m. | #15
On 07/26/2018 12:06 AM, 瞿仙淼 wrote:
> 

>> 在 2018年7月25日,上午5:24,Jeff Law <law@redhat.com> 写道:

>>

>> On 07/24/2018 12:18 PM, Sandra Loosemore wrote:

>>> On 07/24/2018 09:45 AM, Jeff Law wrote:

>>>> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:

>>>> I'm not a big fan of more awk code, but I'm not going to object to it :-)

>>>>

>>>> Why does the port have its own little pass for condition code

>>>> optimization (cse_cc)?  What is it doing that can't be done with our

>>>> generic optimizers?

>>>

>>> This pass was included in the initial patch set we got from C-SKY, and

>>> as it didn't seem to break anything I left it in.  Perhaps C-SKY can

>>> provide a testcase that demonstrates why it's still useful in the

>>> current version of GCC; otherwise we can remove this from the initial

>>> port submission and restore it later if some performance analysis shows

>>> it is still worthwhile.

>> FWIW it looks like we model CC setting on just a few insns, (add,

>> subtract) so I'd be surprised if this little mini pass found much.  I'd

>> definitely like to hear from the csky authors here.

>>

>> Alternately, you could do add some instrumentation to flag when it

>> triggers, take a test or two that does, reduce it and we can then look

>> at the key RTL sequences and see what the pass is really doing.

>>

> 

> I wrote a case to reproduce this problem on C-SKY. C code is as follows:

> -----------------------------------------------------------------------

> int e1, e2;

> 

> void func (int a, int b, int c, int d, int f, int g)

> {

>   e1 = a > b ? f : g;

>   e2 = a > b ? c : d;

> 

>   return;

> }

> -----------------------------------------------------------------------

> 

> compile to assembler with option “-O3 -S” :

> -----------------------------------------------------------------------

> func:

>   cmplt a1, a0

>   ld.w  t1, (sp, 0)

>   ld.w  t0, (sp, 4)

>   movt  t0, t1

>   cmplt a1, a0

>   movt  a3, a2

>   lrw a1, e2

>   lrw a2, e1

>   st.w  a3, (a1, 0)

>   st.w  t0, (a2, 0)

>   rts

> -----------------------------------------------------------------------

> There is an extra “cmplt a1, a0" in the above code without cse_cc. This situation mainly occurs when a relatively short branch jump is converted into a conditional execution instruction. And the CSE pass can not reduce the same conditional comparison instruction . Here is the rtx sequence after “cse2” pass.

> 

> -----------------------------------------------------------------------

> (insn 28 13 29 2 (set (reg:CC 33 c)

>         (gt:CC (reg/v:SI 77 [ a ])

>             (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}

>      (nil))

> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])

>         (if_then_else:SI (eq (reg:CC 33 c)

>                 (const_int 0 [0]))

>             (reg/v:SI 82 [ g ])

>             (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}

>      (expr_list:REG_DEAD (reg/v:SI 81 [ f ])

>         (expr_list:REG_DEAD (reg:CC 33 c)

>             (nil))))

> (insn 30 29 31 2 (set (reg:CC 33 c)

>         (gt:CC (reg/v:SI 77 [ a ])

>             (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}

>      (expr_list:REG_DEAD (reg/v:SI 78 [ b ])

>         (expr_list:REG_DEAD (reg/v:SI 77 [ a ])

>             (nil))))

> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])

>         (if_then_else:SI (eq (reg:CC 33 c)

>                 (const_int 0 [0]))

>             (reg/v:SI 80 [ d ])

>             (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}

>      (expr_list:REG_DEAD (reg/v:SI 79 [ c ])

>         (expr_list:REG_DEAD (reg:CC 33 c)

>             (nil))))

> -----------------------------------------------------------------------

> 

> It doesn't seem to check the same conditional comparison instruction .So I wrote this to solve this problem, but I am not sure if this is the best way : )

> 

> PS, the same conditional comparison instruction cannot be reduced with the latest version gcc with C-SKY because I just insert the “cse_cc” after “cse1”, when I insert after “cse2”, this problem can be solved very well.

I think the cse_cc pass is really just working around one or more bugs
in CSE and/or a backend bug.  The RTL above clearly shows a common
subexpression that is not eliminated by CSE.

What CSE should be trying to do is changing the second and third
occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would
create nop-sets which would be subsequently deleted.  I suspect you do
not have an insn which matches that nop set of the CC register.  If you
fix that I suspect CSE will work better and eliminate the need for your
cse_cc pass.

jeff
Jeff Law Aug. 2, 2018, 10:33 p.m. | #16
On 07/27/2018 07:49 PM, Sandra Loosemore wrote:
> On 07/26/2018 12:06 AM, 瞿仙淼 wrote:

>>

>> I wrote a case to reproduce this problem on C-SKY. C code is as follows:

>> -----------------------------------------------------------------------

>> int e1, e2;

>>

>> void func (int a, int b, int c, int d, int f, int g)

>> {

>>    e1 = a > b ? f : g;

>>    e2 = a > b ? c : d;

>>

>>    return;

>> }

>> -----------------------------------------------------------------------

>>

>> compile to assembler with option “-O3 -S” :

>> -----------------------------------------------------------------------

>> func:

>>    cmplt a1, a0

>>    ld.w  t1, (sp, 0)

>>    ld.w  t0, (sp, 4)

>>    movt  t0, t1

>>    cmplt a1, a0

>>    movt  a3, a2

>>    lrw a1, e2

>>    lrw a2, e1

>>    st.w  a3, (a1, 0)

>>    st.w  t0, (a2, 0)

>>    rts

>> -----------------------------------------------------------------------

>> There is an extra “cmplt a1, a0" in the above code without cse_cc.

>> This situation mainly occurs when a relatively short branch jump is

>> converted into a conditional execution instruction. And the CSE pass

>> can not reduce the same conditional comparison instruction . Here is

>> the rtx sequence after “cse2” pass.

>>

>> -----------------------------------------------------------------------

>> (insn 28 13 29 2 (set (reg:CC 33 c)

>>          (gt:CC (reg/v:SI 77 [ a ])

>>              (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}

>>       (nil))

>> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])

>>          (if_then_else:SI (eq (reg:CC 33 c)

>>                  (const_int 0 [0]))

>>              (reg/v:SI 82 [ g ])

>>              (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}

>>       (expr_list:REG_DEAD (reg/v:SI 81 [ f ])

>>          (expr_list:REG_DEAD (reg:CC 33 c)

>>              (nil))))

>> (insn 30 29 31 2 (set (reg:CC 33 c)

>>          (gt:CC (reg/v:SI 77 [ a ])

>>              (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}

>>       (expr_list:REG_DEAD (reg/v:SI 78 [ b ])

>>          (expr_list:REG_DEAD (reg/v:SI 77 [ a ])

>>              (nil))))

>> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])

>>          (if_then_else:SI (eq (reg:CC 33 c)

>>                  (const_int 0 [0]))

>>              (reg/v:SI 80 [ d ])

>>              (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}

>>       (expr_list:REG_DEAD (reg/v:SI 79 [ c ])

>>          (expr_list:REG_DEAD (reg:CC 33 c)

>>              (nil))))

>> -----------------------------------------------------------------------

>>

>> It doesn't seem to check the same conditional comparison instruction

>> .So I wrote this to solve this problem, but I am not sure if this is

>> the best way : )

>>

>> PS, the same conditional comparison instruction cannot be reduced with

>> the latest version gcc with C-SKY because I just insert the “cse_cc”

>> after “cse1”, when I insert after “cse2”, this problem can be solved

>> very well.

> 

> Thanks, this is very helpful.  I've verified this and I'm moving the

> pass as you suggest, adding a more detailed comment to the source to

> explain what the pass is for, and adding your test case to the

> testsuite.  This will be included when I resubmit the patches to address

> other review comments too.

> 

> Jeff, does that adequately address your concerns about whether the pass

> is useful?

I think the pass is papering over problems elsewhere (see my most other
reply on this thread).  I do think it would be useful to take that code
and create a test based on it.  I suspect you'll want to verify multiple
GT expressions prior to CSE2 and that after CSE2 you have a single GT
expression.  Presumably it'd be in the csky specific test.

jeff
Yunhai Aug. 3, 2018, 7:58 a.m. | #17
> 在 2018年8月3日,06:27,Jeff Law <law@redhat.com> 写道:

> 

> On 07/26/2018 12:06 AM, 瞿仙淼 wrote:

>> 

>>> 在 2018年7月25日,上午5:24,Jeff Law <law@redhat.com> 写道:

>>> 

>>> On 07/24/2018 12:18 PM, Sandra Loosemore wrote:

>>>> On 07/24/2018 09:45 AM, Jeff Law wrote:

>>>>> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:

>>>>> I'm not a big fan of more awk code, but I'm not going to object to it :-)

>>>>> 

>>>>> Why does the port have its own little pass for condition code

>>>>> optimization (cse_cc)?  What is it doing that can't be done with our

>>>>> generic optimizers?

>>>> 

>>>> This pass was included in the initial patch set we got from C-SKY, and

>>>> as it didn't seem to break anything I left it in.  Perhaps C-SKY can

>>>> provide a testcase that demonstrates why it's still useful in the

>>>> current version of GCC; otherwise we can remove this from the initial

>>>> port submission and restore it later if some performance analysis shows

>>>> it is still worthwhile.

>>> FWIW it looks like we model CC setting on just a few insns, (add,

>>> subtract) so I'd be surprised if this little mini pass found much.  I'd

>>> definitely like to hear from the csky authors here.

>>> 

>>> Alternately, you could do add some instrumentation to flag when it

>>> triggers, take a test or two that does, reduce it and we can then look

>>> at the key RTL sequences and see what the pass is really doing.

>>> 

>> 

>> I wrote a case to reproduce this problem on C-SKY. C code is as follows:

>> -----------------------------------------------------------------------

>> int e1, e2;

>> 

>> void func (int a, int b, int c, int d, int f, int g)

>> {

>>  e1 = a > b ? f : g;

>>  e2 = a > b ? c : d;

>> 

>>  return;

>> }

>> -----------------------------------------------------------------------

>> 

>> compile to assembler with option “-O3 -S” :

>> -----------------------------------------------------------------------

>> func:

>>  cmplt a1, a0

>>  ld.w  t1, (sp, 0)

>>  ld.w  t0, (sp, 4)

>>  movt  t0, t1

>>  cmplt a1, a0

>>  movt  a3, a2

>>  lrw a1, e2

>>  lrw a2, e1

>>  st.w  a3, (a1, 0)

>>  st.w  t0, (a2, 0)

>>  rts

>> -----------------------------------------------------------------------

>> There is an extra “cmplt a1, a0" in the above code without cse_cc. This situation mainly occurs when a relatively short branch jump is converted into a conditional execution instruction. And the CSE pass can not reduce the same conditional comparison instruction . Here is the rtx sequence after “cse2” pass.

>> 

>> -----------------------------------------------------------------------

>> (insn 28 13 29 2 (set (reg:CC 33 c)

>>        (gt:CC (reg/v:SI 77 [ a ])

>>            (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}

>>     (nil))

>> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])

>>        (if_then_else:SI (eq (reg:CC 33 c)

>>                (const_int 0 [0]))

>>            (reg/v:SI 82 [ g ])

>>            (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}

>>     (expr_list:REG_DEAD (reg/v:SI 81 [ f ])

>>        (expr_list:REG_DEAD (reg:CC 33 c)

>>            (nil))))

>> (insn 30 29 31 2 (set (reg:CC 33 c)

>>        (gt:CC (reg/v:SI 77 [ a ])

>>            (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}

>>     (expr_list:REG_DEAD (reg/v:SI 78 [ b ])

>>        (expr_list:REG_DEAD (reg/v:SI 77 [ a ])

>>            (nil))))

>> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])

>>        (if_then_else:SI (eq (reg:CC 33 c)

>>                (const_int 0 [0]))

>>            (reg/v:SI 80 [ d ])

>>            (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}

>>     (expr_list:REG_DEAD (reg/v:SI 79 [ c ])

>>        (expr_list:REG_DEAD (reg:CC 33 c)

>>            (nil))))

>> -----------------------------------------------------------------------

>> 

>> It doesn't seem to check the same conditional comparison instruction .So I wrote this to solve this problem, but I am not sure if this is the best way : )

>> 

>> PS, the same conditional comparison instruction cannot be reduced with the latest version gcc with C-SKY because I just insert the “cse_cc” after “cse1”, when I insert after “cse2”, this problem can be solved very well.

> I think the cse_cc pass is really just working around one or more bugs

> in CSE and/or a backend bug.  The RTL above clearly shows a common

> subexpression that is not eliminated by CSE.

> 

> What CSE should be trying to do is changing the second and third

> occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would

> create nop-sets which would be subsequently deleted.  I suspect you do

> not have an insn which matches that nop set of the CC register.  If you

> fix that I suspect CSE will work better and eliminate the need for your

> cse_cc pass.


Thanks you for your suggestions, we will try this method.

> 

> jeff
Sandra Loosemore Aug. 3, 2018, 4:26 p.m. | #18
On 08/02/2018 04:27 PM, Jeff Law wrote:
> I think the cse_cc pass is really just working around one or more bugs

> in CSE and/or a backend bug.  The RTL above clearly shows a common

> subexpression that is not eliminated by CSE.

> 

> What CSE should be trying to do is changing the second and third

> occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would

> create nop-sets which would be subsequently deleted.  I suspect you do

> not have an insn which matches that nop set of the CC register.  If you

> fix that I suspect CSE will work better and eliminate the need for your

> cse_cc pass.


Thanks for the tip about that!  I hacked things up to do as you suggest 
and it appears to work.  I'll post a new patch set once I've had time 
for more testing, probably Monday-ish.

-Sandra