Document AMD GCN features.

Message ID 048fcb30-4c20-6290-64f4-93903df43601@codesourcery.com
State New
Headers show
Series
  • Document AMD GCN features.
Related show

Commit Message

Andrew Stubbs Jan. 18, 2019, 11:42 a.m.
Hi,

This patch adds the documentation needed for the newly-added AMD GCN 
back end.

OK to commit?

Andrew

Comments

Jeff Law Jan. 21, 2019, 6:03 p.m. | #1
On 1/18/19 4:42 AM, Andrew Stubbs wrote:
> Hi,

> 

> This patch adds the documentation needed for the newly-added AMD GCN

> back end.

> 

> OK to commit?

> 

> Andrew

> 

> 190118-gcc-gcn-docs.patch

> 

> Document AMD GCN.

> 

> 2019-01-18  Andrew Stubbs  <ams@codesourcery.com>

> 

> 	gcc/

> 	* doc/extend.tex (AMD GCN Function Attributes): New section.

> 	* doc/install.texi (amdgcn-unknown-amdhsa): New instructions.

> 	* doc/invoke.texi (AMD GCN Options): New section.

> 	* doc/md.texi (Constraints for Particular Machines): Add AMD GCN.

OK
jeff
Andrew Stubbs Jan. 22, 2019, 10:57 a.m. | #2
On 21/01/2019 18:03, Jeff Law wrote:
>> 2019-01-18  Andrew Stubbs  <ams@codesourcery.com>

>>

>> 	gcc/

>> 	* doc/extend.tex (AMD GCN Function Attributes): New section.

>> 	* doc/install.texi (amdgcn-unknown-amdhsa): New instructions.

>> 	* doc/invoke.texi (AMD GCN Options): New section.

>> 	* doc/md.texi (Constraints for Particular Machines): Add AMD GCN.

> OK


Committed, thanks.

Andrew

Patch

Document AMD GCN.

2019-01-18  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* doc/extend.tex (AMD GCN Function Attributes): New section.
	* doc/install.texi (amdgcn-unknown-amdhsa): New instructions.
	* doc/invoke.texi (AMD GCN Options): New section.
	* doc/md.texi (Constraints for Particular Machines): Add AMD GCN.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ebd5648..465de30 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2393,6 +2393,7 @@  GCC plugins may provide their own attributes.
 @menu
 * Common Function Attributes::
 * AArch64 Function Attributes::
+* AMD GCN Function Attributes::
 * ARC Function Attributes::
 * ARM Function Attributes::
 * AVR Function Attributes::
@@ -3954,6 +3955,96 @@  Note that CPU tuning options and attributes such as the @option{-mcpu=},
 @option{-mcpu=} option or the @code{cpu=} attribute conflicts with the
 architectural feature rules specified above.
 
+@node AMD GCN Function Attributes
+@subsection AMD GCN Function Attributes
+
+These function attributes are supported by the AMD GCN back end:
+
+@table @code
+@item amdgpu_hsa_kernel
+@cindex @code{amdgpu_hsa_kernel} function attribute, AMD GCN
+This attribute indicates that the corresponding function should be compiled as
+a kernel function, that is an entry point that can be invoked from the host
+via the HSA runtime library.  By default functions are only callable only from
+other GCN functions.
+
+This attribute is implicitly applied to any function named @code{main}, using
+default parameters.
+
+Kernel functions may return an integer value, which will be written to a
+conventional place within the HSA "kernargs" region.
+
+The attribute parameters configure what values are passed into the kernel
+function by the GPU drivers, via the initial register state.  Some values are
+used by the compiler, and therefore forced on.  Enabling other options may
+break assumptions in the compiler and/or run-time libraries.
+
+@table @code
+@item private_segment_buffer
+Set @code{enable_sgpr_private_segment_buffer} flag.  Always on (required to
+locate the stack).
+
+@item dispatch_ptr
+Set @code{enable_sgpr_dispatch_ptr} flag.  Always on (required to locate the
+launch dimensions).
+
+@item queue_ptr
+Set @code{enable_sgpr_queue_ptr} flag.  Always on (required to convert address
+spaces).
+
+@item kernarg_segment_ptr
+Set @code{enable_sgpr_kernarg_segment_ptr} flag.  Always on (required to
+locate the kernel arguments, "kernargs").
+
+@item dispatch_id
+Set @code{enable_sgpr_dispatch_id} flag.
+
+@item flat_scratch_init
+Set @code{enable_sgpr_flat_scratch_init} flag.
+
+@item private_segment_size
+Set @code{enable_sgpr_private_segment_size} flag.
+
+@item grid_workgroup_count_X
+Set @code{enable_sgpr_grid_workgroup_count_x} flag.  Always on (required to
+use OpenACC/OpenMP).
+
+@item grid_workgroup_count_Y
+Set @code{enable_sgpr_grid_workgroup_count_y} flag.
+
+@item grid_workgroup_count_Z
+Set @code{enable_sgpr_grid_workgroup_count_z} flag.
+
+@item workgroup_id_X
+Set @code{enable_sgpr_workgroup_id_x} flag.
+
+@item workgroup_id_Y
+Set @code{enable_sgpr_workgroup_id_y} flag.
+
+@item workgroup_id_Z
+Set @code{enable_sgpr_workgroup_id_z} flag.
+
+@item workgroup_info
+Set @code{enable_sgpr_workgroup_info} flag.
+
+@item private_segment_wave_offset
+Set @code{enable_sgpr_private_segment_wave_byte_offset} flag.  Always on
+(required to locate the stack).
+
+@item work_item_id_X
+Set @code{enable_vgpr_workitem_id} parameter.  Always on (can't be disabled).
+
+@item work_item_id_Y
+Set @code{enable_vgpr_workitem_id} parameter.  Always on (required to enable
+vectorization.)
+
+@item work_item_id_Z
+Set @code{enable_vgpr_workitem_id} parameter.  Always on (required to use
+OpenACC/OpenMP).
+
+@end table
+@end table
+
 @node ARC Function Attributes
 @subsection ARC Function Attributes
 
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index d5e1edb..81a15a0 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3447,6 +3447,27 @@  This is a synonym for @samp{x86_64-*-solaris2.1[0-9]*}.
 @html
 <hr />
 @end html
+@anchor{amdgcn-unknown-amdhsa}
+@heading amdgcn-unknown-amdhsa
+AMD GCN GPU target.
+
+Instead of GNU Binutils, you will need to install LLVM 6, or later, and copy
+@file{bin/llvm-mc} to @file{amdgcn-unknown-amdhsa/bin/as},
+@file{bin/lld} to @file{amdgcn-unknown-amdhsa/bin/ld},
+@file{bin/llvm-nm} to @file{amdgcn-unknown-amdhsa/bin/nm}, and
+@file{bin/llvm-ar} to both @file{bin/amdgcn-unknown-amdhsa-ar} and
+@file{bin/amdgcn-unknown-amdhsa-ranlib}.
+
+Use Newlib (2019-01-16, or newer).
+
+To run the binaries, install the HSA Runtime from the
+@uref{https://rocm.github.io,,ROCm Platform}, and use
+@file{libexec/gcc/amdhsa-unknown-amdhsa/@var{version}/gcn-run} to launch them
+on the GPU.
+
+@html
+<hr />
+@end html
 @anchor{arc-x-elf32}
 @heading arc-*-elf32
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1151708..ff8cd10 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -643,6 +643,9 @@  Objective-C and Objective-C++ Dialects}.
 -mfp-mode=@var{mode}  -mvect-double  -max-vect-align=@var{num} @gol
 -msplit-vecmove-early  -m1reg-@var{reg}}
 
+@emph{AMD GCN Options}
+@gccoptlist{-march=@var{gpu} -mtune=@var{gpu} -mstack-size=@var{bytes}}
+
 @emph{ARC Options}
 @gccoptlist{-mbarrel-shifter  -mjli-always @gol
 -mcpu=@var{cpu}  -mA6  -mARC600  -mA7  -mARC700 @gol
@@ -15479,6 +15482,7 @@  platform.
 @menu
 * AArch64 Options::
 * Adapteva Epiphany Options::
+* AMD GCN Options::
 * ARC Options::
 * ARM Options::
 * AVR Options::
@@ -16083,6 +16087,41 @@  purpose.  The default is @option{-m1reg-none}.
 
 @end table
 
+@node AMD GCN Options
+@subsection AMD GCN Options
+@cindex AMD GCN Options
+
+These options are defined specifically for the AMD GCN port.
+
+@table @gcctabopt
+
+@item -march=@var{gpu}
+@opindex march
+@itemx -mtune=@var{gpu}
+@opindex mtune
+Set architecture type or tuning for @var{gpu}. Supported values for @var{gpu}
+are
+
+@table @samp
+@opindex fiji
+@item fiji
+Compile for GCN3 Fiji devices (gfx803).
+
+@item gfx900
+Compile for GCN5 Vega 10 devices (gfx900).
+
+@end table
+
+@item -mstack-size=@var{bytes}
+@opindex mstack-size
+Specify how many @var{bytes} of stack space will be requested for each GPU
+thread (wave-front).  Beware that there may be many threads and limited memory
+available.  The size of the stack allocation may also have an impact on
+run-time performance.  The default is 32KB when using OpenACC or OpenMP, and
+1MB otherwise.
+
+@end table
+
 @node ARC Options
 @subsection ARC Options
 @cindex ARC options
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 18b8af0..6ffb69b 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1800,6 +1800,100 @@  DF modes
 @end table
 
 
+@item AMD GCN ---@file{config/gcn/constraints.md}
+@table @code
+@item I
+Immediate integer in the range @minus{}16 to 64
+
+@item J
+Immediate 16-bit signed integer
+
+@item Kf
+Immediate constant @minus{}1
+
+@item L
+Immediate 15-bit unsigned integer
+
+@item A
+Immediate constant that can be inlined in an instruction encoding: integer
+@minus{}16..64, or float 0.0, +/@minus{}0.5, +/@minus{}1.0, +/@minus{}2.0,
++/@minus{}4.0, 1.0/(2.0*PI)
+
+@item B
+Immediate 32-bit signed integer that can be attached to an instruction encoding
+
+@item C
+Immediate 32-bit integer in range @minus{}16..4294967295 (i.e. 32-bit unsigned
+integer or @samp{A} constraint)
+
+@item DA
+Immediate 64-bit constant that can be split into two @samp{A} constants
+
+@item DB
+Immediate 64-bit constant that can be split into two @samp{B} constants
+
+@item U
+Any @code{unspec}
+
+@item Y
+Any @code{symbol_ref} or @code{label_ref}
+
+@item v
+VGPR register
+
+@item Sg
+SGPR register
+
+@item SD
+SGPR registers valid for instruction destinations, including VCC, M0 and EXEC
+
+@item SS
+SGPR registers valid for instruction sources, including VCC, M0, EXEC and SCC
+
+@item Sm
+SGPR registers valid as a source for scalar memory instructions (excludes M0
+and EXEC)
+
+@item Sv
+SGPR registers valid as a source or destination for vector instructions
+(excludes EXEC)
+
+@item ca
+All condition registers: SCC, VCCZ, EXECZ
+
+@item cs
+Scalar condition register: SCC
+
+@item cV
+Vector condition register: VCC, VCC_LO, VCC_HI
+
+@item e
+EXEC register (EXEC_LO and EXEC_HI)
+
+@item RB
+Memory operand with address space suitable for @code{buffer_*} instructions
+
+@item RF
+Memory operand with address space suitable for @code{flat_*} instructions
+
+@item RS
+Memory operand with address space suitable for @code{s_*} instructions
+
+@item RL
+Memory operand with address space suitable for @code{ds_*} LDS instructions
+
+@item RG
+Memory operand with address space suitable for @code{ds_*} GDS instructions
+
+@item RD
+Memory operand with address space suitable for any @code{ds_*} instructions
+
+@item RM
+Memory operand with address space suitable for @code{global_*} instructions
+
+@end table
+
+
 @item ARC ---@file{config/arc/constraints.md}
 @table @code
 @item q