[nvptx,PR81352] Add exit insn after noreturn call for neutered threads in warp

Message ID 61047ed2-e5e6-1b04-08d9-62b547001df2@mentor.com
State New
Headers show
Series
  • [nvptx,PR81352] Add exit insn after noreturn call for neutered threads in warp
Related show

Commit Message

Tom de Vries Jan. 24, 2018, 11:24 a.m.
Hi,

atm the test-case contained in this patch hangs.

For the test-case we generate:
...
   @ %r79 bra $L18;
   {
     call _gfortran_abort;
     trap;
     exit;
   }
  $L18:
...

which results in SASS code (at GOMP_NVPTX_JIT=-O4):
...
         /*05d8*/               @P0 BRA `(.L_18);
         /*05e8*/                   JCAL `(_gfortran_abort);
         /*05f0*/                   BPT.TRAP 0x1;
         /*05f8*/                   EXIT;
.L_18:
...
There's no convergence point generated for the diverging branch, so we 
may end up executing random code after .L18 (a problem I long suspected 
could happen, but never observed until now).

The patch adds an exit on the other path, making sure that all threads 
in the warp reach exit, and indeed fixing the hang:
...
   @ %r79 bra $L18;
   {
     call _gfortran_abort;
     trap;
     exit;
   }
  $L18:
  exit;
...

Build and reg-tested on x86_64 with nvptx accelerator.

I'll commit this shortly for stage4. Strictly speaking, this is not an 8 
regression, but a wrong code bug. But I think that the code generation 
error seems fundamental enough, and the fix simple and localized enough, 
that it's stage4 permissible.

Thanks,
- Tom

Comments

Richard Biener Jan. 24, 2018, 11:53 a.m. | #1
On Wed, 24 Jan 2018, Tom de Vries wrote:

> Hi,

> 

> atm the test-case contained in this patch hangs.

> 

> For the test-case we generate:

> ...

>   @ %r79 bra $L18;

>   {

>     call _gfortran_abort;

>     trap;

>     exit;

>   }

>  $L18:

> ...

> 

> which results in SASS code (at GOMP_NVPTX_JIT=-O4):

> ...

>         /*05d8*/               @P0 BRA `(.L_18);

>         /*05e8*/                   JCAL `(_gfortran_abort);

>         /*05f0*/                   BPT.TRAP 0x1;

>         /*05f8*/                   EXIT;

> .L_18:

> ...

> There's no convergence point generated for the diverging branch, so we may end

> up executing random code after .L18 (a problem I long suspected could happen,

> but never observed until now).

> 

> The patch adds an exit on the other path, making sure that all threads in the

> warp reach exit, and indeed fixing the hang:

> ...

>   @ %r79 bra $L18;

>   {

>     call _gfortran_abort;

>     trap;

>     exit;

>   }

>  $L18:

>  exit;

> ...

> 

> Build and reg-tested on x86_64 with nvptx accelerator.

> 

> I'll commit this shortly for stage4. Strictly speaking, this is not an 8

> regression, but a wrong code bug. But I think that the code generation error

> seems fundamental enough, and the fix simple and localized enough, that it's

> stage4 permissible.


wrong-code bugs qualify for stage4 if a fix isn't too invasive.  Target
maintainers have an extra say to override stage4 rules anyway and for
non-primary/secondary targets nobody cares anyway.

Richard.

> Thanks,

> - Tom

> 


-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Tom de Vries Jan. 24, 2018, 1:42 p.m. | #2
On 01/24/2018 12:53 PM, Richard Biener wrote:
> On Wed, 24 Jan 2018, Tom de Vries wrote:

>> I'll commit this shortly for stage4. Strictly speaking, this is not an 8

>> regression, but a wrong code bug. But I think that the code generation error

>> seems fundamental enough, and the fix simple and localized enough, that it's

>> stage4 permissible.

> 

> wrong-code bugs qualify for stage4 if a fix isn't too invasive.  Target

> maintainers have an extra say to override stage4 rules anyway and for

> non-primary/secondary targets nobody cares anyway.


Maybe then we should be more clear then in formulation of stage 4 criteria?

Thanks,
- Tom

[ Change validated as XHTML 1.0 Transitional ]

Index: htdocs/develop.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/develop.html,v
retrieving revision 1.178
diff -u -r1.178 develop.html
--- htdocs/develop.html	15 Jan 2018 08:23:26 -0000	1.178
+++ htdocs/develop.html	24 Jan 2018 13:40:30 -0000
@@ -130,10 +130,10 @@
  <h4><a name="stage4">Stage 4</a></h4>

  <p>During this period, the only (non-documentation) changes that may
-be made are changes that fix regressions.  Other changes may not be
-done during this period.  Note that the same constraints apply
-to release branches.  This period lasts until stage 1 opens for
-the next release.</p>
+be made are changes that fix regressions, or that fix wrong-code bugs
+in a non-invasive way.  Other changes may not be done during this
+period.  Note that the same constraints apply to release branches.
+This period lasts until stage 1 opens for the next release.</p>

  <p><strong>Rationale</strong></p>
Gerald Pfeifer April 8, 2018, 2:31 p.m. | #3
On Wed, 24 Jan 2018, Tom de Vries wrote:
> On 01/24/2018 12:53 PM, Richard Biener wrote:

>> wrong-code bugs qualify for stage4 if a fix isn't too invasive.  Target

>> maintainers have an extra say to override stage4 rules anyway and for

>> non-primary/secondary targets nobody cares anyway.

> Maybe then we should be more clear then in formulation of stage 4 criteria?


Richi?  I did not see anyone approve or reject Tom's patch, and it
did not make it into the tree.

What is your take?  (It's fine as far as wwwdocs go, but this really
is a question on the RMs.)

Gerald

> [ Change validated as XHTML 1.0 Transitional ]

> 

> Index: htdocs/develop.html

> ===================================================================

> RCS file: /cvs/gcc/wwwdocs/htdocs/develop.html,v

> retrieving revision 1.178

> diff -u -r1.178 develop.html

> --- htdocs/develop.html	15 Jan 2018 08:23:26 -0000	1.178

> +++ htdocs/develop.html	24 Jan 2018 13:40:30 -0000

> @@ -130,10 +130,10 @@

>  <h4><a name="stage4">Stage 4</a></h4>

> 

>  <p>During this period, the only (non-documentation) changes that may

> -be made are changes that fix regressions.  Other changes may not be

> -done during this period.  Note that the same constraints apply

> -to release branches.  This period lasts until stage 1 opens for

> -the next release.</p>

> +be made are changes that fix regressions, or that fix wrong-code bugs

> +in a non-invasive way.  Other changes may not be done during this

> +period.  Note that the same constraints apply to release branches.

> +This period lasts until stage 1 opens for the next release.</p>

> 

>  <p><strong>Rationale</strong></p>

Patch

[nvptx, PR81352] Add exit insn after noreturn call for neutered threads in warp

2018-01-23  Tom de Vries  <tom@codesourcery.com>

	PR target/81352
	* config/nvptx/nvptx.c (nvptx_single): Add exit insn after noreturn call
	for neutered threads in warp.
	* config/nvptx/nvptx.md (define_insn "exit"): New insn.

	* testsuite/libgomp.oacc-fortran/pr81352.f90: New test.

---
 gcc/config/nvptx/nvptx.c                           |  7 ++++++-
 gcc/config/nvptx/nvptx.md                          |  5 +++++
 libgomp/testsuite/libgomp.oacc-fortran/pr81352.f90 | 20 ++++++++++++++++++++
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index f5bb438..3516740 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4062,7 +4062,12 @@  nvptx_single (unsigned mask, basic_block from, basic_block to)
 	if (tail_branch)
 	  before = emit_label_before (label, before);
 	else
-	  emit_label_after (label, tail);
+	  {
+	    rtx_insn *label_insn = emit_label_after (label, tail);
+	    if (mode == GOMP_DIM_VECTOR && CALL_P (tail)
+		&& find_reg_note (tail, REG_NORETURN, NULL))
+	      emit_insn_after (gen_exit (), label_insn);
+	  }
       }
 
   /* Now deal with propagating the branch condition.  */
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index f9c087b..135479b 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -994,6 +994,11 @@ 
   ""
   "")
 
+(define_insn "exit"
+  [(const_int 1)]
+  ""
+  "exit;")
+
 (define_insn "return"
   [(return)]
   ""
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr81352.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr81352.f90
new file mode 100644
index 0000000..f6969c8
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr81352.f90
@@ -0,0 +1,20 @@ 
+! { dg-do run }
+
+program foo
+  integer :: a(3,3), l, ll
+  a = 0
+
+  !$acc parallel num_gangs (1) num_workers(1)
+
+  do l=1,3
+     !$acc loop vector
+     do ll=1,3
+        a(l,ll) = 2
+     enddo
+  enddo
+
+  if (any(a(1:3,1:3).ne.2)) call abort
+
+  !$acc end parallel
+
+end program foo