[openacc] Add support for OpenACC routine nohost clause

Message ID 85bef1e7-8ecd-c7d8-7a4c-fdad81e8350e@codesourcery.com
State New
Headers show
Series
  • [openacc] Add support for OpenACC routine nohost clause
Related show

Commit Message

Cesar Philippidis Oct. 2, 2018, 2:11 p.m.
Attached is a patch that introduces support for the acc routine nohost
clause. Basically, if an acc routine function is marked as nohost, then
the compiler does not generate code for the host. It's kind of strange
to test for. Basically, we had to use acc_on_device at -O2 so that the
host references to the dead function get optimized away.

I believe that the nohost clause was added for acc routines to allow
offloaded acc code to call vendor libraries, such as cuBLAS, which are
only available for specific accelerators. I haven't seen it used much in
practice though.

Is this OK for trunk? I bootstrapped and regtested it for x86_64 Linux
with nvptx offloading.

Thanks
Cesar

Patch

[OpenACC] Add support for OpenACC routine nohost clause

(was OpenACC bind, nohost changes)

2018-XX-YY  Thomas Schwinge  <thomas@codesourcery.com>
	    Cesar Philippidis  <cesar@codesourcery.com>

	gcc/
	* tree-core.h (omp_clause_code): Add OMP_CLAUSE_NOHOST.
	* tree.c (omp_clause_num_ops, omp_clause_code_name, walk_tree_1):
	Update for these.
	* tree-pretty-print.c (dump_omp_clause): Handle	OMP_CLAUSE_NOHOST.
	* gimplify.c (gimplify_scan_omp_clauses)
	(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_NOHOST.
	* tree-nested.c (convert_nonlocal_omp_clauses)
	(convert_local_omp_clauses): Likewise.
	* omp-low.c (scan_sharing_clauses): Likewise.
	* omp-offload.c (maybe_discard_oacc_function): New function.
	(execute_oacc_device_lower) [!ACCEL_COMPILER]: Handle OpenACC
	nohost clauses.

	gcc/c-family/
	* c-attribs.c (c_common_attribute_table): Set min_len to -1 for
	"omp declare target".
	* c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_NOHST.

	gcc/c/
	* c-parser.c (c_parser_omp_clause_name): Handle "nohost".
	(c_parser_oacc_all_clauses): Handle PRAGMA_OACC_CLAUSE_NOHOST.
	(c_parser_oacc_routine, c_finish_oacc_routine): Update.
	* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_NOHOST.

	gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Handle "nohost".
	(cp_parser_oacc_all_clauses): Handle PRAGMA_OACC_CLAUSE_NOHOST,
	(cp_parser_oacc_routine, cp_finalize_oacc_routine): Update.
	* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_NOHOST.
	* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_NOHOST.

	gcc/fortran/
	* gfortran.h (gfc_omp_clauses): Add nohost members.
	* openmp.c (omp_mask2): Add OMP_CLAUSE_NOHOST.
	(gfc_match_omp_clauses): Handle OMP_CLAUSE_NOHOST.
	(gfc_match_oacc_routine): Set oacc_function_nohost when appropriate.
	* gfortran.h (symbol_attribute): Add oacc_function_nohost member.
	* trans-openmp.c (gfc_add_omp_offload_attributes): Use it to decide
	whether to generate an OMP_CLAUSE_NOHOST clause.
	(gfc_trans_omp_clauses_1): Unreachable code to generate an
	OMP_CLAUSE_NOHOST clause.

	gcc/testsuite/
	* c-c++-common/goacc/classify-routine.c: Adjust test.
	* c-c++-common/goacc/routine-1.c: Likewise.
	* c-c++-common/goacc/routine-2.c: Likewise.
	* c-c++-common/goacc/routine-nohost-1.c: New test.
	* g++.dg/goacc/routine-2.C: Adjust test.
	* gfortran.dg/goacc/pr72741.f90: New test.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/routine-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/routine-bind-nohost-1.c:
	Update test.
	* testsuite/libgomp.oacc-fortran/routine-8.f90: Likewise.

(cherry picked from gomp-4_0-branch r223007, r226192, r226259, r228915,
r228916, and r231423)
(cherry picked from gomp-4_0-branch r231973 and r231979)
(cherry picked from gomp-4_0-branch r238847)
---
 gcc/c-family/c-attribs.c                      |  2 +-
 gcc/c-family/c-pragma.h                       |  1 +
 gcc/c/c-parser.c                              | 12 +++++-
 gcc/c/c-typeck.c                              |  1 +
 gcc/cp/parser.c                               | 13 +++++--
 gcc/cp/pt.c                                   |  1 +
 gcc/cp/semantics.c                            |  1 +
 gcc/fortran/gfortran.h                        |  3 +-
 gcc/fortran/openmp.c                          | 29 +++++++-------
 gcc/fortran/trans-openmp.c                    | 15 +++++++-
 gcc/gimplify.c                                |  2 +
 gcc/lto/lto.c                                 |  1 +
 gcc/omp-low.c                                 |  2 +
 gcc/omp-offload.c                             | 38 ++++++++++++++++---
 .../c-c++-common/goacc/classify-routine.c     |  4 +-
 gcc/testsuite/c-c++-common/goacc/routine-1.c  |  8 ++++
 gcc/testsuite/c-c++-common/goacc/routine-2.c  |  8 ++--
 .../c-c++-common/goacc/routine-nohost-1.c     | 28 ++++++++++++++
 gcc/testsuite/g++.dg/goacc/routine-2.C        |  9 +----
 gcc/testsuite/gfortran.dg/goacc/pr72741.f90   | 30 +++++++++++++++
 gcc/tree-core.h                               |  3 ++
 gcc/tree-nested.c                             |  4 ++
 gcc/tree-pretty-print.c                       |  3 ++
 gcc/tree.c                                    |  3 ++
 .../libgomp.oacc-c-c++-common/routine-3.c     | 33 ++++++++++++++++
 .../routine-nohost-1.c                        | 18 +++++++++
 .../libgomp.oacc-fortran/routine-6.f90        | 28 ++++++++++++++
 27 files changed, 257 insertions(+), 43 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/routine-nohost-1.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/pr72741.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/routine-3.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/routine-6.f90

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 5454e09adbc..a8dc91ae93d 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -435,7 +435,7 @@  const struct attribute_spec c_common_attribute_table[] =
 			      handle_omp_declare_simd_attribute, NULL },
   { "simd",		      0, 1, true,  false, false, false,
 			      handle_simd_attribute, NULL },
-  { "omp declare target",     0, 0, true, false, false, false,
+  { "omp declare target",     0, -1, true, false, false, false,
 			      handle_omp_declare_target_attribute, NULL },
   { "omp declare target link", 0, 0, true, false, false, false,
 			      handle_omp_declare_target_attribute, NULL },
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index b322547b11a..7ef281ce8af 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -142,6 +142,7 @@  enum pragma_omp_clause {
   PRAGMA_OACC_CLAUSE_GANG,
   PRAGMA_OACC_CLAUSE_HOST,
   PRAGMA_OACC_CLAUSE_INDEPENDENT,
+  PRAGMA_OACC_CLAUSE_NOHOST,
   PRAGMA_OACC_CLAUSE_NUM_GANGS,
   PRAGMA_OACC_CLAUSE_NUM_WORKERS,
   PRAGMA_OACC_CLAUSE_PRESENT,
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 3517cb783d9..187a2dec999 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11457,6 +11457,8 @@  c_parser_omp_clause_name (c_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_NOTINBRANCH;
 	  else if (!strcmp ("nowait", p))
 	    result = PRAGMA_OMP_CLAUSE_NOWAIT;
+	  else if (!strcmp ("nohost", p))
+	    result = PRAGMA_OACC_CLAUSE_NOHOST;
 	  else if (!strcmp ("num_gangs", p))
 	    result = PRAGMA_OACC_CLAUSE_NUM_GANGS;
 	  else if (!strcmp ("num_tasks", p))
@@ -14127,6 +14129,11 @@  c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "link";
 	  break;
+	case PRAGMA_OACC_CLAUSE_NOHOST:
+	  clauses = c_parser_oacc_simple_clause (parser, here,
+						 OMP_CLAUSE_NOHOST, clauses);
+	  c_name = "nohost";
+	  break;
 	case PRAGMA_OACC_CLAUSE_NUM_GANGS:
 	  clauses = c_parser_oacc_single_int_clause (parser,
 						     OMP_CLAUSE_NUM_GANGS,
@@ -14949,7 +14956,8 @@  c_parser_oacc_kernels_parallel (location_t loc, c_parser *parser,
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_GANG)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WORKER)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_VECTOR)		\
-	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_SEQ) )
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_SEQ)			\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_NOHOST) )
 
 /* Parse an OpenACC routine directive.  For named directives, we apply
    immediately to the named function.  For unnamed ones we then parse
@@ -15110,7 +15118,7 @@  c_finish_oacc_routine (struct oacc_routine_data *data, tree fndecl,
   /* Add an "omp declare target" attribute.  */
   DECL_ATTRIBUTES (fndecl)
     = tree_cons (get_identifier ("omp declare target"),
-		 NULL_TREE, DECL_ATTRIBUTES (fndecl));
+		 data->clauses, DECL_ATTRIBUTES (fndecl));
 
   /* Remember that we've used this "#pragma acc routine".  */
   data->fndecl_seen = true;
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index a5a7da0084c..48eaf26a672 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -14043,6 +14043,7 @@  c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	case OMP_CLAUSE_GANG:
 	case OMP_CLAUSE_WORKER:
 	case OMP_CLAUSE_VECTOR:
+	case OMP_CLAUSE_NOHOST:
 	case OMP_CLAUSE_TILE:
 	case OMP_CLAUSE_IF_PRESENT:
 	case OMP_CLAUSE_FINALIZE:
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index fa7ee7798ae..d56105ca177 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -31460,6 +31460,8 @@  cp_parser_omp_clause_name (cp_parser *parser)
 	case 'n':
 	  if (!strcmp ("nogroup", p))
 	    result = PRAGMA_OMP_CLAUSE_NOGROUP;
+	  else if (!strcmp ("nohost", p))
+	    result = PRAGMA_OACC_CLAUSE_NOHOST;
 	  else if (!strcmp ("notinbranch", p))
 	    result = PRAGMA_OMP_CLAUSE_NOTINBRANCH;
 	  else if (!strcmp ("nowait", p))
@@ -33856,6 +33858,11 @@  cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "link";
 	  break;
+	case PRAGMA_OACC_CLAUSE_NOHOST:
+	  clauses = cp_parser_oacc_simple_clause (parser, OMP_CLAUSE_NOHOST,
+						  clauses, here);
+	  c_name = "nohost";
+	  break;
 	case PRAGMA_OACC_CLAUSE_NUM_GANGS:
 	  code = OMP_CLAUSE_NUM_GANGS;
 	  c_name = "num_gangs";
@@ -38055,8 +38062,8 @@  cp_parser_omp_taskloop (cp_parser *parser, cp_token *pragma_tok,
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_GANG)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WORKER)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_VECTOR)		\
-	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_SEQ))
-
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_SEQ)			\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_NOHOST) )
 
 /* Parse the OpenACC routine pragma.  This has an optional '( name )'
    component, which must resolve to a declared namespace-scope
@@ -38282,7 +38289,7 @@  cp_finalize_oacc_routine (cp_parser *parser, tree fndecl, bool is_defn)
       /* Add an "omp declare target" attribute.  */
       DECL_ATTRIBUTES (fndecl)
 	= tree_cons (get_identifier ("omp declare target"),
-		     NULL_TREE, DECL_ATTRIBUTES (fndecl));
+		     parser->oacc_routine->clauses, DECL_ATTRIBUTES (fndecl));
 
       /* Don't unset parser->oacc_routine here: we may still need it to
 	 diagnose wrong usage.  But, remember that we've used this "#pragma acc
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index b8b6545434b..7932dd2714b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -16148,6 +16148,7 @@  tsubst_omp_clauses (tree clauses, enum c_omp_region_type ort,
 	case OMP_CLAUSE_IF_PRESENT:
 	case OMP_CLAUSE_FINALIZE:
 	  break;
+	case OMP_CLAUSE_NOHOST:
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index bf3c63a09a1..c4442200b74 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7111,6 +7111,7 @@  finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_IF_PRESENT:
 	case OMP_CLAUSE_FINALIZE:
+	case OMP_CLAUSE_NOHOST:
 	  break;
 
 	case OMP_CLAUSE_TILE:
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 657b5bb5a65..781dc2a7d17 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -913,6 +913,7 @@  typedef struct
 
   /* This is an OpenACC acclerator function at level N - 1  */
   ENUM_BITFIELD (oacc_function) oacc_function:3;
+  unsigned oacc_function_nohost:1;
 
   /* Attributes set by compiler extensions (!GCC$ ATTRIBUTES).  */
   unsigned ext_attr:EXT_ATTR_NUM;
@@ -1354,7 +1355,7 @@  typedef struct gfc_omp_clauses
   gfc_expr_list *tile_list;
   unsigned async:1, gang:1, worker:1, vector:1, seq:1, independent:1;
   unsigned wait:1, par_auto:1, gang_static:1;
-  unsigned if_present:1, finalize:1;
+  unsigned if_present:1, finalize:1, nohost;
   locus loc;
 
 }
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index a046863445d..60ecaf54523 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -811,6 +811,7 @@  enum omp_mask2
   OMP_CLAUSE_TILE,
   OMP_CLAUSE_IF_PRESENT,
   OMP_CLAUSE_FINALIZE,
+  OMP_CLAUSE_NOHOST,
   /* This must come last.  */
   OMP_MASK2_LAST
 };
@@ -1439,6 +1440,13 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      c->nogroup = needs_space = true;
 	      continue;
 	    }
+	  if ((mask & OMP_CLAUSE_NOHOST)
+	      && !c->nohost
+	      && gfc_match ("nohost") == MATCH_YES)
+	    {
+	      c->nohost = true;
+	      continue;
+	    }
 	  if ((mask & OMP_CLAUSE_NOTINBRANCH)
 	      && !c->notinbranch
 	      && !c->inbranch
@@ -1971,7 +1979,8 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
   omp_mask (OMP_CLAUSE_ASYNC)
 #define OACC_ROUTINE_CLAUSES \
   (omp_mask (OMP_CLAUSE_GANG) | OMP_CLAUSE_WORKER | OMP_CLAUSE_VECTOR	      \
-   | OMP_CLAUSE_SEQ)
+   | OMP_CLAUSE_SEQ							      \
+   | OMP_CLAUSE_NOHOST)
 
 
 static match
@@ -2348,22 +2357,12 @@  gfc_match_oacc_routine (void)
 	  != MATCH_YES))
     return MATCH_ERROR;
 
-  /* Scan for invalid routine geometry.  */
   dims = gfc_oacc_routine_dims (c);
   if (dims == OACC_FUNCTION_NONE)
     {
-      gfc_error ("Multiple loop axes specified in !$ACC ROUTINE at %L",
-		 &old_loc);
-
-      /* Don't abort early, because it's important to let the user
-	 know of any potential duplicate routine directives.  */
-      seen_error = true;
-    }
-  else if (dims == OACC_FUNCTION_AUTO)
-    {
-      gfc_warning (0, "Expected one of %<gang%>, %<worker%>, %<vector%> or "
-		   "%<seq%> clauses in !$ACC ROUTINE at %L", &old_loc);
-      dims = OACC_FUNCTION_SEQ;
+      gfc_error ("Multiple loop axes specified for routine %C");
+      gfc_current_locus = old_loc;
+      return MATCH_ERROR;
     }
 
   if (sym != NULL)
@@ -2406,6 +2405,8 @@  gfc_match_oacc_routine (void)
 	goto cleanup;
 
       gfc_current_ns->proc_name->attr.oacc_function = dims;
+      gfc_current_ns->proc_name->attr.oacc_function_nohost
+	= c ? c->nohost : false;
 
       if (seen_error)
 	goto cleanup;
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 753272d84c2..0ee78d210d3 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1290,8 +1290,12 @@  gfc_add_omp_offload_attributes (symbol_attribute sym_attr, tree list)
     list = tree_cons (get_identifier ("omp declare target link"),
 		      NULL_TREE, list);
   else if (sym_attr.omp_declare_target)
-    list = tree_cons (get_identifier ("omp declare target"),
-		      NULL_TREE, list);
+    {
+      tree c = NULL_TREE;
+      if (sym_attr.oacc_function_nohost)
+	c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_NOHOST);
+      list = tree_cons (get_identifier ("omp declare target"), c, list);
+    }
 
   if (sym_attr.oacc_function != OACC_FUNCTION_NONE)
     {
@@ -3053,6 +3057,13 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 	  OMP_CLAUSE_GANG_STATIC_EXPR (c) = arg;
 	}
     }
+  if (clauses->nohost)
+    {
+      c = build_omp_clause (where.lb->location, OMP_CLAUSE_NOHOST);
+      omp_clauses = gfc_trans_add_clause (c, omp_clauses);
+      //TODO
+      gcc_unreachable();
+    }
 
   return nreverse (omp_clauses);
 }
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 509fc2f3f5b..32c3fac378c 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8552,6 +8552,7 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	  ctx->default_kind = OMP_CLAUSE_DEFAULT_KIND (c);
 	  break;
 
+	case OMP_CLAUSE_NOHOST:
 	default:
 	  gcc_unreachable ();
 	}
@@ -9310,6 +9311,7 @@  gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 	case OMP_CLAUSE_FINALIZE:
 	  break;
 
+	case OMP_CLAUSE_NOHOST:
 	default:
 	  gcc_unreachable ();
 	}
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -1334,6 +1334,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	    install_var_local (decl, ctx);
 	  break;
 
+	case OMP_CLAUSE_NOHOST:
 	case OMP_CLAUSE__CACHE_:
 	default:
 	  gcc_unreachable ();
@@ -1500,6 +1501,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE_FINALIZE:
 	  break;
 
+	case OMP_CLAUSE_NOHOST:
 	case OMP_CLAUSE__CACHE_:
 	default:
 	  gcc_unreachable ();
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 0abf0283c9e..c2ad801dc2a 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -1451,6 +1451,25 @@  default_goacc_reduction (gcall *call)
   gsi_replace_with_seq (&gsi, seq, true);
 }
 
+/* Determine whether DECL should be discarded in this offload
+   compilation.  */
+
+static bool
+maybe_discard_oacc_function (tree decl)
+{
+  tree attr = lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl));
+
+  if (!attr)
+    return false;
+
+  enum omp_clause_code kind = OMP_CLAUSE_NOHOST;
+
+  if (omp_find_clause (TREE_VALUE (attr), kind))
+    return true;
+
+  return false;
+}
+
 /* Main entry point for oacc transformations which run on the device
    compiler after LTO, so we know what the target device is at this
    point (including the host fallback).  */
@@ -1458,12 +1477,19 @@  default_goacc_reduction (gcall *call)
 static unsigned int
 execute_oacc_device_lower ()
 {
-  tree attrs = oacc_get_fn_attrib (current_function_decl);
-
-  if (!attrs)
+  tree attr = oacc_get_fn_attrib (current_function_decl);
+  if (!attr)
     /* Not an offloaded function.  */
     return 0;
 
+  if (maybe_discard_oacc_function (current_function_decl))
+    {
+      if (dump_file)
+	fprintf (dump_file, "Discarding function\n");
+      TREE_ASM_WRITTEN (current_function_decl) = 1;
+      return TODO_discard_function;
+    }
+
   /* Parse the default dim argument exactly once.  */
   if ((const void *)flag_openacc_dims != &flag_openacc_dims)
     {
@@ -1484,12 +1510,12 @@  execute_oacc_device_lower ()
   if (is_oacc_kernels && !is_oacc_kernels_parallelized)
     {
       oacc_set_fn_attrib (current_function_decl, NULL, NULL);
-      attrs = oacc_get_fn_attrib (current_function_decl);
+      attr = oacc_get_fn_attrib (current_function_decl);
     }
 
   /* Discover, partition and process the loops.  */
   oacc_loop *loops = oacc_loop_discovery ();
-  int fn_level = oacc_fn_attrib_level (attrs);
+  int fn_level = oacc_fn_attrib_level (attr);
 
   if (dump_file)
     {
@@ -1516,7 +1542,7 @@  execute_oacc_device_lower ()
     }
 
   int dims[GOMP_DIM_MAX];
-  oacc_validate_dims (current_function_decl, attrs, dims, fn_level, used_mask);
+  oacc_validate_dims (current_function_decl, attr, dims, fn_level, used_mask);
 
   if (dump_file)
     {
diff --git a/gcc/testsuite/c-c++-common/goacc/classify-routine.c b/gcc/testsuite/c-c++-common/goacc/classify-routine.c
index fd89fc1ec66..f54c3942bbf 100644
--- a/gcc/testsuite/c-c++-common/goacc/classify-routine.c
+++ b/gcc/testsuite/c-c++-common/goacc/classify-routine.c
@@ -21,10 +21,10 @@  void ROUTINE ()
 }
 
 /* Check the offloaded function's attributes.
-   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp declare target, oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "ompexp" } } */
+   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp declare target \\(worker\\), oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "ompexp" } } */
 
 /* Check the offloaded function's classification and compute dimensions (will
    always be 1 x 1 x 1 for non-offloading compilation).
    { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow" } }
    { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
-   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target, oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "oaccdevlow" } } */
+   { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target \\(worker\\), oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "oaccdevlow" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/routine-1.c b/gcc/testsuite/c-c++-common/goacc/routine-1.c
index a75692246b6..db1322e11ca 100644
--- a/gcc/testsuite/c-c++-common/goacc/routine-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/routine-1.c
@@ -1,3 +1,4 @@ 
+/* Test valid use of clauses with routine.  */
 
 #pragma acc routine gang
 void gang (void)
@@ -19,6 +20,11 @@  void seq (void)
 {
 }
 
+#pragma acc routine nohost
+void nohost (void)
+{
+}
+
 int main ()
 {
 #pragma acc kernels num_gangs (32) num_workers (32) vector_length (32)
@@ -27,6 +33,7 @@  int main ()
     worker ();
     vector ();
     seq ();
+    nohost ();
   }
 
 #pragma acc parallel num_gangs (32) num_workers (32) vector_length (32)
@@ -35,6 +42,7 @@  int main ()
     worker ();
     vector ();
     seq ();
+    nohost ();
   }
 
   return 0;
diff --git a/gcc/testsuite/c-c++-common/goacc/routine-2.c b/gcc/testsuite/c-c++-common/goacc/routine-2.c
index fc5eb11bb54..d1ea61e3310 100644
--- a/gcc/testsuite/c-c++-common/goacc/routine-2.c
+++ b/gcc/testsuite/c-c++-common/goacc/routine-2.c
@@ -1,19 +1,19 @@ 
-#pragma acc routine gang worker /* { dg-error "multiple loop axes" } */
+#pragma acc routine gang worker /* { dg-error "conflicting level" } */
 void gang (void)
 {
 }
 
-#pragma acc routine worker vector /* { dg-error "multiple loop axes" } */
+#pragma acc routine worker vector /* { dg-error "conflicting level" } */
 void worker (void)
 {
 }
 
-#pragma acc routine vector seq /* { dg-error "multiple loop axes" } */
+#pragma acc routine vector seq /* { dg-error "conflicting level" } */
 void vector (void)
 {
 }
 
-#pragma acc routine seq gang /* { dg-error "multiple loop axes" } */
+#pragma acc routine seq gang /* { dg-error "conflicting level" } */
 void seq (void)
 {
 }
diff --git a/gcc/testsuite/c-c++-common/goacc/routine-nohost-1.c b/gcc/testsuite/c-c++-common/goacc/routine-nohost-1.c
new file mode 100644
index 00000000000..9baa56cb206
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/routine-nohost-1.c
@@ -0,0 +1,28 @@ 
+/* Test the nohost clause for OpenACC routine directive.  Exercising different
+   variants for declaring routines.  */
+
+/* { dg-additional-options "-fdump-tree-oaccdevlow" } */
+
+#pragma acc routine nohost
+int THREE(void)
+{
+  return 3;
+}
+
+#pragma acc routine nohost
+extern void NOTHING(void);
+
+void NOTHING(void)
+{
+}
+
+extern float ADD(float, float);
+
+#pragma acc routine (ADD) nohost
+
+float ADD(float x, float y)
+{
+  return x + y;
+}
+
+/* { dg-final { scan-tree-dump-times "Discarding function" 3 "oaccdevlow" } } */
diff --git a/gcc/testsuite/g++.dg/goacc/routine-2.C b/gcc/testsuite/g++.dg/goacc/routine-2.C
index ea7c9bf7393..c82493321cb 100644
--- a/gcc/testsuite/g++.dg/goacc/routine-2.C
+++ b/gcc/testsuite/g++.dg/goacc/routine-2.C
@@ -2,15 +2,8 @@ 
 
 template <typename T>
 extern T one_d();
-#pragma acc routine (one_d) /* { dg-error "names a set of overloads" } */
+#pragma acc routine (one_d) nohost /* { dg-error "names a set of overloads" } */
 
-template <typename T>
-T
-one()
-{
-  return 1;
-}
-#pragma acc routine (one) /* { dg-error "names a set of overloads" } */
 
 int incr (int);
 float incr (float);
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr72741.f90 b/gcc/testsuite/gfortran.dg/goacc/pr72741.f90
new file mode 100644
index 00000000000..b295a4fcc59
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/pr72741.f90
@@ -0,0 +1,30 @@ 
+SUBROUTINE v_1
+  !$ACC ROUTINE VECTOR WORKER ! { dg-error "Multiple loop axes" }
+END SUBROUTINE v_1
+
+SUBROUTINE sub_1
+  IMPLICIT NONE
+  EXTERNAL :: g_1
+  !$ACC ROUTINE (g_1) GANG WORKER ! { dg-error "Multiple loop axes" }
+  !$ACC ROUTINE (ABORT) SEQ VECTOR ! { dg-error "Multiple loop axes" "" { xfail *-*-* } }
+! { dg-bogus "invalid function name abort" "" { xfail *-*-* } .-1 }
+
+  CALL v_1
+  CALL g_1
+  CALL ABORT
+END SUBROUTINE sub_1
+
+MODULE m_w_1
+  IMPLICIT NONE
+  EXTERNAL :: w_1
+  !$ACC ROUTINE (w_1) WORKER SEQ ! { dg-error "Multiple loop axes" }
+  !$ACC ROUTINE (ABORT) VECTOR GANG ! { dg-error "Multiple loop axes" "" { xfail *-*-* } }
+! { dg-bogus "invalid function name abort" "" { xfail *-*-* } .-1 }
+
+CONTAINS
+  SUBROUTINE sub_2
+    CALL v_1
+    CALL w_1
+    CALL ABORT
+  END SUBROUTINE sub_2
+END MODULE m_w_1
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index dee27f89dec..22fc9334a73 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -449,6 +449,9 @@  enum omp_clause_code {
   /* OpenACC clause: vector_length (integer-expression).  */
   OMP_CLAUSE_VECTOR_LENGTH,
 
+  /* OpenACC clause: nohost.  */
+  OMP_CLAUSE_NOHOST,
+
   /* OpenACC clause: tile ( size-expr-list ).  */
   OMP_CLAUSE_TILE,
 
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index 4c8eda94f14..eddb91998c7 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -1345,6 +1345,8 @@  convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_FINALIZE:
 	  break;
 
+	  /* OpenACC nohost clause is not yet handled here.  */
+	case OMP_CLAUSE_NOHOST:
 	  /* The following clause belongs to the OpenACC cache directive, which
 	     is discarded during gimplification.  */
 	case OMP_CLAUSE__CACHE_:
@@ -2036,6 +2038,8 @@  convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_FINALIZE:
 	  break;
 
+	  /* OpenACC nohost clauses is not yet handled here.  */
+	case OMP_CLAUSE_NOHOST:
 	  /* The following clause belongs to the OpenACC cache directive, which
 	     is discarded during gimplification.  */
 	case OMP_CLAUSE__CACHE_:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 2c089b11751..e6f32f12fd9 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1043,6 +1043,9 @@  dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
 			 spc, flags, false);
       pp_right_paren (pp);
       break;
+    case OMP_CLAUSE_NOHOST:
+      pp_string (pp, "nohost");
+      break;
 
     case OMP_CLAUSE__GRIDDIM_:
       pp_string (pp, "_griddim_(");
diff --git a/gcc/tree.c b/gcc/tree.c
index c3ac8f36d55..f953e08a0a8 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -341,6 +341,7 @@  unsigned const char omp_clause_num_ops[] =
   1, /* OMP_CLAUSE_NUM_GANGS  */
   1, /* OMP_CLAUSE_NUM_WORKERS  */
   1, /* OMP_CLAUSE_VECTOR_LENGTH  */
+  0, /* OMP_CLAUSE_NOHOST  */
   3, /* OMP_CLAUSE_TILE  */
   2, /* OMP_CLAUSE__GRIDDIM_  */
   0, /* OMP_CLAUSE_IF_PRESENT */
@@ -414,6 +415,7 @@  const char * const omp_clause_code_name[] =
   "num_gangs",
   "num_workers",
   "vector_length",
+  "nohost",
   "tile",
   "_griddim_",
   "if_present",
@@ -11672,6 +11674,7 @@  walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
 	case OMP_CLAUSE_DEFAULTMAP:
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
+	case OMP_CLAUSE_NOHOST:
 	case OMP_CLAUSE_TILE:
 	case OMP_CLAUSE__SIMT_:
 	case OMP_CLAUSE_IF_PRESENT:
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-3.c
new file mode 100644
index 00000000000..2cdd6bf459c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-3.c
@@ -0,0 +1,33 @@ 
+/* At -O0, we do get the expected "undefined reference to `foo'" link-time
+   error message (but the check needs to be done differently; compare to
+   routine-nohost-1.c), but for -O2 we don't; presumably because the function
+   gets inlined.
+   { dg-xfail-if "TODO" { *-*-* } { "-O0" } { "" } } */
+
+#include <stdlib.h>
+
+#pragma acc routine nohost
+int
+foo (int n)
+{
+  if (n == 0 || n == 1)
+    return 1;
+
+  return n * n;
+}
+
+int
+main()
+{
+  int a, n = 10;
+
+#pragma acc parallel copy (a, n)
+  {
+    a = foo (n);
+  }
+
+  if (a != n * n)
+    abort ();
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c
new file mode 100644
index 00000000000..365af9319bd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c
@@ -0,0 +1,18 @@ 
+/* { dg-do link } */
+
+extern int three (void);
+
+#pragma acc routine (three) nohost
+__attribute__((noinline))
+int three(void)
+{
+  return 3;
+}
+
+int main(void)
+{
+  return (three() == 3) ? 0 : 1;
+}
+
+/* Expecting link to fail; "undefined reference to `three'" (or similar).
+   { dg-excess-errors "" } */
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-6.f90
new file mode 100644
index 00000000000..1bae09c2a9a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-6.f90
@@ -0,0 +1,28 @@ 
+! { dg-do run }
+! { dg-xfail-if "TODO" { *-*-* } }
+
+program main
+  integer :: a, n
+  
+  n = 10
+
+  !$acc parallel copy (a, n)
+     a = foo (n)
+  !$acc end parallel 
+
+  if (a .ne. n * n) call abort
+
+contains
+
+function foo (n) result (rc)
+  !$acc routine nohost
+
+  integer, intent (in) :: n
+  integer :: rc
+
+  rc = n * n
+
+end function
+
+end program main
+
-- 
2.17.1