[3/3,Aarch64] Implement Aarch64 SIMD ABI

Message ID 1537456618.24844.16.camel@cavium.com
State New
Headers show
Series
  • [1/3,Aarch64] Implement Aarch64 SIMD ABI
Related show

Commit Message

Steve Ellcey Sept. 20, 2018, 3:17 p.m.
This is the third of three patches for Aarch64 SIMD ABI support.  This
patch is not fully tested yet but I want to post it to get comments.

This is the only patch of the three that touches non-aarch64 specific
code.  The changes here are made to allow GCC to have better information
about what registers are clobbered by functions.  With the new SIMD
ABI on Aarch64 the registers clobbered by a SIMD function is a subset
of the registers clobbered by a normal (non-SIMD) function.  This can
result in the caller saving and restoring more registers than is necessary.

This patch addresses that by passing information about the call insn to 
various routines so that they can check on what type of function is being
called and modify the clobbered register set based on that information.

As an example, this code:

  __attribute__ ((__simd__ ("notinbranch"))) extern double sin (double __x);
  __attribute__ ((__simd__ ("notinbranch"))) extern double log (double __x);
  __attribute__ ((__simd__ ("notinbranch"))) extern double exp (double __x);

  double foo(double * __restrict__ x, double * __restrict__ y,
             double * __restrict__ z, int n)
  {
	int i;
	double a = 0.0;
	for (i = 0; i < n; i++)
		a = a + sin(x[i]) + log(y[i]) + exp (z[i]);
	return a;
  }

Will generate stores inside the main vectorized loop to preserve registers
without this patch, but after the patch, will not do any stores and will
use registers it knows the vector sin/log/exp functions do not clobber.

Comments?

Steve Ellcey
sellcey@cavium.com


2018-09-20  Steve Ellcey  <sellcey@cavium.com>

	* caller-save.c (setup_save_areas): Modify get_call_reg_set_usage
	arguments.
	(save_call_clobbered_regs): Ditto.
	* config/aarch64/aarch64.c (aarch64_simd_function_def): New function.
	(aarch64_simd_call_p): Ditto.
	(aarch64_hard_regno_call_part_clobbered): Check for simd calls.
	(aarch64_check_part_clobbered): New function.
	(aarch64_used_reg_set): New function.
	(TARGET_CHECK_PART_CLOBBERED): New macro.
	(TARGET_USED_REG_SET): New macro.
	* cselib.c (cselib_process_insn): Modify
	targetm.hard_regno_call_part_clobbered arguments.
	* df-scan.c (df_get_call_refs): Modify get_call_reg_set_usage
	arguments.
	* doc/tm.texi.in (TARGET_CHECK_PART_CLOBBERED): New hook.
	(TARGET_USED_REG_SET): New hook.
	* final.c (collect_fn_hard_reg_usage): Modify get_call_reg_set_usage
	arguments.
	(get_call_reg_set_usage): Update description and argument list,
	modify code to return proper register set.
	* hooks.c (hook_bool_uint_mode_false): Rename to
	hook_bool_insn_uint_mode_false.
	* hooks.h (hook_bool_uint_mode_false): Ditto.
	* ira-conflicts.c (ira_build_conflicts): Modify
	targetm.hard_regno_call_part_clobbered arguments.
	* ira-costs.c (ira_tune_allocno_costs): Ditto.
	* ira-lives.c (process_bb_node_lives): Modify get_call_reg_set_usage
	arguments.
	* lra-constraints.c (need_for_call_save_p): Add new argument.
	Modify return and update arguments to
	targetm.hard_regno_call_part_clobbered.
	(need_for_split_p): Add insn argument. Pass argument to
	need_for_call_save_p.
	(split_if_necessary): Pass insn argument to need_for_split_p.
	(inherit_in_ebb): Pass curr_insn to need_for_split_p.
	* lra-int.h (struct lra_reg): Add check_part_clobbered field
	* lra-lives.c (lra_setup_reload_pseudo_preferenced_hard_reg):
	Add insn argument.
	(check_pseudos_live_through_calls): Add check of flag_ipa_ra.
	(process_bb_lives): Pass curr_insn to check_pseudos_live_through_calls.
	Modify get_call_reg_set_usage, targetm.check_part_clobbered, and
	check_pseudos_live_through_calls arguments.
	* lra.c (initialize_lra_reg_info_element): Initialize
	check_part_clobbered to false.
	* postreload.c (reload_combine): Modify get_call_reg_set_usage
	arguments.
	* regcprop.c (copyprop_hardreg_forward_1): Modify
	get_call_reg_set_usage and targetm.hard_regno_call_part_clobbered
	arguments.
	* reginfo.c (choose_hard_reg_mode): Modify
	targetm.hard_regno_call_part_clobbered arguments.
	* regrename.c (check_new_reg_p): Ditto.
	* regs.h (get_call_reg_set_usage): Update argument list.
	* reload.c (find_equiv_reg): Modify
	targetm.hard_regno_call_part_clobbered argument list.
	* reload1.c (emit_reload_insns): Ditto.
	* resource.c (mark_set_resources): Modify get_call_reg_set_usage
	argument list.
	* sched-deps.c (deps_analyze_insn): Modify
	targetm.hard_regno_call_part_clobbered argument list.
	* sel-sched.c (init_regs_for_mode): Ditto.
	(mark_unavailable_hard_regs): Ditto.
	* target.def (hard_regno_call_part_clobbered): Update description
	and argument list.
	(check_part_clobbered): New hook.
	(used_reg_set): New hook.
	* targhooks.c (default_dwarf_frame_reg_mode): Update 
	targetm.hard_regno_call_part_clobbered argument list.
	(default_used_reg_set): New function.
	* targhooks.h (default_used_reg_set): New function declaration.
	* var-tracking.c (dataflow_set_clear_at_call): Modify
	get_call_reg_set_usage argument list.

Comments

Richard Sandiford Oct. 8, 2018, 12:53 p.m. | #1
Steve Ellcey <sellcey@cavium.com> writes:
> This is the third of three patches for Aarch64 SIMD ABI support.  This

> patch is not fully tested yet but I want to post it to get comments.

>

> This is the only patch of the three that touches non-aarch64 specific

> code.  The changes here are made to allow GCC to have better information

> about what registers are clobbered by functions.  With the new SIMD

> ABI on Aarch64 the registers clobbered by a SIMD function is a subset

> of the registers clobbered by a normal (non-SIMD) function.  This can

> result in the caller saving and restoring more registers than is necessary.

>

> This patch addresses that by passing information about the call insn to 

> various routines so that they can check on what type of function is being

> called and modify the clobbered register set based on that information.

>

> As an example, this code:

>

>   __attribute__ ((__simd__ ("notinbranch"))) extern double sin (double __x);

>   __attribute__ ((__simd__ ("notinbranch"))) extern double log (double __x);

>   __attribute__ ((__simd__ ("notinbranch"))) extern double exp (double __x);

>

>   double foo(double * __restrict__ x, double * __restrict__ y,

>              double * __restrict__ z, int n)

>   {

> 	int i;

> 	double a = 0.0;

> 	for (i = 0; i < n; i++)

> 		a = a + sin(x[i]) + log(y[i]) + exp (z[i]);

> 	return a;

>   }

>

> Will generate stores inside the main vectorized loop to preserve registers

> without this patch, but after the patch, will not do any stores and will

> use registers it knows the vector sin/log/exp functions do not clobber.

>

> Comments?


I think it'd be better to keep the get_call_reg_set_usage interface
the same, since:

  get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);

is easier to read than:

  get_call_reg_set_usage (insn, &used_regs, true);

I don't think it would be possible as things stand for an ABI variant
to *add* call-clobbered registers here.  Instead it would need to
clobber the registers explicitly in CALL_INSN_FUNCTION_USAGE.
So I think in practice the new hook can only remove call-clobbered
registers.  It might then be better to have it filter out registers
from the set that it's given.  I.e. something like:

  targetm.remove_extra_call_preserved_regs (insn, &used_regs);

which get_call_reg_set_usage could call immediately before
returning.

The change to targetm.hard_regno_call_part_clobbered looks good.

I guess targetm.check_part_clobbered is probably an expedient
fix for now.  It might be better to call it something like
honor_part_clobbered_p, to make it clearer that the hook doesn't
actually do the checking itself.

When you submit the final patch, could you split off each hook and
interface change?  That would make things easier to review.

Thanks,
Richard

Patch

diff --git a/gcc/caller-save.c b/gcc/caller-save.c
index a7edbad..922b02d 100644
--- a/gcc/caller-save.c
+++ b/gcc/caller-save.c
@@ -442,7 +442,7 @@  setup_save_areas (void)
       freq = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn));
       REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 			       &chain->live_throughout);
-      get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
+      get_call_reg_set_usage (insn, &used_regs, true);
 
       /* Record all registers set in this call insn.  These don't
 	 need to be saved.  N.B. the call insn might set a subreg
@@ -526,7 +526,7 @@  setup_save_areas (void)
 
 	  REG_SET_TO_HARD_REG_SET (hard_regs_to_save,
 				   &chain->live_throughout);
-	  get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
+	  get_call_reg_set_usage (insn, &used_regs, true);
 
 	  /* Record all registers set in this call insn.  These don't
 	     need to be saved.  N.B. the call insn might set a subreg
@@ -855,8 +855,7 @@  save_call_clobbered_regs (void)
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, call_fixed_reg_set);
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, this_insn_sets);
 	      AND_COMPL_HARD_REG_SET (hard_regs_to_save, hard_regs_saved);
-	      get_call_reg_set_usage (insn, &call_def_reg_set,
-				      call_used_reg_set);
+	      get_call_reg_set_usage (insn, &call_def_reg_set, true);
 	      AND_HARD_REG_SET (hard_regs_to_save, call_def_reg_set);
 
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 8cc738c..b101c7b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1383,16 +1383,87 @@  aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode)
   return false;
 }
 
+/* Return true if this is a definition of a vectorized simdclone function.
+   We recognize this only by looking for a simd or aarch64_vector_pcs
+   attribute.  */
+
+static bool
+aarch64_simd_function_def (tree fndecl)
+{
+  if (lookup_attribute ("aarch64_vector_pcs", DECL_ATTRIBUTES (fndecl)) != NULL)
+    return true;
+  if (lookup_attribute ("simd", DECL_ATTRIBUTES (fndecl)) == NULL)
+    return false;
+  return (VECTOR_TYPE_P (TREE_TYPE (TREE_TYPE (fndecl))));
+}
+
+/* Return true if insn is a call to a simd function.  */
+
+static bool
+aarch64_simd_call_p (rtx_insn *insn)
+{
+  rtx symbol;
+  rtx call;
+  tree fndecl;
+
+  if (!insn)
+    return false;
+  call = get_call_rtx_from (insn);
+  if (!call)
+    return false;
+  symbol = XEXP (XEXP (call, 0), 0);
+  if (GET_CODE (symbol) != SYMBOL_REF)
+    return false;
+  fndecl = SYMBOL_REF_DECL (symbol);
+  if (!fndecl)
+    return false;
+
+  return aarch64_simd_decl_p (fndecl);
+}
+
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
    the lower 64 bits of a 128-bit register.  Tell the compiler the callee
    clobbers the top 64 bits when restoring the bottom 64 bits.  */
 
 static bool
-aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode mode)
+aarch64_hard_regno_call_part_clobbered (rtx_insn *insn, unsigned int regno, machine_mode mode)
 {
+  if (aarch64_simd_call_p (insn))
+    {
+      if (FP_SIMD_SAVED_REGNUM_P (regno))
+        return false;
+    }
   return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8);
 }
 
+static bool
+aarch64_check_part_clobbered(rtx_insn *insn)
+{
+  if (aarch64_simd_call_p (insn))
+    return false;
+  return true;
+}
+
+void
+aarch64_used_reg_set (rtx_insn *insn,
+		      HARD_REG_SET *return_set,
+		      bool default_to_used)
+{
+  int regno;
+
+  if (default_to_used)
+    COPY_HARD_REG_SET (*return_set, call_used_reg_set);
+  else
+    COPY_HARD_REG_SET (*return_set, regs_invalidated_by_call);
+
+  if (aarch64_simd_call_p (insn))
+    {
+      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
+	if (FP_SIMD_SAVED_REGNUM_P (regno))
+	  CLEAR_HARD_REG_BIT (*return_set, regno);
+    }
+}
+
 /* Implement REGMODE_NATURAL_SIZE.  */
 poly_uint64
 aarch64_regmode_natural_size (machine_mode mode)
@@ -17932,6 +18003,9 @@  aarch64_libgcc_floating_mode_supported_p
 #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \
   aarch64_hard_regno_call_part_clobbered
 
+#undef TARGET_CHECK_PART_CLOBBERED
+#define TARGET_CHECK_PART_CLOBBERED aarch64_check_part_clobbered
+
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT aarch64_constant_alignment
 
@@ -17947,6 +18021,9 @@  aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_SPECULATION_SAFE_VALUE
 #define TARGET_SPECULATION_SAFE_VALUE aarch64_speculation_safe_value
 
+#undef TARGET_USED_REG_SET
+#define TARGET_USED_REG_SET aarch64_used_reg_set
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests
diff --git a/gcc/cselib.c b/gcc/cselib.c
index 6d3a407..7d6f28e 100644
--- a/gcc/cselib.c
+++ b/gcc/cselib.c
@@ -2769,7 +2769,7 @@  cselib_process_insn (rtx_insn *insn)
 	if (call_used_regs[i]
 	    || (REG_VALUES (i) && REG_VALUES (i)->elt
 		&& (targetm.hard_regno_call_part_clobbered
-		    (i, GET_MODE (REG_VALUES (i)->elt->val_rtx)))))
+		    (insn, i, GET_MODE (REG_VALUES (i)->elt->val_rtx)))))
 	  cselib_invalidate_regno (i, reg_raw_mode[i]);
 
       /* Since it is not clear how cselib is going to be used, be
diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 0b119f2..4d2751d 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3097,8 +3097,7 @@  df_get_call_refs (struct df_collection_rec *collection_rec,
   CLEAR_HARD_REG_SET (defs_generated);
   df_find_hard_reg_defs (PATTERN (insn_info->insn), &defs_generated);
   is_sibling_call = SIBLING_CALL_P (insn_info->insn);
-  get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage,
-			  regs_invalidated_by_call);
+  get_call_reg_set_usage (insn_info->insn, &fn_reg_set_usage, false);
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     {
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index c509a9b..fb05da6 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1693,6 +1693,10 @@  of @code{CALL_USED_REGISTERS}.
 @cindex call-saved register
 @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED
 
+@hook TARGET_CHECK_PART_CLOBBERED
+
+@hook TARGET_USED_REG_SET
+
 @findex fixed_regs
 @findex call_used_regs
 @findex global_regs
diff --git a/gcc/final.c b/gcc/final.c
index 6943c07..343dd00 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -5002,8 +5002,7 @@  collect_fn_hard_reg_usage (void)
       if (CALL_P (insn)
 	  && !self_recursive_call_p (insn))
 	{
-	  if (!get_call_reg_set_usage (insn, &insn_used_regs,
-				       call_used_reg_set))
+	  if (!get_call_reg_set_usage (insn, &insn_used_regs, true))
 	    return;
 
 	  IOR_HARD_REG_SET (function_used_regs, insn_used_regs);
@@ -5074,24 +5073,29 @@  get_call_cgraph_rtl_info (rtx_insn *insn)
 }
 
 /* Find hard registers used by function call instruction INSN, and return them
-   in REG_SET.  Return DEFAULT_SET in REG_SET if not found.  */
+   in REG_SET.  If not found return call_used_set in REG_SET when
+   default_to_used is TRUE or regs_invalidated_by_call when it is false.  */
 
 bool
 get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set,
-			HARD_REG_SET default_set)
+			bool default_to_used)
 {
+  HARD_REG_SET default_set;
+
   if (flag_ipa_ra)
     {
       struct cgraph_rtl_info *node = get_call_cgraph_rtl_info (insn);
       if (node != NULL
 	  && node->function_used_regs_valid)
 	{
+	  targetm.used_reg_set (insn, &default_set, default_to_used);
 	  COPY_HARD_REG_SET (*reg_set, node->function_used_regs);
 	  AND_HARD_REG_SET (*reg_set, default_set);
 	  return true;
 	}
     }
 
+  targetm.used_reg_set (insn, &default_set, default_to_used);
   COPY_HARD_REG_SET (*reg_set, default_set);
   return false;
 }
diff --git a/gcc/hooks.c b/gcc/hooks.c
index 780cc1e..c412d69 100644
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@@ -140,9 +140,10 @@  hook_bool_puint64_puint64_true (poly_uint64, poly_uint64)
   return true;
 }
 
-/* Generic hook that takes (unsigned int, machine_mode) and returns false.  */
+/* Generic hook that takes (rtx_insn *, unsigned int, machine_mode) and
+   returns false.  */
 bool
-hook_bool_uint_mode_false (unsigned int, machine_mode)
+hook_bool_insn_uint_mode_false (rtx_insn *, unsigned int, machine_mode)
 {
   return false;
 }
diff --git a/gcc/hooks.h b/gcc/hooks.h
index 0ed5b95..3ca3db2 100644
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@@ -40,7 +40,8 @@  extern bool hook_bool_const_rtx_insn_const_rtx_insn_true (const rtx_insn *,
 extern bool hook_bool_mode_uhwi_false (machine_mode,
 				       unsigned HOST_WIDE_INT);
 extern bool hook_bool_puint64_puint64_true (poly_uint64, poly_uint64);
-extern bool hook_bool_uint_mode_false (unsigned int, machine_mode);
+extern bool hook_bool_insn_uint_mode_false (rtx_insn *, unsigned int,
+					    machine_mode);
 extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
 extern bool hook_bool_tree_false (tree);
 extern bool hook_bool_const_tree_false (const_tree);
diff --git a/gcc/ira-conflicts.c b/gcc/ira-conflicts.c
index eb85e77..8288485 100644
--- a/gcc/ira-conflicts.c
+++ b/gcc/ira-conflicts.c
@@ -808,7 +808,7 @@  ira_build_conflicts (void)
 		 regs must conflict with them.  */
 	      for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 		if (!TEST_HARD_REG_BIT (call_used_reg_set, regno)
-		    && targetm.hard_regno_call_part_clobbered (regno,
+		    && targetm.hard_regno_call_part_clobbered (NULL, regno,
 							       obj_mode))
 		  {
 		    SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
index 6fa917a..6a2a0b4 100644
--- a/gcc/ira-costs.c
+++ b/gcc/ira-costs.c
@@ -2337,7 +2337,7 @@  ira_tune_allocno_costs (void)
 						   *crossed_calls_clobber_regs)
 		  && (ira_hard_reg_set_intersection_p (regno, mode,
 						       call_used_reg_set)
-		      || targetm.hard_regno_call_part_clobbered (regno,
+		      || targetm.hard_regno_call_part_clobbered (NULL, regno,
 								 mode)))
 		cost += (ALLOCNO_CALL_FREQ (a)
 			 * (ira_memory_move_cost[mode][rclass][0]
diff --git a/gcc/ira-lives.c b/gcc/ira-lives.c
index b38d4a5..39ea82a 100644
--- a/gcc/ira-lives.c
+++ b/gcc/ira-lives.c
@@ -1202,8 +1202,7 @@  process_bb_node_lives (ira_loop_tree_node_t loop_tree_node)
 		  int num = ALLOCNO_NUM (a);
 		  HARD_REG_SET this_call_used_reg_set;
 
-		  get_call_reg_set_usage (insn, &this_call_used_reg_set,
-					  call_used_reg_set);
+		  get_call_reg_set_usage (insn, &this_call_used_reg_set, true);
 
 		  /* Don't allocate allocnos that cross setjmps or any
 		     call, if this function receives a nonlocal
diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 8be4d46..21ef0d8 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -5344,18 +5344,32 @@  inherit_reload_reg (bool def_p, int original_regno,
 /* Return true if we need a caller save/restore for pseudo REGNO which
    was assigned to a hard register.  */
 static inline bool
-need_for_call_save_p (int regno)
+need_for_call_save_p (int regno, rtx_insn *insn ATTRIBUTE_UNUSED)
 {
+  machine_mode pmode = PSEUDO_REGNO_MODE (regno);
+  int new_regno = reg_renumber[regno];
+
   lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
-  return (usage_insns[regno].calls_num < calls_num
-	  && (overlaps_hard_reg_set_p
-	      ((flag_ipa_ra &&
-		! hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set))
-	       ? lra_reg_info[regno].actual_call_used_reg_set
-	       : call_used_reg_set,
-	       PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
-	      || (targetm.hard_regno_call_part_clobbered
-		  (reg_renumber[regno], PSEUDO_REGNO_MODE (regno)))));
+    if (usage_insns[regno].calls_num >= calls_num)
+      return false;
+
+  /* If we are doing interprocedural register allocation,
+     targetm.hard_regno_call_part_clobbered was used to set
+     actual_call_used_reg_set and should not to be checked
+     here.  */
+
+  if (flag_ipa_ra
+      && !hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set))
+    return (overlaps_hard_reg_set_p
+             (lra_reg_info[regno].actual_call_used_reg_set,
+              pmode, new_regno)
+            || (lra_reg_info[regno].check_part_clobbered
+                && targetm.hard_regno_call_part_clobbered
+		     (NULL, new_regno, pmode)));
+  else
+    return (overlaps_hard_reg_set_p (call_used_reg_set, pmode, new_regno)
+            || targetm.hard_regno_call_part_clobbered
+		(NULL, new_regno, pmode));
 }
 
 /* Global registers occurring in the current EBB.  */
@@ -5374,7 +5388,8 @@  static bitmap_head ebb_global_regs;
    assignment pass because of too many generated moves which will be
    probably removed in the undo pass.  */
 static inline bool
-need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno)
+need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno,
+		  rtx_insn *insn)
 {
   int hard_regno = regno < FIRST_PSEUDO_REGISTER ? regno : reg_renumber[regno];
 
@@ -5416,7 +5431,8 @@  need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno)
 	       || (regno >= FIRST_PSEUDO_REGISTER
 		   && lra_reg_info[regno].nrefs > 3
 		   && bitmap_bit_p (&ebb_global_regs, regno))))
-	  || (regno >= FIRST_PSEUDO_REGISTER && need_for_call_save_p (regno)));
+	  || (regno >= FIRST_PSEUDO_REGISTER
+	      && need_for_call_save_p (regno, insn)));
 }
 
 /* Return class for the split pseudo created from original pseudo with
@@ -5536,7 +5552,7 @@  split_reg (bool before_p, int original_regno, rtx_insn *insn,
       nregs = hard_regno_nregs (hard_regno, mode);
       rclass = lra_get_allocno_class (original_regno);
       original_reg = regno_reg_rtx[original_regno];
-      call_save_p = need_for_call_save_p (original_regno);
+      call_save_p = need_for_call_save_p (original_regno, insn);
     }
   lra_assert (hard_regno >= 0);
   if (lra_dump_file != NULL)
@@ -5769,7 +5785,7 @@  split_if_necessary (int regno, machine_mode mode,
 	     && INSN_UID (next_usage_insns) < max_uid)
 	    || (GET_CODE (next_usage_insns) == INSN_LIST
 		&& (INSN_UID (XEXP (next_usage_insns, 0)) < max_uid)))
-	&& need_for_split_p (potential_reload_hard_regs, regno + i)
+	&& need_for_split_p (potential_reload_hard_regs, regno + i, insn)
 	&& split_reg (before_p, regno + i, insn, next_usage_insns, NULL))
     res = true;
   return res;
@@ -6539,7 +6555,8 @@  inherit_in_ebb (rtx_insn *head, rtx_insn *tail)
 		  && usage_insns[j].check == curr_usage_insns_check
 		  && (next_usage_insns = usage_insns[j].insns) != NULL_RTX)
 		{
-		  if (need_for_split_p (potential_reload_hard_regs, j))
+		  if (need_for_split_p (potential_reload_hard_regs, j,
+					curr_insn))
 		    {
 		      if (lra_dump_file != NULL && head_p)
 			{
diff --git a/gcc/lra-int.h b/gcc/lra-int.h
index 5267b53..e6aacd2 100644
--- a/gcc/lra-int.h
+++ b/gcc/lra-int.h
@@ -117,6 +117,8 @@  struct lra_reg
   /* This member is set up in lra-lives.c for subsequent
      assignments.  */
   lra_copy_t copies;
+  /* Whether or not the register is partially clobbered.  */
+  bool check_part_clobbered;
 };
 
 /* References to the common info about each register.  */
diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
index 565c68b..5d3eab9 100644
--- a/gcc/lra-lives.c
+++ b/gcc/lra-lives.c
@@ -568,7 +568,8 @@  lra_setup_reload_pseudo_preferenced_hard_reg (int regno,
    PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS.  */
 static inline void
 check_pseudos_live_through_calls (int regno,
-				  HARD_REG_SET last_call_used_reg_set)
+				  HARD_REG_SET last_call_used_reg_set,
+				  rtx_insn *insn ATTRIBUTE_UNUSED)
 {
   int hr;
 
@@ -578,11 +579,12 @@  check_pseudos_live_through_calls (int regno,
   IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs,
 		    last_call_used_reg_set);
 
-  for (hr = 0; hr < FIRST_PSEUDO_REGISTER; hr++)
-    if (targetm.hard_regno_call_part_clobbered (hr,
-						PSEUDO_REGNO_MODE (regno)))
-      add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs,
-			   PSEUDO_REGNO_MODE (regno), hr);
+  if (!flag_ipa_ra)
+    for (hr = 0; hr < FIRST_PSEUDO_REGISTER; hr++)
+      if (targetm.hard_regno_call_part_clobbered (NULL, hr,
+						  PSEUDO_REGNO_MODE (regno)))
+        add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs,
+			     PSEUDO_REGNO_MODE (regno), hr);
   lra_reg_info[regno].call_p = true;
   if (! sparseset_bit_p (pseudos_live_through_setjumps, regno))
     return;
@@ -820,7 +822,8 @@  process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p)
 		|= mark_regno_live (reg->regno, reg->biggest_mode,
 				    curr_point);
 	      check_pseudos_live_through_calls (reg->regno,
-						last_call_used_reg_set);
+						last_call_used_reg_set,
+						curr_insn);
 	    }
 
 	  if (reg->regno >= FIRST_PSEUDO_REGISTER)
@@ -872,8 +875,7 @@  process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p)
 	  else
 	    {
 	      HARD_REG_SET this_call_used_reg_set;
-	      get_call_reg_set_usage (curr_insn, &this_call_used_reg_set,
-				      call_used_reg_set);
+	      get_call_reg_set_usage (curr_insn, &this_call_used_reg_set, true);
 
 	      bool flush = (! hard_reg_set_empty_p (last_call_used_reg_set)
 			    && ! hard_reg_set_equal_p (last_call_used_reg_set,
@@ -883,9 +885,13 @@  process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p)
 		{
 		  IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set,
 				    this_call_used_reg_set);
+
+		  if (targetm.check_part_clobbered (curr_insn))
+		    lra_reg_info[j].check_part_clobbered = true;
+
 		  if (flush)
 		    check_pseudos_live_through_calls
-		      (j, last_call_used_reg_set);
+		      (j, last_call_used_reg_set, curr_insn);
 		}
 	      COPY_HARD_REG_SET(last_call_used_reg_set, this_call_used_reg_set);
 	    }
@@ -915,7 +921,8 @@  process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p)
 	      |= mark_regno_live (reg->regno, reg->biggest_mode,
 				  curr_point);
 	    check_pseudos_live_through_calls (reg->regno,
-					      last_call_used_reg_set);
+					      last_call_used_reg_set,
+					      curr_insn);
 	  }
 
       for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
@@ -1071,7 +1078,7 @@  process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p)
       if (sparseset_cardinality (pseudos_live_through_calls) == 0)
 	break;
       if (sparseset_bit_p (pseudos_live_through_calls, j))
-	check_pseudos_live_through_calls (j, last_call_used_reg_set);
+	check_pseudos_live_through_calls (j, last_call_used_reg_set, NULL);
     }
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; ++i)
diff --git a/gcc/lra.c b/gcc/lra.c
index aa768fb..17cbf07 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -1344,6 +1344,7 @@  initialize_lra_reg_info_element (int i)
   lra_reg_info[i].val = get_new_reg_value ();
   lra_reg_info[i].offset = 0;
   lra_reg_info[i].copies = NULL;
+  lra_reg_info[i].check_part_clobbered = false;
 }
 
 /* Initialize common reg info and copies.  */
diff --git a/gcc/postreload.c b/gcc/postreload.c
index 56cb14d..bca4e59 100644
--- a/gcc/postreload.c
+++ b/gcc/postreload.c
@@ -1332,7 +1332,7 @@  reload_combine (void)
 	  rtx link;
 	  HARD_REG_SET used_regs;
 
-	  get_call_reg_set_usage (insn, &used_regs, call_used_reg_set);
+	  get_call_reg_set_usage (insn, &used_regs, true);
 
 	  for (r = 0; r < FIRST_PSEUDO_REGISTER; r++)
 	    if (TEST_HARD_REG_BIT (used_regs, r))
diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 1f80576..cdd5b90 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -1048,13 +1048,11 @@  copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd)
 		}
 	    }
 
-	  get_call_reg_set_usage (insn,
-				  &regs_invalidated_by_this_call,
-				  regs_invalidated_by_call);
+	  get_call_reg_set_usage (insn, &regs_invalidated_by_this_call, false);
 	  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
 	    if ((TEST_HARD_REG_BIT (regs_invalidated_by_this_call, regno)
 		 || (targetm.hard_regno_call_part_clobbered
-		     (regno, vd->e[regno].mode)))
+		     (insn, regno, vd->e[regno].mode)))
 		&& (regno < set_regno || regno >= set_regno + set_nregs))
 	      kill_value_regno (regno, 1, vd);
 
diff --git a/gcc/reginfo.c b/gcc/reginfo.c
index 33befa5..df789b5 100644
--- a/gcc/reginfo.c
+++ b/gcc/reginfo.c
@@ -639,7 +639,7 @@  choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED,
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
 	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (regno, mode))
+	    || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -647,7 +647,7 @@  choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED,
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
 	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (regno, mode))
+	    || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -655,7 +655,7 @@  choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED,
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
 	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (regno, mode))
+	    || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -663,7 +663,7 @@  choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED,
     if (hard_regno_nregs (regno, mode) == nregs
 	&& targetm.hard_regno_mode_ok (regno, mode)
 	&& (!call_saved
-	    || !targetm.hard_regno_call_part_clobbered (regno, mode))
+	    || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
 	&& maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (found_mode)))
       found_mode = mode;
 
@@ -677,7 +677,7 @@  choose_hard_reg_mode (unsigned int regno ATTRIBUTE_UNUSED,
       if (hard_regno_nregs (regno, mode) == nregs
 	  && targetm.hard_regno_mode_ok (regno, mode)
 	  && (!call_saved
-	      || !targetm.hard_regno_call_part_clobbered (regno, mode)))
+	      || !targetm.hard_regno_call_part_clobbered (NULL, regno, mode)))
 	return mode;
     }
 
diff --git a/gcc/regrename.c b/gcc/regrename.c
index 8424093..5bee9b7 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -339,9 +339,9 @@  check_new_reg_p (int reg ATTRIBUTE_UNUSED, int new_reg,
 	 && ! DEBUG_INSN_P (tmp->insn))
 	|| (this_head->need_caller_save_reg
 	    && ! (targetm.hard_regno_call_part_clobbered
-		  (reg, GET_MODE (*tmp->loc)))
+		  (NULL, reg, GET_MODE (*tmp->loc)))
 	    && (targetm.hard_regno_call_part_clobbered
-		(new_reg, GET_MODE (*tmp->loc)))))
+		(NULL, new_reg, GET_MODE (*tmp->loc)))))
       return false;
 
   return true;
diff --git a/gcc/regs.h b/gcc/regs.h
index f143cbd..35cf969 100644
--- a/gcc/regs.h
+++ b/gcc/regs.h
@@ -385,6 +385,6 @@  range_in_hard_reg_set_p (const HARD_REG_SET set, unsigned regno, int nregs)
 
 /* Get registers used by given function call instruction.  */
 extern bool get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set,
-				    HARD_REG_SET default_set);
+				    bool default_to_used);
 
 #endif /* GCC_REGS_H */
diff --git a/gcc/reload.c b/gcc/reload.c
index 88299a8..b26340c 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -6912,13 +6912,13 @@  find_equiv_reg (rtx goal, rtx_insn *insn, enum reg_class rclass, int other,
 	  if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER)
 	    for (i = 0; i < nregs; ++i)
 	      if (call_used_regs[regno + i]
-		  || targetm.hard_regno_call_part_clobbered (regno + i, mode))
+		  || targetm.hard_regno_call_part_clobbered (p, regno + i, mode))
 		return 0;
 
 	  if (valueno >= 0 && valueno < FIRST_PSEUDO_REGISTER)
 	    for (i = 0; i < valuenregs; ++i)
 	      if (call_used_regs[valueno + i]
-		  || targetm.hard_regno_call_part_clobbered (valueno + i,
+		  || targetm.hard_regno_call_part_clobbered (p, valueno + i,
 							     mode))
 		return 0;
 	}
diff --git a/gcc/reload1.c b/gcc/reload1.c
index 3c0c9ff..f65e930 100644
--- a/gcc/reload1.c
+++ b/gcc/reload1.c
@@ -8289,7 +8289,8 @@  emit_reload_insns (struct insn_chain *chain)
 			   : out_regno + k);
 		      reg_reloaded_insn[regno + k] = insn;
 		      SET_HARD_REG_BIT (reg_reloaded_valid, regno + k);
-		      if (targetm.hard_regno_call_part_clobbered (regno + k,
+		      if (targetm.hard_regno_call_part_clobbered (insn,
+								  regno + k,
 								  mode))
 			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
 					  regno + k);
@@ -8369,7 +8370,8 @@  emit_reload_insns (struct insn_chain *chain)
 			   : in_regno + k);
 		      reg_reloaded_insn[regno + k] = insn;
 		      SET_HARD_REG_BIT (reg_reloaded_valid, regno + k);
-		      if (targetm.hard_regno_call_part_clobbered (regno + k,
+		      if (targetm.hard_regno_call_part_clobbered (insn,
+								  regno + k,
 								  mode))
 			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
 					  regno + k);
@@ -8485,7 +8487,7 @@  emit_reload_insns (struct insn_chain *chain)
 		      CLEAR_HARD_REG_BIT (reg_reloaded_dead, src_regno + k);
 		      SET_HARD_REG_BIT (reg_reloaded_valid, src_regno + k);
 		      if (targetm.hard_regno_call_part_clobbered
-			  (src_regno + k, mode))
+			  (insn, src_regno + k, mode))
 			SET_HARD_REG_BIT (reg_reloaded_call_part_clobbered,
 					  src_regno + k);
 		      else
diff --git a/gcc/resource.c b/gcc/resource.c
index fdfab69..aea497f 100644
--- a/gcc/resource.c
+++ b/gcc/resource.c
@@ -669,7 +669,7 @@  mark_set_resources (rtx x, struct resources *res, int in_dest,
 
 	  res->cc = res->memory = 1;
 
-	  get_call_reg_set_usage (call_insn, &regs, regs_invalidated_by_call);
+	  get_call_reg_set_usage (call_insn, &regs, false);
 	  IOR_HARD_REG_SET (res->regs, regs);
 
 	  for (link = CALL_INSN_FUNCTION_USAGE (call_insn);
@@ -1040,7 +1040,7 @@  mark_target_live_regs (rtx_insn *insns, rtx target_maybe_return, struct resource
 		  HARD_REG_SET regs_invalidated_by_this_call;
 		  get_call_reg_set_usage (real_insn,
 					  &regs_invalidated_by_this_call,
-					  regs_invalidated_by_call);
+					  false);
 		  /* CALL clobbers all call-used regs that aren't fixed except
 		     sp, ap, and fp.  Do this before setting the result of the
 		     call live.  */
diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index f89f282..fa4bdfe 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -3728,7 +3728,7 @@  deps_analyze_insn (struct deps_desc *deps, rtx_insn *insn)
              Since we only have a choice between 'might be clobbered'
              and 'definitely not clobbered', we must include all
              partly call-clobbered registers here.  */
-	    else if (targetm.hard_regno_call_part_clobbered (i,
+	    else if (targetm.hard_regno_call_part_clobbered (insn, i,
 							     reg_raw_mode[i])
                      || TEST_HARD_REG_BIT (regs_invalidated_by_call, i))
               SET_REGNO_REG_SET (reg_pending_clobbers, i);
diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 824f1ec..7b442ea 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -1103,7 +1103,7 @@  init_regs_for_mode (machine_mode mode)
       if (i >= 0)
         continue;
 
-      if (targetm.hard_regno_call_part_clobbered (cur_reg, mode))
+      if (targetm.hard_regno_call_part_clobbered (NULL, cur_reg, mode))
         SET_HARD_REG_BIT (sel_hrd.regs_for_call_clobbered[mode],
                           cur_reg);
 
@@ -1252,7 +1252,7 @@  mark_unavailable_hard_regs (def_t def, struct reg_rename *reg_rename_p,
 
   /* Exclude registers that are partially call clobbered.  */
   if (def->crosses_call
-      && !targetm.hard_regno_call_part_clobbered (regno, mode))
+      && !targetm.hard_regno_call_part_clobbered (NULL, regno, mode))
     AND_COMPL_HARD_REG_SET (reg_rename_p->available_for_renaming,
                             sel_hrd.regs_for_call_clobbered[mode]);
 
diff --git a/gcc/target.def b/gcc/target.def
index 9e22423..e82fc30 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5735,12 +5735,32 @@  DEFHOOK
 partly call-clobbered, and if a value of mode @var{mode} would be partly\n\
 clobbered by a call.  For example, if the low 32 bits of @var{regno} are\n\
 preserved across a call but higher bits are clobbered, this hook should\n\
-return true for a 64-bit mode but false for a 32-bit mode.\n\
+return true for a 64-bit mode but false for a 32-bit mode.   If insn is\n\
+not NULL then it is the call instruction being made.  This allows the\n\
+function to return different values based on a function attribute or other\n\
+function specific information.\n\
 \n\
 The default implementation returns false, which is correct\n\
 for targets that don't have partly call-clobbered registers.",
- bool, (unsigned int regno, machine_mode mode),
- hook_bool_uint_mode_false)
+ bool, (rtx_insn *insn, unsigned int regno, machine_mode mode),
+ hook_bool_insn_uint_mode_false)
+
+DEFHOOK
+(
+ check_part_clobbered,
+ "This hook should return true if the function @var{insn} should obey\n\
+ the hard_regno_call_part_clobbered target.  False if should ignore it.",
+ bool, (rtx_insn *insn),
+ hook_bool_rtx_insn_true)
+
+DEFHOOK
+(used_reg_set,
+ "This hook should set return_set to the call_used_reg_set if\n\
+@var{default_to_used} is true and regs_invalidated_by_call if it is false.\n\
+The hook may look at @var{insn} to see if the default register set\n\
+should be modified due to attributes on the function being called.",
+ void, (rtx_insn *insn, HARD_REG_SET *return_set, bool default_to_used),
+ default_used_reg_set)
 
 /* Return the smallest number of different values for which it is best to
    use a jump-table instead of a tree of conditional branches.  */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index afd56f3..8242b2a 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1928,7 +1928,7 @@  default_dwarf_frame_reg_mode (int regno)
 {
   machine_mode save_mode = reg_raw_mode[regno];
 
-  if (targetm.hard_regno_call_part_clobbered (regno, save_mode))
+  if (targetm.hard_regno_call_part_clobbered (NULL, regno, save_mode))
     save_mode = choose_hard_reg_mode (regno, 1, true);
   return save_mode;
 }
@@ -2370,4 +2370,15 @@  default_speculation_safe_value (machine_mode mode ATTRIBUTE_UNUSED,
   return result;
 }
 
+void
+default_used_reg_set (rtx_insn *insn ATTRIBUTE_UNUSED,
+		      HARD_REG_SET *return_set,
+		      bool default_to_used)
+{
+  if (default_to_used)
+    COPY_HARD_REG_SET (*return_set, call_used_reg_set);
+  else
+    COPY_HARD_REG_SET (*return_set, regs_invalidated_by_call);
+}
+
 #include "gt-targhooks.h"
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index f92ca5c..3f17efd 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -285,4 +285,6 @@  extern bool default_have_speculation_safe_value (bool);
 extern bool speculation_safe_value_not_needed (bool);
 extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
 
+extern void default_used_reg_set (rtx_insn *, HARD_REG_SET *, bool);
+
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 5537fa6..81a052e 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -4901,8 +4901,7 @@  dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn)
   hard_reg_set_iterator hrsi;
   HARD_REG_SET invalidated_regs;
 
-  get_call_reg_set_usage (call_insn, &invalidated_regs,
-			  regs_invalidated_by_call);
+  get_call_reg_set_usage (call_insn, &invalidated_regs, false);
 
   EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi)
     var_regno_delete (set, r);