[RFC] mask out mult expr ctz bits from nonzero bits

Message ID oreei68s2v.fsf@lxoliva.fsfla.org
State New
Headers show
Series
  • [RFC] mask out mult expr ctz bits from nonzero bits
Related show

Commit Message

Alexandre Oliva Jan. 27, 2021, 12:47 p.m.
While looking into the possibility of introducing setmemM patterns on
RISC-V to undo the transformation from loops of word writes into
memset, I was disappointed to find out that get_nonzero_bits would
take into account the range of the length passed to memset, but not
the trivially-available observation that this length was a multiple of
the word size.  This knowledge, if passed on to setmemM, could enable
setmemM to output more efficient code.

In the end, I did not introduce a setmemM pattern, nor the machinery
to pass the ctz of the length on to it along with other useful
information, but I figured this small improvement to nonzero_bits
could still improve code generation elsewhere.
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564341.html


Regstrapped on x86_64-linux-gnu.  No analysis of codegen impact yet.
Does this seem worth pursuing, presumably for stage1?


for  gcc/ChangeLog

	* tree-ssanames.c (get_nonzero_bits): Zero out low bits of
	integral types, when a MULT_EXPR INTEGER_CST operand ensures
	the result will be a multiple of a power of two.
---
 gcc/tree-ssanames.c |   23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)



-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist         GNU Toolchain Engineer
        Vim, Vi, Voltei pro Emacs -- GNUlius Caesar

Comments

Feng Xue OS via Gcc-patches June 10, 2021, 6:20 p.m. | #1
On 1/27/2021 6:47 AM, Alexandre Oliva wrote:
> While looking into the possibility of introducing setmemM patterns on

> RISC-V to undo the transformation from loops of word writes into

> memset, I was disappointed to find out that get_nonzero_bits would

> take into account the range of the length passed to memset, but not

> the trivially-available observation that this length was a multiple of

> the word size.  This knowledge, if passed on to setmemM, could enable

> setmemM to output more efficient code.

>

> In the end, I did not introduce a setmemM pattern, nor the machinery

> to pass the ctz of the length on to it along with other useful

> information, but I figured this small improvement to nonzero_bits

> could still improve code generation elsewhere.

> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564341.html

>

>

> Regstrapped on x86_64-linux-gnu.  No analysis of codegen impact yet.

> Does this seem worth pursuing, presumably for stage1?

>

>

> for  gcc/ChangeLog

>

> 	* tree-ssanames.c (get_nonzero_bits): Zero out low bits of

> 	integral types, when a MULT_EXPR INTEGER_CST operand ensures

> 	the result will be a multiple of a power of two.

Your call on whether or not to pursue -- I'm not sure how often this 
helps us in practice.

If you want to pursue, I'd suggest some tests to show when/how its helpful.

jeff

Patch

diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 51a26d2fce1c2..c4b5bf2a4999a 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -546,10 +546,29 @@  get_nonzero_bits (const_tree name)
     }
 
   range_info_def *ri = SSA_NAME_RANGE_INFO (name);
+  wide_int ret;
   if (!ri)
-    return wi::shwi (-1, precision);
+    ret = wi::shwi (-1, precision);
+  else
+    ret = ri->get_nonzero_bits ();
+
+  /* If NAME is defined as a multiple of a constant C, we know the ctz(C) low
+     bits are zero.  ??? Should we handle LSHIFT_EXPR too?  Non-constants,
+     e.g. the minimum shift count, and ctz from both MULT_EXPR operands?  That
+     could make for deep recursion.  */
+  if (INTEGRAL_TYPE_P (TREE_TYPE (name))
+      && SSA_NAME_DEF_STMT (name)
+      && is_gimple_assign (SSA_NAME_DEF_STMT (name))
+      && gimple_assign_rhs_code (SSA_NAME_DEF_STMT (name)) == MULT_EXPR
+      && TREE_CODE (gimple_assign_rhs2 (SSA_NAME_DEF_STMT (name))) == INTEGER_CST)
+    {
+      unsigned HOST_WIDE_INT bits
+	= tree_ctz (gimple_assign_rhs2 (SSA_NAME_DEF_STMT (name)));
+      wide_int mask = wi::shwi (-1, precision) << bits;
+      ret &= mask;
+    }
 
-  return ri->get_nonzero_bits ();
+  return ret;
 }
 
 /* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false