Message ID  20200217074133.572691helijia@linux.ibm.com 

State  New 
Headers  show 
Series 

Related  show 
On Mon, 17 Feb 2020, Li Jia He wrote: > Hi, > > This patch wants to fix PR66552 on gimple and optimizes (x shift (n mod C)) > to (x shift (n bit_and (C  1))) when C is a constant and power of two as > discussed in PR66552. > > The regression testing for the patch was done on GCC mainline on > > powerpc64leunknownlinuxgnu (Power 9 LE) > > with no regressions. Is it OK for GCC11 ? I fail to see the connection with a shift operation, can you explain? Thanks, Richard. > Thanks, > Lijia He > > gcc/ChangeLog > 20200217 Li Jia He <helijia@linux.ibm.com> > > PR treeoptimization/66552 > * match.pd (x shift (n mod pow2(c))): Optimizes to > (x shift (n bit_and (pow2(c)  1)). > > testsuite/ChangeLog > 20190217 Li Jia He <helijia@linux.ibm.com> > > PR treeoptimization/66552 > * testsuite/gcc.dg/pr66552.c: New testcase. >  > gcc/match.pd  6 ++++++ > gcc/testsuite/gcc.dg/pr66552.c  14 ++++++++++++++ > 2 files changed, 20 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/pr66552.c > > diff git a/gcc/match.pd b/gcc/match.pd > index 73834c25593..1d74f7dba7f 100644 >  a/gcc/match.pd > +++ b/gcc/match.pd > @@ 546,6 +546,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (simplify > (mod (mod@2 @0 @1) @1) > @2) > + /* Optimize (x shift (n mod C)) to (x shift (n bit_and (C  1))) when C is a > + constant and power of two. */ > + (for shift (lshift rshift) > + (simplify > + (shift @0 (mod @1 integer_pow2p@2)) > + (shift @0 (bit_and @1 (minus @2 { build_int_cst (TREE_TYPE (@2), 1); }))))) > /* From extract_muldiv_1: (X * C1) % C2 is zero if C1 is a multiple of C2. */ > (simplify > (mod (mult @0 INTEGER_CST@1) INTEGER_CST@2) > diff git a/gcc/testsuite/gcc.dg/pr66552.c b/gcc/testsuite/gcc.dg/pr66552.c > new file mode 100644 > index 00000000000..7583c9ad25a >  /dev/null > +++ b/gcc/testsuite/gcc.dg/pr66552.c > @@ 0,0 +1,14 @@ > +/* { dgdo compile } */ > +/* { dgoptions "O2 fdumptreelower" } */ > + > +unsigned a(unsigned x, int n) > +{ > + return x >> (n % 32); > +} > + > +unsigned b(unsigned x, int n) > +{ > + return x << (n % 32); > +} > + > +/* { dgfinal { scantreedumpnot " % " "lower" } } */ >  Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
On Mon, Feb 17, 2020 at 09:01:13AM +0100, Richard Biener wrote: > On Mon, 17 Feb 2020, Li Jia He wrote: > > This patch wants to fix PR66552 on gimple and optimizes (x shift (n mod C)) > > to (x shift (n bit_and (C  1))) when C is a constant and power of two as > > discussed in PR66552. > > > > The regression testing for the patch was done on GCC mainline on > > > > powerpc64leunknownlinuxgnu (Power 9 LE) > > > > with no regressions. Is it OK for GCC11 ? > > I fail to see the connection with a shift operation, can you explain? As mentioned in the PR, the reason why we can only optimize unsigned mod C into bit_and C1 (where C is pow2p) is that signed modulo can be negative for negative values (is nonpositive). For shifts negative values would be UB and so we can assume it will not be negative (i.e. x will be either positive or a negative multiple of C) and can use bit_and then. Jakub
On Mon, 17 Feb 2020, Jakub Jelinek wrote: > On Mon, Feb 17, 2020 at 09:01:13AM +0100, Richard Biener wrote: > > On Mon, 17 Feb 2020, Li Jia He wrote: > > > This patch wants to fix PR66552 on gimple and optimizes (x shift (n mod C)) > > > to (x shift (n bit_and (C  1))) when C is a constant and power of two as > > > discussed in PR66552. > > > > > > The regression testing for the patch was done on GCC mainline on > > > > > > powerpc64leunknownlinuxgnu (Power 9 LE) > > > > > > with no regressions. Is it OK for GCC11 ? > > > > I fail to see the connection with a shift operation, can you explain? > > As mentioned in the PR, the reason why we can only optimize unsigned > mod C into bit_and C1 (where C is pow2p) is that signed modulo can be negative for > negative values (is nonpositive). For shifts negative values would be UB > and so we can assume it will not be negative (i.e. x will be either positive > or a negative multiple of C) and can use bit_and then. That's clearly information that should be in the comment before the pattern. We could also generally do the transform if we know the other operand of the modulo is nonnegative. Also the pattern doing the standalone transform does /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, i.e. "X % C" into "X & (C  1)", if X and C are positive. Also optimize A % (C << N) where C is a power of 2, to A & ((C << N)  1). */ (match (power_of_two_cand @1) INTEGER_CST@1) (match (power_of_two_cand @1) (lshift INTEGER_CST@1 @2)) (for mod (trunc_mod floor_mod) (simplify (mod @0 (convert?@3 (power_of_two_cand@1 @2))) (if ((TYPE_UNSIGNED (type)  tree_expr_nonnegative_p (@0)) && tree_nop_conversion_p (type, TREE_TYPE (@3)) && integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0) (bit_and @0 (convert (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })))))) so it also checks whether @2 is not negative, the new pattern lacks this and could make use of power_of_two_cand as well (in fact I'd place it next to the pattern above. Also the above applies for trunc_mod and floor_mod but the proposed patch applies the transform for ceil_mod and round_mod as well. Richard.
Li Jia He <helijia@linux.ibm.com> writes: > Hi, > @@ 546,6 +546,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (simplify > (mod (mod@2 @0 @1) @1) > @2) > + /* Optimize (x shift (n mod C)) to (x shift (n bit_and (C  1))) when C is a > + constant and power of two. */ > + (for shift (lshift rshift) > + (simplify > + (shift @0 (mod @1 integer_pow2p@2)) > + (shift @0 (bit_and @1 (minus @2 { build_int_cst (TREE_TYPE (@2), 1); }))))) > /* From extract_muldiv_1: (X * C1) % C2 is zero if C1 is a multiple of C2. */ > (simplify > (mod (mult @0 INTEGER_CST@1) INTEGER_CST@2) > +unsigned a(unsigned x, int n) > +{ > + return x >> (n % 32); > +} > + With this patch, the behavior will change on negtive @2. While this would not be an issue because it is UB. In below foo, r and r1 are not same after patch.  int __attribute__ ((noinline)) foo (unsigned x, int n, int t) { int r = x >> (n % 4); int r1 = x >> t; __builtin_printf ("%d : %d\n", r, r1); return r == r1; } int main() { unsigned x = 0x12c; int n = 3; return foo (x, n, n %4); }  For: "x=0x12c, n=3; x >> (n % 4)" without patch, it is 0; with patch it is 0x96.
Hi, On 2020/2/17 4:30 PM, Richard Biener wrote: > On Mon, 17 Feb 2020, Jakub Jelinek wrote: > >> On Mon, Feb 17, 2020 at 09:01:13AM +0100, Richard Biener wrote: >>> On Mon, 17 Feb 2020, Li Jia He wrote: >>>> This patch wants to fix PR66552 on gimple and optimizes (x shift (n mod C)) >>>> to (x shift (n bit_and (C  1))) when C is a constant and power of two as >>>> discussed in PR66552. >>>> >>>> The regression testing for the patch was done on GCC mainline on >>>> >>>> powerpc64leunknownlinuxgnu (Power 9 LE) >>>> >>>> with no regressions. Is it OK for GCC11 ? >>> >>> I fail to see the connection with a shift operation, can you explain? >> >> As mentioned in the PR, the reason why we can only optimize unsigned >> mod C into bit_and C1 (where C is pow2p) is that signed modulo can be negative for >> negative values (is nonpositive). For shifts negative values would be UB >> and so we can assume it will not be negative (i.e. x will be either positive >> or a negative multiple of C) and can use bit_and then. > > That's clearly information that should be in the comment before the > pattern. We could also generally do the transform if we know > the other operand of the modulo is nonnegative. > > Also the pattern doing the standalone transform does > > /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, > i.e. "X % C" into "X & (C  1)", if X and C are positive. > Also optimize A % (C << N) where C is a power of 2, > to A & ((C << N)  1). */ > (match (power_of_two_cand @1) > INTEGER_CST@1) > (match (power_of_two_cand @1) > (lshift INTEGER_CST@1 @2)) > (for mod (trunc_mod floor_mod) > (simplify > (mod @0 (convert?@3 (power_of_two_cand@1 @2))) > (if ((TYPE_UNSIGNED (type) >  tree_expr_nonnegative_p (@0)) > && tree_nop_conversion_p (type, TREE_TYPE (@3)) > && integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0) > (bit_and @0 (convert (minus @1 { build_int_cst (TREE_TYPE (@1), 1); > })))))) > > so it also checks whether @2 is not negative, the new pattern lacks > this and could make use of power_of_two_cand as well (in fact I'd > place it next to the pattern above. > Thank you for your suggestions. I have modified the code according to your suggestions. But power_of_two_cand is not used, it will cause pr703542.c failed ((0x1234ULL << (i % 54)) will transfer to (0x1234ULL << (i & 54))). The regression testing for the patch was done on GCC mainline on powerpc64le with no regressions. Would you like to help to see if there are other problems? Thanks in advance. The code is in the attachment. > Also the above applies for trunc_mod and floor_mod but the proposed > patch applies the transform for ceil_mod and round_mod as well. > > Richard. >  BR, Lijia He diff git a/gcc/match.pd b/gcc/match.pd index 73834c25593..a8fc7918621 100644  a/gcc/match.pd +++ b/gcc/match.pd @@ 598,12 +598,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, i.e. "X % C" into "X & (C  1)", if X and C are positive. Also optimize A % (C << N) where C is a power of 2,  to A & ((C << N)  1). */ + to A & ((C << N)  1). And optimize "A shift (N % C)" where C + is a power of 2 and shift operation included "<<" and ">>", + to "A shift (N & (C  1))". */ (match (power_of_two_cand @1) INTEGER_CST@1) (match (power_of_two_cand @1) (lshift INTEGER_CST@1 @2)) (for mod (trunc_mod floor_mod) + (for shift (lshift rshift) + (simplify + (shift @0 (mod @1 integer_pow2p@2)) + (if (tree_expr_nonnegative_p (@2)) + (shift @0 (bit_and @1 (minus @2 { build_int_cst (TREE_TYPE (@2), 1); })))))) (simplify (mod @0 (convert?@3 (power_of_two_cand@1 @2))) (if ((TYPE_UNSIGNED (type) diff git a/gcc/testsuite/gcc.dg/pr66552.c b/gcc/testsuite/gcc.dg/pr66552.c new file mode 100644 index 00000000000..7583c9ad25a  /dev/null +++ b/gcc/testsuite/gcc.dg/pr66552.c @@ 0,0 +1,14 @@ +/* { dgdo compile } */ +/* { dgoptions "O2 fdumptreelower" } */ + +unsigned a(unsigned x, int n) +{ + return x >> (n % 32); +} + +unsigned b(unsigned x, int n) +{ + return x << (n % 32); +} + +/* { dgfinal { scantreedumpnot " % " "lower" } } */
On Tue, 18 Feb 2020, Li Jia He wrote: >> Also the pattern doing the standalone transform does >> >> /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, >> i.e. "X % C" into "X & (C  1)", if X and C are positive. >> Also optimize A % (C << N) where C is a power of 2, >> to A & ((C << N)  1). */ >> (match (power_of_two_cand @1) >> INTEGER_CST@1) >> (match (power_of_two_cand @1) >> (lshift INTEGER_CST@1 @2)) >> (for mod (trunc_mod floor_mod) >> (simplify >> (mod @0 (convert?@3 (power_of_two_cand@1 @2))) >> (if ((TYPE_UNSIGNED (type) >>  tree_expr_nonnegative_p (@0)) >> && tree_nop_conversion_p (type, TREE_TYPE (@3)) >> && integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0) >> (bit_and @0 (convert (minus @1 { build_int_cst (TREE_TYPE (@1), 1); >> })))))) >> >> so it also checks whether @2 is not negative, the new pattern lacks >> this and could make use of power_of_two_cand as well (in fact I'd >> place it next to the pattern above. >> > > Thank you for your suggestions. I have modified the code according to your > suggestions. But power_of_two_cand is not used, it will cause pr703542.c > failed ((0x1234ULL << (i % 54)) will transfer to (0x1234ULL << (i & 54))). How exactly did you use power_of_two_cand? As shown in the transformation above, it combines with integer_pow2p, it doesn't replace it.  Marc Glisse
Hi, On 2020/2/22 11:12 PM, Marc Glisse wrote: > On Tue, 18 Feb 2020, Li Jia He wrote: > >>> Also the pattern doing the standalone transform does >>> >>> /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, >>> i.e. "X % C" into "X & (C  1)", if X and C are positive. >>> Also optimize A % (C << N) where C is a power of 2, >>> to A & ((C << N)  1). */ >>> (match (power_of_two_cand @1) >>> INTEGER_CST@1) >>> (match (power_of_two_cand @1) >>> (lshift INTEGER_CST@1 @2)) >>> (for mod (trunc_mod floor_mod) >>> (simplify >>> (mod @0 (convert?@3 (power_of_two_cand@1 @2))) >>> (if ((TYPE_UNSIGNED (type) >>>  tree_expr_nonnegative_p (@0)) >>> && tree_nop_conversion_p (type, TREE_TYPE (@3)) >>> && integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0) >>> (bit_and @0 (convert (minus @1 { build_int_cst (TREE_TYPE (@1), 1); >>> })))))) >>> >>> so it also checks whether @2 is not negative, the new pattern lacks >>> this and could make use of power_of_two_cand as well (in fact I'd >>> place it next to the pattern above. >>> >> >> Thank you for your suggestions. I have modified the code according to >> your >> suggestions. But power_of_two_cand is not used, it will cause pr703542.c >> failed ((0x1234ULL << (i % 54)) will transfer to (0x1234ULL << (i & >> 54))). > > How exactly did you use power_of_two_cand? As shown in the > transformation above, it combines with integer_pow2p, it doesn't replace > it. > I used power_of_two_cand as follows： diff git a/gcc/match.pd b/gcc/match.pd index 73834c25593..a90e93d8af0 100644  a/gcc/match.pd +++ b/gcc/match.pd @@ 598,12 +598,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, i.e. "X % C" into "X & (C  1)", if X and C are positive. Also optimize A % (C << N) where C is a power of 2,  to A & ((C << N)  1). */ + to A & ((C << N)  1). And optimize "A shift (N % C)" where C + is a power of 2 and shift operation included "<<" and ">>", + to "A shift (N & (C  1))". */ (match (power_of_two_cand @1) INTEGER_CST@1) (match (power_of_two_cand @1) (lshift INTEGER_CST@1 @2)) (for mod (trunc_mod floor_mod) + (for shift (lshift rshift) + (simplify + (shift @0 (mod @1 (power_of_two_cand @2))) + (if (tree_expr_nonnegative_p (@2)) + (shift @0 (bit_and @1 (minus @2 { build_int_cst (TREE_TYPE (@2), 1); })))))) (simplify (mod @0 (convert?@3 (power_of_two_cand@1 @2))) (if ((TYPE_UNSIGNED (type) I found that there will be a generated tree_power_of_two_cand function in genericmatch.c and gimple_power_of_two_cand function in gimplematch.c. bool tree_power_of_two_cand (tree t, tree *res_ops) { const tree type = TREE_TYPE (t); if (TREE_SIDE_EFFECTS (t)) return false; switch (TREE_CODE (t)) { case INTEGER_CST: { { /* #line 604 "../../gccmirror/gcc/match.pd" */ tree captures[1] ATTRIBUTE_UNUSED = { t }; if (__builtin_expect (dump_file && (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Matching expression %s:%d, %s:%d\n", "match.pd", 604, __FILE__, __LINE__); res_ops[0] = captures[0]; return true; } break; } case LSHIFT_EXPR: { tree _p0 = TREE_OPERAND (t, 0); tree _p1 = TREE_OPERAND (t, 1); switch (TREE_CODE (_p0)) { case INTEGER_CST: { { /* #line 606 "../../gccmirror/gcc/match.pd" */ tree captures[2] ATTRIBUTE_UNUSED = { _p0, _p1 }; if (__builtin_expect (dump_file && (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Matching expression %s:%d, %s:%d\n", "match.pd", 606, __FILE__, __LINE__); res_ops[0] = captures[0]; return true; } break; } default:; } break; } default:; } return false; } In the tree_power_of_two_cand function, we can see that if t is an INTEGER_CST, it will be captured directly, so using power_of_two_cand may not be appropriate here.  BR, Lijia He
On Mon, 24 Feb 2020, Li Jia He wrote: > Hi, > > On 2020/2/22 11:12 PM, Marc Glisse wrote: >> On Tue, 18 Feb 2020, Li Jia He wrote: >> >>>> Also the pattern doing the standalone transform does >>>> >>>> /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, >>>> i.e. "X % C" into "X & (C  1)", if X and C are positive. >>>> Also optimize A % (C << N) where C is a power of 2, >>>> to A & ((C << N)  1). */ >>>> (match (power_of_two_cand @1) >>>> INTEGER_CST@1) >>>> (match (power_of_two_cand @1) >>>> (lshift INTEGER_CST@1 @2)) >>>> (for mod (trunc_mod floor_mod) >>>> (simplify >>>> (mod @0 (convert?@3 (power_of_two_cand@1 @2))) >>>> (if ((TYPE_UNSIGNED (type) >>>>  tree_expr_nonnegative_p (@0)) >>>> && tree_nop_conversion_p (type, TREE_TYPE (@3)) >>>> && integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0) >>>> (bit_and @0 (convert (minus @1 { build_int_cst (TREE_TYPE (@1), 1); >>>> })))))) >>>> >>>> so it also checks whether @2 is not negative, the new pattern lacks >>>> this and could make use of power_of_two_cand as well (in fact I'd >>>> place it next to the pattern above. >>>> >>> >>> Thank you for your suggestions. I have modified the code according to >>> your >>> suggestions. But power_of_two_cand is not used, it will cause pr703542.c >>> failed ((0x1234ULL << (i % 54)) will transfer to (0x1234ULL << (i & 54))). >> >> How exactly did you use power_of_two_cand? As shown in the transformation >> above, it combines with integer_pow2p, it doesn't replace it. >> > > I used power_of_two_cand as follows： > > diff git a/gcc/match.pd b/gcc/match.pd > index 73834c25593..a90e93d8af0 100644 >  a/gcc/match.pd > +++ b/gcc/match.pd > @@ 598,12 +598,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, > i.e. "X % C" into "X & (C  1)", if X and C are positive. > Also optimize A % (C << N) where C is a power of 2, >  to A & ((C << N)  1). */ > + to A & ((C << N)  1). And optimize "A shift (N % C)" where C > + is a power of 2 and shift operation included "<<" and ">>", > + to "A shift (N & (C  1))". */ > (match (power_of_two_cand @1) > INTEGER_CST@1) > (match (power_of_two_cand @1) > (lshift INTEGER_CST@1 @2)) > (for mod (trunc_mod floor_mod) > + (for shift (lshift rshift) > + (simplify > + (shift @0 (mod @1 (power_of_two_cand @2))) > + (if (tree_expr_nonnegative_p (@2)) > + (shift @0 (bit_and @1 (minus @2 { build_int_cst (TREE_TYPE (@2), 1); > })))))) > (simplify > (mod @0 (convert?@3 (power_of_two_cand@1 @2))) > (if ((TYPE_UNSIGNED (type) > > I found that there will be a generated tree_power_of_two_cand function in > genericmatch.c and gimple_power_of_two_cand function in gimplematch.c. > > bool > tree_power_of_two_cand (tree t, tree *res_ops) > { > const tree type = TREE_TYPE (t); > if (TREE_SIDE_EFFECTS (t)) return false; > switch (TREE_CODE (t)) > { > case INTEGER_CST: > { > { > /* #line 604 "../../gccmirror/gcc/match.pd" */ > tree captures[1] ATTRIBUTE_UNUSED = { t }; > if (__builtin_expect (dump_file && (dump_flags & TDF_FOLDING), 0)) > fprintf (dump_file, "Matching expression %s:%d, %s:%d\n", "match.pd", 604, > __FILE__, __LINE__); > res_ops[0] = captures[0]; > return true; > } > break; > } > case LSHIFT_EXPR: > { > tree _p0 = TREE_OPERAND (t, 0); > tree _p1 = TREE_OPERAND (t, 1); > switch (TREE_CODE (_p0)) > { > case INTEGER_CST: > { > { > /* #line 606 "../../gccmirror/gcc/match.pd" */ > tree captures[2] ATTRIBUTE_UNUSED = { _p0, _p1 }; > if (__builtin_expect (dump_file && (dump_flags & > TDF_FOLDING), 0)) fprintf (dump_file, "Matching expression %s:%d, %s:%d\n", > "match.pd", 606, __FILE__, __LINE__); > res_ops[0] = captures[0]; > return true; > } > break; > } > default:; > } > break; > } > default:; > } > return false; > } > > In the tree_power_of_two_cand function, we can see that if t is an > INTEGER_CST, it will be captured directly, so using power_of_two_cand > may not be appropriate here. Please look at the one transformation that already uses power_of_two_cand. It matches (power_of_two_cand@1 @2), then tests integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0, and uses @1 in the output. In some sense, tree_power_of_two_cand prepares the argument for a call to integer_pow2p, it doesn't replace it (although we could move the extra tests into power_of_two_cand if we expect all users will need exactly the same).  Marc Glisse
Hi, On 2020/2/24 4:16 PM, Marc Glisse wrote: > On Mon, 24 Feb 2020, Li Jia He wrote: > >> Hi, >> >> On 2020/2/22 11:12 PM, Marc Glisse wrote: >>> On Tue, 18 Feb 2020, Li Jia He wrote: >>> >>>>> Also the pattern doing the standalone transform does >>>>> >>>>> /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, >>>>> i.e. "X % C" into "X & (C  1)", if X and C are positive. >>>>> Also optimize A % (C << N) where C is a power of 2, >>>>> to A & ((C << N)  1). */ >>>>> (match (power_of_two_cand @1) >>>>> INTEGER_CST@1) >>>>> (match (power_of_two_cand @1) >>>>> (lshift INTEGER_CST@1 @2)) >>>>> (for mod (trunc_mod floor_mod) >>>>> (simplify >>>>> (mod @0 (convert?@3 (power_of_two_cand@1 @2))) >>>>> (if ((TYPE_UNSIGNED (type) >>>>>  tree_expr_nonnegative_p (@0)) >>>>> && tree_nop_conversion_p (type, TREE_TYPE (@3)) >>>>> && integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0) >>>>> (bit_and @0 (convert (minus @1 { build_int_cst (TREE_TYPE (@1), >>>>> 1); >>>>> })))))) >>>>> >>>>> so it also checks whether @2 is not negative, the new pattern lacks >>>>> this and could make use of power_of_two_cand as well (in fact I'd >>>>> place it next to the pattern above. >>>>> >>>> >>>> Thank you for your suggestions. I have modified the code according >>>> to your >>>> suggestions. But power_of_two_cand is not used, it will cause >>>> pr703542.c >>>> failed ((0x1234ULL << (i % 54)) will transfer to (0x1234ULL << (i & >>>> 54))). >>> >>> How exactly did you use power_of_two_cand? As shown in the >>> transformation above, it combines with integer_pow2p, it doesn't >>> replace it. >>> >> >> I used power_of_two_cand as follows： >> >> diff git a/gcc/match.pd b/gcc/match.pd >> index 73834c25593..a90e93d8af0 100644 >>  a/gcc/match.pd >> +++ b/gcc/match.pd >> @@ 598,12 +598,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >> /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, >> i.e. "X % C" into "X & (C  1)", if X and C are positive. >> Also optimize A % (C << N) where C is a power of 2, >>  to A & ((C << N)  1). */ >> + to A & ((C << N)  1). And optimize "A shift (N % C)" where C >> + is a power of 2 and shift operation included "<<" and ">>", >> + to "A shift (N & (C  1))". */ >> (match (power_of_two_cand @1) >> INTEGER_CST@1) >> (match (power_of_two_cand @1) >> (lshift INTEGER_CST@1 @2)) >> (for mod (trunc_mod floor_mod) >> + (for shift (lshift rshift) >> + (simplify >> + (shift @0 (mod @1 (power_of_two_cand @2))) >> + (if (tree_expr_nonnegative_p (@2)) >> + (shift @0 (bit_and @1 (minus @2 { build_int_cst (TREE_TYPE (@2), >> 1); })))))) >> (simplify >> (mod @0 (convert?@3 (power_of_two_cand@1 @2))) >> (if ((TYPE_UNSIGNED (type) >> >> I found that there will be a generated tree_power_of_two_cand function in >> genericmatch.c and gimple_power_of_two_cand function in gimplematch.c. >> >> bool >> tree_power_of_two_cand (tree t, tree *res_ops) >> { >> const tree type = TREE_TYPE (t); >> if (TREE_SIDE_EFFECTS (t)) return false; >> switch (TREE_CODE (t)) >> { >> case INTEGER_CST: >> { >> { >> /* #line 604 "../../gccmirror/gcc/match.pd" */ >> tree captures[1] ATTRIBUTE_UNUSED = { t }; >> if (__builtin_expect (dump_file && (dump_flags & >> TDF_FOLDING), 0)) fprintf (dump_file, "Matching expression %s:%d, >> %s:%d\n", "match.pd", 604, __FILE__, __LINE__); >> res_ops[0] = captures[0]; >> return true; >> } >> break; >> } >> case LSHIFT_EXPR: >> { >> tree _p0 = TREE_OPERAND (t, 0); >> tree _p1 = TREE_OPERAND (t, 1); >> switch (TREE_CODE (_p0)) >> { >> case INTEGER_CST: >> { >> { >> /* #line 606 "../../gccmirror/gcc/match.pd" */ >> tree captures[2] ATTRIBUTE_UNUSED = { _p0, _p1 }; >> if (__builtin_expect (dump_file && (dump_flags & >> TDF_FOLDING), 0)) fprintf (dump_file, "Matching expression %s:%d, >> %s:%d\n", "match.pd", 606, __FILE__, __LINE__); >> res_ops[0] = captures[0]; >> return true; >> } >> break; >> } >> default:; >> } >> break; >> } >> default:; >> } >> return false; >> } >> >> In the tree_power_of_two_cand function, we can see that if t is an >> INTEGER_CST, it will be captured directly, so using power_of_two_cand >> may not be appropriate here. > > Please look at the one transformation that already uses > power_of_two_cand. It matches (power_of_two_cand@1 @2), then tests > integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0, and uses @1 in the > output. In some sense, tree_power_of_two_cand prepares the argument for > a call to integer_pow2p, it doesn't replace it (although we could move > the extra tests into power_of_two_cand if we expect all users will need > exactly the same). > Thank you for your guidance. I have made some changes based on my understanding. Would you like to see it again? Thank you in advance. The code is in the attachment.  BR, Lijia He diff git a/gcc/match.pd b/gcc/match.pd index 73834c25593..9e7619c5be8 100644  a/gcc/match.pd +++ b/gcc/match.pd @@ 598,12 +598,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR, i.e. "X % C" into "X & (C  1)", if X and C are positive. Also optimize A % (C << N) where C is a power of 2,  to A & ((C << N)  1). */ + to A & ((C << N)  1). And optimize "A shift (B % C)" where C + is a power of 2, shift operation included "<<" and ">>" and assume + (B % C) will not be negative as shifts negative values would be UB, + to "A shift (B & (C  1))". */ (match (power_of_two_cand @1) INTEGER_CST@1) (match (power_of_two_cand @1) (lshift INTEGER_CST@1 @2)) (for mod (trunc_mod floor_mod) + (for shift (lshift rshift) + (simplify + (shift @0 (mod @1 (power_of_two_cand@2 @3))) + (if (integer_pow2p (@3) && tree_int_cst_sgn (@3) > 0) + (shift @0 (bit_and @1 (minus @2 { build_int_cst (TREE_TYPE (@2), + 1); })))))) (simplify (mod @0 (convert?@3 (power_of_two_cand@1 @2))) (if ((TYPE_UNSIGNED (type) diff git a/gcc/testsuite/gcc.dg/pr66552.c b/gcc/testsuite/gcc.dg/pr66552.c new file mode 100644 index 00000000000..7583c9ad25a  /dev/null +++ b/gcc/testsuite/gcc.dg/pr66552.c @@ 0,0 +1,14 @@ +/* { dgdo compile } */ +/* { dgoptions "O2 fdumptreelower" } */ + +unsigned a(unsigned x, int n) +{ + return x >> (n % 32); +} + +unsigned b(unsigned x, int n) +{ + return x << (n % 32); +} + +/* { dgfinal { scantreedumpnot " % " "lower" } } */
diff git a/gcc/match.pd b/gcc/match.pd index 73834c25593..1d74f7dba7f 100644  a/gcc/match.pd +++ b/gcc/match.pd @@ 546,6 +546,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify (mod (mod@2 @0 @1) @1) @2) + /* Optimize (x shift (n mod C)) to (x shift (n bit_and (C  1))) when C is a + constant and power of two. */ + (for shift (lshift rshift) + (simplify + (shift @0 (mod @1 integer_pow2p@2)) + (shift @0 (bit_and @1 (minus @2 { build_int_cst (TREE_TYPE (@2), 1); }))))) /* From extract_muldiv_1: (X * C1) % C2 is zero if C1 is a multiple of C2. */ (simplify (mod (mult @0 INTEGER_CST@1) INTEGER_CST@2) diff git a/gcc/testsuite/gcc.dg/pr66552.c b/gcc/testsuite/gcc.dg/pr66552.c new file mode 100644 index 00000000000..7583c9ad25a  /dev/null +++ b/gcc/testsuite/gcc.dg/pr66552.c @@ 0,0 +1,14 @@ +/* { dgdo compile } */ +/* { dgoptions "O2 fdumptreelower" } */ + +unsigned a(unsigned x, int n) +{ + return x >> (n % 32); +} + +unsigned b(unsigned x, int n) +{ + return x << (n % 32); +} + +/* { dgfinal { scantreedumpnot " % " "lower" } } */