Backports to 9.3

Message ID 20200213224457.GL17695@tucnak
State New
Headers show
Series
  • Backports to 9.3
Related show

Commit Message

Jakub Jelinek Feb. 13, 2020, 10:44 p.m.
Hi!

I've backported following 15 commits from trunk to 9.3 branch,
bootstrapped/regtested on x86_64-linux and i686-linux, committed.

r10-6186-g32667e04c7153d97d09d81c1af073d400f0c719a
r10-6273-gbff948aa337807260344c83ac9079d6386410094
r10-6314-gaa1b56967d85bfc80d71341395f862ec2b30ca36
r10-6315-g8d7c0bf876fa784101f9ad9e3bba82cc065357da
r10-6358-g56b92750f83724177d2c6eae30c208e935a56a37
r10-6444-gb843bcb89519293404bb00d2ed09aae529b54d7f
r10-6460-g5a8ad97b6e4823d4ded00a3ce8d80e4bf93368d4
r10-6470-gcf785618ecc90e3f063b99572de48cb91aa5ab5d
r10-6471-gcb3f06480a17f98579704b9927632627a3814c5c
r10-6522-g79ab8c4321b2dc940bb706a7432a530e26f0df1a
r10-6565-gf57aa9503ff170ff6c8549718bd736f6c8168bab
r10-6593-g62fc0a6ce28c502fc6a7b7c09157840bf98f945f
r10-6612-gdc6d0f89d4be3ed7fde73417606a78c73d954cdf
r10-6617-gae2b8ede40a81a83f50d1e705972bc46fafd4ce5
r10-6625-gbacdd5e978dad84e9c547b0d5c7fed14b8d75157

	Jakub
From 3b2fbe3e723b20ea9089e5f45c55b79feb37085b Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Thu, 23 Jan 2020 20:08:22 +0100
Subject: [PATCH 01/15] postreload: Fix up postreload combine [PR93402]

The following testcase is miscompiled, because the postreload pass changes:
-(insn 14 13 23 2 (parallel [
-            (set (reg:DI 1 dx [94])
-                (plus:DI (reg:DI 1 dx [95])
-                    (reg:DI 5 di [92])))
-            (clobber (reg:CC 17 flags))
-        ]) "pr93402.c":8:30 186 {*adddi_1}
-     (expr_list:REG_EQUAL (plus:DI (reg:DI 5 di [92])
-            (const_int 111111111111 [0x19debd01c7]))
-        (nil)))
-(insn 23 14 25 2 (set (reg:SI 0 ax)
+(insn 23 13 25 2 (set (reg:SI 0 ax)
         (const_int 0 [0])) "pr93402.c":10:1 67 {*movsi_internal}
      (nil))
 (insn 25 23 26 2 (use (reg:SI 0 ax)) "pr93402.c":10:1 -1
      (nil))
-(insn 26 25 35 2 (use (reg:DI 1 dx)) "pr93402.c":10:1 -1
+(insn 26 25 35 2 (use (plus:DI (reg:DI 1 dx [95])
+            (reg:DI 5 di [92]))) "pr93402.c":10:1 -1
      (nil))
A USE insn is not a normal insn and verify_changes called from
apply_change_group is happy about any changes into it.
The following patch avoids this optimization if we were to change
the USE operand (this routine only changes a reg into (plus reg reg2)).

2020-01-23  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/93402
	* postreload.c (reload_combine_recognize_pattern): Don't try to adjust
	USE insns.

	* gcc.c-torture/execute/pr93402.c: New test.
---
 gcc/ChangeLog                                 |  9 ++++++++
 gcc/postreload.c                              |  4 ++++
 gcc/testsuite/ChangeLog                       |  8 +++++++
 gcc/testsuite/gcc.c-torture/execute/pr93402.c | 21 +++++++++++++++++++
 4 files changed, 42 insertions(+)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93402.c

-- 
2.20.1
From 764e831291a2e510978ca7be0bffb55589a5a0b6 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Tue, 28 Jan 2020 08:46:23 +0100
Subject: [PATCH 02/15] i386: Fix ix86_fold_builtin shift folding [PR93418]

The following testcase is miscompiled, because the variable shift left
operand, { -1, -1, -1, -1 } is represented as a VECTOR_CST with
VECTOR_CST_NPATTERNS 1 and VECTOR_CST_NELTS_PER_PATTERN 1, so when
we call builder.new_unary_operation, builder.encoded_nelts () will be just 1
and thus we encode the resulting vector as if all the elements were the
same.
For non-masked is_vshift, we could perhaps call builder.new_binary_operation
(TREE_TYPE (args[0]), args[0], args[1], false), but then there are masked
shifts, for non-is_vshift we could perhaps call it too but with args[2]
instead of args[1], but there is no builder.new_ternary_operation.
All this stuff is primarily for aarch64 anyway, on x86 we don't have any
variable length vectors, and it is not a big deal to compute all elements
and just let builder.finalize () find the most efficient VECTOR_CST
representation of the vector.  So, instead of doing too much, this just
keeps using new_unary_operation only if only one VECTOR_CST is involved
(i.e. non-masked shift by constant) and for the rest just compute all elts.

2020-01-28  Jakub Jelinek  <jakub@redhat.com>

	PR target/93418
	* config/i386/i386.c (ix86_fold_builtin) <do_shift>: If mask is not
	-1 or is_vshift is true, use new_vector with number of elts npatterns
	rather than new_unary_operation.

	* gcc.target/i386/avx2-pr93418.c: New test.
---
 gcc/ChangeLog                                |  7 +++++++
 gcc/config/i386/i386.c                       |  9 +++++++--
 gcc/testsuite/ChangeLog                      |  5 +++++
 gcc/testsuite/gcc.target/i386/avx2-pr93418.c | 20 ++++++++++++++++++++
 4 files changed, 39 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx2-pr93418.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2029c67bf02..ca09488b59d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,6 +1,13 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-01-28  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/93418
+	* config/i386/i386.c (ix86_fold_builtin) <do_shift>: If mask is not
+	-1 or is_vshift is true, use new_vector with number of elts npatterns
+	rather than new_unary_operation.
+
 	2020-01-23  Jakub Jelinek  <jakub@redhat.com>
 
 	PR rtl-optimization/93402
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 6ee6aea2389..779e8111379 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -33418,8 +33418,13 @@ ix86_fold_builtin (tree fndecl, int n_args,
 		    countt = build_int_cst (integer_type_node, count);
 		}
 	      tree_vector_builder builder;
-	      builder.new_unary_operation (TREE_TYPE (args[0]), args[0],
-					   false);
+	      if (mask != HOST_WIDE_INT_M1U || is_vshift)
+		builder.new_vector (TREE_TYPE (args[0]),
+				    TYPE_VECTOR_SUBPARTS (TREE_TYPE (args[0])),
+				    1);
+	      else
+		builder.new_unary_operation (TREE_TYPE (args[0]), args[0],
+					     false);
 	      unsigned int cnt = builder.encoded_nelts ();
 	      for (unsigned int i = 0; i < cnt; ++i)
 		{
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index bec5eba5033..532f8dbef6c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,6 +1,11 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-01-28  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/93418
+	* gcc.target/i386/avx2-pr93418.c: New test.
+
 	2020-01-23  Jakub Jelinek  <jakub@redhat.com>
 
 	PR rtl-optimization/93402
diff --git a/gcc/testsuite/gcc.target/i386/avx2-pr93418.c b/gcc/testsuite/gcc.target/i386/avx2-pr93418.c
new file mode 100644
index 00000000000..67ed33ddf9d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx2-pr93418.c
@@ -0,0 +1,20 @@
+/* PR target/93418 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "link_error" "optimized" } } */
+
+#include <x86intrin.h>
+
+void link_error (void);
+
+void
+foo (void)
+{
+  __m128i a = _mm_set1_epi32 (0xffffffffU);
+  __m128i b = _mm_setr_epi32 (16, 31, -34, 3);
+  __m128i c = _mm_sllv_epi32 (a, b);
+  __v4su d = (__v4su) c;
+  if (d[0] != 0xffff0000U || d[1] != 0x80000000U
+      || d[2] != 0 || d[3] != 0xfffffff8U)
+    link_error ();
+}
-- 
2.20.1
From 244f4b8c2823531a1e479a3773272af539dda258 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Wed, 29 Jan 2020 09:39:16 +0100
Subject: [PATCH 03/15] openmp: Handle rest of EXEC_OACC_* in
 oacc_code_to_statement [PR93463]

As the testcase shows, some EXEC_OACC_* codes weren't handled in
oacc_code_to_statement.  Fixed thusly.

2020-01-29  Jakub Jelinek  <jakub@redhat.com>

	PR fortran/93463
	* openmp.c (oacc_code_to_statement): Handle
	EXEC_OACC_{ROUTINE,UPDATE,WAIT,CACHE,{ENTER,EXIT}_DATA,DECLARE}.

	* gfortran.dg/goacc/pr93463.f90: New test.
---
 gcc/fortran/ChangeLog                       |  9 +++++++++
 gcc/fortran/openmp.c                        | 14 ++++++++++++++
 gcc/testsuite/ChangeLog                     |  5 +++++
 gcc/testsuite/gfortran.dg/goacc/pr93463.f90 | 15 +++++++++++++++
 4 files changed, 43 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/pr93463.f90

diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index 0f5b7e60a50..f31e052b10a 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,3 +1,12 @@
+2020-02-13  Jakub Jelinek  <jakub@redhat.com>
+
+	Backported from mainline
+	2020-01-29  Jakub Jelinek  <jakub@redhat.com>
+
+	PR fortran/93463
+	* openmp.c (oacc_code_to_statement): Handle
+	EXEC_OACC_{ROUTINE,UPDATE,WAIT,CACHE,{ENTER,EXIT}_DATA,DECLARE}.
+
 2020-02-03  Tobias Burnus  <tobias@codesourcery.com>
 
 	Backported from mainline
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 716dd5ec3e2..83b1c4487de 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -5858,6 +5858,20 @@ oacc_code_to_statement (gfc_code *code)
       return ST_OACC_LOOP;
     case EXEC_OACC_ATOMIC:
       return ST_OACC_ATOMIC;
+    case EXEC_OACC_ROUTINE:
+      return ST_OACC_ROUTINE;
+    case EXEC_OACC_UPDATE:
+      return ST_OACC_UPDATE;
+    case EXEC_OACC_WAIT:
+      return ST_OACC_WAIT;
+    case EXEC_OACC_CACHE:
+      return ST_OACC_CACHE;
+    case EXEC_OACC_ENTER_DATA:
+      return ST_OACC_ENTER_DATA;
+    case EXEC_OACC_EXIT_DATA:
+      return ST_OACC_EXIT_DATA;
+    case EXEC_OACC_DECLARE:
+      return ST_OACC_DECLARE;
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 532f8dbef6c..b5165efbc35 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,6 +1,11 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-01-29  Jakub Jelinek  <jakub@redhat.com>
+
+	PR fortran/93463
+	* gfortran.dg/goacc/pr93463.f90: New test.
+
 	2020-01-28  Jakub Jelinek  <jakub@redhat.com>
 
 	PR target/93418
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr93463.f90 b/gcc/testsuite/gfortran.dg/goacc/pr93463.f90
new file mode 100644
index 00000000000..920892fdcda
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/pr93463.f90
@@ -0,0 +1,15 @@
+! PR fortran/93463
+! { dg-do compile { target fopenmp } }
+! { dg-additional-options "-fopenmp" }
+
+program pr93463
+   integer :: i, x, y, z
+   !$omp parallel do
+   do i = 1, 4
+      !$acc enter data create(x)	! { dg-error "ACC ENTER DATA directive cannot be specified within" }
+      !$acc exit data copyout(x)	! { dg-error "ACC EXIT DATA directive cannot be specified within" }
+      !$acc cache(y)			! { dg-error "ACC CACHE directive cannot be specified within" }
+      !$acc wait(1)			! { dg-error "ACC WAIT directive cannot be specified within" }
+      !$acc update self(z)		! { dg-error "ACC UPDATE directive cannot be specified within" }
+   end do
+end
-- 
2.20.1
From 4b124e3c9c35121969cc23d0aea4bcb2c406fd21 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Wed, 29 Jan 2020 09:41:42 +0100
Subject: [PATCH 04/15] openmp: c++: Consider typeinfo decls to be
 predetermined shared [PR91118]

If the typeinfo decls appear in OpenMP default(none) regions, as we no longer
predetermine const with no mutable members, they are diagnosed as errors,
but it isn't something the users can actually provide explicit sharing for in
the clauses.

2020-01-29  Jakub Jelinek  <jakub@redhat.com>

	PR c++/91118
	* cp-gimplify.c (cxx_omp_predetermined_sharing): Return
	OMP_CLAUSE_DEFAULT_SHARED for typeinfo decls.

	* g++.dg/gomp/pr91118-1.C: New test.
	* g++.dg/gomp/pr91118-2.C: New test.
---
 gcc/cp/ChangeLog                      |  9 +++++++++
 gcc/cp/cp-gimplify.c                  |  4 ++++
 gcc/testsuite/ChangeLog               |  4 ++++
 gcc/testsuite/g++.dg/gomp/pr91118-1.C | 12 ++++++++++++
 gcc/testsuite/g++.dg/gomp/pr91118-2.C | 14 ++++++++++++++
 5 files changed, 43 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/gomp/pr91118-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/pr91118-2.C

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 64ca338029b..31dee033f6e 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,12 @@
+2020-02-13  Jakub Jelinek  <jakub@redhat.com>
+
+	Backported from mainline
+	2020-01-29  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/91118
+	* cp-gimplify.c (cxx_omp_predetermined_sharing): Return
+	OMP_CLAUSE_DEFAULT_SHARED for typeinfo decls.
+
 2020-01-28  Jason Merrill  <jason@redhat.com>
 
 	PR c++/90546
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index a7121b70a3b..90a315003d4 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -2107,6 +2107,10 @@ cxx_omp_predetermined_sharing (tree decl)
 	   && DECL_OMP_PRIVATIZED_MEMBER (decl)))
     return OMP_CLAUSE_DEFAULT_SHARED;
 
+  /* Similarly for typeinfo symbols.  */
+  if (VAR_P (decl) && DECL_ARTIFICIAL (decl) && DECL_TINFO_P (decl))
+    return OMP_CLAUSE_DEFAULT_SHARED;
+
   return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
 }
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index b5165efbc35..62df3f97f49 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -3,6 +3,10 @@
 	Backported from mainline
 	2020-01-29  Jakub Jelinek  <jakub@redhat.com>
 
+	PR c++/91118
+	* g++.dg/gomp/pr91118-1.C: New test.
+	* g++.dg/gomp/pr91118-2.C: New test.
+
 	PR fortran/93463
 	* gfortran.dg/goacc/pr93463.f90: New test.
 
diff --git a/gcc/testsuite/g++.dg/gomp/pr91118-1.C b/gcc/testsuite/g++.dg/gomp/pr91118-1.C
new file mode 100644
index 00000000000..f29d69db084
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/pr91118-1.C
@@ -0,0 +1,12 @@
+// PR c++/91118
+// { dg-do compile }
+// { dg-additional-options "-fsanitize=undefined" }
+
+#include <iostream>
+
+void
+foo ()
+{
+#pragma omp parallel default(none) shared(std::cerr)
+  std::cerr << "hello" << std::endl;
+}
diff --git a/gcc/testsuite/g++.dg/gomp/pr91118-2.C b/gcc/testsuite/g++.dg/gomp/pr91118-2.C
new file mode 100644
index 00000000000..80f1e3e45c4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/pr91118-2.C
@@ -0,0 +1,14 @@
+// PR c++/91118
+// { dg-do compile }
+
+#include <typeinfo>
+
+struct S { virtual ~S (); };
+void bar (const std::type_info &, const std::type_info &);
+
+void
+foo (S *p)
+{
+  #pragma omp parallel default (none) firstprivate (p)
+    bar (typeid (*p), typeid (S));
+}
-- 
2.20.1
From 329475795c6eeaa2b122672091c9119b9d6c5564 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Thu, 30 Jan 2020 21:28:17 +0100
Subject: [PATCH 05/15] combine: Punt on out of range rotate counts [PR93505]

What happens on this testcase is with the out of bounds rotate we get:
Trying 13 -> 16:
   13: r129:SI=r132:DI#0<-<0x20
      REG_DEAD r132:DI
   16: r123:DI=r129:SI<0
      REG_DEAD r129:SI
Successfully matched this instruction:
(set (reg/v:DI 123 [ <retval> ])
    (const_int 0 [0]))
during combine.  So, perhaps we could also change simplify-rtx.c to punt
if it is out of bounds rather than trying to optimize anything.
Or, but probably GCC11 material, if we decide that ROTATE/ROTATERT doesn't
have out of bounds counts or introduce targetm.rotate_truncation_mask,
we should truncate the argument instead of punting.
Punting is better for backports though.

2020-01-30  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/93505
	* combine.c (simplify_comparison) <case ROTATE>: Punt on out of range
	rotate counts.

	* gcc.c-torture/compile/pr93505.c: New test.
---
 gcc/ChangeLog                                 |  6 ++++++
 gcc/combine.c                                 |  3 ++-
 gcc/testsuite/ChangeLog                       |  5 +++++
 gcc/testsuite/gcc.c-torture/compile/pr93505.c | 15 +++++++++++++++
 4 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr93505.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ca09488b59d..0d8e888ec3a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,6 +1,12 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-01-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/93505
+	* combine.c (simplify_comparison) <case ROTATE>: Punt on out of range
+	rotate counts.
+
 	2020-01-28  Jakub Jelinek  <jakub@redhat.com>
 
 	PR target/93418
diff --git a/gcc/combine.c b/gcc/combine.c
index 4de759a8e6b..601943d6bb0 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -12424,7 +12424,8 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
 	     bit.  This will be converted into a ZERO_EXTRACT.  */
 	  if (const_op == 0 && sign_bit_comparison_p
 	      && CONST_INT_P (XEXP (op0, 1))
-	      && mode_width <= HOST_BITS_PER_WIDE_INT)
+	      && mode_width <= HOST_BITS_PER_WIDE_INT
+	      && UINTVAL (XEXP (op0, 1)) < mode_width)
 	    {
 	      op0 = simplify_and_const_int (NULL_RTX, mode, XEXP (op0, 0),
 					    (HOST_WIDE_INT_1U
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 62df3f97f49..384f34a41ca 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,6 +1,11 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-01-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/93505
+	* gcc.c-torture/compile/pr93505.c: New test.
+
 	2020-01-29  Jakub Jelinek  <jakub@redhat.com>
 
 	PR c++/91118
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr93505.c b/gcc/testsuite/gcc.c-torture/compile/pr93505.c
new file mode 100644
index 00000000000..0627962eae5
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr93505.c
@@ -0,0 +1,15 @@
+/* PR middle-end/93505 */
+
+unsigned a;
+
+unsigned
+foo (unsigned x)
+{
+  unsigned int y = 32 - __builtin_bswap64 (-a);
+  /* This would be UB (x << 32) at runtime.  Ensure we don't
+     invoke UB in the compiler because of that (visible with
+     bootstrap-ubsan).  */
+  x = x << y | x >> (-y & 31);
+  x >>= 31;
+  return x;
+}
-- 
2.20.1
From d42f9eaa3e189d4228a4b3a63d02b83fed6385e7 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Wed, 5 Feb 2020 11:32:37 +0100
Subject: [PATCH 06/15] openmp: Avoid ICEs with declare simd; declare simd
 inbranch [PR93555]

The testcases ICE because when processing the declare simd inbranch,
we don't create the i == 0 clone as it already exists, which means
clone_info->nargs is not adjusted, but we then rely on it being adjusted
when trying other clones.

2020-02-05  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/93555
	* omp-simd-clone.c (expand_simd_clones): If simd_clone_mangle or
	simd_clone_create failed when i == 0, adjust clone->nargs by
	clone->inbranch.

	* c-c++-common/gomp/pr93555-1.c: New test.
	* c-c++-common/gomp/pr93555-2.c: New test.
	* gfortran.dg/gomp/pr93555.f90: New test.
---
 gcc/ChangeLog                               |  7 +++++++
 gcc/omp-simd-clone.c                        | 12 ++++++++++--
 gcc/testsuite/ChangeLog                     |  7 +++++++
 gcc/testsuite/c-c++-common/gomp/pr93555-1.c | 18 ++++++++++++++++++
 gcc/testsuite/c-c++-common/gomp/pr93555-2.c | 16 ++++++++++++++++
 gcc/testsuite/gfortran.dg/gomp/pr93555.f90  | 11 +++++++++++
 6 files changed, 69 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/pr93555-1.c
 create mode 100644 gcc/testsuite/c-c++-common/gomp/pr93555-2.c
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/pr93555.f90

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 0d8e888ec3a..a4435086f02 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,6 +1,13 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-05  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/93555
+	* omp-simd-clone.c (expand_simd_clones): If simd_clone_mangle or
+	simd_clone_create failed when i == 0, adjust clone->nargs by
+	clone->inbranch.
+
 	2020-01-30  Jakub Jelinek  <jakub@redhat.com>
 
 	PR middle-end/93505
diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
index 845443efc59..e865828f569 100644
--- a/gcc/omp-simd-clone.c
+++ b/gcc/omp-simd-clone.c
@@ -1703,14 +1703,22 @@ expand_simd_clones (struct cgraph_node *node)
 	     already.  */
 	  tree id = simd_clone_mangle (node, clone);
 	  if (id == NULL_TREE)
-	    continue;
+	    {
+	      if (i == 0)
+		clone->nargs += clone->inbranch;
+	      continue;
+	    }
 
 	  /* Only when we are sure we want to create the clone actually
 	     clone the function (or definitions) or create another
 	     extern FUNCTION_DECL (for prototypes without definitions).  */
 	  struct cgraph_node *n = simd_clone_create (node);
 	  if (n == NULL)
-	    continue;
+	    {
+	      if (i == 0)
+		clone->nargs += clone->inbranch;
+	      continue;
+	    }
 
 	  n->simdclone = clone;
 	  clone->origin = node;
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 384f34a41ca..96390af9e12 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,6 +1,13 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-05  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/93555
+	* c-c++-common/gomp/pr93555-1.c: New test.
+	* c-c++-common/gomp/pr93555-2.c: New test.
+	* gfortran.dg/gomp/pr93555.f90: New test.
+
 	2020-01-30  Jakub Jelinek  <jakub@redhat.com>
 
 	PR middle-end/93505
diff --git a/gcc/testsuite/c-c++-common/gomp/pr93555-1.c b/gcc/testsuite/c-c++-common/gomp/pr93555-1.c
new file mode 100644
index 00000000000..2eb76a2d9de
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/pr93555-1.c
@@ -0,0 +1,18 @@
+/* PR middle-end/93555 */
+/* { dg-do compile } */
+
+#pragma omp declare simd
+#pragma omp declare simd inbranch
+int
+foo (int x)
+{
+  return x;
+}
+
+#pragma omp declare simd inbranch
+#pragma omp declare simd
+int
+bar (int x)
+{
+  return x;
+}
diff --git a/gcc/testsuite/c-c++-common/gomp/pr93555-2.c b/gcc/testsuite/c-c++-common/gomp/pr93555-2.c
new file mode 100644
index 00000000000..091f5bd5ff1
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/pr93555-2.c
@@ -0,0 +1,16 @@
+/* PR middle-end/93555 */
+/* { dg-do compile } */
+
+#pragma omp declare simd
+#pragma omp declare simd inbranch
+void
+foo (void)
+{
+}
+
+#pragma omp declare simd inbranch
+#pragma omp declare simd
+void
+bar (void)
+{
+}
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr93555.f90 b/gcc/testsuite/gfortran.dg/gomp/pr93555.f90
new file mode 100644
index 00000000000..4a97fee07a7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/pr93555.f90
@@ -0,0 +1,11 @@
+! PR middle-end/93555
+! { dg-do compile }
+
+subroutine foo
+  !$omp declare simd(foo)
+  !$omp declare simd(foo) inbranch
+end
+subroutine bar
+  !$omp declare simd(bar) inbranch
+  !$omp declare simd(bar)
+end
-- 
2.20.1
From 520b364da0b20dcb492229757190cc3f30322052 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Wed, 5 Feb 2020 23:35:08 +0100
Subject: [PATCH 07/15] c++: Mark __builtin_convertvector operand as read
 [PR93557]

In C++ we weren't calling mark_exp_read on the __builtin_convertvector first
argument.  I guess it could misbehave even with lambda implicit captures.

Fixed by calling decay_conversion on the argument, we use the argument as
rvalue so we want the standard lvalue to rvalue conversions, but as the
argument must be a vector type, e.g. integral promotions aren't really
needed.

2020-02-05  Jakub Jelinek  <jakub@redhat.com>

	PR c++/93557
	* semantics.c (cp_build_vec_convert): Call decay_conversion on arg
	prior to passing it to c_build_vec_convert.

	* c-c++-common/Wunused-var-17.c: New test.
---
 gcc/cp/ChangeLog                            |  6 ++++++
 gcc/cp/semantics.c                          |  3 ++-
 gcc/testsuite/ChangeLog                     |  3 +++
 gcc/testsuite/c-c++-common/Wunused-var-17.c | 19 +++++++++++++++++++
 4 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/Wunused-var-17.c

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 31dee033f6e..d9bb3b5d75b 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,6 +1,12 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-05  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/93557
+	* semantics.c (cp_build_vec_convert): Call decay_conversion on arg
+	prior to passing it to c_build_vec_convert.
+
 	2020-01-29  Jakub Jelinek  <jakub@redhat.com>
 
 	PR c++/91118
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 1bd014696fc..0c727eaf2e7 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -10036,7 +10036,8 @@ cp_build_vec_convert (tree arg, location_t loc, tree type,
 
   tree ret = NULL_TREE;
   if (!type_dependent_expression_p (arg) && !dependent_type_p (type))
-    ret = c_build_vec_convert (cp_expr_loc_or_loc (arg, input_location), arg,
+    ret = c_build_vec_convert (cp_expr_loc_or_loc (arg, input_location),
+			       decay_conversion (arg, complain),
 			       loc, type, (complain & tf_error) != 0);
 
   if (!processing_template_decl)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 96390af9e12..2cfc06f5605 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -3,6 +3,9 @@
 	Backported from mainline
 	2020-02-05  Jakub Jelinek  <jakub@redhat.com>
 
+	PR c++/93557
+	* c-c++-common/Wunused-var-17.c: New test.
+
 	PR middle-end/93555
 	* c-c++-common/gomp/pr93555-1.c: New test.
 	* c-c++-common/gomp/pr93555-2.c: New test.
diff --git a/gcc/testsuite/c-c++-common/Wunused-var-17.c b/gcc/testsuite/c-c++-common/Wunused-var-17.c
new file mode 100644
index 00000000000..ab995f8b674
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wunused-var-17.c
@@ -0,0 +1,19 @@
+/* PR c++/93557 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wunused-but-set-variable" } */
+
+typedef int VI __attribute__((vector_size (sizeof (int) * 4)));
+typedef float VF __attribute__((vector_size (sizeof (float) * 4)));
+
+void
+foo (VI *p, VF *q)
+{
+  VI a = (VI) { 1, 2, 3, 4 };			/* { dg-bogus "set but not used" } */
+  q[0] = __builtin_convertvector (a, VF);
+  VI b = p[1];					/* { dg-bogus "set but not used" } */
+  q[1] = __builtin_convertvector (b, VF);
+  VF c = (VF) { 5.0f, 6.0f, 7.0f, 8.0f };	/* { dg-bogus "set but not used" } */
+  p[2] = __builtin_convertvector (c, VI);
+  VF d = q[3];					/* { dg-bogus "set but not used" } */
+  p[3] = __builtin_convertvector (d, VI);
+}
-- 
2.20.1
From d3266b1311723841ec553277f1fb6bfddef8809d Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Thu, 6 Feb 2020 09:15:13 +0100
Subject: [PATCH 08/15] openmp: Notice reduction decl in outer contexts after
 adding it to shared [PR93515]

If we call omp_add_variable, following omp_notice_variable will already find it
on that construct and not go through outer constructs, the following patch fixes that.
Note, this still doesn't follow OpenMP 5.0 semantics on target combined with other
constructs with reduction/lastprivate/linear clauses, will handle that for GCC11.

2020-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/93515
	* gimplify.c (gimplify_scan_omp_clauses) <do_notice>: If adding
	shared clause, call omp_notice_variable on outer context if any.
---
 gcc/ChangeLog  |  6 ++++++
 gcc/gimplify.c | 10 +++++++---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a4435086f02..bc199e5956f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,6 +1,12 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-06  Jakub Jelinek  <jakub@redhat.com>
+
+	PR libgomp/93515
+	* gimplify.c (gimplify_scan_omp_clauses) <do_notice>: If adding
+	shared clause, call omp_notice_variable on outer context if any.
+
 	2020-02-05  Jakub Jelinek  <jakub@redhat.com>
 
 	PR middle-end/93555
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index a0cb6c402bc..c57113cda1d 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -9081,9 +9081,13 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 				  == POINTER_TYPE))))
 		    omp_firstprivatize_variable (outer_ctx, decl);
 		  else
-		    omp_add_variable (outer_ctx, decl,
-				      GOVD_SEEN | GOVD_SHARED);
-		  omp_notice_variable (outer_ctx, decl, true);
+		    {
+		      omp_add_variable (outer_ctx, decl,
+					GOVD_SEEN | GOVD_SHARED);
+		      if (outer_ctx->outer_context)
+			omp_notice_variable (outer_ctx->outer_context, decl,
+					     true);
+		    }
 		}
 	    }
 	  if (outer_ctx)
-- 
2.20.1
From 05fa0de35ec63db2c3aacd30cc34a7389b3c4e5d Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Thu, 6 Feb 2020 09:19:08 +0100
Subject: [PATCH 09/15] openmp: Fix handling of non-addressable shared scalars
 in parallel nested inside of target [PR93515]

As the following testcase shows, we need to consider even target to be a construct
that forces not to use copy in/out for shared on parallel inside of the target.
E.g. for parallel nested inside another parallel or host teams, we already avoid
copy in/out and we need to treat target the same.

2020-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/93515
	* omp-low.c (use_pointer_for_field): For nested constructs, also
	look for map clauses on target construct.
	(scan_omp_1_stmt) <case GIMPLE_OMP_TARGET>: Bump temporarily
	taskreg_nesting_level.

	* testsuite/libgomp.c-c++-common/pr93515.c: New test.
---
 gcc/ChangeLog                                 |  6 ++++
 gcc/omp-low.c                                 | 33 +++++++++++++----
 libgomp/ChangeLog                             |  8 +++++
 .../testsuite/libgomp.c-c++-common/pr93515.c  | 36 +++++++++++++++++++
 4 files changed, 76 insertions(+), 7 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/pr93515.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index bc199e5956f..e9ce10c2870 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -3,6 +3,12 @@
 	Backported from mainline
 	2020-02-06  Jakub Jelinek  <jakub@redhat.com>
 
+	PR libgomp/93515
+	* omp-low.c (use_pointer_for_field): For nested constructs, also
+	look for map clauses on target construct.
+	(scan_omp_1_stmt) <case GIMPLE_OMP_TARGET>: Bump temporarily
+	taskreg_nesting_level.
+
 	PR libgomp/93515
 	* gimplify.c (gimplify_scan_omp_clauses) <do_notice>: If adding
 	shared clause, call omp_notice_variable on outer context if any.
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 81280296a24..813cefd69b9 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -444,18 +444,30 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx)
 	  omp_context *up;
 
 	  for (up = shared_ctx->outer; up; up = up->outer)
-	    if (is_taskreg_ctx (up) && maybe_lookup_decl (decl, up))
+	    if ((is_taskreg_ctx (up)
+		 || (gimple_code (up->stmt) == GIMPLE_OMP_TARGET
+		     && is_gimple_omp_offloaded (up->stmt)))
+		&& maybe_lookup_decl (decl, up))
 	      break;
 
 	  if (up)
 	    {
 	      tree c;
 
-	      for (c = gimple_omp_taskreg_clauses (up->stmt);
-		   c; c = OMP_CLAUSE_CHAIN (c))
-		if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_SHARED
-		    && OMP_CLAUSE_DECL (c) == decl)
-		  break;
+	      if (gimple_code (up->stmt) == GIMPLE_OMP_TARGET)
+		{
+		  for (c = gimple_omp_target_clauses (up->stmt);
+		       c; c = OMP_CLAUSE_CHAIN (c))
+		    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+			&& OMP_CLAUSE_DECL (c) == decl)
+		      break;
+		}
+	      else
+		for (c = gimple_omp_taskreg_clauses (up->stmt);
+		     c; c = OMP_CLAUSE_CHAIN (c))
+		  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_SHARED
+		      && OMP_CLAUSE_DECL (c) == decl)
+		    break;
 
 	      if (c)
 		goto maybe_mark_addressable_and_ret;
@@ -3348,7 +3360,14 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p,
       break;
 
     case GIMPLE_OMP_TARGET:
-      scan_omp_target (as_a <gomp_target *> (stmt), ctx);
+      if (is_gimple_omp_offloaded (stmt))
+	{
+	  taskreg_nesting_level++;
+	  scan_omp_target (as_a <gomp_target *> (stmt), ctx);
+	  taskreg_nesting_level--;
+	}
+      else
+	scan_omp_target (as_a <gomp_target *> (stmt), ctx);
       break;
 
     case GIMPLE_OMP_TEAMS:
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 86af79021ed..b90fddf3ee2 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,11 @@
+2020-02-13  Jakub Jelinek  <jakub@redhat.com>
+
+	Backported from mainline
+	2020-02-06  Jakub Jelinek  <jakub@redhat.com>
+
+	PR libgomp/93515
+	* testsuite/libgomp.c-c++-common/pr93515.c: New test.
+
 2020-01-22  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
diff --git a/libgomp/testsuite/libgomp.c-c++-common/pr93515.c b/libgomp/testsuite/libgomp.c-c++-common/pr93515.c
new file mode 100644
index 00000000000..8a69088ccec
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/pr93515.c
@@ -0,0 +1,36 @@
+/* PR libgomp/93515 */
+
+#include <omp.h>
+#include <stdlib.h>
+
+int
+main ()
+{
+  int i;
+  int a = 42;
+#pragma omp target teams distribute parallel for defaultmap(tofrom: scalar)
+  for (i = 0; i < 64; ++i)
+    if (omp_get_team_num () == 0)
+      if (omp_get_thread_num () == 0)
+	a = 142;
+  if (a != 142)
+    __builtin_abort ();
+  a = 42;
+#pragma omp target parallel for defaultmap(tofrom: scalar)
+  for (i = 0; i < 64; ++i)
+    if (omp_get_thread_num () == 0)
+      a = 143;
+  if (a != 143)
+    __builtin_abort ();
+  a = 42;
+#pragma omp target firstprivate(a)
+  {
+    #pragma omp parallel for
+    for (i = 0; i < 64; ++i)
+      if (omp_get_thread_num () == 0)
+	a = 144;
+    if (a != 144)
+      abort ();
+  }
+  return 0;
+}
-- 
2.20.1
From a91e5d88970c8d865a49f2a4ed4e17ee2c58b73f Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Sat, 8 Feb 2020 10:59:40 +0100
Subject: [PATCH 10/15] i386: Make xmm16-xmm31 call used even in ms ABI
 [PR65782]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
> I guess that Comment #9 patch form the PR should be trivially correct,

> but althouhg it looks obvious, I don't want to propose the patch since

> I have no means of testing it.


I don't have means of testing it either.
https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
128-bits only) are call preserved.

We are talking e.g. about
/* { dg-options "-O2 -mabi=ms -mavx512vl" } */

typedef double V __attribute__((vector_size (16)));
void foo (void);
V bar (void);
void baz (V);
void
qux (void)
{
  V c;
  {
    register V a __asm ("xmm18");
    V b = bar ();
    asm ("" : "=x" (a) : "0" (b));
    c = a;
  }
  foo ();
  {
    register V d __asm ("xmm18");
    V e;
    d = c;
    asm ("" : "=x" (e) : "0" (d));
    baz (e);
  }
}
where according to the MSDN doc gcc incorrectly holds the c value
in xmm18 register across the foo call; if foo is compiled by some Microsoft
compiler (or LLVM), then it could clobber %xmm18.
If all xmm18 occurrences are changed to say xmm15, then it is valid to hold
the 128-bit value across the foo call (though, surprisingly, LLVM saves it
into stack anyway).

The other parts are I guess mainly about SEH.  Consider e.g.
void
foo (void)
{
  register double x __asm ("xmm14");
  register double y __asm ("xmm18");
  asm ("" : "=x" (x));
  asm ("" : "=v" (y));
  x += y;
  y += x;
  asm ("" : : "x" (x));
  asm ("" : : "v" (y));
}
looking at cross-compiler output, with -O2 -mavx512f this emits
	.file	"abcdeq.c"
	.text
	.align 16
	.globl	foo
	.def	foo;	.scl	2;	.type	32;	.endef
	.seh_proc	foo
foo:
	subq	$40, %rsp
	.seh_stackalloc	40
	vmovaps %xmm14,	(%rsp)
	.seh_savexmm	%xmm14, 0
	vmovaps %xmm18,	16(%rsp)
	.seh_savexmm	%xmm18, 16
	.seh_endprologue
	vaddsd	%xmm18, %xmm14, %xmm14
	vaddsd	%xmm18, %xmm14, %xmm18
	vmovaps	(%rsp), %xmm14
	vmovaps	16(%rsp), %xmm18
	addq	$40, %rsp
	ret
	.seh_endproc
	.ident	"GCC: (GNU) 10.0.1 20200207 (experimental)"
Does whatever assembler mingw64 uses even assemble this (I mean the
.seh_savexmm %xmm16, 16 could be problematic)?
I can find e.g.
https://stackoverflow.com/questions/43152633/invalid-register-for-seh-savexmm-in-cygwin/43210527
which then links to
https://gcc.gnu.org/PR65782

2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
	    Jakub Jelinek  <jakub@redhat.com>

	PR target/65782
	* config/i386/i386.h (CALL_USED_REGISTERS): Make
	xmm16-xmm31 call-used even in 64-bit ms-abi.

	* gcc.target/i386/pr65782.c: New test.

Co-authored-by: Uroš Bizjak <ubizjak@gmail.com>
---
 gcc/ChangeLog                           |  7 +++++++
 gcc/config/i386/i386.h                  |  4 ++--
 gcc/testsuite/ChangeLog                 |  6 ++++++
 gcc/testsuite/gcc.target/i386/pr65782.c | 16 ++++++++++++++++
 4 files changed, 31 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr65782.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e9ce10c2870..72c8ee6bd67 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,6 +1,13 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
+		    Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/65782
+	* config/i386/i386.h (CALL_USED_REGISTERS): Make
+	xmm16-xmm31 call-used even in 64-bit ms-abi.
+
 	2020-02-06  Jakub Jelinek  <jakub@redhat.com>
 
 	PR libgomp/93515
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 95e1733f12a..14e5a392f62 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1082,9 +1082,9 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
 /*xmm8,xmm9,xmm10,xmm11,xmm12,xmm13,xmm14,xmm15*/		\
      6,   6,    6,    6,    6,    6,    6,    6,		\
 /*xmm16,xmm17,xmm18,xmm19,xmm20,xmm21,xmm22,xmm23*/		\
-     6,    6,     6,    6,    6,    6,    6,    6,		\
+     1,    1,     1,    1,    1,    1,    1,    1,		\
 /*xmm24,xmm25,xmm26,xmm27,xmm28,xmm29,xmm30,xmm31*/		\
-     6,    6,     6,    6,    6,    6,    6,    6,		\
+     1,    1,     1,    1,    1,    1,    1,    1,		\
  /* k0,  k1,  k2,  k3,  k4,  k5,  k6,  k7*/			\
      1,   1,   1,   1,   1,   1,   1,   1 }
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 2cfc06f5605..c7b8e6a585a 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,6 +1,12 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
+		    Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/65782
+	* gcc.target/i386/pr65782.c: New test.
+
 	2020-02-05  Jakub Jelinek  <jakub@redhat.com>
 
 	PR c++/93557
diff --git a/gcc/testsuite/gcc.target/i386/pr65782.c b/gcc/testsuite/gcc.target/i386/pr65782.c
new file mode 100644
index 00000000000..298dca1be97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65782.c
@@ -0,0 +1,16 @@
+/* PR target/65782 */
+/* { dg-do assemble { target { avx512vl && { ! ia32 } } } } */
+/* { dg-options "-O2 -mavx512vl" } */
+
+void
+foo (void)
+{
+  register double x __asm ("xmm14");
+  register double y __asm ("xmm18");
+  asm ("" : "=x" (x));
+  asm ("" : "=v" (y));
+  x += y;
+  y += x;
+  asm ("" : : "x" (x));
+  asm ("" : : "v" (y));
+}
-- 
2.20.1
From b7cbce7a174292adc7c9d6db81bba6922a591d69 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Mon, 10 Feb 2020 22:44:40 +0100
Subject: [PATCH 11/15] i386: Fix -mavx -mno-mavx2 ICE with VEC_COND_EXPR
 [PR93637]

As mentioned in the PR, for -mavx -mno-avx2 the backend does support
vcondv4div4df and vcondv8siv8sf optabs (while generally 32-byte vectors
aren't much supported in that case, it is performed using
vandps/vandnps/vorps).  The problem is that after the last generic vector
lowering (where the VEC_COND_EXPR still compares two V4DF vectors and
has two V4DI last operands and V4DI result and so is considered ok) fre4
folds the condition into constant, at which point the middle-end during
expansion will try vcond_mask_optab and fall back to trying to expand it
as the constant vector < 0 vcondv4div4di, but neither of them is supported
for -mavx -mno-avx2 and thus we ICE.

So, the options I see is either what the following patch does, also support
vcond_mask_v4div4di and vcond_mask_v4siv4si already for TARGET_AVX, or
require for vcondv4div4df and vcondv8siv8sf TARGET_AVX2 rather than current
TARGET_AVX.

2020-02-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/93637
	* config/i386/sse.md (VI_256_AVX2): New mode iterator.
	(vcond_mask_<mode><sseintvecmodelower>): Use it instead of VI_256.
	Change condition from TARGET_AVX2 to TARGET_AVX.

	* gcc.target/i386/avx-pr93637.c: New test.
---
 gcc/ChangeLog                               |  7 +++++++
 gcc/config/i386/sse.md                      | 16 +++++++++++-----
 gcc/testsuite/ChangeLog                     |  5 +++++
 gcc/testsuite/gcc.target/i386/avx-pr93637.c | 17 +++++++++++++++++
 4 files changed, 40 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx-pr93637.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 72c8ee6bd67..34091354380 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,6 +1,13 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-10  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/93637
+	* config/i386/sse.md (VI_256_AVX2): New mode iterator.
+	(vcond_mask_<mode><sseintvecmodelower>): Use it instead of VI_256.
+	Change condition from TARGET_AVX2 to TARGET_AVX.
+
 	2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
 		    Jakub Jelinek  <jakub@redhat.com>
 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 659cbff22f3..c0fe0eefd50 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -3180,13 +3180,19 @@
 	  (match_operand:<avx512fmaskmode> 3 "register_operand")))]
   "TARGET_AVX512BW")
 
+;; As vcondv4div4df and vcondv8siv8sf are enabled already with TARGET_AVX,
+;; and their condition can be folded late into a constant, we need to
+;; support vcond_mask_v4div4di and vcond_mask_v8siv8si for TARGET_AVX.
+(define_mode_iterator VI_256_AVX2 [(V32QI "TARGET_AVX2") (V16HI "TARGET_AVX2")
+				   V8SI V4DI])
+
 (define_expand "vcond_mask_<mode><sseintvecmodelower>"
-  [(set (match_operand:VI_256 0 "register_operand")
-	(vec_merge:VI_256
-	  (match_operand:VI_256 1 "nonimmediate_operand")
-	  (match_operand:VI_256 2 "nonimm_or_0_operand")
+  [(set (match_operand:VI_256_AVX2 0 "register_operand")
+	(vec_merge:VI_256_AVX2
+	  (match_operand:VI_256_AVX2 1 "nonimmediate_operand")
+	  (match_operand:VI_256_AVX2 2 "nonimm_or_0_operand")
 	  (match_operand:<sseintvecmode> 3 "register_operand")))]
-  "TARGET_AVX2"
+  "TARGET_AVX"
 {
   ix86_expand_sse_movcc (operands[0], operands[3],
 			 operands[1], operands[2]);
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index c7b8e6a585a..9ec4d50dac3 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,6 +1,11 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-10  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/93637
+	* gcc.target/i386/avx-pr93637.c: New test.
+
 	2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
 		    Jakub Jelinek  <jakub@redhat.com>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx-pr93637.c b/gcc/testsuite/gcc.target/i386/avx-pr93637.c
new file mode 100644
index 00000000000..9e7a0a7c9c1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx-pr93637.c
@@ -0,0 +1,17 @@
+/* PR target/93637 */
+/* { dg-do compile } */
+/* { dg-options "-mavx -mno-avx2 -O3 --param sccvn-max-alias-queries-per-access=3" } */
+
+double
+foo (void)
+{
+  int i;
+  double r = 7.0;
+  double a[] = { 0.0, 0.0, -0.0, 0.0, 0.0, -0.0, 1.0, 0.0, 0.0, -0.0, 1.0, 0.0, 1.0, 1.0 };
+
+  for (i = 0; i < sizeof (a) / sizeof (a[0]); ++i)
+    if (a[i] == 0.0)
+      r = a[i];
+
+  return r;
+}
-- 
2.20.1
From 20ac13c895c5abe7a350de0b664abf190aa28a16 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Wed, 12 Feb 2020 11:58:35 +0100
Subject: [PATCH 12/15] i386: Fix up vec_extract_lo* patterns [PR93670]

The VEXTRACT* insns have way too many different CPUID feature flags (ATT
syntax)
vextractf128 $imm, %ymm, %xmm/mem		AVX
vextracti128 $imm, %ymm, %xmm/mem		AVX2
vextract{f,i}32x4 $imm, %ymm, %xmm/mem {k}{z}	AVX512VL+AVX512F
vextract{f,i}32x4 $imm, %zmm, %xmm/mem {k}{z}	AVX512F
vextract{f,i}64x2 $imm, %ymm, %xmm/mem {k}{z}	AVX512VL+AVX512DQ
vextract{f,i}64x2 $imm, %zmm, %xmm/mem {k}{z}	AVX512DQ
vextract{f,i}32x8 $imm, %zmm, %ymm/mem {k}{z}	AVX512DQ
vextract{f,i}64x4 $imm, %zmm, %ymm/mem {k}{z}	AVX512F

As the testcase shows and the patch too, we didn't get it right in all
cases.

The first hunk is about avx512vl_vextractf128v8s[if] incorrectly
requiring TARGET_AVX512DQ.  The corresponding insn is the first
vextract{f,i}32x4 above, so it requires VL+F, and the builtins have it
correct (TARGET_AVX512VL implies TARGET_AVX512F):
BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8sf, "__builtin_ia32_extractf32x4_256_mask", IX86_BUILTIN_EXTRACTF32X4_256, UNKNOWN, (int) V4SF_FTYPE_V8SF_INT_V4SF_UQI)
BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8si, "__builtin_ia32_extracti32x4_256_mask", IX86_BUILTIN_EXTRACTI32X4_256, UNKNOWN, (int) V4SI_FTYPE_V8SI_INT_V4SI_UQI)
We only need TARGET_AVX512DQ for avx512vl_vextractf128v4d[if].

The second hunk is about vec_extract_lo_v16s[if]{,_mask}.  These are using
the vextract{f,i}32x8 insns (AVX512DQ above), but we weren't requiring that,
but instead incorrectly && 1 for non-masked and && (64 == 64 && TARGET_AVX512VL)
for masked insns.  This is extraction from ZMM, so it doesn't need VL for
anything.  The hunk actually only requires TARGET_AVX512DQ when the insn
is masked, if it is not masked, when TARGET_AVX512DQ isn't available we can
use vextract{f,i}64x4 instead which is available already in TARGET_AVX512F
and does the same thing, extracts the low 256 bits from 512 bits vector
(often we split it into just nothing, but there are some special cases like
when using xmm16+ when we can't without AVX512VL).

The last hunk is about vec_extract_lo_v8s[if]{,_mask}.  The non-_mask
suffixed ones are ok already and just split into nothing (lowpart subreg).
The masked ones were incorrectly requiring TARGET_AVX512VL and
TARGET_AVX512DQ, when we only need TARGET_AVX512VL.

2020-02-12  Jakub Jelinek  <jakub@redhat.com>

	PR target/93670
	* config/i386/sse.md (VI48F_256_DQ): New mode iterator.
	(avx512vl_vextractf128<mode>): Use it instead of VI48F_256.  Remove
	TARGET_AVX512DQ from condition.
	(vec_extract_lo_<mode><mask_name>): Use <mask_avx512dq_condition>
	instead of <mask_mode512bit_condition> in condition.  If
	TARGET_AVX512DQ is false, emit vextract*64x4 instead of
	vextract*32x8.
	(vec_extract_lo_<mode><mask_name>): Drop <mask_avx512dq_condition>
	from condition.

	* gcc.target/i386/avx512vl-pr93670.c: New test.
---
 gcc/ChangeLog                                 | 13 ++++
 gcc/config/i386/sse.md                        | 18 +++--
 gcc/testsuite/ChangeLog                       |  5 ++
 .../gcc.target/i386/avx512vl-pr93670.c        | 77 +++++++++++++++++++
 4 files changed, 108 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512vl-pr93670.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 34091354380..ed83af1fcd3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,6 +1,19 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-12  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/93670
+	* config/i386/sse.md (VI48F_256_DQ): New mode iterator.
+	(avx512vl_vextractf128<mode>): Use it instead of VI48F_256.  Remove
+	TARGET_AVX512DQ from condition.
+	(vec_extract_lo_<mode><mask_name>): Use <mask_avx512dq_condition>
+	instead of <mask_mode512bit_condition> in condition.  If
+	TARGET_AVX512DQ is false, emit vextract*64x4 instead of
+	vextract*32x8.
+	(vec_extract_lo_<mode><mask_name>): Drop <mask_avx512dq_condition>
+	from condition.
+
 	2020-02-10  Jakub Jelinek  <jakub@redhat.com>
 
 	PR target/93637
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c0fe0eefd50..043665f5e8b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8220,13 +8220,16 @@
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
+(define_mode_iterator VI48F_256_DQ
+  [V8SI V8SF (V4DI "TARGET_AVX512DQ") (V4DF "TARGET_AVX512DQ")])
+
 (define_expand "avx512vl_vextractf128<mode>"
   [(match_operand:<ssehalfvecmode> 0 "nonimmediate_operand")
-   (match_operand:VI48F_256 1 "register_operand")
+   (match_operand:VI48F_256_DQ 1 "register_operand")
    (match_operand:SI 2 "const_0_to_1_operand")
    (match_operand:<ssehalfvecmode> 3 "nonimm_or_0_operand")
    (match_operand:QI 4 "register_operand")]
-  "TARGET_AVX512DQ && TARGET_AVX512VL"
+  "TARGET_AVX512VL"
 {
   rtx (*insn)(rtx, rtx, rtx, rtx);
   rtx dest = operands[0];
@@ -8294,14 +8297,19 @@
                      (const_int 4) (const_int 5)
                      (const_int 6) (const_int 7)])))]
   "TARGET_AVX512F
-   && <mask_mode512bit_condition>
+   && <mask_avx512dq_condition>
    && (<mask_applied> || !(MEM_P (operands[0]) && MEM_P (operands[1])))"
 {
   if (<mask_applied>
       || (!TARGET_AVX512VL
 	  && !REG_P (operands[0])
 	  && EXT_REX_SSE_REG_P (operands[1])))
-    return "vextract<shuffletype>32x8\t{$0x0, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x0}";
+    {
+      if (TARGET_AVX512DQ)
+	return "vextract<shuffletype>32x8\t{$0x0, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x0}";
+      else
+	return "vextract<shuffletype>64x4\t{$0x0, %1, %0|%0, %1, 0x0}";
+    }
   else
     return "#";
 }
@@ -8411,7 +8419,7 @@
 	  (parallel [(const_int 0) (const_int 1)
 		     (const_int 2) (const_int 3)])))]
   "TARGET_AVX
-   && <mask_avx512vl_condition> && <mask_avx512dq_condition>
+   && <mask_avx512vl_condition>
    && (<mask_applied> || !(MEM_P (operands[0]) && MEM_P (operands[1])))"
 {
   if (<mask_applied>)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9ec4d50dac3..5894d94ece7 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,6 +1,11 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
 	Backported from mainline
+	2020-02-12  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/93670
+	* gcc.target/i386/avx512vl-pr93670.c: New test.
+
 	2020-02-10  Jakub Jelinek  <jakub@redhat.com>
 
 	PR target/93637
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr93670.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr93670.c
new file mode 100644
index 00000000000..3f232a96901
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-pr93670.c
@@ -0,0 +1,77 @@
+/* PR target/93670 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vl -mno-avx512dq" } */
+
+#include <x86intrin.h>
+
+__m128i
+f1 (__m256i x)
+{
+  return _mm256_extracti32x4_epi32 (x, 0);
+}
+
+__m128i
+f2 (__m256i x, __m128i w, __mmask8 m)
+{
+  return _mm256_mask_extracti32x4_epi32 (w, m, x, 0);
+}
+
+__m128i
+f3 (__m256i x, __mmask8 m)
+{
+  return _mm256_maskz_extracti32x4_epi32 (m, x, 0);
+}
+
+__m128
+f4 (__m256 x)
+{
+  return _mm256_extractf32x4_ps (x, 0);
+}
+
+__m128
+f5 (__m256 x, __m128 w, __mmask8 m)
+{
+  return _mm256_mask_extractf32x4_ps (w, m, x, 0);
+}
+
+__m128
+f6 (__m256 x, __mmask8 m)
+{
+  return _mm256_maskz_extractf32x4_ps (m, x, 0);
+}
+
+__m128i
+f7 (__m256i x)
+{
+  return _mm256_extracti32x4_epi32 (x, 1);
+}
+
+__m128i
+f8 (__m256i x, __m128i w, __mmask8 m)
+{
+  return _mm256_mask_extracti32x4_epi32 (w, m, x, 1);
+}
+
+__m128i
+f9 (__m256i x, __mmask8 m)
+{
+  return _mm256_maskz_extracti32x4_epi32 (m, x, 1);
+}
+
+__m128
+f10 (__m256 x)
+{
+  return _mm256_extractf32x4_ps (x, 1);
+}
+
+__m128
+f11 (__m256 x, __m128 w, __mmask8 m)
+{
+  return _mm256_mask_extractf32x4_ps (w, m, x, 1);
+}
+
+__m128
+f12 (__m256 x, __mmask8 m)
+{
+  return _mm256_maskz_extractf32x4_ps (m, x, 1);
+}
-- 
2.20.1
From 488a947b2ddd57a6f44a6aecc32862f8cbf4ec77 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Thu, 13 Feb 2020 08:17:07 +0100
Subject: [PATCH 13/15] i386: Fix k*shift* intrinsics [PR93673]

As mentioned in the PR, the intrinsics allow counts from 0 to 255, but
we actually reject values from 128 to 255.  That is because QImode
CONST_INTs can be only -128 to 127.  Fixed by using const_0_to_255_operand
and dropping the modes for the operands with those predicates
(the IL actually contains the CONST_INT which has VOIDmode).

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

	PR target/93673
	* config/i386/sse.md (k<code><mode>): Drop mode from last operand and
	use const_0_to_255_operand predicate instead of immediate_operand.
	(avx512dq_fpclass<mode><mask_scalar_merge_name>,
	avx512dq_vmfpclass<mode><mask_scalar_merge_name>,
	vgf2p8affineinvqb_<mode><mask_name>,
	vgf2p8affineqb_<mode><mask_name>): Drop mode from
	const_0_to_255_operand predicated operands.

	* gcc.target/i386/avx512f-pr93673.c: New test.
	* gcc.target/i386/avx512dq-pr93673.c: New test.
	* gcc.target/i386/avx512bw-pr93673.c: New test.
---
 gcc/ChangeLog                                 |  9 ++++++
 gcc/config/i386/sse.md                        | 10 +++----
 gcc/testsuite/ChangeLog                       |  5 ++++
 .../gcc.target/i386/avx512bw-pr93673.c        | 30 +++++++++++++++++++
 .../gcc.target/i386/avx512dq-pr93673.c        | 20 +++++++++++++
 .../gcc.target/i386/avx512f-pr93673.c         | 20 +++++++++++++
 6 files changed, 89 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512bw-pr93673.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512dq-pr93673.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-pr93673.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ed83af1fcd3..3c48d574029 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,14 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
+	PR target/93673
+	* config/i386/sse.md (k<code><mode>): Drop mode from last operand and
+	use const_0_to_255_operand predicate instead of immediate_operand.
+	(avx512dq_fpclass<mode><mask_scalar_merge_name>,
+	avx512dq_vmfpclass<mode><mask_scalar_merge_name>,
+	vgf2p8affineinvqb_<mode><mask_name>,
+	vgf2p8affineqb_<mode><mask_name>): Drop mode from
+	const_0_to_255_operand predicated operands.
+
 	Backported from mainline
 	2020-02-12  Jakub Jelinek  <jakub@redhat.com>
 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 043665f5e8b..18cc39ae521 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1610,7 +1610,7 @@
   [(set (match_operand:SWI1248_AVX512BWDQ 0 "register_operand" "=k")
 	(any_lshift:SWI1248_AVX512BWDQ
 	  (match_operand:SWI1248_AVX512BWDQ 1 "register_operand" "k")
-	  (match_operand:QI 2 "immediate_operand" "n")))
+	  (match_operand 2 "const_0_to_255_operand" "n")))
    (unspec [(const_int 0)] UNSPEC_MASKOP)]
   "TARGET_AVX512F"
   "k<mshift><mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
@@ -21130,7 +21130,7 @@
   [(set (match_operand:<avx512fmaskmode> 0 "register_operand" "=k")
           (unspec:<avx512fmaskmode>
             [(match_operand:VF_AVX512VL 1 "register_operand" "v")
-             (match_operand:QI 2 "const_0_to_255_operand" "n")]
+             (match_operand 2 "const_0_to_255_operand" "n")]
              UNSPEC_FPCLASS))]
    "TARGET_AVX512DQ"
    "vfpclass<ssemodesuffix>\t{%2, %1, %0<mask_scalar_merge_operand3>|%0<mask_scalar_merge_operand3>, %1, %2}";
@@ -21144,7 +21144,7 @@
 	(and:<avx512fmaskmode>
 	  (unspec:<avx512fmaskmode>
 	    [(match_operand:VF_128 1 "register_operand" "v")
-             (match_operand:QI 2 "const_0_to_255_operand" "n")]
+             (match_operand 2 "const_0_to_255_operand" "n")]
 	    UNSPEC_FPCLASS)
 	  (const_int 1)))]
    "TARGET_AVX512DQ"
@@ -21749,7 +21749,7 @@
 	(unspec:VI1_AVX512F
 	  [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
 	   (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")
-	   (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
+	   (match_operand 3 "const_0_to_255_operand" "n,n,n")]
 	  UNSPEC_GF2P8AFFINEINV))]
   "TARGET_GFNI"
   "@
@@ -21767,7 +21767,7 @@
 	(unspec:VI1_AVX512F
 	  [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
 	   (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")
-	   (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
+	   (match_operand 3 "const_0_to_255_operand" "n,n,n")]
 	  UNSPEC_GF2P8AFFINE))]
   "TARGET_GFNI"
   "@
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 5894d94ece7..899255a84c6 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,10 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
+	PR target/93673
+	* gcc.target/i386/avx512f-pr93673.c: New test.
+	* gcc.target/i386/avx512dq-pr93673.c: New test.
+	* gcc.target/i386/avx512bw-pr93673.c: New test.
+
 	Backported from mainline
 	2020-02-12  Jakub Jelinek  <jakub@redhat.com>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-pr93673.c b/gcc/testsuite/gcc.target/i386/avx512bw-pr93673.c
new file mode 100644
index 00000000000..dc87ed20d1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-pr93673.c
@@ -0,0 +1,30 @@
+/* PR target/93673 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512bw" } */
+
+#include <x86intrin.h>
+
+void
+foo (__mmask32 *c, __mmask64 *d)
+{
+  c[0] = _kshiftli_mask32 (c[0], 0);
+  c[1] = _kshiftri_mask32 (c[1], 0);
+  c[2] = _kshiftli_mask32 (c[2], 1);
+  c[3] = _kshiftri_mask32 (c[3], 1);
+  c[4] = _kshiftli_mask32 (c[4], 31);
+  c[5] = _kshiftri_mask32 (c[5], 31);
+  c[6] = _kshiftli_mask32 (c[6], 0x7f);
+  c[7] = _kshiftri_mask32 (c[7], 0x7f);
+  c[8] = _kshiftli_mask32 (c[8], 0xff);
+  c[9] = _kshiftri_mask32 (c[9], 0xff);
+  d[0] = _kshiftli_mask64 (d[0], 0);
+  d[1] = _kshiftri_mask64 (d[1], 0);
+  d[2] = _kshiftli_mask64 (d[2], 1);
+  d[3] = _kshiftri_mask64 (d[3], 1);
+  d[4] = _kshiftli_mask64 (d[4], 63);
+  d[5] = _kshiftri_mask64 (d[5], 63);
+  d[6] = _kshiftli_mask64 (d[6], 0x7f);
+  d[7] = _kshiftri_mask64 (d[7], 0x7f);
+  d[8] = _kshiftli_mask64 (d[8], 0xff);
+  d[9] = _kshiftri_mask64 (d[9], 0xff);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-pr93673.c b/gcc/testsuite/gcc.target/i386/avx512dq-pr93673.c
new file mode 100644
index 00000000000..3ae1674e4a4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-pr93673.c
@@ -0,0 +1,20 @@
+/* PR target/93673 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512dq" } */
+
+#include <x86intrin.h>
+
+void
+foo (__mmask8 *a)
+{
+  a[0] = _kshiftli_mask8 (a[0], 0);
+  a[1] = _kshiftri_mask8 (a[1], 0);
+  a[2] = _kshiftli_mask8 (a[2], 1);
+  a[3] = _kshiftri_mask8 (a[3], 1);
+  a[4] = _kshiftli_mask8 (a[4], 7);
+  a[5] = _kshiftri_mask8 (a[5], 7);
+  a[6] = _kshiftli_mask8 (a[6], 0x7f);
+  a[7] = _kshiftri_mask8 (a[7], 0x7f);
+  a[8] = _kshiftli_mask8 (a[8], 0xff);
+  a[9] = _kshiftri_mask8 (a[9], 0xff);
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-pr93673.c b/gcc/testsuite/gcc.target/i386/avx512f-pr93673.c
new file mode 100644
index 00000000000..963823c8a78
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-pr93673.c
@@ -0,0 +1,20 @@
+/* PR target/93673 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f" } */
+
+#include <x86intrin.h>
+
+void
+foo (__mmask16 *b)
+{
+  b[0] = _kshiftli_mask16 (b[0], 0);
+  b[1] = _kshiftri_mask16 (b[1], 0);
+  b[2] = _kshiftli_mask16 (b[2], 1);
+  b[3] = _kshiftri_mask16 (b[3], 1);
+  b[4] = _kshiftli_mask16 (b[4], 15);
+  b[5] = _kshiftri_mask16 (b[5], 15);
+  b[6] = _kshiftli_mask16 (b[6], 0x7f);
+  b[7] = _kshiftri_mask16 (b[7], 0x7f);
+  b[8] = _kshiftli_mask16 (b[8], 0xff);
+  b[9] = _kshiftri_mask16 (b[9], 0xff);
+}
-- 
2.20.1
From 08cf145f991327d943d785066709f5f39d20bd85 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Thu, 13 Feb 2020 10:43:27 +0100
Subject: [PATCH 14/15] i386: Fix up _mm*_mask_popcnt_epi* [PR93696]

As mentioned in the PR and as
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mask_popcnt_epi
also documents, _mm*_popcnt_epi* intrinsics are consistent with all other
unary AVX512* intrinsics regarding arguments, i.e. the
_mm*_whatever has just single argument (called a in the docs, and __A in the
GCC headers),
_mm*_mask_whatever has 3 arguments (called src, k, a in the docs and
_W, __U, __A in GCC headers) and
_mm*_maskz_whatever 2 arguments (called k, a in the docs and __U, __A in GCC
headers).  Unfortunately, whomever implemented the _mm*_popcnt_epi*
intrinsics got it wrong for the _mm*_mask_popcnt_epi* ones, calling the
args __A, __U, __B and not passing them in the canonical order to the
builtins, making it API incompatible with ICC as well as clang (tested on
godbolts clang 7/8/9/trunk and ICC 19.0.{0,1}, older clang/ICC don't
understand those, so it isn't that it used to be broken even in other
compilers and got changed afterwards).

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

	PR target/93696
	* config/i386/avx512bitalgintrin.h (_mm512_mask_popcnt_epi8,
	_mm512_mask_popcnt_epi16, _mm256_mask_popcnt_epi8,
	_mm256_mask_popcnt_epi16, _mm_mask_popcnt_epi8,
	_mm_mask_popcnt_epi16): Rename __B argument to __A and __A to __W,
	pass __A to the builtin followed by __W instead of __A followed by
	__B.
	* config/i386/avx512vpopcntdqintrin.h (_mm512_mask_popcnt_epi32,
	_mm512_mask_popcnt_epi64): Likewise.
	* config/i386/avx512vpopcntdqvlintrin.h (_mm_mask_popcnt_epi32,
	_mm256_mask_popcnt_epi32, _mm_mask_popcnt_epi64,
	_mm256_mask_popcnt_epi64): Likewise.

	* gcc.target/i386/pr93696-1.c: New test.
	* gcc.target/i386/pr93696-2.c: New test.
	* gcc.target/i386/avx512bitalg-vpopcntw-1.c (TEST): Fix argument order
	of _mm*_mask_popcnt_*.
	* gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c (TEST): Likewise.
	* gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c (TEST): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntb-1.c (TEST): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntb.c (foo): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntbvl.c (foo): Likewise.
	* gcc.target/i386/avx512vpopcntdq-vpopcntd.c (foo): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntwvl.c (foo): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntw.c (foo): Likewise.
	* gcc.target/i386/avx512vpopcntdq-vpopcntq.c (foo): Likewise.
---
 gcc/ChangeLog                                 | 13 +++
 gcc/config/i386/avx512bitalgintrin.h          | 24 +++---
 gcc/config/i386/avx512vpopcntdqintrin.h       |  8 +-
 gcc/config/i386/avx512vpopcntdqvlintrin.h     | 17 ++--
 gcc/testsuite/ChangeLog                       | 15 ++++
 .../gcc.target/i386/avx512bitalg-vpopcntb-1.c |  2 +-
 .../gcc.target/i386/avx512bitalg-vpopcntb.c   |  2 +-
 .../gcc.target/i386/avx512bitalg-vpopcntbvl.c |  4 +-
 .../gcc.target/i386/avx512bitalg-vpopcntw-1.c |  2 +-
 .../gcc.target/i386/avx512bitalg-vpopcntw.c   |  2 +-
 .../gcc.target/i386/avx512bitalg-vpopcntwvl.c |  4 +-
 .../i386/avx512vpopcntdq-vpopcntd-1.c         |  2 +-
 .../i386/avx512vpopcntdq-vpopcntd.c           |  6 +-
 .../i386/avx512vpopcntdq-vpopcntq-1.c         |  2 +-
 .../i386/avx512vpopcntdq-vpopcntq.c           |  6 +-
 gcc/testsuite/gcc.target/i386/pr93696-1.c     | 79 +++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr93696-2.c     | 79 +++++++++++++++++++
 17 files changed, 226 insertions(+), 41 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93696-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93696-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3c48d574029..b24a416a417 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,18 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
+	PR target/93696
+	* config/i386/avx512bitalgintrin.h (_mm512_mask_popcnt_epi8,
+	_mm512_mask_popcnt_epi16, _mm256_mask_popcnt_epi8,
+	_mm256_mask_popcnt_epi16, _mm_mask_popcnt_epi8,
+	_mm_mask_popcnt_epi16): Rename __B argument to __A and __A to __W,
+	pass __A to the builtin followed by __W instead of __A followed by
+	__B.
+	* config/i386/avx512vpopcntdqintrin.h (_mm512_mask_popcnt_epi32,
+	_mm512_mask_popcnt_epi64): Likewise.
+	* config/i386/avx512vpopcntdqvlintrin.h (_mm_mask_popcnt_epi32,
+	_mm256_mask_popcnt_epi32, _mm_mask_popcnt_epi64,
+	_mm256_mask_popcnt_epi64): Likewise.
+
 	PR target/93673
 	* config/i386/sse.md (k<code><mode>): Drop mode from last operand and
 	use const_0_to_255_operand predicate instead of immediate_operand.
diff --git a/gcc/config/i386/avx512bitalgintrin.h b/gcc/config/i386/avx512bitalgintrin.h
index 8b4fc8e3f67..8ad8586cc59 100644
--- a/gcc/config/i386/avx512bitalgintrin.h
+++ b/gcc/config/i386/avx512bitalgintrin.h
@@ -61,10 +61,10 @@ _mm512_popcnt_epi16 (__m512i __A)
 
 extern __inline __m512i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_mask_popcnt_epi8 (__m512i __A, __mmask64 __U, __m512i __B)
+_mm512_mask_popcnt_epi8 (__m512i __W, __mmask64 __U, __m512i __A)
 {
   return (__m512i) __builtin_ia32_vpopcountb_v64qi_mask ((__v64qi) __A,
-							 (__v64qi) __B,
+							 (__v64qi) __W,
 							 (__mmask64) __U);
 }
 
@@ -79,10 +79,10 @@ _mm512_maskz_popcnt_epi8 (__mmask64 __U, __m512i __A)
 }
 extern __inline __m512i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_mask_popcnt_epi16 (__m512i __A, __mmask32 __U, __m512i __B)
+_mm512_mask_popcnt_epi16 (__m512i __W, __mmask32 __U, __m512i __A)
 {
   return (__m512i) __builtin_ia32_vpopcountw_v32hi_mask ((__v32hi) __A,
-							(__v32hi) __B,
+							(__v32hi) __W,
 							(__mmask32) __U);
 }
 
@@ -127,10 +127,10 @@ _mm512_mask_bitshuffle_epi64_mask (__mmask64 __M, __m512i __A, __m512i __B)
 
 extern __inline __m256i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_mask_popcnt_epi8 (__m256i __A, __mmask32 __U, __m256i __B)
+_mm256_mask_popcnt_epi8 (__m256i __W, __mmask32 __U, __m256i __A)
 {
   return (__m256i) __builtin_ia32_vpopcountb_v32qi_mask ((__v32qi) __A,
-							 (__v32qi) __B,
+							 (__v32qi) __W,
 							 (__mmask32) __U);
 }
 
@@ -222,10 +222,10 @@ _mm_popcnt_epi16 (__m128i __A)
 
 extern __inline __m256i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_mask_popcnt_epi16 (__m256i __A, __mmask16 __U, __m256i __B)
+_mm256_mask_popcnt_epi16 (__m256i __W, __mmask16 __U, __m256i __A)
 {
   return (__m256i) __builtin_ia32_vpopcountw_v16hi_mask ((__v16hi) __A,
-							(__v16hi) __B,
+							(__v16hi) __W,
 							(__mmask16) __U);
 }
 
@@ -241,10 +241,10 @@ _mm256_maskz_popcnt_epi16 (__mmask16 __U, __m256i __A)
 
 extern __inline __m128i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_popcnt_epi8 (__m128i __A, __mmask16 __U, __m128i __B)
+_mm_mask_popcnt_epi8 (__m128i __W, __mmask16 __U, __m128i __A)
 {
   return (__m128i) __builtin_ia32_vpopcountb_v16qi_mask ((__v16qi) __A,
-							 (__v16qi) __B,
+							 (__v16qi) __W,
 							 (__mmask16) __U);
 }
 
@@ -259,10 +259,10 @@ _mm_maskz_popcnt_epi8 (__mmask16 __U, __m128i __A)
 }
 extern __inline __m128i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_popcnt_epi16 (__m128i __A, __mmask8 __U, __m128i __B)
+_mm_mask_popcnt_epi16 (__m128i __W, __mmask8 __U, __m128i __A)
 {
   return (__m128i) __builtin_ia32_vpopcountw_v8hi_mask ((__v8hi) __A,
-							(__v8hi) __B,
+							(__v8hi) __W,
 							(__mmask8) __U);
 }
 
diff --git a/gcc/config/i386/avx512vpopcntdqintrin.h b/gcc/config/i386/avx512vpopcntdqintrin.h
index 3569430baa7..119bdf41887 100644
--- a/gcc/config/i386/avx512vpopcntdqintrin.h
+++ b/gcc/config/i386/avx512vpopcntdqintrin.h
@@ -43,10 +43,10 @@ _mm512_popcnt_epi32 (__m512i __A)
 
 extern __inline __m512i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_mask_popcnt_epi32 (__m512i __A, __mmask16 __U, __m512i __B)
+_mm512_mask_popcnt_epi32 (__m512i __W, __mmask16 __U, __m512i __A)
 {
   return (__m512i) __builtin_ia32_vpopcountd_v16si_mask ((__v16si) __A,
-							 (__v16si) __B,
+							 (__v16si) __W,
 							 (__mmask16) __U);
 }
 
@@ -69,10 +69,10 @@ _mm512_popcnt_epi64 (__m512i __A)
 
 extern __inline __m512i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_mask_popcnt_epi64 (__m512i __A, __mmask8 __U, __m512i __B)
+_mm512_mask_popcnt_epi64 (__m512i __W, __mmask8 __U, __m512i __A)
 {
   return (__m512i) __builtin_ia32_vpopcountq_v8di_mask ((__v8di) __A,
-							(__v8di) __B,
+							(__v8di) __W,
 							(__mmask8) __U);
 }
 
diff --git a/gcc/config/i386/avx512vpopcntdqvlintrin.h b/gcc/config/i386/avx512vpopcntdqvlintrin.h
index b974d09338a..2765592f0ae 100644
--- a/gcc/config/i386/avx512vpopcntdqvlintrin.h
+++ b/gcc/config/i386/avx512vpopcntdqvlintrin.h
@@ -43,10 +43,10 @@ _mm_popcnt_epi32 (__m128i __A)
 
 extern __inline __m128i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_popcnt_epi32 (__m128i __A, __mmask16 __U, __m128i __B)
+_mm_mask_popcnt_epi32 (__m128i __W, __mmask16 __U, __m128i __A)
 {
   return (__m128i) __builtin_ia32_vpopcountd_v4si_mask ((__v4si) __A,
-							 (__v4si) __B,
+							 (__v4si) __W,
 							 (__mmask16) __U);
 }
 
@@ -69,10 +69,10 @@ _mm256_popcnt_epi32 (__m256i __A)
 
 extern __inline __m256i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_mask_popcnt_epi32 (__m256i __A, __mmask16 __U, __m256i __B)
+_mm256_mask_popcnt_epi32 (__m256i __W, __mmask16 __U, __m256i __A)
 {
   return (__m256i) __builtin_ia32_vpopcountd_v8si_mask ((__v8si) __A,
-							 (__v8si) __B,
+							 (__v8si) __W,
 							 (__mmask16) __U);
 }
 
@@ -95,10 +95,10 @@ _mm_popcnt_epi64 (__m128i __A)
 
 extern __inline __m128i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_popcnt_epi64 (__m128i __A, __mmask8 __U, __m128i __B)
+_mm_mask_popcnt_epi64 (__m128i __W, __mmask8 __U, __m128i __A)
 {
   return (__m128i) __builtin_ia32_vpopcountq_v2di_mask ((__v2di) __A,
-							(__v2di) __B,
+							(__v2di) __W,
 							(__mmask8) __U);
 }
 
@@ -121,10 +121,10 @@ _mm256_popcnt_epi64 (__m256i __A)
 
 extern __inline __m256i
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm256_mask_popcnt_epi64 (__m256i __A, __mmask8 __U, __m256i __B)
+_mm256_mask_popcnt_epi64 (__m256i __W, __mmask8 __U, __m256i __A)
 {
   return (__m256i) __builtin_ia32_vpopcountq_v4di_mask ((__v4di) __A,
-							(__v4di) __B,
+							(__v4di) __W,
 							(__mmask8) __U);
 }
 
@@ -144,4 +144,3 @@ _mm256_maskz_popcnt_epi64 (__mmask8 __U, __m256i __A)
 #endif /* __DISABLE_AVX512VPOPCNTDQVL__ */
 
 #endif /* _AVX512VPOPCNTDQVLINTRIN_H_INCLUDED */
-
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 899255a84c6..114a1087936 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,20 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
+	PR target/93696
+	* gcc.target/i386/pr93696-1.c: New test.
+	* gcc.target/i386/pr93696-2.c: New test.
+	* gcc.target/i386/avx512bitalg-vpopcntw-1.c (TEST): Fix argument order
+	of _mm*_mask_popcnt_*.
+	* gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c (TEST): Likewise.
+	* gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c (TEST): Likewise.
+	* gcc.target/i386/avx512bitalg-vpopcntb-1.c (TEST): Likewise.
+	* gcc.target/i386/avx512bitalg-vpopcntb.c (foo): Likewise.
+	* gcc.target/i386/avx512bitalg-vpopcntbvl.c (foo): Likewise.
+	* gcc.target/i386/avx512vpopcntdq-vpopcntd.c (foo): Likewise.
+	* gcc.target/i386/avx512bitalg-vpopcntwvl.c (foo): Likewise.
+	* gcc.target/i386/avx512bitalg-vpopcntw.c (foo): Likewise.
+	* gcc.target/i386/avx512vpopcntdq-vpopcntq.c (foo): Likewise.
+
 	PR target/93673
 	* gcc.target/i386/avx512f-pr93673.c: New test.
 	* gcc.target/i386/avx512dq-pr93673.c: New test.
diff --git a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c
index 3dcd48f7e2a..697757b8b73 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c
@@ -41,7 +41,7 @@ TEST (void)
   }
 
   res1.x = INTRINSIC (_popcnt_epi8)       (src.x);
-  res2.x = INTRINSIC (_mask_popcnt_epi8)  (src.x, mask, src0.x);
+  res2.x = INTRINSIC (_mask_popcnt_epi8)  (src0.x, mask, src.x);
   res3.x = INTRINSIC (_maskz_popcnt_epi8) (mask, src.x);
 
   if (UNION_CHECK (AVX512F_LEN, i_b) (res1, res_ref))
diff --git a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb.c b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb.c
index b23da58dbaf..246f925eede 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb.c
@@ -13,7 +13,7 @@ int foo ()
   __mmask16 msk;
   __m512i c = _mm512_popcnt_epi8 (z);
   asm volatile ("" : "+v" (c));
-  c = _mm512_mask_popcnt_epi8 (z, msk, z1);
+  c = _mm512_mask_popcnt_epi8 (z1, msk, z);
   asm volatile ("" : "+v" (c));
   c = _mm512_maskz_popcnt_epi8 (msk, z);
   asm volatile ("" : "+v" (c));
diff --git a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntbvl.c b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntbvl.c
index e6d60f7596c..8c7f45fc5f7 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntbvl.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntbvl.c
@@ -18,13 +18,13 @@ int foo ()
   __mmask16 msk16;
   __m256i c256 = _mm256_popcnt_epi8 (y);
   asm volatile ("" : "+v" (c256));
-  c256 = _mm256_mask_popcnt_epi8 (y, msk32, y_1);
+  c256 = _mm256_mask_popcnt_epi8 (y_1, msk32, y);
   asm volatile ("" : "+v" (c256));
   c256 = _mm256_maskz_popcnt_epi8 (msk32, y);
   asm volatile ("" : "+v" (c256));
   __m128i c128 = _mm_popcnt_epi8 (x);
   asm volatile ("" : "+v" (c128));
-  c128 = _mm_mask_popcnt_epi8 (x, msk16, x_1);
+  c128 = _mm_mask_popcnt_epi8 (x_1, msk16, x);
   asm volatile ("" : "+v" (c128));
   c128 = _mm_maskz_popcnt_epi8 (msk16, x);
   asm volatile ("" : "+v" (c128));
diff --git a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c
index 4f866db2f7a..0a725fe012a 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c
@@ -41,7 +41,7 @@ TEST (void)
   }
 
   res1.x = INTRINSIC (_popcnt_epi16)       (src.x);
-  res2.x = INTRINSIC (_mask_popcnt_epi16)  (src.x, mask, src0.x);
+  res2.x = INTRINSIC (_mask_popcnt_epi16)  (src0.x, mask, src.x);
   res3.x = INTRINSIC (_maskz_popcnt_epi16) (mask, src.x);
 
   if (UNION_CHECK (AVX512F_LEN, i_w) (res1, res_ref))
diff --git a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw.c b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw.c
index 2c49583b597..90663f480fc 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw.c
@@ -13,7 +13,7 @@ int foo ()
   __mmask16 msk;
   __m512i c = _mm512_popcnt_epi16 (z);
   asm volatile ("" : "+v" (c));
-  c = _mm512_mask_popcnt_epi16 (z, msk, z1);
+  c = _mm512_mask_popcnt_epi16 (z1, msk, z);
   asm volatile ("" : "+v" (c));
   c = _mm512_maskz_popcnt_epi16 (msk, z);
   asm volatile ("" : "+v" (c));
diff --git a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntwvl.c b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntwvl.c
index b55adc6023a..3a646b57282 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntwvl.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntwvl.c
@@ -18,13 +18,13 @@ int foo ()
   __mmask8 msk8;
   __m256i c256 = _mm256_popcnt_epi16 (y);
   asm volatile ("" : "+v" (c256));
-  c256 = _mm256_mask_popcnt_epi16 (y, msk16, y_1);
+  c256 = _mm256_mask_popcnt_epi16 (y_1, msk16, y);
   asm volatile ("" : "+v" (c256));
   c256 = _mm256_maskz_popcnt_epi16 (msk16, y);
   asm volatile ("" : "+v" (c256));
   __m128i c128 = _mm_popcnt_epi16 (x);
   asm volatile ("" : "+v" (c128));
-  c128 = _mm_mask_popcnt_epi16 (x, msk8, x_1);
+  c128 = _mm_mask_popcnt_epi16 (x_1, msk8, x);
   asm volatile ("" : "+v" (c128));
   c128 = _mm_maskz_popcnt_epi16 (msk8, x);
   asm volatile ("" : "+v" (c128));
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c
index 245dcd4d534..e7d6bb4dd53 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c
@@ -40,7 +40,7 @@ TEST (void)
   }
 
   res1.x = INTRINSIC (_popcnt_epi32)       (src.x);
-  res2.x = INTRINSIC (_mask_popcnt_epi32)  (src.x, mask, src0.x);
+  res2.x = INTRINSIC (_mask_popcnt_epi32)  (src0.x, mask, src.x);
   res3.x = INTRINSIC (_maskz_popcnt_epi32) (mask, src.x);
 
   if (UNION_CHECK (AVX512F_LEN, i_d) (res1, res_ref))
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
index c70f226824e..b4d82f97032 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd.c
@@ -22,19 +22,19 @@ int foo ()
   __mmask8 msk8;
   __m128i a = _mm_popcnt_epi32 (x);
   asm volatile ("" : "+v" (a));
-  a = _mm_mask_popcnt_epi32 (x, msk8, x_1);
+  a = _mm_mask_popcnt_epi32 (x_1, msk8, x);
   asm volatile ("" : "+v" (a));
   a = _mm_maskz_popcnt_epi32 (msk8, x);
   asm volatile ("" : "+v" (a));
   __m256i b = _mm256_popcnt_epi32 (y);
   asm volatile ("" : "+v" (b));
-  b = _mm256_mask_popcnt_epi32 (y, msk8, y_1);
+  b = _mm256_mask_popcnt_epi32 (y_1, msk8, y);
   asm volatile ("" : "+v" (b));
   b = _mm256_maskz_popcnt_epi32 (msk8, y);
   asm volatile ("" : "+v" (b));
   __m512i c = _mm512_popcnt_epi32 (z);
   asm volatile ("" : "+v" (c));
-  c = _mm512_mask_popcnt_epi32 (z, msk, z_1);
+  c = _mm512_mask_popcnt_epi32 (z_1, msk, z);
   asm volatile ("" : "+v" (c));
   c = _mm512_maskz_popcnt_epi32 (msk, z);
   asm volatile ("" : "+v" (c));
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c
index 27555c496d6..2144cf32c0d 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c
@@ -40,7 +40,7 @@ TEST (void)
   }
 
   res1.x = INTRINSIC (_popcnt_epi64)       (src.x);
-  res2.x = INTRINSIC (_mask_popcnt_epi64)  (src.x, mask, src0.x);
+  res2.x = INTRINSIC (_mask_popcnt_epi64)  (src0.x, mask, src.x);
   res3.x = INTRINSIC (_maskz_popcnt_epi64) (mask, src.x);
 
   if (UNION_CHECK (AVX512F_LEN, i_q) (res1, res_ref))
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
index 9f400c005f3..e87d6c999b6 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq.c
@@ -21,19 +21,19 @@ int foo ()
   __mmask8 msk; 
   __m128i a = _mm_popcnt_epi64 (x);
   asm volatile ("" : "+v" (a));
-  a = _mm_mask_popcnt_epi64 (x, msk, x_1);
+  a = _mm_mask_popcnt_epi64 (x_1, msk, x);
   asm volatile ("" : "+v" (a));
   a = _mm_maskz_popcnt_epi64 (msk, x);
   asm volatile ("" : "+v" (a));
   __m256i b = _mm256_popcnt_epi64 (y);
   asm volatile ("" : "+v" (b));
-  b = _mm256_mask_popcnt_epi64 (y, msk, y_1);
+  b = _mm256_mask_popcnt_epi64 (y_1, msk, y);
   asm volatile ("" : "+v" (b));
   b = _mm256_maskz_popcnt_epi64 (msk, y);
   asm volatile ("" : "+v" (b));
   __m512i c = _mm512_popcnt_epi64 (z);
   asm volatile ("" : "+v" (c));
-  c = _mm512_mask_popcnt_epi64 (z, msk, z_1);
+  c = _mm512_mask_popcnt_epi64 (z_1, msk, z);
   asm volatile ("" : "+v" (c));
   c = _mm512_maskz_popcnt_epi64 (msk, z); 
   asm volatile ("" : "+v" (c));
diff --git a/gcc/testsuite/gcc.target/i386/pr93696-1.c b/gcc/testsuite/gcc.target/i386/pr93696-1.c
new file mode 100644
index 00000000000..128bb98c066
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr93696-1.c
@@ -0,0 +1,79 @@
+/* PR target/93696 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512bitalg -mavx512vpopcntdq -mavx512vl -mavx512bw -masm=att" } */
+/* { dg-final { scan-assembler-times "vpopcnt\[bwdq]\t%\[xyz]mm1, %\[xyz]mm0\{%k\[0-7]\}\[^\{]" 12 } } */
+/* { dg-final { scan-assembler-not "vmovdq\[au]\[0-9]" } } */
+
+#include <x86intrin.h>
+
+__m128i
+f1 (__m128i x, __mmask8 m, __m128i y)
+{
+  return _mm_mask_popcnt_epi64 (x, m, y);
+}
+
+__m128i
+f2 (__m128i x, __mmask8 m, __m128i y)
+{
+  return _mm_mask_popcnt_epi32 (x, m, y);
+}
+
+__m128i
+f3 (__m128i x, __mmask8 m, __m128i y)
+{
+  return _mm_mask_popcnt_epi16 (x, m, y);
+}
+
+__m128i
+f4 (__m128i x, __mmask16 m, __m128i y)
+{
+  return _mm_mask_popcnt_epi8 (x, m, y);
+}
+
+__m256i
+f5 (__m256i x, __mmask8 m, __m256i y)
+{
+  return _mm256_mask_popcnt_epi64 (x, m, y);
+}
+
+__m256i
+f6 (__m256i x, __mmask8 m, __m256i y)
+{
+  return _mm256_mask_popcnt_epi32 (x, m, y);
+}
+
+__m256i
+f7 (__m256i x, __mmask16 m, __m256i y)
+{
+  return _mm256_mask_popcnt_epi16 (x, m, y);
+}
+
+__m256i
+f8 (__m256i x, __mmask32 m, __m256i y)
+{
+  return _mm256_mask_popcnt_epi8 (x, m, y);
+}
+
+__m512i
+f9 (__m512i x, __mmask8 m, __m512i y)
+{
+  return _mm512_mask_popcnt_epi64 (x, m, y);
+}
+
+__m512i
+f10 (__m512i x, __mmask16 m, __m512i y)
+{
+  return _mm512_mask_popcnt_epi32 (x, m, y);
+}
+
+__m512i
+f11 (__m512i x, __mmask32 m, __m512i y)
+{
+  return _mm512_mask_popcnt_epi16 (x, m, y);
+}
+
+__m512i
+f12 (__m512i x, __mmask64 m, __m512i y)
+{
+  return _mm512_mask_popcnt_epi8 (x, m, y);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr93696-2.c b/gcc/testsuite/gcc.target/i386/pr93696-2.c
new file mode 100644
index 00000000000..25a298aea18
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr93696-2.c
@@ -0,0 +1,79 @@
+/* PR target/93696 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512bitalg -mavx512vpopcntdq -mavx512vl -mavx512bw -masm=att" } */
+/* { dg-final { scan-assembler-times "vpopcnt\[bwdq]\t%\[xyz]mm1, %\[xyz]mm0\{%k\[0-7]\}\{z\}" 12 } } */
+/* { dg-final { scan-assembler-not "vmovdq\[au]\[0-9]" } } */
+
+#include <x86intrin.h>
+
+__m128i
+f1 (__m128i x, __mmask8 m, __m128i y)
+{
+  return _mm_maskz_popcnt_epi64 (m, y);
+}
+
+__m128i
+f2 (__m128i x, __mmask8 m, __m128i y)
+{
+  return _mm_maskz_popcnt_epi32 (m, y);
+}
+
+__m128i
+f3 (__m128i x, __mmask8 m, __m128i y)
+{
+  return _mm_maskz_popcnt_epi16 (m, y);
+}
+
+__m128i
+f4 (__m128i x, __mmask16 m, __m128i y)
+{
+  return _mm_maskz_popcnt_epi8 (m, y);
+}
+
+__m256i
+f5 (__m256i x, __mmask8 m, __m256i y)
+{
+  return _mm256_maskz_popcnt_epi64 (m, y);
+}
+
+__m256i
+f6 (__m256i x, __mmask8 m, __m256i y)
+{
+  return _mm256_maskz_popcnt_epi32 (m, y);
+}
+
+__m256i
+f7 (__m256i x, __mmask16 m, __m256i y)
+{
+  return _mm256_maskz_popcnt_epi16 (m, y);
+}
+
+__m256i
+f8 (__m256i x, __mmask32 m, __m256i y)
+{
+  return _mm256_maskz_popcnt_epi8 (m, y);
+}
+
+__m512i
+f9 (__m512i x, __mmask8 m, __m512i y)
+{
+  return _mm512_maskz_popcnt_epi64 (m, y);
+}
+
+__m512i
+f10 (__m512i x, __mmask16 m, __m512i y)
+{
+  return _mm512_maskz_popcnt_epi32 (m, y);
+}
+
+__m512i
+f11 (__m512i x, __mmask32 m, __m512i y)
+{
+  return _mm512_maskz_popcnt_epi16 (m, y);
+}
+
+__m512i
+f12 (__m512i x, __mmask64 m, __m512i y)
+{
+  return _mm512_maskz_popcnt_epi8 (m, y);
+}
-- 
2.20.1
From 7276dd4c7480dd952f0d4a9322ca04ca29f5126f Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>

Date: Thu, 13 Feb 2020 21:00:09 +0100
Subject: [PATCH 15/15] c: Fix ICE with cast to VLA [93576]

The following testcase ICEs, because the PR84305 changes try to evaluate
the size earlier.  If size has side-effects, that is desirable, and the
side-effects will actually be wrapped in a SAVE_EXPR.  The problem on this
testcase is that there are no side-effects, and c_fully_fold doesn't fold
those COMPOUND_EXPRs to constant, and while before gimplification we unshare
trees found in the expressions, the unsharing doesn't involve TYPE_SIZE etc.
of used types.  Gimplification is destructive though, so when we gimplify
the two nested COMPOUND_EXPRs and then try to gimplify it the second time
for the TYPE_SIZEs, we ICE.
Now, we could use unshare_expr in what we push to *expr, SAVE_EXPRs and
their operands in there aren't unshared, but I really don't see a point of
evaluating expressions that don't have side-effects before, so instead
this just pushes there expressions that do have side-effects.

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

	PR c/93576
	* c-decl.c (grokdeclarator): If this_size_varies, only push size into
	*expr if it has side effects.

	* gcc.dg/pr93576.c: New test.
---
 gcc/c/ChangeLog                |  6 ++++++
 gcc/c/c-decl.c                 | 13 ++++++++-----
 gcc/testsuite/ChangeLog        |  3 +++
 gcc/testsuite/gcc.dg/pr93576.c | 10 ++++++++++
 4 files changed, 27 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr93576.c

diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog
index b17ed98a9ae..5ba4516e114 100644
--- a/gcc/c/ChangeLog
+++ b/gcc/c/ChangeLog
@@ -1,3 +1,9 @@
+2020-02-13  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c/93576
+	* c-decl.c (grokdeclarator): If this_size_varies, only push size into
+	*expr if it has side effects.
+
 2020-01-22  Joseph Myers  <joseph@codesourcery.com>
 
 	Backport from mainline:
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index f77fb1739ca..859a6241258 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -6368,11 +6368,14 @@ grokdeclarator (const struct c_declarator *declarator,
 		  }
 		if (this_size_varies)
 		  {
-		    if (*expr)
-		      *expr = build2 (COMPOUND_EXPR, TREE_TYPE (size),
-				      *expr, size);
-		    else
-		      *expr = size;
+		    if (TREE_SIDE_EFFECTS (size))
+		      {
+			if (*expr)
+			  *expr = build2 (COMPOUND_EXPR, TREE_TYPE (size),
+					  *expr, size);
+			else
+			  *expr = size;
+		      }
 		    *expr_const_operands &= size_maybe_const;
 		  }
 	      }
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 114a1087936..24e5e88d763 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,8 @@
 2020-02-13  Jakub Jelinek  <jakub@redhat.com>
 
+	PR c/93576
+	* gcc.dg/pr93576.c: New test.
+
 	PR target/93696
 	* gcc.target/i386/pr93696-1.c: New test.
 	* gcc.target/i386/pr93696-2.c: New test.
diff --git a/gcc/testsuite/gcc.dg/pr93576.c b/gcc/testsuite/gcc.dg/pr93576.c
new file mode 100644
index 00000000000..13c34f3771f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr93576.c
@@ -0,0 +1,10 @@
+/* PR c/93576 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+void
+foo (void)
+{
+  int b[] = { 0 };
+  (char (*)[(1, 7, 2)]) 0;
+}
-- 
2.20.1

Comments

H.J. Lu Feb. 14, 2020, 3:45 p.m. | #1
On Thu, Feb 13, 2020 at 2:46 PM Jakub Jelinek <jakub@redhat.com> wrote:
>

> Hi!

>

> I've backported following 15 commits from trunk to 9.3 branch,

> bootstrapped/regtested on x86_64-linux and i686-linux, committed.

>


Hi Jakub,

Are you preparing for GCC 9.3? I'd like to include this in GCC 9.3:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1d69147af203d4dcd2270429f90c93f1a37ddfff

It is very safe.  Uros asked me to wait for a week before backporting to
GCC 9 branch.  I am planning to do it next Thursday.

Thanks.

-- 
H.J.
Jakub Jelinek Feb. 14, 2020, 3:51 p.m. | #2
On Fri, Feb 14, 2020 at 07:45:43AM -0800, H.J. Lu wrote:
> Are you preparing for GCC 9.3? I'd like to include this in GCC 9.3:

> 

> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1d69147af203d4dcd2270429f90c93f1a37ddfff

> 

> It is very safe.  Uros asked me to wait for a week before backporting to

> GCC 9 branch.  I am planning to do it next Thursday.


Richi wants to do 8.4 first, am backporting a lot of patches to that now.
I'd say we should aim for 8.4 rc next week or say on Monday 24th
and release a week after that and 9.3 maybe one week later than that.

	Jakub
H.J. Lu Feb. 14, 2020, 4:07 p.m. | #3
On Fri, Feb 14, 2020 at 7:51 AM Jakub Jelinek <jakub@redhat.com> wrote:
>

> On Fri, Feb 14, 2020 at 07:45:43AM -0800, H.J. Lu wrote:

> > Are you preparing for GCC 9.3? I'd like to include this in GCC 9.3:

> >

> > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1d69147af203d4dcd2270429f90c93f1a37ddfff

> >

> > It is very safe.  Uros asked me to wait for a week before backporting to

> > GCC 9 branch.  I am planning to do it next Thursday.

>

> Richi wants to do 8.4 first, am backporting a lot of patches to that now.

> I'd say we should aim for 8.4 rc next week or say on Monday 24th


I am planning to back it to both GCC 8 and 9 branches next Thursday.
I think I will be fine.

> and release a week after that and 9.3 maybe one week later than that.

>

>         Jakub

>


Thanks.


-- 
H.J.

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1916dab20d1..2029c67bf02 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@ 
+2020-02-13  Jakub Jelinek  <jakub@redhat.com>
+
+	Backported from mainline
+	2020-01-23  Jakub Jelinek  <jakub@redhat.com>
+
+	PR rtl-optimization/93402
+	* postreload.c (reload_combine_recognize_pattern): Don't try to adjust
+	USE insns.
+
 2020-02-11  Tamar Christina  <tamar.christina@arm.com>
 
 	Backport from mainline
diff --git a/gcc/postreload.c b/gcc/postreload.c
index 728aa9b0ed5..b76c7b0b758 100644
--- a/gcc/postreload.c
+++ b/gcc/postreload.c
@@ -1081,6 +1081,10 @@  reload_combine_recognize_pattern (rtx_insn *insn)
       struct reg_use *use = reg_state[regno].reg_use + i;
       if (GET_MODE (*use->usep) != mode)
 	return false;
+      /* Don't try to adjust (use (REGX)).  */
+      if (GET_CODE (PATTERN (use->insn)) == USE
+	  && &XEXP (PATTERN (use->insn), 0) == use->usep)
+	return false;
     }
 
   /* Look for (set (REGX) (CONST_INT))
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 1dcf894a92a..bec5eba5033 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@ 
+2020-02-13  Jakub Jelinek  <jakub@redhat.com>
+
+	Backported from mainline
+	2020-01-23  Jakub Jelinek  <jakub@redhat.com>
+
+	PR rtl-optimization/93402
+	* gcc.c-torture/execute/pr93402.c: New test.
+
 2020-02-11  Tamar Christina  <tamar.christina@arm.com>
 
 	Backport from mainline
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr93402.c b/gcc/testsuite/gcc.c-torture/execute/pr93402.c
new file mode 100644
index 00000000000..6487797d0aa
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr93402.c
@@ -0,0 +1,21 @@ 
+/* PR rtl-optimization/93402 */
+
+struct S { unsigned int a; unsigned long long b; };
+
+__attribute__((noipa)) struct S
+foo (unsigned long long x)
+{
+  struct S ret;
+  ret.a = 0;
+  ret.b = x * 11111111111ULL + 111111111111ULL;
+  return ret;
+}
+
+int
+main ()
+{
+  struct S a = foo (1);
+  if (a.a != 0 || a.b != 122222222222ULL)
+    __builtin_abort ();
+  return 0;
+}