improve memcmp and memchr constant folding (PR 78257)

Message ID 3f5e507e-8205-2d00-8982-039126e91616@gmail.com
State New
Headers show
Series
  • improve memcmp and memchr constant folding (PR 78257)
Related show

Commit Message

Ian Lance Taylor via Gcc-patches July 31, 2020, 11:55 p.m.
The folders for these functions (and some others) call c_getsr
which relies on string_constant to return the representation of
constant strings.  Because the function doesn't handle constants
of other types, including aggregates, memcmp or memchr calls
involving those are not folded when they could be.

The attached patch extends the algorithm used by string_constant
to also handle constant aggregates involving elements or members
of the same types as native_encode_expr.  (The change restores
the empty initializer optimization inadvertently disabled in
the fix for pr96058.)

To avoid accidentally misusing either string_constant or c_getstr
with non-strings I have introduced a pair of new functions to get
the representation of those: byte_representation and getbyterep.

Tested on x86_64-linux.

Martin

Comments

Ian Lance Taylor via Gcc-patches Aug. 10, 2020, 4:48 p.m. | #1
Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/551152.html

On 7/31/20 5:55 PM, Martin Sebor wrote:
> The folders for these functions (and some others) call c_getsr

> which relies on string_constant to return the representation of

> constant strings.  Because the function doesn't handle constants

> of other types, including aggregates, memcmp or memchr calls

> involving those are not folded when they could be.

> 

> The attached patch extends the algorithm used by string_constant

> to also handle constant aggregates involving elements or members

> of the same types as native_encode_expr.  (The change restores

> the empty initializer optimization inadvertently disabled in

> the fix for pr96058.)

> 

> To avoid accidentally misusing either string_constant or c_getstr

> with non-strings I have introduced a pair of new functions to get

> the representation of those: byte_representation and getbyterep.

> 

> Tested on x86_64-linux.

> 

> Martin
Ian Lance Taylor via Gcc-patches Aug. 13, 2020, 4:21 p.m. | #2
On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:
> The folders for these functions (and some others) call c_getsr

> which relies on string_constant to return the representation of

> constant strings.  Because the function doesn't handle constants

> of other types, including aggregates, memcmp or memchr calls

> involving those are not folded when they could be.

> 

> The attached patch extends the algorithm used by string_constant

> to also handle constant aggregates involving elements or members

> of the same types as native_encode_expr.  (The change restores

> the empty initializer optimization inadvertently disabled in

> the fix for pr96058.)

> 

> To avoid accidentally misusing either string_constant or c_getstr

> with non-strings I have introduced a pair of new functions to get

> the representation of those: byte_representation and getbyterep.

> 

> Tested on x86_64-linux.

> 

> Martin


> PR tree-optimization/78257 - missing memcmp optimization with constant arrays

> 

> gcc/ChangeLog:

> 

> 	PR middle-end/78257

> 	* builtins.c (expand_builtin_memory_copy_args): Rename called function.

> 	(expand_builtin_stpcpy_1): Remove argument from call.

> 	(expand_builtin_memcmp): Rename called function.

> 	(inline_expand_builtin_bytecmp): Same.

> 	* expr.c (convert_to_bytes): New function.

> 	(constant_byte_string): New function (formerly string_constant).

> 	(string_constant): Call constant_byte_string.

> 	(byte_representation): New function.

> 	* expr.h (byte_representation): Declare.

> 	* fold-const-call.c (fold_const_call): Rename called function.

> 	* fold-const.c (c_getstr): Remove an argument.

> 	(getbyterep): Define a new function.

> 	* fold-const.h (c_getstr): Remove an argument.

> 	(getbyterep): Declare a new function.

> 	* gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.

> 	(gimple_fold_builtin_string_compare): Same.

> 	(gimple_fold_builtin_memchr): Same.

> 

> gcc/testsuite/ChangeLog:

> 

> 	PR middle-end/78257

> 	* gcc.dg/memchr.c: New test.

> 	* gcc.dg/memcmp-2.c: New test.

> 	* gcc.dg/memcmp-3.c: New test.

> 	* gcc.dg/memcmp-4.c: New test.

> 

> diff --git a/gcc/expr.c b/gcc/expr.c

> index a150fa0d3b5..a124df54655 100644

> --- a/gcc/expr.c

> +++ b/gcc/expr.c

> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, const_tree exp)

>    /* This must now be the address of EXP.  */

>    return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, 0) == exp;

>  }

> -

> -/* Return the tree node if an ARG corresponds to a string constant or zero

> -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the (possibly

> -   non-constant) offset in bytes within the string that ARG is accessing.

> -   If MEM_SIZE is non-zero the storage size of the memory is returned.

> -   If DECL is non-zero the constant declaration is returned if available.  */

>  

> -tree

> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)

> +/* If EXPR is a constant initializer (either an expression or CONSTRUCTOR),

> +   attempt to obtain its native representation as an array of nonzero BYTES.

> +   Return true on success and false on failure (the latter without modifying

> +   BYTES).  */

> +

> +static bool

> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)

> +{

> +  if (TREE_CODE (expr) == CONSTRUCTOR)

> +    {

> +      /* Set to the size of the CONSTRUCTOR elements.  */

> +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();

> +

> +      if (TREE_CODE (type) == ARRAY_TYPE)

> +	{

> +	  tree val, idx;

> +	  tree eltype = TREE_TYPE (type);

> +	  unsigned HOST_WIDE_INT elsize =

> +	    tree_to_uhwi (TYPE_SIZE_UNIT (eltype));

> +	  unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;

> +	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)

> +	    {

> +	      /* Append zeros for elements with no initializers.  */

> +	      if (!tree_fits_uhwi_p (idx))

> +		return false;

> +	      unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);

> +	      if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))

> +		{

> +		  size = size * elsize + bytes->length ();

> +		  bytes->safe_grow_cleared (size);

> +		}

> +

> +	      if (!convert_to_bytes (eltype, val, bytes))

> +		return false;

> +

> +	      last_idx = cur_idx;

> +	    }

> +	}

> +      else if (TREE_CODE (type) == RECORD_TYPE)

> +	{

> +	  tree val, fld;

> +	  unsigned HOST_WIDE_INT i;

> +	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)

> +	    {

> +	      /* Append zeros for members with no initializers and

> +		 any padding.  */

> +	      unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);

> +	      if (bytes->length () < cur_off)

> +		bytes->safe_grow_cleared (cur_off);

> +

> +	      if (!convert_to_bytes (TREE_TYPE (val), val, bytes))

> +		return false;

> +	    }

> +	}

> +      else

> +	return false;

> +

> +      /* Compute the size of the COSNTRUCTOR elements.  */

> +      ctor_size = bytes->length () - ctor_size;

> +

> +      /* Append zeros to the byte vector to the full size of the type.

> +	 The type size can be less than the size of the CONSTRUCTOR

> +	 if the latter contains initializers for a flexible array

> +	 member.  */

> +      tree size = TYPE_SIZE_UNIT (type);

> +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);

> +      if (ctor_size < type_size)

> +	if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)

> +	  bytes->safe_grow_cleared (bytes->length () + size_grow);

> +

> +      return true;

> +    }

So I think you need to be more careful with CONSTRUCTOR nodes here.  Not all
elements of an object need to appear in the CONSTRUCTOR.  Elements which do not
appear in the CONSTRUCTOR node are considered zero-initialized, unless
CONSTRUCTOR_NO_CLEARING is set.

I don't see anything in the code above which deals with those oddities of
CONSTRUCTOR nodes.  Did I miss it?

jeff
Ian Lance Taylor via Gcc-patches Aug. 13, 2020, 5:44 p.m. | #3
On 8/13/20 10:21 AM, Jeff Law wrote:
> On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:

>> The folders for these functions (and some others) call c_getsr

>> which relies on string_constant to return the representation of

>> constant strings.  Because the function doesn't handle constants

>> of other types, including aggregates, memcmp or memchr calls

>> involving those are not folded when they could be.

>>

>> The attached patch extends the algorithm used by string_constant

>> to also handle constant aggregates involving elements or members

>> of the same types as native_encode_expr.  (The change restores

>> the empty initializer optimization inadvertently disabled in

>> the fix for pr96058.)

>>

>> To avoid accidentally misusing either string_constant or c_getstr

>> with non-strings I have introduced a pair of new functions to get

>> the representation of those: byte_representation and getbyterep.

>>

>> Tested on x86_64-linux.

>>

>> Martin

> 

>> PR tree-optimization/78257 - missing memcmp optimization with constant arrays

>>

>> gcc/ChangeLog:

>>

>> 	PR middle-end/78257

>> 	* builtins.c (expand_builtin_memory_copy_args): Rename called function.

>> 	(expand_builtin_stpcpy_1): Remove argument from call.

>> 	(expand_builtin_memcmp): Rename called function.

>> 	(inline_expand_builtin_bytecmp): Same.

>> 	* expr.c (convert_to_bytes): New function.

>> 	(constant_byte_string): New function (formerly string_constant).

>> 	(string_constant): Call constant_byte_string.

>> 	(byte_representation): New function.

>> 	* expr.h (byte_representation): Declare.

>> 	* fold-const-call.c (fold_const_call): Rename called function.

>> 	* fold-const.c (c_getstr): Remove an argument.

>> 	(getbyterep): Define a new function.

>> 	* fold-const.h (c_getstr): Remove an argument.

>> 	(getbyterep): Declare a new function.

>> 	* gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.

>> 	(gimple_fold_builtin_string_compare): Same.

>> 	(gimple_fold_builtin_memchr): Same.

>>

>> gcc/testsuite/ChangeLog:

>>

>> 	PR middle-end/78257

>> 	* gcc.dg/memchr.c: New test.

>> 	* gcc.dg/memcmp-2.c: New test.

>> 	* gcc.dg/memcmp-3.c: New test.

>> 	* gcc.dg/memcmp-4.c: New test.

>>

>> diff --git a/gcc/expr.c b/gcc/expr.c

>> index a150fa0d3b5..a124df54655 100644

>> --- a/gcc/expr.c

>> +++ b/gcc/expr.c

>> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, const_tree exp)

>>     /* This must now be the address of EXP.  */

>>     return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, 0) == exp;

>>   }

>> -

>> -/* Return the tree node if an ARG corresponds to a string constant or zero

>> -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the (possibly

>> -   non-constant) offset in bytes within the string that ARG is accessing.

>> -   If MEM_SIZE is non-zero the storage size of the memory is returned.

>> -   If DECL is non-zero the constant declaration is returned if available.  */

>>   

>> -tree

>> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)

>> +/* If EXPR is a constant initializer (either an expression or CONSTRUCTOR),

>> +   attempt to obtain its native representation as an array of nonzero BYTES.

>> +   Return true on success and false on failure (the latter without modifying

>> +   BYTES).  */

>> +

>> +static bool

>> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)

>> +{

>> +  if (TREE_CODE (expr) == CONSTRUCTOR)

>> +    {

>> +      /* Set to the size of the CONSTRUCTOR elements.  */

>> +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();

>> +

>> +      if (TREE_CODE (type) == ARRAY_TYPE)

>> +	{

>> +	  tree val, idx;

>> +	  tree eltype = TREE_TYPE (type);

>> +	  unsigned HOST_WIDE_INT elsize =

>> +	    tree_to_uhwi (TYPE_SIZE_UNIT (eltype));

>> +	  unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;

>> +	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)

>> +	    {

>> +	      /* Append zeros for elements with no initializers.  */

>> +	      if (!tree_fits_uhwi_p (idx))

>> +		return false;

>> +	      unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);

>> +	      if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))

>> +		{

>> +		  size = size * elsize + bytes->length ();

>> +		  bytes->safe_grow_cleared (size);

                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

>> +		}

>> +

>> +	      if (!convert_to_bytes (eltype, val, bytes))

>> +		return false;

>> +

>> +	      last_idx = cur_idx;

>> +	    }

>> +	}

>> +      else if (TREE_CODE (type) == RECORD_TYPE)

>> +	{

>> +	  tree val, fld;

>> +	  unsigned HOST_WIDE_INT i;

>> +	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)

>> +	    {

>> +	      /* Append zeros for members with no initializers and

>> +		 any padding.  */

>> +	      unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);

>> +	      if (bytes->length () < cur_off)

>> +		bytes->safe_grow_cleared (cur_off);

                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +

>> +	      if (!convert_to_bytes (TREE_TYPE (val), val, bytes))

>> +		return false;

>> +	    }

>> +	}

>> +      else

>> +	return false;

>> +

>> +      /* Compute the size of the COSNTRUCTOR elements.  */

>> +      ctor_size = bytes->length () - ctor_size;

>> +

>> +      /* Append zeros to the byte vector to the full size of the type.

>> +	 The type size can be less than the size of the CONSTRUCTOR

>> +	 if the latter contains initializers for a flexible array

>> +	 member.  */

>> +      tree size = TYPE_SIZE_UNIT (type);

>> +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);

>> +      if (ctor_size < type_size)

>> +	if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)

>> +	  bytes->safe_grow_cleared (bytes->length () + size_grow);

>> +

>> +      return true;

>> +    }

> So I think you need to be more careful with CONSTRUCTOR nodes here.  Not all

> elements of an object need to appear in the CONSTRUCTOR.  Elements which do not

> appear in the CONSTRUCTOR node are considered zero-initialized, unless

> CONSTRUCTOR_NO_CLEARING is set.

> 

> I don't see anything in the code above which deals with those oddities of

> CONSTRUCTOR nodes.  Did I miss it?


Just capturing for reference what we just discussed off list:

The underlined code above zeroes out the bytes of elements with
no initializers as well as any padding between fields.  It doesn't
consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so
I looked it up.  According to the internals manual:

   Unrepresented fields will be cleared (zeroed), unless the
   CONSTRUCTOR_NO_CLEARING flag is set, in which case their value
   becomes undefined.

So assuming they're zero should be fine, as would doing nothing.
We agreed on the former so I will go ahead with the patch as is.

Thanks
Martin
Ian Lance Taylor via Gcc-patches Aug. 13, 2020, 5:52 p.m. | #4
On Thu, Aug 13, 2020 at 11:44:46AM -0600, Martin Sebor via Gcc-patches wrote:
> The underlined code above zeroes out the bytes of elements with

> no initializers as well as any padding between fields.  It doesn't

> consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so

> I looked it up.  According to the internals manual:

> 

>   Unrepresented fields will be cleared (zeroed), unless the

>   CONSTRUCTOR_NO_CLEARING flag is set, in which case their value

>   becomes undefined.


CONSTRUCTOR_NO_CLEARING shouldn't be relevant to the middle-end (after
gimplification).
Static variable initializers have zero initialization with or without that
bit, and other than that we only allow empty CONSTRUCTORs to mean all zeros
or VECTOR CONSTRUCTORs where missing elts are zero initialized too but
shouldn't really appear.

	Jakub
Ian Lance Taylor via Gcc-patches Aug. 14, 2020, 11:14 p.m. | #5
On 8/13/20 11:44 AM, Martin Sebor wrote:
> On 8/13/20 10:21 AM, Jeff Law wrote:

>> On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:

>>> The folders for these functions (and some others) call c_getsr

>>> which relies on string_constant to return the representation of

>>> constant strings.  Because the function doesn't handle constants

>>> of other types, including aggregates, memcmp or memchr calls

>>> involving those are not folded when they could be.

>>>

>>> The attached patch extends the algorithm used by string_constant

>>> to also handle constant aggregates involving elements or members

>>> of the same types as native_encode_expr.  (The change restores

>>> the empty initializer optimization inadvertently disabled in

>>> the fix for pr96058.)

>>>

>>> To avoid accidentally misusing either string_constant or c_getstr

>>> with non-strings I have introduced a pair of new functions to get

>>> the representation of those: byte_representation and getbyterep.

>>>

>>> Tested on x86_64-linux.

>>>

>>> Martin

>>

>>> PR tree-optimization/78257 - missing memcmp optimization with 

>>> constant arrays

>>>

>>> gcc/ChangeLog:

>>>

>>>     PR middle-end/78257

>>>     * builtins.c (expand_builtin_memory_copy_args): Rename called 

>>> function.

>>>     (expand_builtin_stpcpy_1): Remove argument from call.

>>>     (expand_builtin_memcmp): Rename called function.

>>>     (inline_expand_builtin_bytecmp): Same.

>>>     * expr.c (convert_to_bytes): New function.

>>>     (constant_byte_string): New function (formerly string_constant).

>>>     (string_constant): Call constant_byte_string.

>>>     (byte_representation): New function.

>>>     * expr.h (byte_representation): Declare.

>>>     * fold-const-call.c (fold_const_call): Rename called function.

>>>     * fold-const.c (c_getstr): Remove an argument.

>>>     (getbyterep): Define a new function.

>>>     * fold-const.h (c_getstr): Remove an argument.

>>>     (getbyterep): Declare a new function.

>>>     * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.

>>>     (gimple_fold_builtin_string_compare): Same.

>>>     (gimple_fold_builtin_memchr): Same.

>>>

>>> gcc/testsuite/ChangeLog:

>>>

>>>     PR middle-end/78257

>>>     * gcc.dg/memchr.c: New test.

>>>     * gcc.dg/memcmp-2.c: New test.

>>>     * gcc.dg/memcmp-3.c: New test.

>>>     * gcc.dg/memcmp-4.c: New test.

>>>

>>> diff --git a/gcc/expr.c b/gcc/expr.c

>>> index a150fa0d3b5..a124df54655 100644

>>> --- a/gcc/expr.c

>>> +++ b/gcc/expr.c

>>> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, 

>>> const_tree exp)

>>>     /* This must now be the address of EXP.  */

>>>     return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, 

>>> 0) == exp;

>>>   }

>>> -

>>> -/* Return the tree node if an ARG corresponds to a string constant 

>>> or zero

>>> -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the 

>>> (possibly

>>> -   non-constant) offset in bytes within the string that ARG is 

>>> accessing.

>>> -   If MEM_SIZE is non-zero the storage size of the memory is returned.

>>> -   If DECL is non-zero the constant declaration is returned if 

>>> available.  */

>>> -tree

>>> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree 

>>> *decl)

>>> +/* If EXPR is a constant initializer (either an expression or 

>>> CONSTRUCTOR),

>>> +   attempt to obtain its native representation as an array of 

>>> nonzero BYTES.

>>> +   Return true on success and false on failure (the latter without 

>>> modifying

>>> +   BYTES).  */

>>> +

>>> +static bool

>>> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)

>>> +{

>>> +  if (TREE_CODE (expr) == CONSTRUCTOR)

>>> +    {

>>> +      /* Set to the size of the CONSTRUCTOR elements.  */

>>> +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();

>>> +

>>> +      if (TREE_CODE (type) == ARRAY_TYPE)

>>> +    {

>>> +      tree val, idx;

>>> +      tree eltype = TREE_TYPE (type);

>>> +      unsigned HOST_WIDE_INT elsize =

>>> +        tree_to_uhwi (TYPE_SIZE_UNIT (eltype));

>>> +      unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;

>>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)

>>> +        {

>>> +          /* Append zeros for elements with no initializers.  */

>>> +          if (!tree_fits_uhwi_p (idx))

>>> +        return false;

>>> +          unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);

>>> +          if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))

>>> +        {

>>> +          size = size * elsize + bytes->length ();

>>> +          bytes->safe_grow_cleared (size);

>                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> 

>>> +        }

>>> +

>>> +          if (!convert_to_bytes (eltype, val, bytes))

>>> +        return false;

>>> +

>>> +          last_idx = cur_idx;

>>> +        }

>>> +    }

>>> +      else if (TREE_CODE (type) == RECORD_TYPE)

>>> +    {

>>> +      tree val, fld;

>>> +      unsigned HOST_WIDE_INT i;

>>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)

>>> +        {

>>> +          /* Append zeros for members with no initializers and

>>> +         any padding.  */

>>> +          unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);

>>> +          if (bytes->length () < cur_off)

>>> +        bytes->safe_grow_cleared (cur_off);

>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

>>> +

>>> +          if (!convert_to_bytes (TREE_TYPE (val), val, bytes))

>>> +        return false;

>>> +        }

>>> +    }

>>> +      else

>>> +    return false;

>>> +

>>> +      /* Compute the size of the COSNTRUCTOR elements.  */

>>> +      ctor_size = bytes->length () - ctor_size;

>>> +

>>> +      /* Append zeros to the byte vector to the full size of the type.

>>> +     The type size can be less than the size of the CONSTRUCTOR

>>> +     if the latter contains initializers for a flexible array

>>> +     member.  */

>>> +      tree size = TYPE_SIZE_UNIT (type);

>>> +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);

>>> +      if (ctor_size < type_size)

>>> +    if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)

>>> +      bytes->safe_grow_cleared (bytes->length () + size_grow);

>>> +

>>> +      return true;

>>> +    }

>> So I think you need to be more careful with CONSTRUCTOR nodes here.  

>> Not all

>> elements of an object need to appear in the CONSTRUCTOR.  Elements 

>> which do not

>> appear in the CONSTRUCTOR node are considered zero-initialized, unless

>> CONSTRUCTOR_NO_CLEARING is set.

>>

>> I don't see anything in the code above which deals with those oddities of

>> CONSTRUCTOR nodes.  Did I miss it?

> 

> Just capturing for reference what we just discussed off list:

> 

> The underlined code above zeroes out the bytes of elements with

> no initializers as well as any padding between fields.  It doesn't

> consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so

> I looked it up.  According to the internals manual:

> 

>    Unrepresented fields will be cleared (zeroed), unless the

>    CONSTRUCTOR_NO_CLEARING flag is set, in which case their value

>    becomes undefined.

> 

> So assuming they're zero should be fine, as would doing nothing.

> We agreed on the former so I will go ahead with the patch as is.


I had missed a few Ada failures.  Apparently, Ada can specify
arbitrary array bounds (not just upper but also lower) but
the code assumed the lower bound would always be zero.  I've
adjusted it to avoid making that assumption and committed
the updated revision in r11-2709.

Martin
Ian Lance Taylor via Gcc-patches Aug. 15, 2020, 2:19 p.m. | #6
Hi Martin,


On Sat, 15 Aug 2020 at 01:14, Martin Sebor via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>

> On 8/13/20 11:44 AM, Martin Sebor wrote:

> > On 8/13/20 10:21 AM, Jeff Law wrote:

> >> On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:

> >>> The folders for these functions (and some others) call c_getsr

> >>> which relies on string_constant to return the representation of

> >>> constant strings.  Because the function doesn't handle constants

> >>> of other types, including aggregates, memcmp or memchr calls

> >>> involving those are not folded when they could be.

> >>>

> >>> The attached patch extends the algorithm used by string_constant

> >>> to also handle constant aggregates involving elements or members

> >>> of the same types as native_encode_expr.  (The change restores

> >>> the empty initializer optimization inadvertently disabled in

> >>> the fix for pr96058.)

> >>>

> >>> To avoid accidentally misusing either string_constant or c_getstr

> >>> with non-strings I have introduced a pair of new functions to get

> >>> the representation of those: byte_representation and getbyterep.

> >>>

> >>> Tested on x86_64-linux.

> >>>

> >>> Martin

> >>

> >>> PR tree-optimization/78257 - missing memcmp optimization with

> >>> constant arrays

> >>>

> >>> gcc/ChangeLog:

> >>>

> >>>     PR middle-end/78257

> >>>     * builtins.c (expand_builtin_memory_copy_args): Rename called

> >>> function.

> >>>     (expand_builtin_stpcpy_1): Remove argument from call.

> >>>     (expand_builtin_memcmp): Rename called function.

> >>>     (inline_expand_builtin_bytecmp): Same.

> >>>     * expr.c (convert_to_bytes): New function.

> >>>     (constant_byte_string): New function (formerly string_constant).

> >>>     (string_constant): Call constant_byte_string.

> >>>     (byte_representation): New function.

> >>>     * expr.h (byte_representation): Declare.

> >>>     * fold-const-call.c (fold_const_call): Rename called function.

> >>>     * fold-const.c (c_getstr): Remove an argument.

> >>>     (getbyterep): Define a new function.

> >>>     * fold-const.h (c_getstr): Remove an argument.

> >>>     (getbyterep): Declare a new function.

> >>>     * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.

> >>>     (gimple_fold_builtin_string_compare): Same.

> >>>     (gimple_fold_builtin_memchr): Same.

> >>>

> >>> gcc/testsuite/ChangeLog:

> >>>

> >>>     PR middle-end/78257

> >>>     * gcc.dg/memchr.c: New test.

> >>>     * gcc.dg/memcmp-2.c: New test.

> >>>     * gcc.dg/memcmp-3.c: New test.

> >>>     * gcc.dg/memcmp-4.c: New test.

> >>>

> >>> diff --git a/gcc/expr.c b/gcc/expr.c

> >>> index a150fa0d3b5..a124df54655 100644

> >>> --- a/gcc/expr.c

> >>> +++ b/gcc/expr.c

> >>> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset,

> >>> const_tree exp)

> >>>     /* This must now be the address of EXP.  */

> >>>     return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset,

> >>> 0) == exp;

> >>>   }

> >>> -

> >>> -/* Return the tree node if an ARG corresponds to a string constant

> >>> or zero

> >>> -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the

> >>> (possibly

> >>> -   non-constant) offset in bytes within the string that ARG is

> >>> accessing.

> >>> -   If MEM_SIZE is non-zero the storage size of the memory is returned.

> >>> -   If DECL is non-zero the constant declaration is returned if

> >>> available.  */

> >>> -tree

> >>> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree

> >>> *decl)

> >>> +/* If EXPR is a constant initializer (either an expression or

> >>> CONSTRUCTOR),

> >>> +   attempt to obtain its native representation as an array of

> >>> nonzero BYTES.

> >>> +   Return true on success and false on failure (the latter without

> >>> modifying

> >>> +   BYTES).  */

> >>> +

> >>> +static bool

> >>> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)

> >>> +{

> >>> +  if (TREE_CODE (expr) == CONSTRUCTOR)

> >>> +    {

> >>> +      /* Set to the size of the CONSTRUCTOR elements.  */

> >>> +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();

> >>> +

> >>> +      if (TREE_CODE (type) == ARRAY_TYPE)

> >>> +    {

> >>> +      tree val, idx;

> >>> +      tree eltype = TREE_TYPE (type);

> >>> +      unsigned HOST_WIDE_INT elsize =

> >>> +        tree_to_uhwi (TYPE_SIZE_UNIT (eltype));

> >>> +      unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;

> >>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)

> >>> +        {

> >>> +          /* Append zeros for elements with no initializers.  */

> >>> +          if (!tree_fits_uhwi_p (idx))

> >>> +        return false;

> >>> +          unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);

> >>> +          if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))

> >>> +        {

> >>> +          size = size * elsize + bytes->length ();

> >>> +          bytes->safe_grow_cleared (size);

> >                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> >

> >>> +        }

> >>> +

> >>> +          if (!convert_to_bytes (eltype, val, bytes))

> >>> +        return false;

> >>> +

> >>> +          last_idx = cur_idx;

> >>> +        }

> >>> +    }

> >>> +      else if (TREE_CODE (type) == RECORD_TYPE)

> >>> +    {

> >>> +      tree val, fld;

> >>> +      unsigned HOST_WIDE_INT i;

> >>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)

> >>> +        {

> >>> +          /* Append zeros for members with no initializers and

> >>> +         any padding.  */

> >>> +          unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);

> >>> +          if (bytes->length () < cur_off)

> >>> +        bytes->safe_grow_cleared (cur_off);

> >                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> >>> +

> >>> +          if (!convert_to_bytes (TREE_TYPE (val), val, bytes))

> >>> +        return false;

> >>> +        }

> >>> +    }

> >>> +      else

> >>> +    return false;

> >>> +

> >>> +      /* Compute the size of the COSNTRUCTOR elements.  */

> >>> +      ctor_size = bytes->length () - ctor_size;

> >>> +

> >>> +      /* Append zeros to the byte vector to the full size of the type.

> >>> +     The type size can be less than the size of the CONSTRUCTOR

> >>> +     if the latter contains initializers for a flexible array

> >>> +     member.  */

> >>> +      tree size = TYPE_SIZE_UNIT (type);

> >>> +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);

> >>> +      if (ctor_size < type_size)

> >>> +    if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)

> >>> +      bytes->safe_grow_cleared (bytes->length () + size_grow);

> >>> +

> >>> +      return true;

> >>> +    }

> >> So I think you need to be more careful with CONSTRUCTOR nodes here.

> >> Not all

> >> elements of an object need to appear in the CONSTRUCTOR.  Elements

> >> which do not

> >> appear in the CONSTRUCTOR node are considered zero-initialized, unless

> >> CONSTRUCTOR_NO_CLEARING is set.

> >>

> >> I don't see anything in the code above which deals with those oddities of

> >> CONSTRUCTOR nodes.  Did I miss it?

> >

> > Just capturing for reference what we just discussed off list:

> >

> > The underlined code above zeroes out the bytes of elements with

> > no initializers as well as any padding between fields.  It doesn't

> > consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so

> > I looked it up.  According to the internals manual:

> >

> >    Unrepresented fields will be cleared (zeroed), unless the

> >    CONSTRUCTOR_NO_CLEARING flag is set, in which case their value

> >    becomes undefined.

> >

> > So assuming they're zero should be fine, as would doing nothing.

> > We agreed on the former so I will go ahead with the patch as is.

>

> I had missed a few Ada failures.  Apparently, Ada can specify

> arbitrary array bounds (not just upper but also lower) but

> the code assumed the lower bound would always be zero.  I've

> adjusted it to avoid making that assumption and committed

> the updated revision in r11-2709.

>


This commit is causing a regression on arm:
FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times gimple "memcmp" 0
FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times optimized
"call_in_true_branch_not_eliminated" 0

Christophe


> Martin
Ian Lance Taylor via Gcc-patches Aug. 17, 2020, 2:53 p.m. | #7
On Sat, 2020-08-15 at 16:19 +0200, Christophe Lyon wrote:
> Hi Martin,

> 

> 

> On Sat, 15 Aug 2020 at 01:14, Martin Sebor via Gcc-patches

> <gcc-patches@gcc.gnu.org> wrote:

> > On 8/13/20 11:44 AM, Martin Sebor wrote:

> > > On 8/13/20 10:21 AM, Jeff Law wrote:

> > > > On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:

> > > > > The folders for these functions (and some others) call c_getsr

> > > > > which relies on string_constant to return the representation of

> > > > > constant strings.  Because the function doesn't handle constants

> > > > > of other types, including aggregates, memcmp or memchr calls

> > > > > involving those are not folded when they could be.

> > > > > 

> > > > > The attached patch extends the algorithm used by string_constant

> > > > > to also handle constant aggregates involving elements or members

> > > > > of the same types as native_encode_expr.  (The change restores

> > > > > the empty initializer optimization inadvertently disabled in

> > > > > the fix for pr96058.)

> > > > > 

> > > > > To avoid accidentally misusing either string_constant or c_getstr

> > > > > with non-strings I have introduced a pair of new functions to get

> > > > > the representation of those: byte_representation and getbyterep.

> > > > > 

> > > > > Tested on x86_64-linux.

> > > > > 

> > > > > Martin

> > > > > PR tree-optimization/78257 - missing memcmp optimization with

> > > > > constant arrays

> > > > > 

> > > > > gcc/ChangeLog:

> > > > > 

> > > > >     PR middle-end/78257

> > > > >     * builtins.c (expand_builtin_memory_copy_args): Rename called

> > > > > function.

> > > > >     (expand_builtin_stpcpy_1): Remove argument from call.

> > > > >     (expand_builtin_memcmp): Rename called function.

> > > > >     (inline_expand_builtin_bytecmp): Same.

> > > > >     * expr.c (convert_to_bytes): New function.

> > > > >     (constant_byte_string): New function (formerly string_constant).

> > > > >     (string_constant): Call constant_byte_string.

> > > > >     (byte_representation): New function.

> > > > >     * expr.h (byte_representation): Declare.

> > > > >     * fold-const-call.c (fold_const_call): Rename called function.

> > > > >     * fold-const.c (c_getstr): Remove an argument.

> > > > >     (getbyterep): Define a new function.

> > > > >     * fold-const.h (c_getstr): Remove an argument.

> > > > >     (getbyterep): Declare a new function.

> > > > >     * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.

> > > > >     (gimple_fold_builtin_string_compare): Same.

> > > > >     (gimple_fold_builtin_memchr): Same.

> > > > > 

> > > > > gcc/testsuite/ChangeLog:

> > > > > 

> > > > >     PR middle-end/78257

> > > > >     * gcc.dg/memchr.c: New test.

> > > > >     * gcc.dg/memcmp-2.c: New test.

> > > > >     * gcc.dg/memcmp-3.c: New test.

> > > > >     * gcc.dg/memcmp-4.c: New test.

> > > > > 

> > > > > diff --git a/gcc/expr.c b/gcc/expr.c

> > > > > index a150fa0d3b5..a124df54655 100644

> > > > > --- a/gcc/expr.c

> > > > > +++ b/gcc/expr.c

> > > > > @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset,

> > > > > const_tree exp)

> > > > >     /* This must now be the address of EXP.  */

> > > > >     return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset,

> > > > > 0) == exp;

> > > > >   }

> > > > > -

> > > > > -/* Return the tree node if an ARG corresponds to a string constant

> > > > > or zero

> > > > > -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the

> > > > > (possibly

> > > > > -   non-constant) offset in bytes within the string that ARG is

> > > > > accessing.

> > > > > -   If MEM_SIZE is non-zero the storage size of the memory is returned.

> > > > > -   If DECL is non-zero the constant declaration is returned if

> > > > > available.  */

> > > > > -tree

> > > > > -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree

> > > > > *decl)

> > > > > +/* If EXPR is a constant initializer (either an expression or

> > > > > CONSTRUCTOR),

> > > > > +   attempt to obtain its native representation as an array of

> > > > > nonzero BYTES.

> > > > > +   Return true on success and false on failure (the latter without

> > > > > modifying

> > > > > +   BYTES).  */

> > > > > +

> > > > > +static bool

> > > > > +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)

> > > > > +{

> > > > > +  if (TREE_CODE (expr) == CONSTRUCTOR)

> > > > > +    {

> > > > > +      /* Set to the size of the CONSTRUCTOR elements.  */

> > > > > +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();

> > > > > +

> > > > > +      if (TREE_CODE (type) == ARRAY_TYPE)

> > > > > +    {

> > > > > +      tree val, idx;

> > > > > +      tree eltype = TREE_TYPE (type);

> > > > > +      unsigned HOST_WIDE_INT elsize =

> > > > > +        tree_to_uhwi (TYPE_SIZE_UNIT (eltype));

> > > > > +      unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;

> > > > > +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)

> > > > > +        {

> > > > > +          /* Append zeros for elements with no initializers.  */

> > > > > +          if (!tree_fits_uhwi_p (idx))

> > > > > +        return false;

> > > > > +          unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);

> > > > > +          if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))

> > > > > +        {

> > > > > +          size = size * elsize + bytes->length ();

> > > > > +          bytes->safe_grow_cleared (size);

> > >                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> > > 

> > > > > +        }

> > > > > +

> > > > > +          if (!convert_to_bytes (eltype, val, bytes))

> > > > > +        return false;

> > > > > +

> > > > > +          last_idx = cur_idx;

> > > > > +        }

> > > > > +    }

> > > > > +      else if (TREE_CODE (type) == RECORD_TYPE)

> > > > > +    {

> > > > > +      tree val, fld;

> > > > > +      unsigned HOST_WIDE_INT i;

> > > > > +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)

> > > > > +        {

> > > > > +          /* Append zeros for members with no initializers and

> > > > > +         any padding.  */

> > > > > +          unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);

> > > > > +          if (bytes->length () < cur_off)

> > > > > +        bytes->safe_grow_cleared (cur_off);

> > >                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> > > > > +

> > > > > +          if (!convert_to_bytes (TREE_TYPE (val), val, bytes))

> > > > > +        return false;

> > > > > +        }

> > > > > +    }

> > > > > +      else

> > > > > +    return false;

> > > > > +

> > > > > +      /* Compute the size of the COSNTRUCTOR elements.  */

> > > > > +      ctor_size = bytes->length () - ctor_size;

> > > > > +

> > > > > +      /* Append zeros to the byte vector to the full size of the type.

> > > > > +     The type size can be less than the size of the CONSTRUCTOR

> > > > > +     if the latter contains initializers for a flexible array

> > > > > +     member.  */

> > > > > +      tree size = TYPE_SIZE_UNIT (type);

> > > > > +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);

> > > > > +      if (ctor_size < type_size)

> > > > > +    if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)

> > > > > +      bytes->safe_grow_cleared (bytes->length () + size_grow);

> > > > > +

> > > > > +      return true;

> > > > > +    }

> > > > So I think you need to be more careful with CONSTRUCTOR nodes here.

> > > > Not all

> > > > elements of an object need to appear in the CONSTRUCTOR.  Elements

> > > > which do not

> > > > appear in the CONSTRUCTOR node are considered zero-initialized, unless

> > > > CONSTRUCTOR_NO_CLEARING is set.

> > > > 

> > > > I don't see anything in the code above which deals with those oddities of

> > > > CONSTRUCTOR nodes.  Did I miss it?

> > > 

> > > Just capturing for reference what we just discussed off list:

> > > 

> > > The underlined code above zeroes out the bytes of elements with

> > > no initializers as well as any padding between fields.  It doesn't

> > > consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so

> > > I looked it up.  According to the internals manual:

> > > 

> > >    Unrepresented fields will be cleared (zeroed), unless the

> > >    CONSTRUCTOR_NO_CLEARING flag is set, in which case their value

> > >    becomes undefined.

> > > 

> > > So assuming they're zero should be fine, as would doing nothing.

> > > We agreed on the former so I will go ahead with the patch as is.

> > 

> > I had missed a few Ada failures.  Apparently, Ada can specify

> > arbitrary array bounds (not just upper but also lower) but

> > the code assumed the lower bound would always be zero.  I've

> > adjusted it to avoid making that assumption and committed

> > the updated revision in r11-2709.

> > 

> 

> This commit is causing a regression on arm:

> FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times gimple "memcmp" 0

> FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times optimized

> "call_in_true_branch_not_eliminated" 0

I'm seeing this on a variety of targets as well.

jeff
Ian Lance Taylor via Gcc-patches Aug. 18, 2020, 4:32 p.m. | #8
On 8/15/20 8:19 AM, Christophe Lyon wrote:
> Hi Martin,

> 

> 

> On Sat, 15 Aug 2020 at 01:14, Martin Sebor via Gcc-patches

> <gcc-patches@gcc.gnu.org> wrote:

>>

>> On 8/13/20 11:44 AM, Martin Sebor wrote:

>>> On 8/13/20 10:21 AM, Jeff Law wrote:

>>>> On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:

>>>>> The folders for these functions (and some others) call c_getsr

>>>>> which relies on string_constant to return the representation of

>>>>> constant strings.  Because the function doesn't handle constants

>>>>> of other types, including aggregates, memcmp or memchr calls

>>>>> involving those are not folded when they could be.

>>>>>

>>>>> The attached patch extends the algorithm used by string_constant

>>>>> to also handle constant aggregates involving elements or members

>>>>> of the same types as native_encode_expr.  (The change restores

>>>>> the empty initializer optimization inadvertently disabled in

>>>>> the fix for pr96058.)

>>>>>

>>>>> To avoid accidentally misusing either string_constant or c_getstr

>>>>> with non-strings I have introduced a pair of new functions to get

>>>>> the representation of those: byte_representation and getbyterep.

>>>>>

>>>>> Tested on x86_64-linux.

>>>>>

>>>>> Martin

>>>>

>>>>> PR tree-optimization/78257 - missing memcmp optimization with

>>>>> constant arrays

>>>>>

>>>>> gcc/ChangeLog:

>>>>>

>>>>>      PR middle-end/78257

>>>>>      * builtins.c (expand_builtin_memory_copy_args): Rename called

>>>>> function.

>>>>>      (expand_builtin_stpcpy_1): Remove argument from call.

>>>>>      (expand_builtin_memcmp): Rename called function.

>>>>>      (inline_expand_builtin_bytecmp): Same.

>>>>>      * expr.c (convert_to_bytes): New function.

>>>>>      (constant_byte_string): New function (formerly string_constant).

>>>>>      (string_constant): Call constant_byte_string.

>>>>>      (byte_representation): New function.

>>>>>      * expr.h (byte_representation): Declare.

>>>>>      * fold-const-call.c (fold_const_call): Rename called function.

>>>>>      * fold-const.c (c_getstr): Remove an argument.

>>>>>      (getbyterep): Define a new function.

>>>>>      * fold-const.h (c_getstr): Remove an argument.

>>>>>      (getbyterep): Declare a new function.

>>>>>      * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.

>>>>>      (gimple_fold_builtin_string_compare): Same.

>>>>>      (gimple_fold_builtin_memchr): Same.

>>>>>

>>>>> gcc/testsuite/ChangeLog:

>>>>>

>>>>>      PR middle-end/78257

>>>>>      * gcc.dg/memchr.c: New test.

>>>>>      * gcc.dg/memcmp-2.c: New test.

>>>>>      * gcc.dg/memcmp-3.c: New test.

>>>>>      * gcc.dg/memcmp-4.c: New test.

>>>>>

>>>>> diff --git a/gcc/expr.c b/gcc/expr.c

>>>>> index a150fa0d3b5..a124df54655 100644

>>>>> --- a/gcc/expr.c

>>>>> +++ b/gcc/expr.c

>>>>> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset,

>>>>> const_tree exp)

>>>>>      /* This must now be the address of EXP.  */

>>>>>      return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset,

>>>>> 0) == exp;

>>>>>    }

>>>>> -

>>>>> -/* Return the tree node if an ARG corresponds to a string constant

>>>>> or zero

>>>>> -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the

>>>>> (possibly

>>>>> -   non-constant) offset in bytes within the string that ARG is

>>>>> accessing.

>>>>> -   If MEM_SIZE is non-zero the storage size of the memory is returned.

>>>>> -   If DECL is non-zero the constant declaration is returned if

>>>>> available.  */

>>>>> -tree

>>>>> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree

>>>>> *decl)

>>>>> +/* If EXPR is a constant initializer (either an expression or

>>>>> CONSTRUCTOR),

>>>>> +   attempt to obtain its native representation as an array of

>>>>> nonzero BYTES.

>>>>> +   Return true on success and false on failure (the latter without

>>>>> modifying

>>>>> +   BYTES).  */

>>>>> +

>>>>> +static bool

>>>>> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)

>>>>> +{

>>>>> +  if (TREE_CODE (expr) == CONSTRUCTOR)

>>>>> +    {

>>>>> +      /* Set to the size of the CONSTRUCTOR elements.  */

>>>>> +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();

>>>>> +

>>>>> +      if (TREE_CODE (type) == ARRAY_TYPE)

>>>>> +    {

>>>>> +      tree val, idx;

>>>>> +      tree eltype = TREE_TYPE (type);

>>>>> +      unsigned HOST_WIDE_INT elsize =

>>>>> +        tree_to_uhwi (TYPE_SIZE_UNIT (eltype));

>>>>> +      unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;

>>>>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)

>>>>> +        {

>>>>> +          /* Append zeros for elements with no initializers.  */

>>>>> +          if (!tree_fits_uhwi_p (idx))

>>>>> +        return false;

>>>>> +          unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);

>>>>> +          if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))

>>>>> +        {

>>>>> +          size = size * elsize + bytes->length ();

>>>>> +          bytes->safe_grow_cleared (size);

>>>                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

>>>

>>>>> +        }

>>>>> +

>>>>> +          if (!convert_to_bytes (eltype, val, bytes))

>>>>> +        return false;

>>>>> +

>>>>> +          last_idx = cur_idx;

>>>>> +        }

>>>>> +    }

>>>>> +      else if (TREE_CODE (type) == RECORD_TYPE)

>>>>> +    {

>>>>> +      tree val, fld;

>>>>> +      unsigned HOST_WIDE_INT i;

>>>>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)

>>>>> +        {

>>>>> +          /* Append zeros for members with no initializers and

>>>>> +         any padding.  */

>>>>> +          unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);

>>>>> +          if (bytes->length () < cur_off)

>>>>> +        bytes->safe_grow_cleared (cur_off);

>>>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

>>>>> +

>>>>> +          if (!convert_to_bytes (TREE_TYPE (val), val, bytes))

>>>>> +        return false;

>>>>> +        }

>>>>> +    }

>>>>> +      else

>>>>> +    return false;

>>>>> +

>>>>> +      /* Compute the size of the COSNTRUCTOR elements.  */

>>>>> +      ctor_size = bytes->length () - ctor_size;

>>>>> +

>>>>> +      /* Append zeros to the byte vector to the full size of the type.

>>>>> +     The type size can be less than the size of the CONSTRUCTOR

>>>>> +     if the latter contains initializers for a flexible array

>>>>> +     member.  */

>>>>> +      tree size = TYPE_SIZE_UNIT (type);

>>>>> +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);

>>>>> +      if (ctor_size < type_size)

>>>>> +    if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)

>>>>> +      bytes->safe_grow_cleared (bytes->length () + size_grow);

>>>>> +

>>>>> +      return true;

>>>>> +    }

>>>> So I think you need to be more careful with CONSTRUCTOR nodes here.

>>>> Not all

>>>> elements of an object need to appear in the CONSTRUCTOR.  Elements

>>>> which do not

>>>> appear in the CONSTRUCTOR node are considered zero-initialized, unless

>>>> CONSTRUCTOR_NO_CLEARING is set.

>>>>

>>>> I don't see anything in the code above which deals with those oddities of

>>>> CONSTRUCTOR nodes.  Did I miss it?

>>>

>>> Just capturing for reference what we just discussed off list:

>>>

>>> The underlined code above zeroes out the bytes of elements with

>>> no initializers as well as any padding between fields.  It doesn't

>>> consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so

>>> I looked it up.  According to the internals manual:

>>>

>>>     Unrepresented fields will be cleared (zeroed), unless the

>>>     CONSTRUCTOR_NO_CLEARING flag is set, in which case their value

>>>     becomes undefined.

>>>

>>> So assuming they're zero should be fine, as would doing nothing.

>>> We agreed on the former so I will go ahead with the patch as is.

>>

>> I had missed a few Ada failures.  Apparently, Ada can specify

>> arbitrary array bounds (not just upper but also lower) but

>> the code assumed the lower bound would always be zero.  I've

>> adjusted it to avoid making that assumption and committed

>> the updated revision in r11-2709.

>>

> 

> This commit is causing a regression on arm:

> FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times gimple "memcmp" 0

> FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times optimized

> "call_in_true_branch_not_eliminated" 0


Thanks for letting me know!  (The failure was also subsequently
reported in PR 96665.)

The test fails because the new byte_representation() function isn't
prepared to deal with strings longer than the size of the small local
buffer it passes to native_encode_expr().  When the buffer isn't big
enough for the whole string, native_encode_expr() encodes only as
much as fits and returns the number of encoded bytes.

A simple fix is straightforward but I think a better solution would
be to call native_encode_initializer() that already has most of
the same smarts.  I didn't notice the function until this failure
was reported. Unfortunately, it has (at least) two limitations:

1) it doesn't distinguish a failure from "end of encoding"
2) it doesn't handle initializers for structs with flexible array
    members

(1) means that it can't be called in a loop on any arbitrary
     initializer until the whole thing is encoded.  There's no way
     to tell if the function failed to convert a member of a struct
     or that there simply is nothing else to convert.
(2) means that using it as is would a number of other regressions
     like the one in in strlenopt-55.c.

A fix for (2) is localized to just native_encode_initializer()
(and attached) so that's what I plan to commit unless there are
concerns/suggestions for changes.

Although strictly not necessary, a fix for (1) should involve
changing all other native_encode_xxx() functions to return -1
on failure for consistency, and updating their callers, which
is on the order of 70 places.  I'll think about proposing that
separately.

Martin
PR middle-end/96665 - memcmp of a constant string not folded

gcc/ChangeLog:

	PR middle-end/96665
	* expr.c (convert_to_bytes): Replace statically allocated buffer with
	a dynamically allocated one of sufficient size.

gcc/testsuite/ChangeLog:

	PR middle-end/78257
	* gcc.dg/memcmp-5.c: New test.

diff --git a/gcc/expr.c b/gcc/expr.c
index dd2200ddea8..437faeaba08 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11683,16 +11683,27 @@ convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)
       return true;
     }
 
-  unsigned char charbuf[MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT];
-  int len = native_encode_expr (expr, charbuf, sizeof charbuf, 0);
-  if (len <= 0)
+  /* Except for RECORD_TYPE which may have an initialized flexible array
+     member, the size of a type is the same as the size of the initializer
+     (including any implicitly zeroed out members and padding).  Allocate
+     just enough for that many bytes.  */
+  tree expr_size = TYPE_SIZE_UNIT (TREE_TYPE (expr));
+  if (!expr_size || !tree_fits_uhwi_p (expr_size))
+    return false;
+  const unsigned HOST_WIDE_INT expr_bytes = tree_to_uhwi (expr_size);
+  const unsigned bytes_sofar = bytes->length ();
+  /* native_encode_expr can convert at most INT_MAX bytes.  vec is limited
+     to at most UINT_MAX.  */
+  if (bytes_sofar + expr_bytes > INT_MAX)
     return false;
 
-  unsigned n = bytes->length ();
-  bytes->safe_grow (n + len);
-  unsigned char *p = bytes->address ();
-  memcpy (p + n, charbuf, len);
-  return true;
+  /* Unlike for RECORD_TYPE, there is no need to clear the memory since
+     it's completely overwritten by native_encode_expr.  */
+  bytes->safe_grow (bytes_sofar + expr_bytes);
+  unsigned char *pnext = bytes->begin () + bytes_sofar;
+  int nbytes = native_encode_expr (expr, pnext, expr_bytes, 0);
+  /* NBYTES is zero on failure.  Otherwise it should equal EXPR_BYTES.  */
+  return (unsigned HOST_WIDE_INT) nbytes == expr_bytes;
 }
 
 /* Return a STRING_CST corresponding to ARG's constant initializer either
diff --git a/gcc/testsuite/gcc.dg/memcmp-5.c b/gcc/testsuite/gcc.dg/memcmp-5.c
new file mode 100644
index 00000000000..34bae92f6b0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/memcmp-5.c
@@ -0,0 +1,72 @@
+/* PR middle-end/78257 - missing memcmp optimization with constant arrays
+   { dg-do compile }
+   { dg-options "-O -Wall -fdump-tree-optimized" } */
+
+#define A "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" \
+          "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" \
+          "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" \
+          "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" \
+          "0"
+
+const char a257[sizeof A - 1] = A;
+const char a258[sizeof A] = A;
+
+_Static_assert (sizeof A == 258);
+_Static_assert (sizeof a257 == 257);
+
+/* Verify that initializers longer than 256 characters (an internal limit
+   on the size of a buffer used to store representations in) are handled.  */
+
+void eq_256plus (void)
+{
+  int n = 0;
+
+  n += __builtin_memcmp (a257,       A,       sizeof a257);
+  n += __builtin_memcmp (a257 +   1, A +   1, sizeof a257 - 1);
+  n += __builtin_memcmp (a257 +   2, A +   2, sizeof a257 - 2);
+  n += __builtin_memcmp (a257 + 127, A + 127, sizeof a257 - 127);
+  n += __builtin_memcmp (a257 + 128, A + 128, sizeof a257 - 128);
+  n += __builtin_memcmp (a257 + 255, A + 255, 2);
+  n += __builtin_memcmp (a257 + 256, A + 256, 1);
+
+  n += __builtin_memcmp (a258,       A,       sizeof a257);
+  n += __builtin_memcmp (a258 +   1, A +   1, sizeof a257 - 1);
+  n += __builtin_memcmp (a258 +   2, A +   2, sizeof a257 - 2);
+  n += __builtin_memcmp (a258 + 127, A + 127, sizeof a257 - 127);
+  n += __builtin_memcmp (a258 + 128, A + 128, sizeof a257 - 128);
+  n += __builtin_memcmp (a258 + 256, A + 256, 2);
+  n += __builtin_memcmp (a258 + 257, A + 257, 1);
+
+  if (n)
+    __builtin_abort ();
+}
+
+#define X "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" \
+          "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" \
+          "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" \
+          "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" \
+          "1"
+
+void lt_256plus (void)
+{
+  int n = 0;
+
+  n += 0 >  __builtin_memcmp (a257,       X,       sizeof a257);
+  n += 0 >  __builtin_memcmp (a257 +   1, X +   1, sizeof a257 - 1);
+  n += 0 >  __builtin_memcmp (a257 +   2, X +   2, sizeof a257 - 2);
+  n += 0 >  __builtin_memcmp (a257 + 127, X + 127, sizeof a257 - 127);
+  n += 0 >  __builtin_memcmp (a257 + 128, X + 128, sizeof a257 - 128);
+  n += 0 >  __builtin_memcmp (a257 + 255, X + 255, 2);
+  n += 0 >  __builtin_memcmp (a257 + 256, X + 256, 1);
+
+  n += 0 >  __builtin_memcmp (a258,       X,       sizeof a258);
+  n += 0 >  __builtin_memcmp (a258 +   1, X +   1, sizeof a258 - 1);
+  n += 0 >  __builtin_memcmp (a258 +   2, X +   2, sizeof a258 - 2);
+  n += 0 >  __builtin_memcmp (a258 + 127, X + 127, sizeof a257 - 127);
+  n += 0 >  __builtin_memcmp (a258 + 128, X + 128, sizeof a257 - 128);
+  n += 0 >  __builtin_memcmp (a258 + 256, X + 256, 2);
+  n += 0 == __builtin_memcmp (a258 + 257, X + 257, 1);
+
+  if (n != 14)
+    __builtin_abort ();
+}

Patch

PR tree-optimization/78257 - missing memcmp optimization with constant arrays

gcc/ChangeLog:

	PR middle-end/78257
	* builtins.c (expand_builtin_memory_copy_args): Rename called function.
	(expand_builtin_stpcpy_1): Remove argument from call.
	(expand_builtin_memcmp): Rename called function.
	(inline_expand_builtin_bytecmp): Same.
	* expr.c (convert_to_bytes): New function.
	(constant_byte_string): New function (formerly string_constant).
	(string_constant): Call constant_byte_string.
	(byte_representation): New function.
	* expr.h (byte_representation): Declare.
	* fold-const-call.c (fold_const_call): Rename called function.
	* fold-const.c (c_getstr): Remove an argument.
	(getbyterep): Define a new function.
	* fold-const.h (c_getstr): Remove an argument.
	(getbyterep): Declare a new function.
	* gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.
	(gimple_fold_builtin_string_compare): Same.
	(gimple_fold_builtin_memchr): Same.

gcc/testsuite/ChangeLog:

	PR middle-end/78257
	* gcc.dg/memchr.c: New test.
	* gcc.dg/memcmp-2.c: New test.
	* gcc.dg/memcmp-3.c: New test.
	* gcc.dg/memcmp-4.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 228db78f32b..b872119c1cb 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -4447,7 +4447,7 @@  expand_builtin_memory_copy_args (tree dest, tree src, tree len,
   /* Try to get the byte representation of the constant SRC points to,
      with its byte size in NBYTES.  */
   unsigned HOST_WIDE_INT nbytes;
-  const char *rep = c_getstr (src, &nbytes);
+  const char *rep = getbyterep (src, &nbytes);
 
   /* If the function's constant bound LEN_RTX is less than or equal
      to the byte size of the representation of the constant argument,
@@ -4455,7 +4455,7 @@  expand_builtin_memory_copy_args (tree dest, tree src, tree len,
      the bytes from memory and only store the computed constant.
      This works in the overlap (memmove) case as well because
      store_by_pieces just generates a series of stores of constants
-     from the representation returned by c_getstr().  */
+     from the representation returned by getbyterep().  */
   if (rep
       && CONST_INT_P (len_rtx)
       && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= nbytes
@@ -4704,7 +4704,7 @@  expand_builtin_stpcpy_1 (tree exp, rtx target, machine_mode mode)
 	 because the latter will potentially produce pessimized code
 	 when used to produce the return value.  */
       c_strlen_data lendata = { };
-      if (!c_getstr (src, NULL)
+      if (!c_getstr (src)
 	  || !(len = c_strlen (src, 0, &lendata, 1)))
 	return expand_movstr (dst, src, target,
 			      /*retmode=*/ RETURN_END_MINUS_ONE);
@@ -5357,11 +5357,11 @@  expand_builtin_memcmp (tree exp, rtx target, bool result_eq)
      when the function's result is used for equality to zero, ARG1)
      points to, with its byte size in NBYTES.  */
   unsigned HOST_WIDE_INT nbytes;
-  const char *rep = c_getstr (arg2, &nbytes);
+  const char *rep = getbyterep (arg2, &nbytes);
   if (result_eq && rep == NULL)
     {
       /* For equality to zero the arguments are interchangeable.  */
-      rep = c_getstr (arg1, &nbytes);
+      rep = getbyterep (arg1, &nbytes);
       if (rep != NULL)
 	std::swap (arg1_rtx, arg2_rtx);
     }
@@ -7811,8 +7811,8 @@  inline_expand_builtin_bytecmp (tree exp, rtx target)
   /* Get the object representation of the initializers of ARG1 and ARG2
      as strings, provided they refer to constant objects, with their byte
      sizes in LEN1 and LEN2, respectively.  */
-  const char *bytes1 = c_getstr (arg1, &len1);
-  const char *bytes2 = c_getstr (arg2, &len2);
+  const char *bytes1 = getbyterep (arg1, &len1);
+  const char *bytes2 = getbyterep (arg2, &len2);
 
   /* Fail if neither argument refers to an initialized constant.  */
   if (!bytes1 && !bytes2)
diff --git a/gcc/expr.c b/gcc/expr.c
index a150fa0d3b5..a124df54655 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11594,15 +11594,103 @@  is_aligning_offset (const_tree offset, const_tree exp)
   /* This must now be the address of EXP.  */
   return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, 0) == exp;
 }
-
-/* Return the tree node if an ARG corresponds to a string constant or zero
-   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the (possibly
-   non-constant) offset in bytes within the string that ARG is accessing.
-   If MEM_SIZE is non-zero the storage size of the memory is returned.
-   If DECL is non-zero the constant declaration is returned if available.  */
 
-tree
-string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
+/* If EXPR is a constant initializer (either an expression or CONSTRUCTOR),
+   attempt to obtain its native representation as an array of nonzero BYTES.
+   Return true on success and false on failure (the latter without modifying
+   BYTES).  */
+
+static bool
+convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)
+{
+  if (TREE_CODE (expr) == CONSTRUCTOR)
+    {
+      /* Set to the size of the CONSTRUCTOR elements.  */
+      unsigned HOST_WIDE_INT ctor_size = bytes->length ();
+
+      if (TREE_CODE (type) == ARRAY_TYPE)
+	{
+	  tree val, idx;
+	  tree eltype = TREE_TYPE (type);
+	  unsigned HOST_WIDE_INT elsize =
+	    tree_to_uhwi (TYPE_SIZE_UNIT (eltype));
+	  unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;
+	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)
+	    {
+	      /* Append zeros for elements with no initializers.  */
+	      if (!tree_fits_uhwi_p (idx))
+		return false;
+	      unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);
+	      if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))
+		{
+		  size = size * elsize + bytes->length ();
+		  bytes->safe_grow_cleared (size);
+		}
+
+	      if (!convert_to_bytes (eltype, val, bytes))
+		return false;
+
+	      last_idx = cur_idx;
+	    }
+	}
+      else if (TREE_CODE (type) == RECORD_TYPE)
+	{
+	  tree val, fld;
+	  unsigned HOST_WIDE_INT i;
+	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)
+	    {
+	      /* Append zeros for members with no initializers and
+		 any padding.  */
+	      unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);
+	      if (bytes->length () < cur_off)
+		bytes->safe_grow_cleared (cur_off);
+
+	      if (!convert_to_bytes (TREE_TYPE (val), val, bytes))
+		return false;
+	    }
+	}
+      else
+	return false;
+
+      /* Compute the size of the COSNTRUCTOR elements.  */
+      ctor_size = bytes->length () - ctor_size;
+
+      /* Append zeros to the byte vector to the full size of the type.
+	 The type size can be less than the size of the CONSTRUCTOR
+	 if the latter contains initializers for a flexible array
+	 member.  */
+      tree size = TYPE_SIZE_UNIT (type);
+      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);
+      if (ctor_size < type_size)
+	if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)
+	  bytes->safe_grow_cleared (bytes->length () + size_grow);
+
+      return true;
+    }
+
+  unsigned char charbuf[MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT];
+  int len = native_encode_expr (expr, charbuf, sizeof charbuf, 0);
+  if (len <= 0)
+    return false;
+
+  unsigned n = bytes->length ();
+  bytes->safe_grow (n + len);
+  unsigned char *p = bytes->address ();
+  memcpy (p + n, charbuf, len);
+  return true;
+}
+
+/* Return a STRING_CST corresponding to ARG's constant initializer either
+   if it's a string constant, or, when VALREP is set, any other constant,
+   or null otherwise.
+   On success, set *PTR_OFFSET to the (possibly non-constant) byte offset
+   within the byte string that ARG is references.  If nonnull set *MEM_SIZE
+   to the size of the byte string.  If nonnull, set *DECL to the constant
+   declaration ARG refers to.  */
+
+static tree
+constant_byte_string (tree arg, tree *ptr_offset, tree *mem_size, tree *decl,
+		      bool valrep = false)
 {
   tree dummy = NULL_TREE;;
   if (!mem_size)
@@ -11749,18 +11837,43 @@  string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
       return array;
     }
 
-  if (!VAR_P (array) && TREE_CODE (array) != CONST_DECL)
-    return NULL_TREE;
-
   tree init = ctor_for_folding (array);
-
-  /* Handle variables initialized with string literals.  */
   if (!init || init == error_mark_node)
     return NULL_TREE;
+
+  if (valrep)
+    {
+      HOST_WIDE_INT cstoff;
+      if (!base_off.is_constant (&cstoff))
+	return NULL_TREE;
+
+      /* If value representation was requested convert the initializer
+	 for the whole array or object into a string of bytes forming
+	 its value representation and return it.  */
+      auto_vec<unsigned char> bytes;
+      if (!convert_to_bytes (TREE_TYPE (init), init, &bytes))
+	return NULL_TREE;
+
+      unsigned n = bytes.length ();
+      const char *p = reinterpret_cast<const char *>(bytes.address ());
+      init = build_string_literal (n, p, char_type_node);
+      init = TREE_OPERAND (init, 0);
+      init = TREE_OPERAND (init, 0);
+
+      *mem_size = size_int (TREE_STRING_LENGTH (init));
+      *ptr_offset = wide_int_to_tree (ssizetype, base_off);
+
+      if (decl)
+	*decl = array;
+
+      return init;
+    }
+
   if (TREE_CODE (init) == CONSTRUCTOR)
     {
       /* Convert the 64-bit constant offset to a wider type to avoid
-	 overflow.  */
+	 overflow and use it to obtain the initializer for the subobject
+	 it points into.  */
       offset_int wioff;
       if (!base_off.is_constant (&wioff))
 	return NULL_TREE;
@@ -11773,6 +11886,9 @@  string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
       unsigned HOST_WIDE_INT fieldoff = 0;
       init = fold_ctor_reference (TREE_TYPE (arg), init, base_off, 0, array,
 				  &fieldoff);
+      if (!init || init == error_mark_node)
+	return NULL_TREE;
+
       HOST_WIDE_INT cstoff;
       if (!base_off.is_constant (&cstoff))
 	return NULL_TREE;
@@ -11785,9 +11901,6 @@  string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
 	offset = off;
     }
 
-  if (!init)
-    return NULL_TREE;
-
   *ptr_offset = offset;
 
   tree inittype = TREE_TYPE (init);
@@ -11858,7 +11971,29 @@  string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
 
   return init;
 }
-
+
+/* Return STRING_CST if an ARG corresponds to a string constant or zero
+   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the (possibly
+   non-constant) offset in bytes within the string that ARG is accessing.
+   If MEM_SIZE is non-zero the storage size of the memory is returned.
+   If DECL is non-zero the constant declaration is returned if available.  */
+
+tree
+string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
+{
+  return constant_byte_string (arg, ptr_offset, mem_size, decl, false);
+}
+
+/* Similar to string_constant, return a STRING_CST corresponding
+   to the value representation of the first argument if it's
+   a constant.  */
+
+tree
+byte_representation (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
+{
+  return constant_byte_string (arg, ptr_offset, mem_size, decl, true);
+}
+
 /* Compute the modular multiplicative inverse of A modulo M
    using extended Euclid's algorithm.  Assumes A and M are coprime.  */
 static wide_int
diff --git a/gcc/expr.h b/gcc/expr.h
index 725991ff217..88d55bac30e 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -289,9 +289,13 @@  expand_normal (tree exp)
 }
 
 
-/* Return the tree node and offset if a given argument corresponds to
-   a string constant.  */
+/* Return STRING_CST and set offset, size and decl, if the first
+   argument corresponds to a string constant.  */
 extern tree string_constant (tree, tree *, tree *, tree *);
+/* Similar to string_constant, return a STRING_CST corresponding
+   to the value representation of the first argument if it's
+   a constant.  */
+extern tree byte_representation (tree, tree *, tree *, tree *);
 
 extern enum tree_code maybe_optimize_mod_cmp (enum tree_code, tree *, tree *);
 
diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c
index c9e368db9d0..11ed47db3d9 100644
--- a/gcc/fold-const-call.c
+++ b/gcc/fold-const-call.c
@@ -1800,8 +1800,8 @@  fold_const_call (combined_fn fn, tree type, tree arg0, tree arg1, tree arg2)
 	  && !TREE_SIDE_EFFECTS (arg0)
 	  && !TREE_SIDE_EFFECTS (arg1))
 	return build_int_cst (type, 0);
-      if ((p0 = c_getstr (arg0, &s0))
-	  && (p1 = c_getstr (arg1, &s1))
+      if ((p0 = getbyterep (arg0, &s0))
+	  && (p1 = getbyterep (arg1, &s1))
 	  && s2 <= s0
 	  && s2 <= s1)
 	return build_cmp_result (type, memcmp (p0, p1, s2));
@@ -1814,7 +1814,7 @@  fold_const_call (combined_fn fn, tree type, tree arg0, tree arg1, tree arg2)
 	  && !TREE_SIDE_EFFECTS (arg0)
 	  && !TREE_SIDE_EFFECTS (arg1))
 	return build_int_cst (type, 0);
-      if ((p0 = c_getstr (arg0, &s0))
+      if ((p0 = getbyterep (arg0, &s0))
 	  && s2 <= s0
 	  && target_char_cst_p (arg1, &c))
 	{
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 300d959278b..1c66edc9474 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -15487,19 +15487,19 @@  fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE_INT off)
 			  ptr, size_int (off));
 }
 
-/* Return a pointer P to a NUL-terminated string containing the sequence
+/* Return a pointer to a NUL-terminated string containing the sequence
    of bytes corresponding to the representation of the object referred to
    by SRC (or a subsequence of such bytes within it if SRC is a reference
    to an initialized constant array plus some constant offset).
-   If STRSIZE is non-null, store the number of bytes in the constant
-   sequence including the terminating NUL byte.  *STRSIZE is equal to
-   sizeof(A) - OFFSET where A is the array that stores the constant
-   sequence that SRC points to and OFFSET is the byte offset of SRC from
-   the beginning of A.  SRC need not point to a string or even an array
-   of characters but may point to an object of any type.  */
+   Set *STRSIZE the number of bytes in the constant sequence including
+   the terminating NUL byte.  *STRSIZE is equal to sizeof(A) - OFFSET
+   where A is the array that stores the constant sequence that SRC points
+   to and OFFSET is the byte offset of SRC from the beginning of A.  SRC
+   need not point to a string or even an array of characters but may point
+   to an object of any type.  */
 
 const char *
-c_getstr (tree src, unsigned HOST_WIDE_INT *strsize /* = NULL */)
+getbyterep (tree src, unsigned HOST_WIDE_INT *strsize)
 {
   /* The offset into the array A storing the string, and A's byte size.  */
   tree offset_node;
@@ -15508,7 +15508,10 @@  c_getstr (tree src, unsigned HOST_WIDE_INT *strsize /* = NULL */)
   if (strsize)
     *strsize = 0;
 
-  src = string_constant (src, &offset_node, &mem_size, NULL);
+  if (strsize)
+    src = byte_representation (src, &offset_node, &mem_size, NULL);
+  else
+    src = string_constant (src, &offset_node, &mem_size, NULL);
   if (!src)
     return NULL;
 
@@ -15576,6 +15579,18 @@  c_getstr (tree src, unsigned HOST_WIDE_INT *strsize /* = NULL */)
   return offset < init_bytes ? string + offset : "";
 }
 
+/* Return a pointer to a NUL-terminated string corresponding to
+   the expression STR referencing a constant string, possibly
+   involving a constant offset.  Return null if STR either doesn't
+   reference a constant string or if it involves a nonconstant
+   offset.  */
+
+const char *
+c_getstr (tree str)
+{
+  return getbyterep (str, NULL);
+}
+
 /* Given a tree T, compute which bits in T may be nonzero.  */
 
 wide_int
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 0f788a458f2..0c0f5fd46cc 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -199,7 +199,8 @@  extern bool expr_not_equal_to (tree t, const wide_int &);
 extern tree const_unop (enum tree_code, tree, tree);
 extern tree const_binop (enum tree_code, tree, tree, tree);
 extern bool negate_mathfn_p (combined_fn);
-extern const char *c_getstr (tree, unsigned HOST_WIDE_INT * = NULL);
+extern const char *getbyterep (tree, unsigned HOST_WIDE_INT *);
+extern const char *c_getstr (tree);
 extern wide_int tree_nonzero_bits (const_tree);
 
 /* Return OFF converted to a pointer offset type suitable as offset for
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 81c77f7e8b4..66b7ae7e4b1 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -774,7 +774,7 @@  gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	     strlenopt tests that rely on it for passing are adjusted, this
 	     hack can be removed.  */
 	  && !c_strlen (src, 1)
-	  && !((tmp_str = c_getstr (src, &tmp_len)) != NULL
+	  && !((tmp_str = getbyterep (src, &tmp_len)) != NULL
 	       && memchr (tmp_str, 0, tmp_len) == NULL)
 	  && !(srctype
 	       && AGGREGATE_TYPE_P (srctype)
@@ -2464,8 +2464,8 @@  gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
      For nul-terminated strings then adjusted to their length so that
      LENx == NULPOSx holds.  */
   unsigned HOST_WIDE_INT len1 = HOST_WIDE_INT_MAX, len2 = len1;
-  const char *p1 = c_getstr (str1, &len1);
-  const char *p2 = c_getstr (str2, &len2);
+  const char *p1 = getbyterep (str1, &len1);
+  const char *p2 = getbyterep (str2, &len2);
 
   /* The position of the terminating nul character if one exists, otherwise
      a value greater than LENx.  */
@@ -2662,7 +2662,7 @@  gimple_fold_builtin_memchr (gimple_stmt_iterator *gsi)
 
   unsigned HOST_WIDE_INT length = tree_to_uhwi (len);
   unsigned HOST_WIDE_INT string_length;
-  const char *p1 = c_getstr (arg1, &string_length);
+  const char *p1 = getbyterep (arg1, &string_length);
 
   if (p1)
     {
diff --git a/gcc/testsuite/gcc.dg/memchr.c b/gcc/testsuite/gcc.dg/memchr.c
new file mode 100644
index 00000000000..fb21d58b476
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/memchr.c
@@ -0,0 +1,94 @@ 
+/* PR middle-end/78257 - missing memcmp optimization with constant arrays
+   { dg-do compile }
+   { dg-options "-O -Wall -fdump-tree-optimized" } */
+
+typedef __INT8_TYPE__  int8_t;
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+typedef __SIZE_TYPE__  size_t;
+
+extern void* memchr (const void*, int, size_t);
+
+/* Verify that initializers for flexible array members are handled
+   correctly.  */
+
+struct SX
+{
+  /* offset */
+  /*   0    */ int32_t n;
+  /*   4    */ int8_t: 1;
+  /*   6    */ int16_t a[];
+};
+
+_Static_assert (__builtin_offsetof (struct SX, a) == 6);
+
+const struct SX sx =
+  {
+   0x11121314, { 0x2122, 0x3132, 0x4142, 0x5152 }
+  };
+
+const char sx_rep[] =
+  {
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   0x11, 0x12, 0x13, 0x14, 0, 0, 0x21, 0x22, 0x31, 0x32, 0x41, 0x42, 0x51, 0x52
+#else
+   0x14, 0x13, 0x12, 0x11, 0, 0, 0x22, 0x21, 0x32, 0x31, 0x42, 0x41, 0x52, 0x51
+#endif
+  };
+
+
+void test_find (void)
+{
+  int n = 0, nb = (const char*)&sx.a[4] - (const char*)&sx;
+  const char *p = (const char*)&sx, *q = sx_rep;
+
+  if (nb != sizeof sx_rep)
+    __builtin_abort ();
+
+  n += p      == memchr (p, q[ 0], nb);
+  n += p +  1 == memchr (p, q[ 1], nb);
+  n += p +  2 == memchr (p, q[ 2], nb);
+  n += p +  3 == memchr (p, q[ 3], nb);
+  n += p +  4 == memchr (p, q[ 4], nb);
+  n += p +  4 == memchr (p, q[ 5], nb);
+  n += p +  6 == memchr (p, q[ 6], nb);
+  n += p +  7 == memchr (p, q[ 7], nb);
+  n += p +  8 == memchr (p, q[ 8], nb);
+  n += p +  9 == memchr (p, q[ 9], nb);
+  n += p + 10 == memchr (p, q[10], nb);
+  n += p + 11 == memchr (p, q[11], nb);
+  n += p + 12 == memchr (p, q[12], nb);
+  n += p + 13 == memchr (p, q[13], nb);
+
+  if (n != 14)
+    __builtin_abort ();
+}
+
+void test_not_find (void)
+{
+  int n = 0, nb = (const char*)&sx.a[4] - (const char*)&sx;
+  const char *p = (const char*)&sx, *q = sx_rep;
+
+  if (nb != sizeof sx_rep)
+    __builtin_abort ();
+
+  n += 0 == memchr (p,      0xff, nb);
+  n += 0 == memchr (p +  1, q[ 0], nb - 1);
+  n += 0 == memchr (p +  2, q[ 1], nb - 2);
+  n += 0 == memchr (p +  3, q[ 2], nb - 3);
+  n += 0 == memchr (p +  4, q[ 3], nb - 4);
+  n += 0 == memchr (p +  6, q[ 4], nb - 6);
+  n += 0 == memchr (p +  7, q[ 6], nb - 7);
+  n += 0 == memchr (p +  8, q[ 7], nb - 8);
+  n += 0 == memchr (p +  9, q[ 8], nb - 9);
+  n += 0 == memchr (p + 10, q[ 9], nb - 10);
+  n += 0 == memchr (p + 11, q[10], nb - 11);
+  n += 0 == memchr (p + 12, q[11], nb - 12);
+  n += 0 == memchr (p + 13, q[12], nb - 13);
+  n += 0 == memchr (p + 14, q[13], nb - 14);
+
+  if (n != 14)
+    __builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump-not "abort" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/memcmp-2.c b/gcc/testsuite/gcc.dg/memcmp-2.c
new file mode 100644
index 00000000000..ff99c12b0af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/memcmp-2.c
@@ -0,0 +1,183 @@ 
+/* PR middle-end/78257 - missing memcmp optimization with constant arrays
+   { dg-do compile }
+   { dg-options "-O -Wall -fdump-tree-optimized" } */
+
+#define assert(e) ((e) ? (void)0 : __builtin_abort ())
+
+typedef __INT32_TYPE__ int32_t;
+
+extern int memcmp (const void*, const void*, __SIZE_TYPE__);
+
+const int32_t i_0 = 0;
+const int32_t j_0 = 0;
+
+void eq_i0_j0 (void)
+{
+  const char *pi = (char*)&i_0, *pj = (char*)&j_0;
+  int n = 0;
+
+  n += 0 == memcmp (pi,     pj,     sizeof (int32_t));
+  n += 0 == memcmp (pi + 1, pj + 1, sizeof (int32_t) - 1);
+  n += 0 == memcmp (pi + 2, pj + 2, sizeof (int32_t) - 2);
+  n += 0 == memcmp (pi + 3, pj + 3, sizeof (int32_t) - 3);
+  n += 0 == memcmp (pi + 4, pj + 4, sizeof (int32_t) - 4);
+
+  assert (n == 5);
+}
+
+
+const int32_t i1234 = 1234;
+const int32_t j1234 = 1234;
+
+void eq_i1234_j1245 (void)
+{
+  const char *pi = (char*)&i1234, *pj = (char*)&j1234;
+  int n = 0;
+
+  n += 0 == memcmp (pi,     pj,     sizeof (int32_t));
+  n += 0 == memcmp (pi + 1, pj + 1, sizeof (int32_t) - 1);
+  n += 0 == memcmp (pi + 2, pj + 2, sizeof (int32_t) - 2);
+  n += 0 == memcmp (pi + 3, pj + 3, sizeof (int32_t) - 3);
+  n += 0 == memcmp (pi + 4, pj + 4, sizeof (int32_t) - 4);
+
+  assert (n == 5);
+}
+
+
+const int32_t a1[2] = { 1234 };
+const int32_t b1[2] = { 1234 };
+
+void eq_a1_b1 (void)
+{
+  const char *pi = (char*)&a1, *pj = (char*)&b1;
+  int n = 0, nb = sizeof a1;
+
+  n += 0 == memcmp (pi,     pj,     nb);
+  n += 0 == memcmp (pi + 1, pj + 1, nb - 1);
+  n += 0 == memcmp (pi + 2, pj + 2, nb - 2);
+  n += 0 == memcmp (pi + 3, pj + 3, nb - 3);
+  n += 0 == memcmp (pi + 4, pj + 4, nb - 4);
+  n += 0 == memcmp (pi + 5, pj + 5, nb - 5);
+  n += 0 == memcmp (pi + 6, pj + 6, nb - 6);
+  n += 0 == memcmp (pi + 7, pj + 7, nb - 7);
+  n += 0 == memcmp (pi + 8, pj + 8, nb - 8);
+
+  assert (n == 9);
+}
+
+const int32_t a2[2] = { 1234 };
+const int32_t b2[2] = { 1234, 0 };
+
+void eq_a2_b2 (void)
+{
+  const char *pi = (char*)&a2, *pj = (char*)&b2;
+  int n = 0, nb = sizeof a2;
+
+  n += 0 == memcmp (pi,     pj,     nb);
+  n += 0 == memcmp (pi + 1, pj + 1, nb - 1);
+  n += 0 == memcmp (pi + 2, pj + 2, nb - 2);
+  n += 0 == memcmp (pi + 3, pj + 3, nb - 3);
+  n += 0 == memcmp (pi + 4, pj + 4, nb - 4);
+  n += 0 == memcmp (pi + 5, pj + 5, nb - 5);
+  n += 0 == memcmp (pi + 6, pj + 6, nb - 6);
+  n += 0 == memcmp (pi + 7, pj + 7, nb - 7);
+  n += 0 == memcmp (pi + 8, pj + 8, nb - 8);
+
+  assert (n == 9);
+}
+
+
+const int32_t a5[5] = { [3] = 1234, [1] = 0 };
+const int32_t b5[5] = { 0, 0, 0, 1234 };
+
+void eq_a5_b5 (void)
+{
+  int n = 0, b = sizeof a5;
+  const char *pi = (char*)a5, *pj = (char*)b5;
+
+  n += 0 == memcmp (pi, pj, b);
+  n += 0 == memcmp (pi + 1, pj + 1, b - 1);
+  n += 0 == memcmp (pi + 2, pj + 2, b - 2);
+  n += 0 == memcmp (pi + 3, pj + 3, b - 3);
+
+  n += 0 == memcmp (pi + 4, pj + 4, b - 4);
+  n += 0 == memcmp (pi + 5, pj + 5, b - 5);
+  n += 0 == memcmp (pi + 6, pj + 6, b - 6);
+  n += 0 == memcmp (pi + 7, pj + 7, b - 7);
+
+  n += 0 == memcmp (pi + 8, pj + 8, b - 8);
+  n += 0 == memcmp (pi + 9, pj + 9, b - 9);
+  n += 0 == memcmp (pi + 10, pj + 10, b - 10);
+  n += 0 == memcmp (pi + 11, pj + 11, b - 11);
+
+  n += 0 == memcmp (pi + 12, pj + 12, b - 12);
+  n += 0 == memcmp (pi + 13, pj + 13, b - 13);
+  n += 0 == memcmp (pi + 14, pj + 14, b - 14);
+  n += 0 == memcmp (pi + 15, pj + 15, b - 15);
+
+  n += 0 == memcmp (pi + 16, pj + 16, b - 16);
+  n += 0 == memcmp (pi + 17, pj + 17, b - 17);
+  n += 0 == memcmp (pi + 18, pj + 18, b - 18);
+  n += 0 == memcmp (pi + 19, pj + 19, b - 19);
+
+  assert (n == 20);
+}
+
+
+const int32_t a19[19] = { [13] = 13, [8] = 8, [4] = 4, [1] = 1  };
+const int32_t b19[19] = { 0, 1, 0, 0, 4, 0, 0, 0, 8, 0, 0, 0, 0, 13 };
+
+void eq_a19_b19 (void)
+{
+  int n = 0, b = sizeof a19;
+  const char *pi = (char*)a19, *pj = (char*)b19;
+
+  n += 0 == memcmp (pi,     pj,     b);
+  n += 0 == memcmp (pi + 1, pj + 1, b - 1);
+  n += 0 == memcmp (pi + 2, pj + 2, b - 2);
+  n += 0 == memcmp (pi + 3, pj + 3, b - 3);
+
+  n += 0 == memcmp (pi + 14, pj + 14, b - 14);
+  n += 0 == memcmp (pi + 15, pj + 15, b - 15);
+  n += 0 == memcmp (pi + 16, pj + 16, b - 16);
+  n += 0 == memcmp (pi + 17, pj + 17, b - 17);
+
+  n += 0 == memcmp (pi + 28, pj + 28, b - 28);
+  n += 0 == memcmp (pi + 29, pj + 29, b - 29);
+  n += 0 == memcmp (pi + 30, pj + 30, b - 30);
+  n += 0 == memcmp (pi + 31, pj + 31, b - 31);
+
+  n += 0 == memcmp (pi + 42, pj + 42, b - 42);
+  n += 0 == memcmp (pi + 43, pj + 43, b - 43);
+  n += 0 == memcmp (pi + 44, pj + 44, b - 44);
+  n += 0 == memcmp (pi + 45, pj + 45, b - 45);
+
+  n += 0 == memcmp (pi + 56, pj + 56, b - 56);
+  n += 0 == memcmp (pi + 57, pj + 57, b - 57);
+  n += 0 == memcmp (pi + 58, pj + 58, b - 58);
+  n += 0 == memcmp (pi + 59, pj + 59, b - 59);
+
+  assert (n == 20);
+}
+
+
+const int32_t A20[20] = { [13] = 14, [8] = 8, [4] = 4, [1] = 1  };
+const int32_t b20[20] = { 0, 1, 0, 0, 4, 0, 0, 0, 8, 0, 0, 0, 0, 13 };
+
+void gt_A20_b20 (void)
+{
+  int n = memcmp (A20, b20, sizeof A20) > 0;
+  assert (n == 1);
+}
+
+const int32_t a21[21] = { [13] = 12, [8] = 8, [4] = 4, [1] = 1  };
+const int32_t B21[21] = { 0, 1, 0, 0, 4, 0, 0, 0, 8, 0, 0, 0, 0, 13 };
+
+void lt_a21_B21 (void)
+{
+  int n = memcmp (a21, B21, sizeof a21) < 0;
+  assert (n == 1);
+}
+
+
+/* { dg-final { scan-tree-dump-not "abort" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/memcmp-3.c b/gcc/testsuite/gcc.dg/memcmp-3.c
new file mode 100644
index 00000000000..b5b8ac1209f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/memcmp-3.c
@@ -0,0 +1,349 @@ 
+/* PR middle-end/78257 - missing memcmp optimization with constant arrays
+   { dg-do compile }
+   { dg-options "-O -Wall -fdump-tree-optimized" }
+   { dg-skip-if "missing data representation" { "pdp11-*-*" } } */
+
+#define offsetof(T, m) __builtin_offsetof (T, m)
+
+typedef __INT8_TYPE__  int8_t;
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+typedef __INT64_TYPE__ int64_t;
+typedef __SIZE_TYPE__  size_t;
+
+extern int memcmp (const void*, const void*, size_t);
+
+const int32_t ia4[4] = { 0x11121314, 0x21222324, 0x31323334, 0x41424344 };
+const int32_t ia4_des[4] =
+  { [2] = 0x31323334, [0] = 0x11121314, 0x21222324, [3] = 0x41424344 };
+const char ia4_rep[] =
+  {
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   "\x11\x12\x13\x14" "\x21\x22\x23\x24"
+   "\x31\x32\x33\x34" "\x41\x42\x43\x44"
+#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+   "\x14\x13\x12\x11" "\x24\x23\x22\x21"
+   "\x34\x33\x32\x31" "\x44\x43\x42\x41"
+#endif
+  };
+
+void eq_ia4 (void)
+{
+  int n = 0, b = sizeof ia4;
+  const char *p = (const char*)ia4, *q = ia4_rep;
+
+  n += memcmp (p,      q,      b);
+  n += memcmp (p + 1,  q + 1,  b - 1);
+  n += memcmp (p + 2,  q + 2,  b - 2);
+  n += memcmp (p + 3,  q + 3,  b - 3);
+  n += memcmp (p + 4,  q + 4,  b - 4);
+  n += memcmp (p + 5,  q + 5,  b - 5);
+  n += memcmp (p + 6,  q + 6,  b - 6);
+  n += memcmp (p + 7,  q + 7,  b - 7);
+  n += memcmp (p + 8,  q + 8,  b - 8);
+  n += memcmp (p + 9,  q + 9,  b - 9);
+  n += memcmp (p + 10, q + 10, b - 10);
+  n += memcmp (p + 11, q + 11, b - 11);
+  n += memcmp (p + 12, q + 12, b - 12);
+  n += memcmp (p + 13, q + 13, b - 13);
+  n += memcmp (p + 14, q + 14, b - 14);
+  n += memcmp (p + 15, q + 15, b - 15);
+  n += memcmp (p + 16, q + 16, b - 16);
+
+  p = (const char*)ia4_des;
+
+  n += memcmp (p,      q,      b);
+  n += memcmp (p + 1,  q + 1,  b - 1);
+  n += memcmp (p + 2,  q + 2,  b - 2);
+  n += memcmp (p + 3,  q + 3,  b - 3);
+  n += memcmp (p + 4,  q + 4,  b - 4);
+  n += memcmp (p + 5,  q + 5,  b - 5);
+  n += memcmp (p + 6,  q + 6,  b - 6);
+  n += memcmp (p + 7,  q + 7,  b - 7);
+  n += memcmp (p + 8,  q + 8,  b - 8);
+  n += memcmp (p + 9,  q + 9,  b - 9);
+  n += memcmp (p + 10, q + 10, b - 10);
+  n += memcmp (p + 11, q + 11, b - 11);
+  n += memcmp (p + 12, q + 12, b - 12);
+  n += memcmp (p + 13, q + 13, b - 13);
+  n += memcmp (p + 14, q + 14, b - 14);
+  n += memcmp (p + 15, q + 15, b - 15);
+  n += memcmp (p + 16, q + 16, b - 16);
+
+  if (n != 0)
+    __builtin_abort ();
+}
+
+const float fa4[4] = { 1.0, 2.0, 3.0, 4.0 };
+const float fa4_des[4] = { [0] = fa4[0], [1] = 2.0, [2] = fa4[2], [3] = 4.0 };
+
+void eq_fa4 (void)
+{
+  int n = 0, b = sizeof fa4;
+  const char *p = (const char*)fa4, *q = (const char*)fa4_des;
+
+  n += memcmp (p,      q,      b);
+  n += memcmp (p + 1,  q + 1,  b - 1);
+  n += memcmp (p + 2,  q + 2,  b - 2);
+  n += memcmp (p + 3,  q + 3,  b - 3);
+  n += memcmp (p + 4,  q + 4,  b - 4);
+  n += memcmp (p + 5,  q + 5,  b - 5);
+  n += memcmp (p + 6,  q + 6,  b - 6);
+  n += memcmp (p + 7,  q + 7,  b - 7);
+  n += memcmp (p + 8,  q + 8,  b - 8);
+  n += memcmp (p + 9,  q + 9,  b - 9);
+  n += memcmp (p + 10, q + 10, b - 10);
+  n += memcmp (p + 11, q + 11, b - 11);
+  n += memcmp (p + 12, q + 12, b - 12);
+  n += memcmp (p + 13, q + 13, b - 13);
+  n += memcmp (p + 14, q + 14, b - 14);
+  n += memcmp (p + 15, q + 15, b - 15);
+  n += memcmp (p + 16, q + 16, b - 16);
+
+  if (n != 0)
+    __builtin_abort ();
+}
+
+/* Verify "greater than" comparison with the difference in the last byte.  */
+const char ia4_xrep_16[sizeof ia4] =
+  {
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   0x11, 0x12, 0x13, 0x14, 0x21, 0x22, 0x23, 0x24,
+   0x31, 0x32, 0x33, 0x34, 0x41, 0x42, 0x43
+#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+   0x14, 0x13, 0x12, 0x11, 0x24, 0x23, 0x22, 0x21,
+   0x34, 0x33, 0x32, 0x31, 0x44, 0x43, 0x42
+#endif
+  };
+
+void gt_ia4 (void)
+{
+  int n = 0, b = sizeof ia4;
+  const char *p = (const char*)ia4, *q = ia4_xrep_16;
+
+  n += 0 < memcmp (p,      q,      b);
+  n += 0 < memcmp (p + 1,  q + 1,  b - 1);
+  n += 0 < memcmp (p + 2,  q + 2,  b - 2);
+  n += 0 < memcmp (p + 3,  q + 3,  b - 3);
+  n += 0 < memcmp (p + 4,  q + 4,  b - 4);
+  n += 0 < memcmp (p + 5,  q + 5,  b - 5);
+  n += 0 < memcmp (p + 6,  q + 6,  b - 6);
+  n += 0 < memcmp (p + 7,  q + 7,  b - 7);
+  n += 0 < memcmp (p + 8,  q + 8,  b - 8);
+  n += 0 < memcmp (p + 9,  q + 9,  b - 9);
+  n += 0 < memcmp (p + 10, q + 10, b - 10);
+  n += 0 < memcmp (p + 11, q + 11, b - 11);
+  n += 0 < memcmp (p + 12, q + 12, b - 12);
+  n += 0 < memcmp (p + 13, q + 13, b - 13);
+  n += 0 < memcmp (p + 14, q + 14, b - 14);
+  n += 0 < memcmp (p + 15, q + 15, b - 15);
+
+  if (n != 16)
+    __builtin_abort ();
+}
+
+struct S8_16_32
+{
+  int8_t  i8;
+  int16_t i16;
+  int32_t i32;
+};
+
+_Static_assert (sizeof (struct S8_16_32) == 8);
+
+const struct S8_16_32 s8_16_32 = { 1, 0x2122, 0x31323334 };
+const struct S8_16_32 s8_16_32_des =
+  { .i8 = 1, .i16 = 0x2122, .i32 = 0x31323334 };
+
+const char s8_16_32_rep[] =
+  {
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   1, 0, 0x21, 0x22, 0x31, 0x32, 0x33, 0x34
+#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+   1, 0, 0x22, 0x21, 0x34, 0x33, 0x32, 0x31
+#endif
+  };
+
+void eq_s8_16_32 (void)
+{
+  int n = 0, b = sizeof s8_16_32;
+  const char *p = (char*)&s8_16_32, *q = s8_16_32_rep;
+
+  n += memcmp (p,     q,     b);
+  n += memcmp (p + 1, q + 1, b - 1);
+  n += memcmp (p + 2, q + 2, b - 2);
+  n += memcmp (p + 3, q + 3, b - 3);
+  n += memcmp (p + 4, q + 4, b - 4);
+  n += memcmp (p + 5, q + 5, b - 5);
+  n += memcmp (p + 6, q + 6, b - 6);
+  n += memcmp (p + 7, q + 7, b - 7);
+
+  p = (char*)&s8_16_32_des;
+
+  n += memcmp (p,     q,     b);
+  n += memcmp (p + 1, q + 1, b - 1);
+  n += memcmp (p + 2, q + 2, b - 2);
+  n += memcmp (p + 3, q + 3, b - 3);
+  n += memcmp (p + 4, q + 4, b - 4);
+  n += memcmp (p + 5, q + 5, b - 5);
+  n += memcmp (p + 6, q + 6, b - 6);
+  n += memcmp (p + 7, q + 7, b - 7);
+
+  if (n != 0)
+    __builtin_abort ();
+}
+
+
+struct S8_16_32_64
+{
+  /*  0 */ int8_t   i8;
+  /*  1 */ int8_t:  1;
+  /*  2 */ int16_t  i16;
+  /*  4 */ int32_t: 1;
+  /*  8 */ int32_t  i32;
+  /* 12 */ int32_t: 1;
+  /* 16 */ int64_t  i64;
+  /* 24 */ int8_t:  0;
+};
+
+_Static_assert (offsetof (struct S8_16_32_64, i16) == 2);
+_Static_assert (offsetof (struct S8_16_32_64, i32) == 8);
+_Static_assert (offsetof (struct S8_16_32_64, i64) == 16);
+_Static_assert (sizeof (struct S8_16_32_64) == 24);
+
+const struct S8_16_32_64 s8_16_32_64 =
+  { 1, 0x2122, 0x31323334, 0x4142434445464748LLU };
+
+const char s8_16_32_64_rep[sizeof s8_16_32_64] =
+  {
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   "\x01" "\x00" "\x21\x22" "\x00\x00\x00\x00" "\x31\x32\x33\x34"
+   "\x00\x00\x00\x00" "\x41\x42\x43\x44\x45\x46\x47\x48"
+#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+   "\x01" "\x00" "\x22\x21" "\x00\x00\x00\x00" "\x34\x33\x32\x31"
+   "\x00\x00\x00\x00" "\x48\x47\x46\x45\x44\x43\x42\x41"
+#endif
+  };
+
+const struct S8_16_32_64 s8_16_32_64_des =
+  { .i64 = 0x4142434445464748LLU, .i16 = 0x2122, .i32 = 0x31323334, .i8 = 1 };
+
+
+void eq_8_16_32_64 (void)
+{
+  int n = 0, b = sizeof s8_16_32_64;
+  const char *p = (char*)&s8_16_32_64, *q = s8_16_32_64_rep;
+
+  n += memcmp (p, q, b);
+  n += memcmp (p + 1,  q + 1,  b - 1);
+  n += memcmp (p + 2,  q + 2,  b - 2);
+  n += memcmp (p + 3,  q + 3,  b - 3);
+  n += memcmp (p + 4,  q + 4,  b - 4);
+  n += memcmp (p + 5,  q + 5,  b - 5);
+  n += memcmp (p + 6,  q + 6,  b - 6);
+  n += memcmp (p + 7,  q + 7,  b - 7);
+  n += memcmp (p + 8,  q + 8,  b - 8);
+  n += memcmp (p + 9,  q + 9,  b - 9);
+  n += memcmp (p + 10, q + 10, b - 10);
+  n += memcmp (p + 11, q + 11, b - 11);
+  n += memcmp (p + 12, q + 12, b - 12);
+  n += memcmp (p + 13, q + 13, b - 13);
+  n += memcmp (p + 14, q + 14, b - 14);
+  n += memcmp (p + 15, q + 15, b - 15);
+  n += memcmp (p + 16, q + 16, b - 16);
+  n += memcmp (p + 17, q + 17, b - 17);
+  n += memcmp (p + 18, q + 18, b - 18);
+  n += memcmp (p + 19, q + 19, b - 19);
+  n += memcmp (p + 20, q + 20, b - 20);
+  n += memcmp (p + 21, q + 21, b - 21);
+  n += memcmp (p + 22, q + 22, b - 22);
+  n += memcmp (p + 23, q + 23, b - 23);
+
+  p = (char*)&s8_16_32_64_des;
+
+  n += memcmp (p, q, b);
+  n += memcmp (p + 1,  q + 1,  b - 1);
+  n += memcmp (p + 2,  q + 2,  b - 2);
+  n += memcmp (p + 3,  q + 3,  b - 3);
+  n += memcmp (p + 4,  q + 4,  b - 4);
+  n += memcmp (p + 5,  q + 5,  b - 5);
+  n += memcmp (p + 6,  q + 6,  b - 6);
+  n += memcmp (p + 7,  q + 7,  b - 7);
+  n += memcmp (p + 8,  q + 8,  b - 8);
+  n += memcmp (p + 9,  q + 9,  b - 9);
+  n += memcmp (p + 10, q + 10, b - 10);
+  n += memcmp (p + 11, q + 11, b - 11);
+  n += memcmp (p + 12, q + 12, b - 12);
+  n += memcmp (p + 13, q + 13, b - 13);
+  n += memcmp (p + 14, q + 14, b - 14);
+  n += memcmp (p + 15, q + 15, b - 15);
+  n += memcmp (p + 16, q + 16, b - 16);
+  n += memcmp (p + 17, q + 17, b - 17);
+  n += memcmp (p + 18, q + 18, b - 18);
+  n += memcmp (p + 19, q + 19, b - 19);
+  n += memcmp (p + 20, q + 20, b - 20);
+  n += memcmp (p + 21, q + 21, b - 21);
+  n += memcmp (p + 22, q + 22, b - 22);
+  n += memcmp (p + 23, q + 23, b - 23);
+
+  if (n != 0)
+    __builtin_abort ();
+}
+
+struct S64_x_3
+{
+  int64_t i64a[3];
+};
+
+_Static_assert (sizeof (struct S64_x_3) == 24);
+
+const struct S64_x_3 s64_x_3 =
+  { { 0x0000000021220001LLU, 0x0000000031323334LLU, 0x4142434445464748LLU } };
+
+const char s64_x_3_rep[sizeof s64_x_3] =
+  {
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+   "\x00\x00\x00\x00\x21\x22\x00\x01"
+   "\x00\x00\x00\x00\x31\x32\x33\x34"
+   "\x41\x42\x43\x44\x45\x46\x47\x48"
+#elif __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+   "\x01\x00\x22\x21\x00\x00\x00\x00"
+   "\x34\x33\x32\x31\x00\x00\x00\x00"
+   "\x48\x47\x46\x45\x44\x43\x42\x41"
+#endif
+  };
+
+void eq_64_x_3 (void)
+{
+  int n = 0, b = sizeof s8_16_32_64;
+  const char *p = (char*)&s8_16_32_64, *q = s64_x_3_rep;
+  n += memcmp (p, q, b);
+  n += memcmp (p + 1,  q + 1,  b - 1);
+  n += memcmp (p + 2,  q + 2,  b - 2);
+  n += memcmp (p + 3,  q + 3,  b - 3);
+  n += memcmp (p + 4,  q + 4,  b - 4);
+  n += memcmp (p + 5,  q + 5,  b - 5);
+  n += memcmp (p + 6,  q + 6,  b - 6);
+  n += memcmp (p + 7,  q + 7,  b - 7);
+  n += memcmp (p + 8,  q + 8,  b - 8);
+  n += memcmp (p + 9,  q + 9,  b - 9);
+  n += memcmp (p + 10, q + 10, b - 10);
+  n += memcmp (p + 11, q + 11, b - 11);
+  n += memcmp (p + 12, q + 12, b - 12);
+  n += memcmp (p + 13, q + 13, b - 13);
+  n += memcmp (p + 14, q + 14, b - 14);
+  n += memcmp (p + 15, q + 15, b - 15);
+  n += memcmp (p + 16, q + 16, b - 16);
+  n += memcmp (p + 17, q + 17, b - 17);
+  n += memcmp (p + 18, q + 18, b - 18);
+  n += memcmp (p + 19, q + 19, b - 19);
+  n += memcmp (p + 20, q + 20, b - 20);
+  n += memcmp (p + 21, q + 21, b - 21);
+  n += memcmp (p + 22, q + 22, b - 22);
+  n += memcmp (p + 23, q + 23, b - 23);
+
+  if (n != 0)
+    __builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump-not "abort" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/memcmp-4.c b/gcc/testsuite/gcc.dg/memcmp-4.c
new file mode 100644
index 00000000000..bbac7197501
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/memcmp-4.c
@@ -0,0 +1,81 @@ 
+/* PR middle-end/78257 - missing memcmp optimization with constant arrays
+   { dg-do compile }
+   { dg-options "-O -Wall -fdump-tree-optimized" } */
+
+typedef __INT8_TYPE__  int8_t;
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+typedef __SIZE_TYPE__  size_t;
+
+extern int memcmp (const void*, const void*, size_t);
+
+/* Verify that initializers for flexible array members are handled
+   correctly.  */
+
+struct Si16_x
+{
+  int16_t n, a[];
+};
+
+const struct Si16_x si16_4 =
+  {
+   0x1112, { 0x2122, 0x3132, 0x4142 }
+  };
+
+const char si16_4_rep[] =
+  {
+   0x12, 0x11, 0x22, 0x21, 0x32, 0x31, 0x42, 0x41
+  };
+
+void eq_si16_x (void)
+{
+  int n = 0, b = sizeof si16_4_rep;
+  const char *p = (const char*)&si16_4, *q = si16_4_rep;
+
+  n += memcmp (p,      q,      b);
+  n += memcmp (p + 1,  q + 1,  b - 1);
+  n += memcmp (p + 2,  q + 2,  b - 2);
+  n += memcmp (p + 3,  q + 3,  b - 3);
+  n += memcmp (p + 4,  q + 4,  b - 4);
+  n += memcmp (p + 5,  q + 5,  b - 5);
+  n += memcmp (p + 6,  q + 6,  b - 6);
+  n += memcmp (p + 7,  q + 7,  b - 7);
+  n += memcmp (p + 8,  q + 8,  b - 8);
+
+  p = (const char*)&si16_4.n;
+
+  n += memcmp (p,      q,          b);
+  n += memcmp (p + 1,  q + 1,  b - 1);
+  n += memcmp (p + 2,  q + 2,  b - 2);
+  n += memcmp (p + 3,  q + 3,  b - 3);
+  n += memcmp (p + 4,  q + 4,  b - 4);
+  n += memcmp (p + 5,  q + 5,  b - 5);
+  n += memcmp (p + 6,  q + 6,  b - 6);
+  n += memcmp (p + 7,  q + 7,  b - 7);
+  n += memcmp (p + 8,  q + 8,  b - 8);
+
+  p = (const char*)si16_4.a;
+  q = si16_4_rep + 2;
+
+  n += memcmp (p,      q,      b - 2);
+  n += memcmp (p + 1,  q + 1,  b - 3);
+  n += memcmp (p + 2,  q + 2,  b - 4);
+  n += memcmp (p + 3,  q + 3,  b - 5);
+  n += memcmp (p + 4,  q + 4,  b - 6);
+  n += memcmp (p + 5,  q + 5,  b - 7);
+  n += memcmp (p + 6,  q + 6,  b - 8);
+
+  p = (const char*)&si16_4.a[1];
+  q = si16_4_rep + 4;
+
+  n += memcmp (p,      q,      b - 4);
+  n += memcmp (p + 1,  q + 1,  b - 5);
+  n += memcmp (p + 2,  q + 2,  b - 6);
+  n += memcmp (p + 3,  q + 3,  b - 7);
+  n += memcmp (p + 4,  q + 4,  b - 8);
+
+  if (n != 0)
+    __builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump-not "abort" "optimized" } } */