update Zero-length array documentation

Message ID 24e74cf0-b406-a995-b642-15020a4bb697@gmail.com
State Superseded
Headers show
Series
  • update Zero-length array documentation
Related show

Commit Message

Martin Sebor June 5, 2018, 10:31 p.m.
Following the brief discussion Re: aliasing between internal
zero-length-arrays and other members(*) the attached patch
updates the documentation of Zero-length arrays to clarify
that the only valid accesses are to those where the array is
the last member of a struct.

I also took the liberty to mention a couple of more details
that may be of interest to users: namely their alignment and
size, and the handling of one-element arrays.

As I mentioned in the strnlen patch I just submitted, I'd like
to add support for detecting the misuses of zero-length arrays
but I thought I'd post this one first to help flesh out
the implementation of the warning.

Martin

[*] https://gcc.gnu.org/ml/gcc/2018-06/msg00046.html

Comments

Richard Sandiford June 11, 2018, 6:08 p.m. | #1
Martin Sebor <msebor@gmail.com> writes:
> @@ -1553,12 +1553,28 @@ struct line *thisline = (struct line *)

>  thisline->length = this_length;

>  @end smallexample

>  

> -In ISO C90, you would have to give @code{contents} a length of 1, which

> -means either you waste space or complicate the argument to @code{malloc}.

> +Although the size of a zero-length array is zero, an array member of

> +this kind may increase the size the enclosing type as a result of tail

> +padding.  The offset of a zero-length array member from the beginning

> +of the enclosing structure is the same as the offset of an array with

> +one or more elements of the same type.  The alignment of a zero-length

> +array is the same as the alignment of its elements.

>  

> -In ISO C99, you would use a @dfn{flexible array member}, which is

> -slightly different in syntax and semantics:

> +Declaring zero-length arrays in other contexts, including as interior

> +members of structure objects or as non-member objects is discouraged.

> +Accessing elements of zero-length arrays declared in such contexts is

> +undefined and may be diagnosed.

>  

> +In the absence of the zero-length array extension, in ISO C90 the

> +@code{contents} array in the example above would typically be declared

> +to have a single element.  Although this technique is discouraged, GCC

> +handles trailing one-element array members similarly to zero-length

> +arrays.


This last sentence seems a bit vague.  E.g. GCC should never diagnose
an access to element 0 of a 1-element trailing array, whereas (like you
say above) it might for zero-length trailing arrays.

Thanks,
Richard
Martin Sebor June 11, 2018, 7:47 p.m. | #2
On 06/11/2018 12:08 PM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:

>> @@ -1553,12 +1553,28 @@ struct line *thisline = (struct line *)

>>  thisline->length = this_length;

>>  @end smallexample

>>

>> -In ISO C90, you would have to give @code{contents} a length of 1, which

>> -means either you waste space or complicate the argument to @code{malloc}.

>> +Although the size of a zero-length array is zero, an array member of

>> +this kind may increase the size the enclosing type as a result of tail

>> +padding.  The offset of a zero-length array member from the beginning

>> +of the enclosing structure is the same as the offset of an array with

>> +one or more elements of the same type.  The alignment of a zero-length

>> +array is the same as the alignment of its elements.

>>

>> -In ISO C99, you would use a @dfn{flexible array member}, which is

>> -slightly different in syntax and semantics:

>> +Declaring zero-length arrays in other contexts, including as interior

>> +members of structure objects or as non-member objects is discouraged.

>> +Accessing elements of zero-length arrays declared in such contexts is

>> +undefined and may be diagnosed.

>>

>> +In the absence of the zero-length array extension, in ISO C90 the

>> +@code{contents} array in the example above would typically be declared

>> +to have a single element.  Although this technique is discouraged, GCC

>> +handles trailing one-element array members similarly to zero-length

>> +arrays.

>

> This last sentence seems a bit vague.  E.g. GCC should never diagnose

> an access to element 0 of a 1-element trailing array, whereas (like you

> say above) it might for zero-length trailing arrays.


GCC doesn't diagnose past-the-end accesses to trailing member
arrays regardless of their size.  I don't think it should start
diagnosing them for zero-length arrays, and probably not even
for one-element arrays (at least not in C90 mode).  I think in
the future it would be worthwhile to consider diagnosing past-
the-end accesses to trailing member arrays of two or more
elements, but this patch isn't meant to suggest it's done yet.
At the same time, I do want to leave the door open to diagnosing
such accesses eventually, so I don't want to go into too much
detail describing exactly what is and what isn't diagnosed today.

That said, I'm all for improving the text so if something isn't
clear enough that should be clearer or if there are improvements
you suggest please do let me know.

Thanks
Martin
Richard Sandiford June 11, 2018, 8:03 p.m. | #3
Martin Sebor <msebor@gmail.com> writes:
> On 06/11/2018 12:08 PM, Richard Sandiford wrote:

>> Martin Sebor <msebor@gmail.com> writes:

>>> @@ -1553,12 +1553,28 @@ struct line *thisline = (struct line *)

>>>  thisline->length = this_length;

>>>  @end smallexample

>>>

>>> -In ISO C90, you would have to give @code{contents} a length of 1, which

>>> -means either you waste space or complicate the argument to @code{malloc}.

>>> +Although the size of a zero-length array is zero, an array member of

>>> +this kind may increase the size the enclosing type as a result of tail

>>> +padding.  The offset of a zero-length array member from the beginning

>>> +of the enclosing structure is the same as the offset of an array with

>>> +one or more elements of the same type.  The alignment of a zero-length

>>> +array is the same as the alignment of its elements.

>>>

>>> -In ISO C99, you would use a @dfn{flexible array member}, which is

>>> -slightly different in syntax and semantics:

>>> +Declaring zero-length arrays in other contexts, including as interior

>>> +members of structure objects or as non-member objects is discouraged.

>>> +Accessing elements of zero-length arrays declared in such contexts is

>>> +undefined and may be diagnosed.

>>>

>>> +In the absence of the zero-length array extension, in ISO C90 the

>>> +@code{contents} array in the example above would typically be declared

>>> +to have a single element.  Although this technique is discouraged, GCC

>>> +handles trailing one-element array members similarly to zero-length

>>> +arrays.

>>

>> This last sentence seems a bit vague.  E.g. GCC should never diagnose

>> an access to element 0 of a 1-element trailing array, whereas (like you

>> say above) it might for zero-length trailing arrays.

>

> GCC doesn't diagnose past-the-end accesses to trailing member

> arrays regardless of their size.  I don't think it should start

> diagnosing them for zero-length arrays, and probably not even

> for one-element arrays (at least not in C90 mode).  I think in

> the future it would be worthwhile to consider diagnosing past-

> the-end accesses to trailing member arrays of two or more

> elements, but this patch isn't meant to suggest it's done yet.

> At the same time, I do want to leave the door open to diagnosing

> such accesses eventually, so I don't want to go into too much

> detail describing exactly what is and what isn't diagnosed today.


Yeah, agree that we shouldn't go into detail about what is
and isn't diagnosed.  I just don't think we should claim that
one-element arrays are treated "similarly" to zero-length arrays
without saying what "similarly" means.  They're certainly different
at some level, otherwise the extension wouldn't have been added :-)

If we don't want to give much more detail about this then I think it
would be better to keep the original paragraph:

  In ISO C90, you would have to give @code{contents} a length of 1, which
  means either you waste space or complicate the argument to @code{malloc}.

instead of the last one.  (The other changes look good to me FWIW.)

Noticed later,

> +The preferred mechanism to declare variable-length types like

> +@code{struct line} above is the ISO C99 @dfn{flexible array member},

> +with slightly different in syntax and semantics:


Don't think the s/which is/with/ change is correct here.

Thanks,
Richard
Martin Sebor June 11, 2018, 9:05 p.m. | #4
On 06/11/2018 02:03 PM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:

>> On 06/11/2018 12:08 PM, Richard Sandiford wrote:

>>> Martin Sebor <msebor@gmail.com> writes:

>>>> @@ -1553,12 +1553,28 @@ struct line *thisline = (struct line *)

>>>>  thisline->length = this_length;

>>>>  @end smallexample

>>>>

>>>> -In ISO C90, you would have to give @code{contents} a length of 1, which

>>>> -means either you waste space or complicate the argument to @code{malloc}.

>>>> +Although the size of a zero-length array is zero, an array member of

>>>> +this kind may increase the size the enclosing type as a result of tail

>>>> +padding.  The offset of a zero-length array member from the beginning

>>>> +of the enclosing structure is the same as the offset of an array with

>>>> +one or more elements of the same type.  The alignment of a zero-length

>>>> +array is the same as the alignment of its elements.

>>>>

>>>> -In ISO C99, you would use a @dfn{flexible array member}, which is

>>>> -slightly different in syntax and semantics:

>>>> +Declaring zero-length arrays in other contexts, including as interior

>>>> +members of structure objects or as non-member objects is discouraged.

>>>> +Accessing elements of zero-length arrays declared in such contexts is

>>>> +undefined and may be diagnosed.

>>>>

>>>> +In the absence of the zero-length array extension, in ISO C90 the

>>>> +@code{contents} array in the example above would typically be declared

>>>> +to have a single element.  Although this technique is discouraged, GCC

>>>> +handles trailing one-element array members similarly to zero-length

>>>> +arrays.

>>>

>>> This last sentence seems a bit vague.  E.g. GCC should never diagnose

>>> an access to element 0 of a 1-element trailing array, whereas (like you

>>> say above) it might for zero-length trailing arrays.

>>

>> GCC doesn't diagnose past-the-end accesses to trailing member

>> arrays regardless of their size.  I don't think it should start

>> diagnosing them for zero-length arrays, and probably not even

>> for one-element arrays (at least not in C90 mode).  I think in

>> the future it would be worthwhile to consider diagnosing past-

>> the-end accesses to trailing member arrays of two or more

>> elements, but this patch isn't meant to suggest it's done yet.

>> At the same time, I do want to leave the door open to diagnosing

>> such accesses eventually, so I don't want to go into too much

>> detail describing exactly what is and what isn't diagnosed today.

>

> Yeah, agree that we shouldn't go into detail about what is

> and isn't diagnosed.  I just don't think we should claim that

> one-element arrays are treated "similarly" to zero-length arrays

> without saying what "similarly" means.  They're certainly different

> at some level, otherwise the extension wouldn't have been added :-)

>

> If we don't want to give much more detail about this then I think it

> would be better to keep the original paragraph:

>

>   In ISO C90, you would have to give @code{contents} a length of 1, which

>   means either you waste space or complicate the argument to @code{malloc}.

>

> instead of the last one.  (The other changes look good to me FWIW.)


I changed it because the text seemed both vague (why does it waste
space and what complication to the malloc call would avoid wasting
it?) and superfluous (it seems obvious that a one-element array
takes up at least 1 unit worth of space, and that allocating less
would mean changing the malloc argument).  But perhaps it would
be helpful to be explicit about the allocation size while still
saying as little as possible about diagnostics.  How abut this:

   In the absence of the zero-length array extension, in ISO C90
   the @code{contents} array in the example above would typically
   be declared to have a single element.  Unlike a zero-length
   array which only contributes to the size of the enclosing
   structure for the purposes of alignment, a one-element array
   always occupies at least as much space as a single object of
   the type.  Although using one-element arrays this way is
   discouraged, GCC handles accesses to trailing one-element
   array members analogously to zero-length arrays.

>

> Noticed later,

>

>> +The preferred mechanism to declare variable-length types like

>> +@code{struct line} above is the ISO C99 @dfn{flexible array member},

>> +with slightly different in syntax and semantics:

>

> Don't think the s/which is/with/ change is correct here.


Thanks, fixed.

Martin

Patch

gcc/ChangeLog:

	* doc/extend.texi (Zero-length arrays): Update and clarify.

Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 261207)
+++ gcc/doc/extend.texi	(working copy)
@@ -1538,9 +1538,9 @@  defined when these address spaces are supported.
 @cindex length-zero arrays
 @cindex flexible array members
 
-Zero-length arrays are allowed in GNU C@.  They are very useful as the
-last element of a structure that is really a header for a variable-length
-object:
+Declaring zero-length arrays is allowed in GNU C as an extension@.
+A zero-length array can be useful as the last element of a structure
+that is really a header for a variable-length object:
 
 @smallexample
 struct line @{
@@ -1553,12 +1553,28 @@  struct line *thisline = (struct line *)
 thisline->length = this_length;
 @end smallexample
 
-In ISO C90, you would have to give @code{contents} a length of 1, which
-means either you waste space or complicate the argument to @code{malloc}.
+Although the size of a zero-length array is zero, an array member of
+this kind may increase the size the enclosing type as a result of tail
+padding.  The offset of a zero-length array member from the beginning
+of the enclosing structure is the same as the offset of an array with
+one or more elements of the same type.  The alignment of a zero-length
+array is the same as the alignment of its elements.
 
-In ISO C99, you would use a @dfn{flexible array member}, which is
-slightly different in syntax and semantics:
+Declaring zero-length arrays in other contexts, including as interior
+members of structure objects or as non-member objects is discouraged.
+Accessing elements of zero-length arrays declared in such contexts is
+undefined and may be diagnosed.
 
+In the absence of the zero-length array extension, in ISO C90 the
+@code{contents} array in the example above would typically be declared
+to have a single element.  Although this technique is discouraged, GCC
+handles trailing one-element array members similarly to zero-length
+arrays.
+
+The preferred mechanism to declare variable-length types like
+@code{struct line} above is the ISO C99 @dfn{flexible array member},
+with slightly different in syntax and semantics:
+
 @itemize @bullet
 @item
 Flexible array members are written as @code{contents[]} without