[04/13] libctf, ld: prohibit doing things with forwards prohibited in C

Message ID 20201218195147.378720-5-nick.alcock@oracle.com
State New
Headers show
Series
  • CTF dumper improvements, a lookup testsuite, and bugfixes
Related show

Commit Message

Andreas Krebbel via Binutils Dec. 18, 2020, 7:51 p.m.
C allows you to do only a very few things with entities of incomplete
type (as opposed to pointers to them): make pointers to them and give
them cv-quals, roughly. In particular you can't sizeof them and you
can't get their alignment.

While we explicitly do not impose all the requirements the standard
imposes on CTF users, and it *is* still desirable to be able to get size
and alignment info for incomplete array types (because of their use as
the last member of structures), it is meaningless to ask for the size or
alignment of forwards.  Worse, libctf didn't prohibit this and returned
nonsense from internal implementation details when you asked (it
returned the kind of the pointed-to type as both the size and alignment,
because forwards reuse ctt_type as a type kind, and ctt_type and
ctt_size overlap).  So introduce a new error, ECTF_INCOMPLETE, which is
returned when you try to get the size or alignment of forwards: we also
return it when you try to construct CTF dicts that do invalid things
with forwards, namely adding a forward member to a struct or union or
making an array of elements of forward type.

The dumper will not emit size or alignment info for forwards any more.

(This should not be an API break since ctf_type_size and ctf_type_align
could both return errors before now: any code that isn't expecting error
returns is already potentially broken.)

include/ChangeLog
2020-12-08  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ECTF_INCOMPLETE): New.
	(ECTF_NERR): Adjust.

ld/ChangeLog
2020-12-08  Nick Alcock  <nick.alcock@oracle.com>

	* testsuite/ld-ctf/conflicting-cycle-1.parent.d: Adjust for dumper
	changes.
	* testsuite/ld-ctf/cross-tu-cyclic-conflicting.d: Likewise.
	* testsuite/ld-ctf/forward.c: New test...
	* testsuite/ld-ctf/forward.d: ... and results.

libctf/ChangeLog
2020-12-08  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-types.c (ctf_type_resolve): Improve comment.
	(ctf_type_size): Return ECTF_INCOMPLETE when applied to forwards.
	(ctf_type_align): Likewise.
	* ctf-create.c (ctf_add_array): Likewise.
	(ctf_add_member_offset): Likewise.
	* ctf-dump.c (ctf_dump_format_type): Do not try to print the size of
	forwards.
	(ctf_dump_member): Do not try to print their alignment.
---
 include/ctf-api.h                             |  5 ++--
 .../ld-ctf/conflicting-cycle-1.parent.d       |  4 ++--
 .../ld-ctf/cross-tu-cyclic-conflicting.d      |  4 ++--
 ld/testsuite/ld-ctf/forward.c                 |  2 ++
 ld/testsuite/ld-ctf/forward.d                 | 23 +++++++++++++++++++
 libctf/ctf-create.c                           | 20 ++++++++++++++++
 libctf/ctf-dump.c                             | 17 +++++++++++---
 libctf/ctf-types.c                            | 13 ++++++++++-
 8 files changed, 78 insertions(+), 10 deletions(-)
 create mode 100644 ld/testsuite/ld-ctf/forward.c
 create mode 100644 ld/testsuite/ld-ctf/forward.d

-- 
2.29.2.250.g8336e49d6f.dirty

Comments

Andreas Krebbel via Binutils Dec. 27, 2020, 5:43 p.m. | #1
On 18 Dec 2020, Nick Alcock via Binutils uttered the following:

> C allows you to do only a very few things with entities of incomplete

> type (as opposed to pointers to them): make pointers to them and give

> them cv-quals, roughly. In particular you can't sizeof them and you

> can't get their alignment.

>

> While we explicitly do not impose all the requirements the standard

> imposes on CTF users, and it *is* still desirable to be able to get size

> and alignment info for incomplete array types (because of their use as

> the last member of structures), it is meaningless to ask for the size or

> alignment of forwards.  Worse, libctf didn't prohibit this and returned

> nonsense from internal implementation details when you asked (it

> returned the kind of the pointed-to type as both the size and alignment,

> because forwards reuse ctt_type as a type kind, and ctt_type and

> ctt_size overlap).  So introduce a new error, ECTF_INCOMPLETE, which is

> returned when you try to get the size or alignment of forwards: we also

> return it when you try to construct CTF dicts that do invalid things

> with forwards, namely adding a forward member to a struct or union or

> making an array of elements of forward type.


... aaand this one is firing on real code. I'll figure out why tomorrow.
I suspect it's falsely firing in the one case C permits, incomplete
array types as the last member of a struct (though I tested this case
and it seemed to work).
Andreas Krebbel via Binutils Dec. 28, 2020, 2:04 a.m. | #2
On 27 Dec 2020, Nick Alcock via Binutils uttered the following:

> On 18 Dec 2020, Nick Alcock via Binutils uttered the following:

>

>> returned when you try to get the size or alignment of forwards: we also

>> return it when you try to construct CTF dicts that do invalid things

>> with forwards, namely adding a forward member to a struct or union or

>> making an array of elements of forward type.

>

> ... aaand this one is firing on real code. I'll figure out why tomorrow.


So I couldn't stop thinking about this.  The problem is intrinsic to the
design of the deduplicator and will take a format rev to fix properly,
but we can do a good-enough fix that will get things working again.

The problem is this. Imagine you have these TUs:

A.c:
  struct A;
  struct B { struct A *a; };
  struct A { struct B b; long foo; long bar; struct B b2; };

B.c:
  struct A;
  struct B { struct A *a; };
  struct A { struct B b; int foo; struct B b2; };

Now struct A is ambiguously defined, so the deduplicator will emit a
forward for A in the shared CTF dict .ctf and emit two child dicts
for A.c and B.c with definitions of the real structs A in them:

Contents of CTF section .ctf:
  Types:
     1: struct B (size 0x8)
        [0x0] (ID 0x1) (kind 6) struct B (aligned at 0x8)
            [0x0] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)
     2: struct A (size 0x6)
        [0x0] (ID 0x2) (kind 9) struct A (aligned at 0x6)
     3: struct A * (size 0x8) -> 2: struct A (size 0x6)
        [0x0] (ID 0x3) (kind 3) struct A * (aligned at 0x8)
     4: long int [0x0:0x40] (size 0x8)
        [0x0] (ID 0x4) (kind 1) long int:64 (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
     5: int [0x0:0x20] (size 0x4)
        [0x0] (ID 0x5) (kind 1) int:32 (aligned at 0x4, format 0x1, offset:bits 0x0:0x20)

CTF archive member: .../ambiguous-struct-A.c:
  Types:
     80000001: struct A (size 0x20)
        [0x0] (ID 0x80000001) (kind 6) struct A (aligned at 0x8)
            [0x0] (ID 0x1) (kind 6) struct B b (aligned at 0x8)
                [0x0] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)
            [0x40] (ID 0x4) (kind 1) long int foo:64 (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
            [0x80] (ID 0x4) (kind 1) long int bar:64 (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
            [0xc0] (ID 0x1) (kind 6) struct B b2 (aligned at 0x8)
                [0xc0] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)

CTF archive member: .../ambiguous-struct-B.c:
  Types:
     80000001: struct A (size 0x18)
        [0x0] (ID 0x80000001) (kind 6) struct A (aligned at 0x8)
            [0x0] (ID 0x1) (kind 6) struct B b (aligned at 0x8)
                [0x0] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)
            [0x40] (ID 0x5) (kind 1) int foo:32 (aligned at 0x4, format 0x1, offset:bits 0x0:0x20)
            [0x80] (ID 0x1) (kind 6) struct B b2 (aligned at 0x8)
                [0x80] (ID 0x3) (kind 3) struct A * a (aligned at 0x8)

Note that the two structs A are different sizes, with corresponding
members at different offsets. But... it's perfectly possible for there
to be an array of one of those structs A, and that array is
unambiguously defined, so it's going to be promoted to the .ctf section:

     6: struct A [50] (size 0x12c)
        [0x0] (ID 0x6) (kind 4) struct A [50] (aligned at 0x6)

... and after the deduplicator's got through with it, that's an array
of, uh... a forward. Which has no size or alignment. Oh dear. And it's
not like it even *can* carry a size: the sizes of both struct A's are
different!

Now we can't stop the deduplicator doing this promotion-to-forward stuff
-- it's the way it breaks cycles. In format v4 we should at least make
it clearer to users what is going on by having the deduplicator emit a
new type kind for these forwards, an 'ambiguous type', CTF_K_AMBIGUOUS,
which is identical to a forward, but the different type kind can make it
possible for users to tell that this is actually a stand-in for an
ambiguous structure or union type which can be found in two or more
per-CU dicts (and we can add a function to the libctf API to return all
the relevant types from the various dicts easily).

Stopping the crashes for now is a simple matter of allowing the addition
of incomplete array types again (even though you can't get their sizes
and/or alignment and you'll get an ECTF_INCOMPLETE if you try, ugh:
ugly, but doesn't cause actual misbehaviour), and adjusting
ctf_add_member_offset so that it does not complain if ctf_type_size or
ctf_type_alignment yields ECTF_INCOMPLETE, but *does* complain if the
next type you try to add to that struct does not have an explicit offset
specified (which would require libctf to get the size and/or alignment
of the incomplete member). The deduplicator always specifies explicit
offsets, so the link-time crashes should go away.

I'll try that in the next day or two, and add a test for this stuff.
(Holiday? I scorn holiday! also a release *is* impending and I'd like
this to be working well enough to get in.)

Patch

diff --git a/include/ctf-api.h b/include/ctf-api.h
index 9dd0592ab8a..16567ef3ab6 100644
--- a/include/ctf-api.h
+++ b/include/ctf-api.h
@@ -230,7 +230,8 @@  typedef struct ctf_snapshot_id
   _CTF_ITEM (ECTF_NEXT_WRONGFUN, "Wrong iteration function called.") \
   _CTF_ITEM (ECTF_NEXT_WRONGFP, "Iteration entity changed in mid-iterate.") \
   _CTF_ITEM (ECTF_FLAGS, "CTF header contains flags unknown to libctf.") \
-  _CTF_ITEM (ECTF_NEEDSBFD, "This feature needs a libctf with BFD support.")
+  _CTF_ITEM (ECTF_NEEDSBFD, "This feature needs a libctf with BFD support.") \
+  _CTF_ITEM (ECTF_INCOMPLETE, "Type is not a complete type.")
 
 #define	ECTF_BASE	1000	/* Base value for libctf errnos.  */
 
@@ -243,7 +244,7 @@  _CTF_ERRORS
 #undef _CTF_FIRST
   };
 
-#define ECTF_NERR (ECTF_NEEDSBFD - ECTF_BASE + 1) /* Count of CTF errors.  */
+#define ECTF_NERR (ECTF_INCOMPLETE - ECTF_BASE + 1) /* Count of CTF errors.  */
 
 /* The CTF data model is inferred to be the caller's data model or the data
    model of the given object, unless ctf_setmodel is explicitly called.  */
diff --git a/ld/testsuite/ld-ctf/conflicting-cycle-1.parent.d b/ld/testsuite/ld-ctf/conflicting-cycle-1.parent.d
index 4cbe9b61f3c..5da66fda14c 100644
--- a/ld/testsuite/ld-ctf/conflicting-cycle-1.parent.d
+++ b/ld/testsuite/ld-ctf/conflicting-cycle-1.parent.d
@@ -29,8 +29,8 @@  Contents of CTF section .ctf:
 #...
   Types:
 #...
-     0x[0-9a-f]*: struct B \(.*
-           *\[0x0\] \(ID 0x[0-9a-f]*\) \(kind 9\) struct B \(.*
+     0x[0-9a-f]*: struct B
+           *\[0x0\] \(ID 0x[0-9a-f]*\) \(kind 9\) struct B
 #...
 CTF archive member: .*:
 #...
diff --git a/ld/testsuite/ld-ctf/cross-tu-cyclic-conflicting.d b/ld/testsuite/ld-ctf/cross-tu-cyclic-conflicting.d
index 3c975ebaa51..eff295edd30 100644
--- a/ld/testsuite/ld-ctf/cross-tu-cyclic-conflicting.d
+++ b/ld/testsuite/ld-ctf/cross-tu-cyclic-conflicting.d
@@ -23,8 +23,8 @@  Contents of CTF section \.ctf:
      0x[0-9a-f]*: int \[0x0:0x[0-9a-f]*\] \(size 0x[0-9a-f]*\)
            *\[0x0\] \(ID 0x[0-9a-f]*\) \(kind 1\) int:[0-9]* \(aligned at 0x[0-9a-f]*, format 0x1, offset:bits 0x0:0x[0-9a-f]*\)
 #...
-     0x[0-9a-f]*: struct A .*
-           *\[0x0\] \(ID 0x[0-9a-f]*\) \(kind 9\) struct A .*
+     0x[0-9a-f]*: struct A
+           *\[0x0\] \(ID 0x[0-9a-f]*\) \(kind 9\) struct A
 #...
      0x[0-9a-f]*: struct C .*
            *\[0x0\] \(ID 0x[0-9a-f]*\) \(kind 6\) struct C .*
diff --git a/ld/testsuite/ld-ctf/forward.c b/ld/testsuite/ld-ctf/forward.c
new file mode 100644
index 00000000000..e41a7aececa
--- /dev/null
+++ b/ld/testsuite/ld-ctf/forward.c
@@ -0,0 +1,2 @@ 
+struct foo;
+struct foo *bar __attribute__((used));
diff --git a/ld/testsuite/ld-ctf/forward.d b/ld/testsuite/ld-ctf/forward.d
new file mode 100644
index 00000000000..9ff0dd2ba73
--- /dev/null
+++ b/ld/testsuite/ld-ctf/forward.d
@@ -0,0 +1,23 @@ 
+#as:
+#source: forward.c
+#objdump: --ctf=.ctf
+#ld: -shared
+#name: Forwards
+
+.*: +file format .*
+
+Contents of CTF section .ctf:
+
+  Header:
+    Magic number: 0xdff2
+    Version: 4 \(CTF_VERSION_3\)
+#...
+    Type section:	.* \(0x18 bytes\)
+#...
+  Types:
+
+     0x[0-9a-f]: struct foo
+          *\[0x0\] \(ID 0x[0-9a-f]*\) \(kind 9\) struct foo
+     0x[0-9a-f]: struct foo \* \(size 0x[0-9a-f]*\) -> 0x[0-9a-f]: struct foo
+          *\[0x0\] \(ID 0x[0-9a-f]\) \(kind 3\) struct foo \* \(aligned at 0x[0-9a-f]*\)
+#...
diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index e03a04683dd..28c468372e8 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -1690,6 +1690,22 @@  ctf_add_array (ctf_dict_t *fp, uint32_t flag, const ctf_arinfo_t *arp)
   if (ctf_lookup_by_id (&tmp, arp->ctr_index) == NULL)
     return CTF_ERR;		/* errno is set for us.  */
 
+  if (ctf_type_kind (fp, arp->ctr_contents) == CTF_K_FORWARD)
+    {
+      ctf_err_warn (fp, 1, ECTF_INCOMPLETE,
+		    _("ctf_add_array: content type %lx is incomplete"),
+		    arp->ctr_contents);
+      return (ctf_set_errno (fp, ECTF_INCOMPLETE));
+    }
+
+  if (ctf_type_kind (fp, arp->ctr_index) == CTF_K_FORWARD)
+    {
+      ctf_err_warn (fp, 1, ECTF_INCOMPLETE,
+		    _("ctf_add_array: index type %lx is incomplete"),
+		    arp->ctr_contents);
+      return (ctf_set_errno (fp, ECTF_INCOMPLETE));
+    }
+
   if ((type = ctf_add_generic (fp, flag, NULL, CTF_K_ARRAY, &dtd)) == CTF_ERR)
     return CTF_ERR;		/* errno is set for us.  */
 
@@ -2123,6 +2139,10 @@  ctf_add_member_offset (ctf_dict_t *fp, ctf_id_t souid, const char *name,
 	      return -1;	/* errno is set for us.  */
 	    }
 
+	  /* We can rely on this type having a valid size due to the
+	     size/alignment checks above prohibiting forward member
+	     addition.  */
+
 	  if (ctf_type_encoding (fp, ltype, &linfo) == 0)
 	    off += linfo.cte_bits;
 	  else if ((lsize = ctf_type_size (fp, ltype)) > 0)
diff --git a/libctf/ctf-dump.c b/libctf/ctf-dump.c
index fd64dd3a9a0..a49f39e4569 100644
--- a/libctf/ctf-dump.c
+++ b/libctf/ctf-dump.c
@@ -151,7 +151,7 @@  ctf_dump_format_type (ctf_dict_t *fp, ctf_id_t id, int flag)
       free (bit);
       bit = NULL;
 
-      if (kind != CTF_K_FUNCTION)
+      if (kind != CTF_K_FUNCTION && kind != CTF_K_FORWARD)
 	if (asprintf (&bit, " (size 0x%lx)%s",
 		      (unsigned long) ctf_type_size (fp, id),
 		      nonroot_trailer) < 0)
@@ -476,6 +476,7 @@  ctf_dump_member (const char *name, ctf_id_t id, unsigned long offset,
   char *bit = NULL;
   ctf_encoding_t ep;
   int has_encoding = 0;
+  int opened_paren = 0;
 
   /* Align neatly.  */
 
@@ -520,8 +521,9 @@  ctf_dump_member (const char *name, ctf_id_t id, unsigned long offset,
 		    ep.cte_bits, (unsigned long) ctf_type_align (state->cdm_fp,
 								 id)) < 0)
 	goto oom;
+      opened_paren = 1;
     }
-  else
+  else if (ctf_type_kind (state->cdm_fp, id) != CTF_K_FORWARD)
     {
       if (asprintf (&bit, "[0x%lx] (ID 0x%lx) (kind %i) %s%s%s "
 		    "(aligned at 0x%lx", offset, id,
@@ -529,6 +531,14 @@  ctf_dump_member (const char *name, ctf_id_t id, unsigned long offset,
 		    (name[0] != 0 && typestr[0] != 0) ? " " : "", name,
 		    (unsigned long) ctf_type_align (state->cdm_fp, id)) < 0)
 	goto oom;
+      opened_paren = 1;
+    }
+  else /* Forwards have no alignment.  */
+    {
+      if (asprintf (&bit, "[0x%lx] (ID 0x%lx) (kind %i) %s%s%s\n", offset, id,
+		    ctf_type_kind (state->cdm_fp, id), typestr,
+		    (name[0] != 0 && typestr[0] != 0) ? " " : "", name) < 0)
+	goto oom;
     }
 
   *state->cdm_str = str_append (*state->cdm_str, bit);
@@ -547,7 +557,8 @@  ctf_dump_member (const char *name, ctf_id_t id, unsigned long offset,
       bit = NULL;
     }
 
-  *state->cdm_str = str_append (*state->cdm_str, ")\n");
+  if (opened_paren)
+    *state->cdm_str = str_append (*state->cdm_str, ")\n");
   return 0;
 
  oom:
diff --git a/libctf/ctf-types.c b/libctf/ctf-types.c
index dd8ee4fd0ee..9483930f40c 100644
--- a/libctf/ctf-types.c
+++ b/libctf/ctf-types.c
@@ -583,7 +583,10 @@  ctf_variable_next (ctf_dict_t *fp, ctf_next_t **it, const char **name)
    against infinite loops, we implement simplified cycle detection and check
    each link against itself, the previous node, and the topmost node.
 
-   Does not drill down through slices to their contained type.  */
+   Does not drill down through slices to their contained type.
+
+   Callers of this function must not presume that a type it returns must have a
+   valid ctt_size: forwards do not, and must be separately handled.  */
 
 ctf_id_t
 ctf_type_resolve (ctf_dict_t *fp, ctf_id_t type)
@@ -948,6 +951,10 @@  ctf_type_size (ctf_dict_t *fp, ctf_id_t type)
 
       return size * ar.ctr_nelems;
 
+    case CTF_K_FORWARD:
+      /* Forwards do not have a meaningful size.  */
+      return (ctf_set_errno (fp, ECTF_INCOMPLETE));
+
     default: /* including slices of enums, etc */
       return (ctf_get_ctt_size (fp, tp, NULL, NULL));
     }
@@ -1043,6 +1050,10 @@  ctf_type_align (ctf_dict_t *fp, ctf_id_t type)
     case CTF_K_ENUM:
       return fp->ctf_dmodel->ctd_int;
 
+    case CTF_K_FORWARD:
+      /* Forwards do not have a meaningful alignment.  */
+      return (ctf_set_errno (fp, ECTF_INCOMPLETE));
+
     default:  /* including slices of enums, etc */
       return (ctf_get_ctt_size (fp, tp, NULL, NULL));
     }