S12Z: gas: Fix bug when a symbol name was the single letter 'c'.

Message ID 20190115184754.23689-1-john@darrington.wattle.id.au
State New
Headers show
Series
  • S12Z: gas: Fix bug when a symbol name was the single letter 'c'.
Related show

Commit Message

John Darrington Jan. 15, 2019, 6:47 p.m.
The assembler incorrectly recognised "c" as a register name, and
refused to allow it where it expected a symbol/label.

gas/
	* config/tc-s12z.c (lex_reg_name): Compare the length of the strings
	before the contents.
	* testsuite/gas/s12z/labels.d: New file.
	* testsuite/gas/s12z/labels.s: New file.
	* testsuite/gas/s12z/s12z.exp: Add them.
---
 gas/config/tc-s12z.c            |  5 +++--
 gas/testsuite/gas/s12z/labels.d | 18 ++++++++++++++++++
 gas/testsuite/gas/s12z/labels.s |  3 +++
 gas/testsuite/gas/s12z/s12z.exp |  1 +
 4 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gas/testsuite/gas/s12z/labels.d
 create mode 100644 gas/testsuite/gas/s12z/labels.s

-- 
2.11.0

Comments

Nick Clifton Jan. 16, 2019, 11:52 a.m. | #1
Hi John,

> The assembler incorrectly recognised "c" as a register name, and

> refused to allow it where it expected a symbol/label.


This sounds like it might be a more serious problem...

> +	st d0, c 	; c is a valid label

How does the assembler know that "c" here is meant to be label c 
and not the register c ?

(In other architectures it is often common practice for the compiler
to prefix all user generated symbols with a character, usually an
underscore, so that it is possible to distinguish between register
names and user symbols).

Cheers
  Nick
John Darrington Jan. 16, 2019, 1:15 p.m. | #2
On Wed, Jan 16, 2019 at 11:52:30AM +0000, Nick Clifton wrote:
     Hi John,
     
     > The assembler incorrectly recognised "c" as a register name, and

     > refused to allow it where it expected a symbol/label.

     
     This sounds like it might be a more serious problem...
     
     > +	st d0, c 	; c is a valid label

     How does the assembler know that "c" here is meant to be label c 
     and not the register c ?

Because there is no register called "c".  Also a register would be
semantically incorrect here.
     
     (In other architectures it is often common practice for the compiler
     to prefix all user generated symbols with a character, usually an
     underscore, so that it is possible to distinguish between register
     names and user symbols).
     
The compiler is free to do that.

It is unfortunate that in this arch the conventional assembler format
has no syntactical distinction between a register name and a
symbol name.  Hence, all symbol names are compared against the list of
register names before use.  Sadly this means the grammar is not context
free.

J'
Nick Clifton Jan. 16, 2019, 3:45 p.m. | #3
Hi John,

>      > +	st d0, c 	; c is a valid label

>      How does the assembler know that "c" here is meant to be label c 

>      and not the register c ?

> 

> Because there is no register called "c".  Also a register would be

> semantically incorrect here.


OK, so how about:

  st d0, d1

That is legal, right ?  But if there is also a symbol called "d1"
then what should the assembler do ?  [Answer - choose one, but be
consistent about it...  Also, possibly, issue a warning message].

> It is unfortunate that in this arch the conventional assembler format

> has no syntactical distinction between a register name and a

> symbol name.  Hence, all symbol names are compared against the list of

> register names before use.  Sadly this means the grammar is not context

> free.


Fair enough - patch approved.

Cheers
  Nick
Joseph Myers Jan. 16, 2019, 5:22 p.m. | #4
On Wed, 16 Jan 2019, Nick Clifton wrote:

> (In other architectures it is often common practice for the compiler

> to prefix all user generated symbols with a character, usually an

> underscore, so that it is possible to distinguish between register

> names and user symbols).


Note that doing this conflicts with the ELF gABI ("External C symbols have 
the same names in C and object files' symbol tables.").  It's only common 
practice for non-ELF.  (It's not unknown for ELF, e.g. BlackFin, but it's 
unusual there, conflicts with the gABI and complicates writing symbol 
version maps that work across architectures; normal ELF practice would 
involve an unambiguous assembler syntax to allow all valid symbol names.  
Note that e.g. glibc does not support any configurations with such a 
leading underscore.)

-- 
Joseph S. Myers
joseph@codesourcery.com
John Darrington Jan. 18, 2019, 6:55 a.m. | #5
On Wed, Jan 16, 2019 at 05:22:04PM +0000, Joseph Myers wrote:
     
     Note that doing this conflicts with the ELF gABI ("External C symbols have 
     the same names in C and object files' symbol tables.").  It's only common 
     practice for non-ELF.  (It's not unknown for ELF, e.g. BlackFin, but it's 
     unusual there, conflicts with the gABI and complicates writing symbol 
     version maps that work across architectures; normal ELF practice would 
     involve an unambiguous assembler syntax to allow all valid symbol names.  

Then there is no way to satisfy both the gABI requirements and the
Freescale assembler syntax, whose manual says "Register names are
reserved identifiers".

I can think of a few ways to mitigate the potential problems this might
cause, but none are entirely elegant.

J'
     

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

Patch

diff --git a/gas/config/tc-s12z.c b/gas/config/tc-s12z.c
index 7d1ddf6643..800db4fe75 100644
--- a/gas/config/tc-s12z.c
+++ b/gas/config/tc-s12z.c
@@ -308,7 +308,7 @@  lex_reg_name (uint16_t which, int *reg)
       p++;
     }
 
-  int len = p - input_line_pointer;
+  size_t len = p - input_line_pointer;
 
   if (len <= 0)
     return 0;
@@ -318,7 +318,8 @@  lex_reg_name (uint16_t which, int *reg)
     {
       gas_assert (registers[i].name);
 
-      if (0 == strncasecmp (registers[i].name, input_line_pointer, len))
+      if (len == strlen (registers[i].name)
+	  && 0 == strncasecmp (registers[i].name, input_line_pointer, len))
 	{
 	  if ((0x1U << i) & which)
 	    {
diff --git a/gas/testsuite/gas/s12z/labels.d b/gas/testsuite/gas/s12z/labels.d
new file mode 100644
index 0000000000..d00e756e31
--- /dev/null
+++ b/gas/testsuite/gas/s12z/labels.d
@@ -0,0 +1,18 @@ 
+#objdump: -d -r
+#name:    check that certain symbol labels are correctly accepted.
+#source:  labels.s
+
+
+.*:     file format elf32-s12z
+
+
+Disassembly of section .text:
+
+00000000 <.text>:
+   0:	c4 fa 00 00 	st d0, 0
+   4:	00 
+			2: R_S12Z_OPR	c
+   5:	c4 bd       	st d0, d1
+   7:	c5 fa 00 00 	st d1, 0
+   b:	00 
+			9: R_S12Z_OPR	xavier
diff --git a/gas/testsuite/gas/s12z/labels.s b/gas/testsuite/gas/s12z/labels.s
new file mode 100644
index 0000000000..9326262135
--- /dev/null
+++ b/gas/testsuite/gas/s12z/labels.s
@@ -0,0 +1,3 @@ 
+	st d0, c 	       ; c is a valid label
+	st d0, d1		; Move D0 into the memory at address D1
+	st d1, xavier           ; This is a valid label
diff --git a/gas/testsuite/gas/s12z/s12z.exp b/gas/testsuite/gas/s12z/s12z.exp
index 03baf90bca..76d0931593 100644
--- a/gas/testsuite/gas/s12z/s12z.exp
+++ b/gas/testsuite/gas/s12z/s12z.exp
@@ -119,6 +119,7 @@  run_dump_test bit-manip-invalid
 run_dump_test opr-symbol
 run_dump_test brclr-symbols
 run_dump_test dbCC
+run_dump_test labels
 
 # Expression related tests
 run_dump_test opr-expr