diagnostics: Add options to control the column units [PR49973] [PR86904]

Message ID 20200131193103.GA28678@ldh.local
State New
Headers show
Series
  • diagnostics: Add options to control the column units [PR49973] [PR86904]
Related show

Commit Message

Lewis Hyatt Jan. 31, 2020, 7:31 p.m.
Hello-

Here is the second patch that I mentioned when I submitted the other related
patch (which is awaiting review):
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. This second patch
is based on top of the first one and it closes out PR49973 and PR86904 by
adding the new option -fdiagnostics-column-unit=[display|byte]. This allows
to specify whether columns are output as simple byte counts (the current
behavior), or as display columns including handling multibyte characters and
tabs. The patch makes display columns the new default. Additionally, a
second new option -fdiagnostics-column-origin is added, which allows to make
the column 0-based (or N-based for any N) instead of 1-based. The default
remains at 1-based as it is now.

A number of testcases were explicitly testing for the old behavior, so I
have updated them to test for the new behavior instead, since the column
number adjusted for tabs is more natural to test for, and matches what
editors typically show (give or take 1 for the origin convention).

One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It failed
after this patch, although it doesn't test for any column numbers. The
answer turned out to be, this test checks for identical error text on two
different lines. When the column units are changed to display columns, then
the column of the second error happens to match the line of the first
one. dejagnu then misinterprets the second error as if it matched the
location of the first one (it doesn't distinguish whether it checks for the
line number or the column number in the output). I added a comment to the
test explaining the situation; since adding the comment has the side effect
of making the first line number no longer match the second column number, it
also makes the test pass again.

It wasn't quite clear to me whether this change was appropriate for GCC 10
or not at this point. We discussed it a couple months ago here:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way, I hope
it isn't a problem that I submitted the patch for review now, whether it
will end up in 10 or 11. Please let me know what's normally expected?
Thanks!

Bootstrapped and regtested all languages on linux x86-64, results were the
same before and after:

FAIL 109 109
PASS 467378 467378
UNRESOLVED 1 1
UNSUPPORTED 11183 11183
UNTESTED 202 202
XFAIL 1756 1756
XPASS 36 36

Thanks for taking a look at it.

-Lewis
gcc/ChangeLog:

2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>

	* common.opt: Added -fdiagnostics-column-unit= and
	-fdiagnostics-column-origin= options.  Fix typo in the description
	text for -fdiagnostics-output-format.
	* diagnostic-format-json.cc (json_from_expanded_location): Added
	diagnostic_context argument.  Use it to convert column numbers as per
	the new options.
	(json_from_location_range): Likewise.
	(json_from_fixit_hint): Likewise.
	(json_end_diagnostic): Pass the new context arguments to helper
	functions above.
	(test_unknown_location): Likewise.
	(test_bad_endpoints): Likewise.
	* diagnostic.c (diagnostic_initialize): Initialize new column_unit and
	column_adj members.
	(diagnostic_converted_column): New function.
	(maybe_line_and_column): Elide negative columns, rather than a zero
	column.
	(diagnostic_get_location_text): Convert column number as per the new
	options.
	(diagnostic_report_current_module): Likewise.
	(print_parseable_fixits): Change pretty_printer argument to a
	diagnostic_context. Use the context to convert column numbers.
	(diagnostic_report_diagnostic): Adapt to new arguments for
	print_parseable_fixits().
	(test_print_parseable_fixits_none): Likewise.
	(test_print_parseable_fixits_insert): Likewise.
	(test_print_parseable_fixits_remove): Likewise.
	(test_print_parseable_fixits_replace): Likewise.
	(assert_location_text): Added origin and column_unit arguments for
	testing the new functionality.
	(test_diagnostic_get_location_text): Added selftests for
	-fdiagnostics-column-unit= and -fdiagnostics-column-origin=.
	* diagnostic.h (enum diagnostics_column_unit): New enum.
	(struct diagnostic_context): Added new column_unit and column_adj
	members.
	(diagnostic_converted_column): Declare.
	(json_from_expanded_location): Added new context argument.
	* opts.c (common_handle_option): Handle the new options.
	* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the
	new context argument to json_from_expanded_location().


gcc/testsuite/ChangeLog:

2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>

	* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output
	for new default display column units.
	* c-c++-common/Wmisleading-indentation.c: Likewise.
	* g++.dg/parse/error4.C: Likewise.
	* g++.old-deja/g++.brendan/crash11.C: Likewise.
	* g++.old-deja/g++.pt/overload2.C: Likewise.
	* g++.old-deja/g++.robertl/eb109.C: Likewise.
	* gcc.dg/format/branch-1.c: Likewise.
	* gcc.dg/format/pr79210.c: Likewise.
	* gcc.dg/redecl-4.c: Likewise.
	* go.dg/arrayclear.go: Add a comment explaining why adding a
	comment to the test is necessary to prevent it breaking due to a
	dejagnu bug revealed by the new default column units.

Comments

David Malcolm Jan. 31, 2020, 8:31 p.m. | #1
On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:
> Hello-

> 

> Here is the second patch that I mentioned when I submitted the other

> related

> patch (which is awaiting review):

> https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 


Sorry about that; I'm v. busy with analyzer bugs right now.

> This second patch

> is based on top of the first one and it closes out PR49973 and

> PR86904 by

> adding the new option -fdiagnostics-column-unit=[display|byte]. This

> allows

> to specify whether columns are output as simple byte counts (the

> current

> behavior), or as display columns including handling multibyte

> characters and

> tabs. The patch makes display columns the new default. Additionally,

> a

> second new option -fdiagnostics-column-origin is added, which allows

> to make

> the column 0-based (or N-based for any N) instead of 1-based. The

> default

> remains at 1-based as it is now.

> 

> A number of testcases were explicitly testing for the old behavior,

> so I

> have updated them to test for the new behavior instead, since the

> column

> number adjusted for tabs is more natural to test for, and matches

> what

> editors typically show (give or take 1 for the origin convention).

> 

> One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It

> failed

> after this patch, although it doesn't test for any column numbers.

> The

> answer turned out to be, this test checks for identical error text on

> two

> different lines. When the column units are changed to display

> columns, then

> the column of the second error happens to match the line of the first

> one. dejagnu then misinterprets the second error as if it matched the

> location of the first one (it doesn't distinguish whether it checks

> for the

> line number or the column number in the output). I added a comment to

> the

> test explaining the situation; since adding the comment has the side

> effect

> of making the first line number no longer match the second column

> number, it

> also makes the test pass again.

> 

> It wasn't quite clear to me whether this change was appropriate for

> GCC 10

> or not at this point. We discussed it a couple months ago here:

> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way,

> I hope

> it isn't a problem that I submitted the patch for review now, whether

> it

> will end up in 10 or 11. Please let me know what's normally expected?

> Thanks!


Thanks Lewis.

This patch looks very promising, but should wait until gcc 11; we're
trying to stabilize gcc 10 right now (I'm knee-deep in analyzer bug-
fixing, so I don't want to add any more diagnostics changes).


> gcc/ChangeLog:

> 

> 2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>

>


Please reference the PRs here

[...]

> gcc/testsuite/ChangeLog:

> 

> 2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>


Likewise here.

[...]

> diff --git a/gcc/common.opt b/gcc/common.opt

> index 630c380bd6a..657985450c2 100644

> --- a/gcc/common.opt

> +++ b/gcc/common.opt

> @@ -1309,6 +1309,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)

>  EnumValue

>  Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)

>  

> +fdiagnostics-column-unit=

> +Common Joined RejectNegative Enum(diagnostics_column_unit)

> +-fdiagnostics-column-unit=[display|byte]	Select units for column numbers.

Should this line mention the default?

> +fdiagnostics-column-origin=

> +Common Joined RejectNegative UInteger

> +-fdiagnostics-column-origin=<number>	Set the number of the first column.  Default 1-based.


These new options should be documented in gcc/doc/invoke.texi.

[...]

> @@ -43,21 +44,23 @@ static json::array *cur_children_array;

>  /* Generate a JSON object for LOC.  */

>  

>  json::value *

> -json_from_expanded_location (location_t loc)

> +json_from_expanded_location (diagnostic_context *context, location_t loc)

>  {

>    expanded_location exploc = expand_location (loc);

>    json::object *result = new json::object ();

>    if (exploc.file)

>      result->set ("file", new json::string (exploc.file));

>    result->set ("line", new json::integer_number (exploc.line));

> -  result->set ("column", new json::integer_number (exploc.column));

> +  const int col = diagnostic_converted_column (context, exploc);

> +  result->set ("column", new json::integer_number (col));


I wonder if the JSON output format should show *both* values: perhaps
add fields "byte-column" and "display-column", and retain the field
"column", which would follow -fdiagnostics-column-unit?

[...]

> @@ -219,6 +220,8 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)

>    context->min_margin_width = 0;

>    context->show_ruler_p = false;

>    context->parseable_fixits_p = false;

> +  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;

> +  context->column_adj = 0;


I'm not sure, but I think I prefer it if we store the column origin
instead, rather than an offset relative to an origin of 1.

[...]

> @@ -338,8 +341,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)

>    return diagnostic_kind_color[kind];

>  }

>  

> +/* Given an expanded_location, convert the column (which is in 1-based bytes)

> +   to the requested units and origin.  Return -1 if the column is

> +   invalid (<= 0).  */

> +int

> +diagnostic_converted_column (diagnostic_context *context, expanded_location s)

> +{

> +  if (s.column <= 0)

> +    return -1;

> +

> +  int col;


...so this would be one_based_col.

> +  switch (context->column_unit)

> +    {

> +    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:

> +      col = location_compute_display_column (s);

> +      break;

> +

> +    case DIAGNOSTICS_COLUMN_UNIT_BYTE:

> +      col = s.column;

> +      break;

> +

> +    default:

> +      gcc_unreachable ();

> +    }

> +

> +  return col + context->column_adj;


...and this would be (I think):

     return context->column_origin + one_based_col - 1;

It would be doing the -1 each time, but maybe it's conceptually clearer?
I'm not sure.

[...]

> @@ -882,8 +930,10 @@ print_parseable_fixits (pretty_printer *pp, rich_location *richloc)

>        location_t next_loc = hint->get_next_loc ();

>        expanded_location next_exploc = expand_location (next_loc);

>        pp_printf (pp, ":{%i:%i-%i:%i}:",

> -		 start_exploc.line, start_exploc.column,

> -		 next_exploc.line, next_exploc.column);

> +		 start_exploc.line,

> +		 diagnostic_converted_column (context, start_exploc),

> +		 next_exploc.line,

> +		 diagnostic_converted_column (context, next_exploc));

>        print_escaped_string (pp, hint->get_string ());

>        pp_newline (pp);

>      }


If we're going to change the output of parseable fixits, that takes us away
from bug-for-bug-compatibility with clang in this area.

That should be documented, at least.

[...]

There's selftest coverage which is good; it would be good to *also*
have a few simple DejaGnu-based tests, showing the explicit use of both
units, and trying some offset values, with some lines with tabs, some
with spaces (if nothing else to verify that the option-parsing is wired
up correctly).

I'm nit-picking - apart from the lack of docs, this looks very
promising.  But as I said earlier, this should wait until gcc 11.

Thanks
Dave
Lewis Hyatt Jan. 31, 2020, 8:46 p.m. | #2
On Fri, Jan 31, 2020 at 3:32 PM David Malcolm <dmalcolm@redhat.com> wrote:
>

> On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:

> > Hello-

> >

> > Here is the second patch that I mentioned when I submitted the other

> > related

> > patch (which is awaiting review):

> > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html.

>

> Sorry about that; I'm v. busy with analyzer bugs right now.


Thanks very much for the feedback. Totally understood that you are
busy with more pressing things for GCC 10. Looking forward to trying
out -fanalyzer myself too. I'll apply these suggestions then and send
an updated version after GCC 10 release.

-Lewis
Joseph Myers Jan. 31, 2020, 10:45 p.m. | #3
This seems to be missing invoke.texi documentation for the new options.

-- 
Joseph S. Myers
joseph@codesourcery.com
Lewis Hyatt Jan. 31, 2020, 11:02 p.m. | #4
Thanks for taking a look, sorry about that, it's my first new option
:). I will add in the next iteration.

-Lewis

On Fri, Jan 31, 2020 at 5:45 PM Joseph Myers <joseph@codesourcery.com> wrote:
>

> This seems to be missing invoke.texi documentation for the new options.

>

> --

> Joseph S. Myers

> joseph@codesourcery.com
Richard Sandiford via Gcc-patches May 8, 2020, 7:35 p.m. | #5
On Fri, Jan 31, 2020 at 03:31:59PM -0500, David Malcolm wrote:
> On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:

> > Hello-

> > 

> > Here is the second patch that I mentioned when I submitted the other

> > related

> > patch (which is awaiting review):

> > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 

> 

> Sorry about that; I'm v. busy with analyzer bugs right now.

> 

> > This second patch

> > is based on top of the first one and it closes out PR49973 and

> > PR86904 by

> > adding the new option -fdiagnostics-column-unit=[display|byte]. This

> > allows

> > to specify whether columns are output as simple byte counts (the

> > current

> > behavior), or as display columns including handling multibyte

> > characters and

> > tabs. The patch makes display columns the new default. Additionally,

> > a

> > second new option -fdiagnostics-column-origin is added, which allows

> > to make

> > the column 0-based (or N-based for any N) instead of 1-based. The

> > default

> > remains at 1-based as it is now.

> > 

> > A number of testcases were explicitly testing for the old behavior,

> > so I

> > have updated them to test for the new behavior instead, since the

> > column

> > number adjusted for tabs is more natural to test for, and matches

> > what

> > editors typically show (give or take 1 for the origin convention).

> > 

> > One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It

> > failed

> > after this patch, although it doesn't test for any column numbers.

> > The

> > answer turned out to be, this test checks for identical error text on

> > two

> > different lines. When the column units are changed to display

> > columns, then

> > the column of the second error happens to match the line of the first

> > one. dejagnu then misinterprets the second error as if it matched the

> > location of the first one (it doesn't distinguish whether it checks

> > for the

> > line number or the column number in the output). I added a comment to

> > the

> > test explaining the situation; since adding the comment has the side

> > effect

> > of making the first line number no longer match the second column

> > number, it

> > also makes the test pass again.

> > 

> > It wasn't quite clear to me whether this change was appropriate for

> > GCC 10

> > or not at this point. We discussed it a couple months ago here:

> > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way,

> > I hope

> > it isn't a problem that I submitted the patch for review now, whether

> > it

> > will end up in 10 or 11. Please let me know what's normally expected?

> > Thanks!

> 

> Thanks Lewis.

> 

> This patch looks very promising, but should wait until gcc 11; we're

> trying to stabilize gcc 10 right now (I'm knee-deep in analyzer bug-

> fixing, so I don't want to add any more diagnostics changes).

>


Hi Dave-

Well GCC 10 was released for a whole day so I thought I would bug you with this
patch again now :). To summarize, I previously sent this in two separate parts.

Part 1: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01626.html
Part 2: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg02108.html

Part 1 added the support for converting tabs to spaces when outputting
diagnostics. Part 2 added the new options -fdiagnostics-column-unit and
-fdiagnostics-column-origin to control whether the column number is printed
in display or byte units. Together they resolve both PR49973 and PR86904.

You provided me with feedback on part 2, which is quoted below with some
notes interspersed. The new version of the patch incorporates all of your
suggestions. Part 1 has not changed other than some trivial rebasing
conflicts. The two patches touch nearly disjoint sets of files and are
logically linked together, so I thought it would be simpler if I just sent
one combined patch now. If you prefer them to be separated as before, please
let me know and I can send them that way as well.

Bootstrap and reg tests were done on x86-64 Linux for all languages.  Tests
look good:

type, before, after
FAIL 96 96
PASS 474637 475097
UNSUPPORTED 11607 11607
UNTESTED 195 195
XFAIL 1816 1816
XPASS 36 36

> 

> > gcc/ChangeLog:

> > 

> > 2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>

> >

> 

> Please reference the PRs here

> 

> [...]

> 

> > gcc/testsuite/ChangeLog:

> > 

> > 2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>

> 

> Likewise here.

> 

> [...]

>


Done.

> > diff --git a/gcc/common.opt b/gcc/common.opt

> > index 630c380bd6a..657985450c2 100644

> > --- a/gcc/common.opt

> > +++ b/gcc/common.opt

> > @@ -1309,6 +1309,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)

> >  EnumValue

> >  Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)

> >  

> > +fdiagnostics-column-unit=

> > +Common Joined RejectNegative Enum(diagnostics_column_unit)

> > +-fdiagnostics-column-unit=[display|byte]	Select units for column numbers.

> Should this line mention the default?

>


Done.

> > +fdiagnostics-column-origin=

> > +Common Joined RejectNegative UInteger

> > +-fdiagnostics-column-origin=<number>	Set the number of the first column.  Default 1-based.

> 

> These new options should be documented in gcc/doc/invoke.texi.

> 

> [...]

>


Done.

> > @@ -43,21 +44,23 @@ static json::array *cur_children_array;

> >  /* Generate a JSON object for LOC.  */

> >  

> >  json::value *

> > -json_from_expanded_location (location_t loc)

> > +json_from_expanded_location (diagnostic_context *context, location_t loc)

> >  {

> >    expanded_location exploc = expand_location (loc);

> >    json::object *result = new json::object ();

> >    if (exploc.file)

> >      result->set ("file", new json::string (exploc.file));

> >    result->set ("line", new json::integer_number (exploc.line));

> > -  result->set ("column", new json::integer_number (exploc.column));

> > +  const int col = diagnostic_converted_column (context, exploc);

> > +  result->set ("column", new json::integer_number (col));

> 

> I wonder if the JSON output format should show *both* values: perhaps

> add fields "byte-column" and "display-column", and retain the field

> "column", which would follow -fdiagnostics-column-unit?

> 

> [...]

>


Done. Adjusted the docs for JSON output as well.

> > @@ -219,6 +220,8 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)

> >    context->min_margin_width = 0;

> >    context->show_ruler_p = false;

> >    context->parseable_fixits_p = false;

> > +  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;

> > +  context->column_adj = 0;

> 

> I'm not sure, but I think I prefer it if we store the column origin

> instead, rather than an offset relative to an origin of 1.

> 

> [...]

> 

> > @@ -338,8 +341,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)

> >    return diagnostic_kind_color[kind];

> >  }

> >  

> > +/* Given an expanded_location, convert the column (which is in 1-based bytes)

> > +   to the requested units and origin.  Return -1 if the column is

> > +   invalid (<= 0).  */

> > +int

> > +diagnostic_converted_column (diagnostic_context *context, expanded_location s)

> > +{

> > +  if (s.column <= 0)

> > +    return -1;

> > +

> > +  int col;

> 

> ...so this would be one_based_col.

> 

> > +  switch (context->column_unit)

> > +    {

> > +    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:

> > +      col = location_compute_display_column (s);

> > +      break;

> > +

> > +    case DIAGNOSTICS_COLUMN_UNIT_BYTE:

> > +      col = s.column;

> > +      break;

> > +

> > +    default:

> > +      gcc_unreachable ();

> > +    }

> > +

> > +  return col + context->column_adj;

> 

> ...and this would be (I think):

> 

>      return context->column_origin + one_based_col - 1;

> 

> It would be doing the -1 each time, but maybe it's conceptually clearer?

> I'm not sure.

>


Sure, done.

> [...]

> 

> > @@ -882,8 +930,10 @@ print_parseable_fixits (pretty_printer *pp, rich_location *richloc)

> >        location_t next_loc = hint->get_next_loc ();

> >        expanded_location next_exploc = expand_location (next_loc);

> >        pp_printf (pp, ":{%i:%i-%i:%i}:",

> > -		 start_exploc.line, start_exploc.column,

> > -		 next_exploc.line, next_exploc.column);

> > +		 start_exploc.line,

> > +		 diagnostic_converted_column (context, start_exploc),

> > +		 next_exploc.line,

> > +		 diagnostic_converted_column (context, next_exploc));

> >        print_escaped_string (pp, hint->get_string ());

> >        pp_newline (pp);

> >      }

> 

> If we're going to change the output of parseable fixits, that takes us away

> from bug-for-bug-compatibility with clang in this area.

> 

> That should be documented, at least.

>


I didn't mean to do anything controversial here, I was just assuming this should
change for consistency, but didn't realize it needed to match an existing
standard. I removed this part of the patch for now, can send it in a separate
one if there's a desire to change this.

> [...]

> 

> There's selftest coverage which is good; it would be good to *also*

> have a few simple DejaGnu-based tests, showing the explicit use of both

> units, and trying some offset values, with some lines with tabs, some

> with spaces (if nothing else to verify that the option-parsing is wired

> up correctly).

>


Done.

> I'm nit-picking - apart from the lack of docs, this looks very

> promising.  But as I said earlier, this should wait until gcc 11.

> 

> Thanks

> Dave

> 


Thanks again for your time!

-Lewis
gcc/ChangeLog:

2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	PR other/86904
	* common.opt: Handle -ftabstop here instead of in c-family
	options.  Add -fdiagnostics-column-unit= and
	-fdiagnostics-column-origin= options.
	* opts.c (common_handle_option): Handle the new options.
	* diagnostic-format-json.cc (json_from_expanded_location): Add
	diagnostic_context argument.  Use it to convert column numbers as per
	the new options.
	(json_from_location_range): Likewise.
	(json_from_fixit_hint): Likewise.
	(json_end_diagnostic): Pass the new context argument to helper
	functions above.  Add "column-origin" field to the output.
	(test_unknown_location): Add the new context argument to calls to
	helper functions.
	(test_bad_endpoints): Likewise.
	* diagnostic-show-locus.c (struct line_bounds): Clarify that the
	units are now always display columns.  Rename members accordingly.
	Add constructor.
	(layout::print_source_line): Add support for tab expansion.
	(layout::print_annotation_line): Adapt to struct line_bounds changes.
	(layout::print_line): Likewise.
	(test_layout_x_offset_display_tab): New selftest.
	(test_one_liner_colorized_utf8): Likewise.
	(test_tab_expansion): Likewise.
	(test_diagnostic_show_locus_one_liner_utf8): Call the new tests.
	(diagnostic_show_locus_c_tests): Likewise.
	* diagnostic.c (diagnostic_initialize): Initialize new column_unit and
	column_origin members.
	(diagnostic_converted_column): New function.
	(maybe_line_and_column): Be willing to output a column of 0.
	(diagnostic_get_location_text): Convert column number as per the new
	options.
	(diagnostic_report_current_module): Likewise.
	(assert_location_text): Add origin and column_unit arguments for
	testing the new functionality.
	(test_diagnostic_get_location_text): Test the new functionality.
	* diagnostic.h (enum diagnostics_column_unit): New enum.
	(struct diagnostic_context): Add members for the new options.
	(diagnostic_converted_column): Declare.
	(json_from_expanded_location): Add new context argument.
	* doc/invoke.texi: Document the new options.
	* input.h (location_compute_display_column): Add tabstop argument.
	* input.c (location_compute_display_column): Likewise.
	(test_cpp_utf8): Add selftests for tab expansion.
	* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the
	new context argument to json_from_expanded_location().

gcc/c-family/ChangeLog:

2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

	PR other/86904
	* c-indentation.c (should_warn_for_misleading_indentation): Get
	global tabstop from the new source.
	* c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which
	is now a common option.
	* c.opt: Likewise.

gcc/testsuite/ChangeLog:

2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	PR other/86904
	* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output
	for new defaults.
	* c-c++-common/Wmisleading-indentation.c: Likewise.
	* c-c++-common/diagnostic-format-json-1.c: Likewise.
	* c-c++-common/diagnostic-format-json-2.c: Likewise.
	* c-c++-common/diagnostic-format-json-3.c: Likewise.
	* c-c++-common/diagnostic-format-json-4.c: Likewise.
	* c-c++-common/diagnostic-format-json-5.c: Likewise.
	* c-c++-common/missing-close-symbol.c: Likewise.
	* g++.dg/diagnostic/bad-binary-ops.C: Likewise.
	* g++.dg/parse/error4.C: Likewise.
	* g++.old-deja/g++.brendan/crash11.C: Likewise.
	* g++.old-deja/g++.pt/overload2.C: Likewise.
	* g++.old-deja/g++.robertl/eb109.C: Likewise.
	* gcc.dg/analyzer/malloc-paths-9.c: Likewise.
	* gcc.dg/bad-binary-ops.c: Likewise.
	* gcc.dg/format/branch-1.c: Likewise.
	* gcc.dg/format/pr79210.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.
	* gcc.dg/redecl-4.c: Likewise.
	* gfortran.dg/diagnostic-format-json-1.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
	* go.dg/arrayclear.go: Add a comment explaining why adding a
	comment was necessary to work around a dejagnu bug.
	* c-c++-common/diagnostic-units-1.c: New test.
	* c-c++-common/diagnostic-units-2.c: New test.
	* c-c++-common/diagnostic-units-3.c: New test.
	* c-c++-common/diagnostic-units-4.c: New test.
	* c-c++-common/diagnostic-units-5.c: New test.
	* c-c++-common/diagnostic-units-6.c: New test.
	* c-c++-common/diagnostic-units-7.c: New test.
	* c-c++-common/diagnostic-units-8.c: New test.

libcpp/ChangeLog:

2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	PR other/86904
	* include/cpplib.h (struct cpp_options): Removed support for -ftabstop,
	which is now handled by cpp_set_tabstop ().
	(class cpp_display_width_computation): New class.
	(cpp_byte_column_to_display_column): Add optional tabstop argument.
	(cpp_display_width): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	(cpp_set_tabstop): New function.
	(cpp_get_tabstop): Likewise.
	* charset.c (global_tabstop): New static variable.
	(cpp_set_tabstop): New function to access global_tabstop.
	(cpp_get_tabstop): Likewise.
	(cpp_display_width_computation::cpp_display_width_computation): New
	function.
	(compute_next_display_width): Removed and implemented this
	functionality in a new function...
	(cpp_display_width_computation::process_next_codepoint): ...here.
	(cpp_display_width_computation::advance_display_cols): New function.
	(cpp_byte_column_to_display_column): Added tabstop argument.
	Reimplemented in terms of class cpp_display_width_computation.
	(cpp_display_column_to_byte_column): Likewise.
	* init.c (cpp_create_reader): Remove handling of -ftabstop, which is now
	handled via cpp_set_tabstop().
commit 080d5f5ac4c18c5b8dd5d4fdd43034624e4f55a9
Author: Lewis Hyatt <lhyatt@gmail.com>
Date:   Fri Jan 17 17:53:58 2020 -0500

    diagnostics: Support conversion of tabs to spaces [PR49973] [PR86904]

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index 9fba3bcc67c..fa4739c47a9 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -299,7 +299,7 @@ should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,
   expanded_location next_stmt_exploc = expand_location (next_stmt_loc);
   expanded_location guard_exploc = expand_location (guard_loc);
 
-  const unsigned int tab_width = cpp_opts->tabstop;
+  const unsigned int tab_width = cpp_get_tabstop ();
 
   /* They must be in the same file.  */
   if (next_stmt_exploc.file != body_exploc.file)
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 58ba0948e79..cddf1e28e1d 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -504,12 +504,6 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,
 	cpp_opts->track_macro_expansion = 2;
       break;
 
-    case OPT_ftabstop_:
-      /* It is documented that we silently ignore silly values.  */
-      if (value >= 1 && value <= 100)
-	cpp_opts->tabstop = value;
-      break;
-
     case OPT_fexec_charset_:
       cpp_opts->narrow_charset = arg;
       break;
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index c49da99d395..dbdb78e0ad3 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1876,10 +1876,6 @@ Enum(strong_eval_order) String(some) Value(1)
 EnumValue
 Enum(strong_eval_order) String(all) Value(2)
 
-ftabstop=
-C ObjC C++ ObjC++ Joined RejectNegative UInteger
--ftabstop=<number>	Distance between tab stops for column reporting.
-
 ftemplate-backtrace-limit=
 C++ ObjC++ Joined RejectNegative UInteger Var(template_backtrace_limit) Init(10)
 Set the maximum number of template instantiation notes for a single warning or error.
diff --git a/gcc/common.opt b/gcc/common.opt
index 30d05734d16..e3c62a3e7ea 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1321,6 +1321,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)
 EnumValue
 Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)
 
+fdiagnostics-column-unit=
+Common Joined RejectNegative Enum(diagnostics_column_unit)
+-fdiagnostics-column-unit=[display|byte]	Select whether column numbers are output as display columns (default) or raw bytes.
+
+fdiagnostics-column-origin=
+Common Joined RejectNegative UInteger
+-fdiagnostics-column-origin=<number>	Set the number of the first column.  The default is 1-based as per GNU style, but some utilities may expect 0-based, for example.
+
 fdiagnostics-format=
 Common Joined RejectNegative Enum(diagnostics_output_format)
 -fdiagnostics-format=[text|json]	Select output format.
@@ -1329,6 +1337,15 @@ Common Joined RejectNegative Enum(diagnostics_output_format)
 SourceInclude
 diagnostic.h
 
+Enum
+Name(diagnostics_column_unit) Type(int)
+
+EnumValue
+Enum(diagnostics_column_unit) String(display) Value(DIAGNOSTICS_COLUMN_UNIT_DISPLAY)
+
+EnumValue
+Enum(diagnostics_column_unit) String(byte) Value(DIAGNOSTICS_COLUMN_UNIT_BYTE)
+
 Enum
 Name(diagnostics_output_format) Type(int)
 
@@ -1358,6 +1375,10 @@ fdiagnostics-path-format=
 Common Joined RejectNegative Var(flag_diagnostics_path_format) Enum(diagnostic_path_format) Init(DPF_INLINE_EVENTS)
 Specify how to print any control-flow path associated with a diagnostic.
 
+ftabstop=
+Common Joined RejectNegative UInteger
+-ftabstop=<number>      Distance between tab stops for column reporting.
+
 Enum
 Name(diagnostic_path_format) Type(int)
 
diff --git a/gcc/diagnostic-format-json.cc b/gcc/diagnostic-format-json.cc
index 7bda5c4ba83..465c42fdfde 100644
--- a/gcc/diagnostic-format-json.cc
+++ b/gcc/diagnostic-format-json.cc
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "diagnostic.h"
+#include "selftest-diagnostic.h"
 #include "diagnostic-metadata.h"
 #include "json.h"
 #include "selftest.h"
@@ -43,21 +44,43 @@ static json::array *cur_children_array;
 /* Generate a JSON object for LOC.  */
 
 json::value *
-json_from_expanded_location (location_t loc)
+json_from_expanded_location (diagnostic_context *context, location_t loc)
 {
   expanded_location exploc = expand_location (loc);
   json::object *result = new json::object ();
   if (exploc.file)
     result->set ("file", new json::string (exploc.file));
   result->set ("line", new json::integer_number (exploc.line));
-  result->set ("column", new json::integer_number (exploc.column));
+
+  const enum diagnostics_column_unit orig_unit = context->column_unit;
+  struct
+  {
+    const char *name;
+    enum diagnostics_column_unit unit;
+  } column_fields[] = {
+    {"display-column", DIAGNOSTICS_COLUMN_UNIT_DISPLAY},
+    {"byte-column", DIAGNOSTICS_COLUMN_UNIT_BYTE}
+  };
+  int the_column = INT_MIN;
+  for (int i = 0; i != sizeof column_fields / sizeof (*column_fields); ++i)
+    {
+      context->column_unit = column_fields[i].unit;
+      const int col = diagnostic_converted_column (context, exploc);
+      result->set (column_fields[i].name, new json::integer_number (col));
+      if (column_fields[i].unit == orig_unit)
+	the_column = col;
+    }
+  gcc_assert (the_column != INT_MIN);
+  result->set ("column", new json::integer_number (the_column));
+  context->column_unit = orig_unit;
   return result;
 }
 
 /* Generate a JSON object for LOC_RANGE.  */
 
 static json::object *
-json_from_location_range (const location_range *loc_range, unsigned range_idx)
+json_from_location_range (diagnostic_context *context,
+			  const location_range *loc_range, unsigned range_idx)
 {
   location_t caret_loc = get_pure_location (loc_range->m_loc);
 
@@ -68,13 +91,13 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)
   location_t finish_loc = get_finish (loc_range->m_loc);
 
   json::object *result = new json::object ();
-  result->set ("caret", json_from_expanded_location (caret_loc));
+  result->set ("caret", json_from_expanded_location (context, caret_loc));
   if (start_loc != caret_loc
       && start_loc != UNKNOWN_LOCATION)
-    result->set ("start", json_from_expanded_location (start_loc));
+    result->set ("start", json_from_expanded_location (context, start_loc));
   if (finish_loc != caret_loc
       && finish_loc != UNKNOWN_LOCATION)
-    result->set ("finish", json_from_expanded_location (finish_loc));
+    result->set ("finish", json_from_expanded_location (context, finish_loc));
 
   if (loc_range->m_label)
     {
@@ -91,14 +114,14 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)
 /* Generate a JSON object for HINT.  */
 
 static json::object *
-json_from_fixit_hint (const fixit_hint *hint)
+json_from_fixit_hint (diagnostic_context *context, const fixit_hint *hint)
 {
   json::object *fixit_obj = new json::object ();
 
   location_t start_loc = hint->get_start_loc ();
-  fixit_obj->set ("start", json_from_expanded_location (start_loc));
+  fixit_obj->set ("start", json_from_expanded_location (context, start_loc));
   location_t next_loc = hint->get_next_loc ();
-  fixit_obj->set ("next", json_from_expanded_location (next_loc));
+  fixit_obj->set ("next", json_from_expanded_location (context, next_loc));
   fixit_obj->set ("string", new json::string (hint->get_string ()));
 
   return fixit_obj;
@@ -190,11 +213,13 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   else
     {
       /* Otherwise, make diag_obj be the top-level object within the group;
-	 add a "children" array.  */
+	 add a "children" array and record the column origin.  */
       toplevel_array->append (diag_obj);
       cur_group = diag_obj;
       cur_children_array = new json::array ();
       diag_obj->set ("children", cur_children_array);
+      diag_obj->set ("column-origin",
+		     new json::integer_number (context->column_origin));
     }
 
   const rich_location *richloc = diagnostic->richloc;
@@ -205,7 +230,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   for (unsigned int i = 0; i < richloc->get_num_locations (); i++)
     {
       const location_range *loc_range = richloc->get_range (i);
-      json::object *loc_obj = json_from_location_range (loc_range, i);
+      json::object *loc_obj = json_from_location_range (context, loc_range, i);
       if (loc_obj)
 	loc_array->append (loc_obj);
     }
@@ -217,7 +242,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
       for (unsigned int i = 0; i < richloc->get_num_fixit_hints (); i++)
 	{
 	  const fixit_hint *hint = richloc->get_fixit_hint (i);
-	  json::object *fixit_obj = json_from_fixit_hint (hint);
+	  json::object *fixit_obj = json_from_fixit_hint (context, hint);
 	  fixit_array->append (fixit_obj);
 	}
     }
@@ -320,7 +345,8 @@ namespace selftest {
 static void
 test_unknown_location ()
 {
-  delete json_from_expanded_location (UNKNOWN_LOCATION);
+  test_diagnostic_context dc;
+  delete json_from_expanded_location (&dc, UNKNOWN_LOCATION);
 }
 
 /* Verify that we gracefully handle attempts to serialize bad
@@ -338,7 +364,8 @@ test_bad_endpoints ()
   loc_range.m_range_display_kind = SHOW_RANGE_WITH_CARET;
   loc_range.m_label = NULL;
 
-  json::object *obj = json_from_location_range (&loc_range, 0);
+  test_diagnostic_context dc;
+  json::object *obj = json_from_location_range (&dc, &loc_range, 0);
   /* We should have a "caret" value, but no "start" or "finish" values.  */
   ASSERT_TRUE (obj != NULL);
   ASSERT_TRUE (obj->get ("caret") != NULL);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 4618b4edb7d..8a34e30c4c7 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -226,22 +226,18 @@ class layout_range
 
 /* A struct for use by layout::print_source_line for telling
    layout::print_annotation_line the extents of the source line that
-   it printed, so that underlines can be clipped appropriately.  */
+   it printed, so that underlines can be clipped appropriately.  Units
+   are 1-based display columns.  */
 
 struct line_bounds
 {
-  int m_first_non_ws;
-  int m_last_non_ws;
+  int m_first_non_ws_disp_col;
+  int m_last_non_ws_disp_col;
 
-  void convert_to_display_cols (char_span line)
+  line_bounds ()
   {
-    m_first_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),
-							line.length (),
-							m_first_non_ws);
-
-    m_last_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),
-						       line.length (),
-						       m_last_non_ws);
+    m_first_non_ws_disp_col = INT_MAX;
+    m_last_non_ws_disp_col = 0;
   }
 };
 
@@ -351,8 +347,8 @@ class layout
  private:
   bool will_show_line_p (linenum_type row) const;
   void print_leading_fixits (linenum_type row);
-  void print_source_line (linenum_type row, const char *line, int line_bytes,
-			  line_bounds *lbounds_out);
+  line_bounds print_source_line (linenum_type row, const char *line,
+				 int line_bytes);
   bool should_print_annotation_line_p (linenum_type row) const;
   void start_annotation_line (char margin_char = ' ') const;
   void print_annotation_line (linenum_type row, const line_bounds lbounds);
@@ -1445,16 +1441,13 @@ layout::calculate_x_offset_display ()
 }
 
 /* Print line ROW of source code, potentially colorized at any ranges, and
-   populate *LBOUNDS_OUT.
-   LINE is the source line (not necessarily 0-terminated) and LINE_BYTES
-   is its length in bytes.
-   This function deals only with byte offsets, not display columns, so
-   m_x_offset_display must be converted from display to byte units.  In
-   particular, LINE_BYTES and LBOUNDS_OUT are in bytes.  */
+   return the line bounds.  LINE is the source line (not necessarily
+   0-terminated) and LINE_BYTES is its length in bytes.  In order to handle both
+   colorization and tab expansion, this function tracks the line position in
+   both byte and display column units.  */
 
-void
-layout::print_source_line (linenum_type row, const char *line, int line_bytes,
-			   line_bounds *lbounds_out)
+line_bounds
+layout::print_source_line (linenum_type row, const char *line, int line_bytes)
 {
   m_colorizer.set_normal_text ();
 
@@ -1469,30 +1462,29 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
   else
     pp_space (m_pp);
 
-  /* We will stop printing the source line at any trailing whitespace, and start
-     printing it as per m_x_offset_display.  */
+  /* We will stop printing the source line at any trailing whitespace.  */
   line_bytes = get_line_bytes_without_trailing_whitespace (line,
 							   line_bytes);
-  int x_offset_bytes = 0;
-  if (m_x_offset_display)
-    {
-      x_offset_bytes = cpp_display_column_to_byte_column (line, line_bytes,
-							  m_x_offset_display);
-      /* In case the leading portion of the line that will be skipped over ends
-	 with a character with wcwidth > 1, then it is possible we skipped too
-	 much, so account for that by padding with spaces.  */
-      const int overage
-	= cpp_byte_column_to_display_column (line, line_bytes, x_offset_bytes)
-	- m_x_offset_display;
-      for (int column = 0; column < overage; ++column)
-	pp_space (m_pp);
-      line += x_offset_bytes;
-    }
 
-  /* Print the line.  */
-  int first_non_ws = INT_MAX;
-  int last_non_ws = 0;
-  for (int col_byte = 1 + x_offset_bytes; col_byte <= line_bytes; col_byte++)
+  /* This object helps to keep track of which display column we are at, which is
+     necessary for computing the line bounds in display units, for doing
+     tab expansion, and for implementing m_x_offset_display.  */
+  cpp_display_width_computation dw (line, line_bytes);
+
+  /* Skip the first m_x_offset_display display columns.  In case the leading
+     portion that will be skipped ends with a character with wcwidth > 1, then
+     it is possible we skipped too much, so account for that by padding with
+     spaces.  Note that this does the right thing too in case a tab was the last
+     character to be skipped over; the tab is effectively replaced by the
+     correct number of trailing spaces needed to offset by the desired number of
+     display columns.  */
+  for (int skipped_display_cols = dw.advance_display_cols (m_x_offset_display);
+       skipped_display_cols > m_x_offset_display; --skipped_display_cols)
+    pp_space (m_pp);
+
+  /* Print the line and compute the line_bounds.  */
+  line_bounds lbounds;
+  while (!dw.done ())
     {
       /* Assuming colorization is enabled for the caret and underline
 	 characters, we may also colorize the associated characters
@@ -1510,7 +1502,8 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
 	{
 	  bool in_range_p;
 	  point_state state;
-	  in_range_p = get_state_at_point (row, col_byte,
+	  const int start_byte_col = dw.bytes_processed () + 1;
+	  in_range_p = get_state_at_point (row, start_byte_col,
 					   0, INT_MAX,
 					   CU_BYTES,
 					   &state);
@@ -1519,22 +1512,44 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
 	  else
 	    m_colorizer.set_normal_text ();
 	}
-      char c = *line;
-      if (c == '\0' || c == '\t' || c == '\r')
-	c = ' ';
-      if (c != ' ')
+
+      /* Get the display width of the next character to be output, expanding
+	 tabs and replacing some control bytes with spaces as necessary.  */
+      const char *c = dw.next_byte ();
+      const int start_disp_col = dw.display_cols_processed () + 1;
+      const int this_display_width = dw.process_next_codepoint ();
+      if (*c == '\t')
+	{
+	  /* The returned display width is the number of spaces into which the
+	     tab should be expanded.  */
+	  for (int i = 0; i != this_display_width; ++i)
+	    pp_space (m_pp);
+	  continue;
+	}
+      if (*c == '\0' || *c == '\r')
+	{
+	  /* cpp_wcwidth() promises to return 1 for all control bytes, and we
+	     want to output these as a single space too, so this case is
+	     actually the same as the '\t' case.  */
+	  gcc_assert (this_display_width == 1);
+	  pp_space (m_pp);
+	  continue;
+	}
+
+      /* We have a (possibly multibyte) character to output; update the line
+	 bounds if it is not whitespace.  */
+      if (*c != ' ')
 	{
-	  last_non_ws = col_byte;
-	  if (first_non_ws == INT_MAX)
-	    first_non_ws = col_byte;
+	  lbounds.m_last_non_ws_disp_col = dw.display_cols_processed ();
+	  if (lbounds.m_first_non_ws_disp_col == INT_MAX)
+	    lbounds.m_first_non_ws_disp_col = start_disp_col;
 	}
-      pp_character (m_pp, c);
-      line++;
+
+      /* Output the character.  */
+      while (c != dw.next_byte ()) pp_character (m_pp, *c++);
     }
   print_newline ();
-
-  lbounds_out->m_first_non_ws = first_non_ws;
-  lbounds_out->m_last_non_ws = last_non_ws;
+  return lbounds;
 }
 
 /* Determine if we should print an annotation line for ROW.
@@ -1576,14 +1591,13 @@ layout::start_annotation_line (char margin_char) const
 }
 
 /* Print a line consisting of the caret/underlines for the given
-   source line.  This function works with display columns, rather than byte
-   counts; in particular, LBOUNDS should be in display column units.  */
+   source line.  */
 
 void
 layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
 {
   int x_bound = get_x_bound_for_row (row, m_exploc.m_display_col,
-				     lbounds.m_last_non_ws);
+				     lbounds.m_last_non_ws_disp_col);
 
   start_annotation_line ();
   pp_space (m_pp);
@@ -1593,8 +1607,8 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
       bool in_range_p;
       point_state state;
       in_range_p = get_state_at_point (row, column,
-				       lbounds.m_first_non_ws,
-				       lbounds.m_last_non_ws,
+				       lbounds.m_first_non_ws_disp_col,
+				       lbounds.m_last_non_ws_disp_col,
 				       CU_DISPLAY_COLS,
 				       &state);
       if (in_range_p)
@@ -2499,15 +2513,11 @@ layout::print_line (linenum_type row)
   if (!line)
     return;
 
-  line_bounds lbounds;
   print_leading_fixits (row);
-  print_source_line (row, line.get_buffer (), line.length (), &lbounds);
+  const line_bounds lbounds
+    = print_source_line (row, line.get_buffer (), line.length ());
   if (should_print_annotation_line_p (row))
-    {
-      if (lbounds.m_first_non_ws != INT_MAX)
-	lbounds.convert_to_display_cols (line);
-      print_annotation_line (row, lbounds);
-    }
+    print_annotation_line (row, lbounds);
   if (m_show_labels_p)
     print_any_labels (row);
   print_trailing_fixits (row);
@@ -2774,6 +2784,114 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
 
 }
 
+static void
+test_layout_x_offset_display_tab (const line_table_case &case_)
+{
+  const char *content
+    = "This line is very long, so that we can use it to test the logic for "
+      "clipping long lines.  Also this: `\t' is a tab that occupies 1 byte and "
+      "a variable number of display columns, starting at column #103.\n";
+
+  /* Number of bytes in the line, subtracting one to remove the newline.  */
+  const int line_bytes = strlen (content) - 1;
+
+ /* The column where the tab begins.  Byte or display is the same as there are
+    no multibyte characters earlier on the line.  */
+  const int tab_col = 103;
+
+  /* Effective extra size of the tab beyond what a single space would have taken
+     up, indexed by tabstop.  */
+  static const int num_tabstops = 11;
+  int extra_width[num_tabstops];
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      const int this_tab_size = tabstop - (tab_col - 1) % tabstop;
+      extra_width[tabstop] = this_tab_size - 1;
+    }
+  /* Example of this calculation: if tabstop is 10, the tab starting at column
+     #103 has to expand into 8 spaces, covering columns 103-110, so that the
+     next character is at column #111.  So it takes up 7 more columns than
+     a space would have taken up.  */
+  ASSERT_EQ (7, extra_width[10]);
+
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+  line_table_test ltt (case_);
+
+  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+
+  location_t line_end = linemap_position_for_column (line_table, line_bytes);
+
+  /* Don't attempt to run the tests if column data might be unavailable.  */
+  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return;
+
+  /* Check that cpp_display_width handles the tabs as expected.  */
+  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  ASSERT_EQ ('\t', *(lspan.get_buffer () + (tab_col - 1)));
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      ASSERT_EQ (line_bytes + extra_width[tabstop],
+		 cpp_display_width (lspan.get_buffer (), lspan.length (),
+				    tabstop));
+      ASSERT_EQ (line_bytes + extra_width[tabstop],
+		 location_compute_display_column (expand_location (line_end),
+						  tabstop));
+    }
+
+  /* Check that the tab is expanded to the expected number of spaces.  */
+  const int global_tabstop = cpp_get_tabstop ();
+  rich_location richloc (line_table,
+			 linemap_position_for_column (line_table,
+						      tab_col + 1));
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      cpp_set_tabstop (tabstop);
+      test_diagnostic_context dc;
+      layout test_layout (&dc, &richloc, DK_ERROR);
+      test_layout.print_line (1);
+      const char *out = pp_formatted_text (dc.printer);
+      ASSERT_EQ (NULL, strchr (out, '\t'));
+      const char *left_quote = strchr (out, '`');
+      const char *right_quote = strchr (out, '\'');
+      ASSERT_NE (NULL, left_quote);
+      ASSERT_NE (NULL, right_quote);
+      ASSERT_EQ (right_quote - left_quote, extra_width[tabstop] + 2);
+    }
+
+  /* Check that the line is offset properly and that the tab is broken up
+     into the expected number of spaces when it is the last character skipped
+     over.  */
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      cpp_set_tabstop (tabstop);
+      test_diagnostic_context dc;
+      static const int small_width = 24;
+      dc.caret_max_width = small_width - 4;
+      dc.min_margin_width = test_left_margin - test_linenum_sep + 1;
+      dc.show_line_numbers_p = true;
+      layout test_layout (&dc, &richloc, DK_ERROR);
+      test_layout.print_line (1);
+
+      /* We have arranged things so that two columns will be printed before
+	 the caret.  If the tab results in more than one space, this should
+	 produce two spaces in the output; otherwise, it will be a single space
+	 preceded by the opening quote before the tab character.  */
+      const char *output1
+	= "   1 |   ' is a tab that occupies 1 byte and a variable number of "
+	  "display columns, starting at column #103.\n"
+	  "     |   ^\n\n";
+      const char *output2
+	= "   1 | ` ' is a tab that occupies 1 byte and a variable number of "
+	  "display columns, starting at column #103.\n"
+	  "     |   ^\n\n";
+      const char *expected_output = (extra_width[tabstop] ? output1 : output2);
+      ASSERT_STREQ (expected_output, pp_formatted_text (dc.printer));
+    }
+
+  cpp_set_tabstop (global_tabstop);
+}
+
+
 /* Verify that diagnostic_show_locus works sanely on UNKNOWN_LOCATION.  */
 
 static void
@@ -3854,6 +3972,27 @@ test_one_liner_labels_utf8 ()
   }
 }
 
+/* Make sure that colorization codes don't interrupt a multibyte
+   sequence, which would corrupt it.  */
+static void
+test_one_liner_colorized_utf8 ()
+{
+  test_diagnostic_context dc;
+  dc.colorize_source_p = true;
+  diagnostic_color_init (&dc, DIAGNOSTICS_COLOR_YES);
+  const location_t pi = linemap_position_for_column (line_table, 12);
+  rich_location richloc (line_table, pi);
+  diagnostic_show_locus (&dc, &richloc, DK_ERROR);
+
+  /* In order to avoid having the test depend on exactly how the colorization
+     was effected, just confirm there are two pi characters in the output.  */
+  const char *result = pp_formatted_text (dc.printer);
+  const char *null_term = result + strlen (result);
+  const char *first_pi = strstr (result, "\xcf\x80");
+  ASSERT_TRUE (first_pi && first_pi <= null_term - 2);
+  ASSERT_STR_CONTAINS (first_pi + 2, "\xcf\x80");
+}
+
 /* Run the various one-liner tests.  */
 
 static void
@@ -3900,6 +4039,7 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
   test_one_liner_many_fixits_1_utf8 ();
   test_one_liner_many_fixits_2_utf8 ();
   test_one_liner_labels_utf8 ();
+  test_one_liner_colorized_utf8 ();
 }
 
 /* Verify that gcc_rich_location::add_location_if_nearby works.  */
@@ -4955,6 +5095,68 @@ test_fixit_deletion_affecting_newline (const line_table_case &case_)
 		pp_formatted_text (dc.printer));
 }
 
+static void
+test_tab_expansion (const line_table_case &case_)
+{
+  /* Set up the tabstop to be sure it is 8.  */
+  const int global_tabstop = cpp_get_tabstop ();
+  cpp_set_tabstop (8);
+
+  /* Create a tempfile and write some text to it.  This example uses a tabstop
+     of 8, as the column numbers attempt to indicate:
+
+    .....................000.01111111111.22222333333  display
+    .....................123.90123456789.56789012345  columns  */
+  const char *content = "  \t   This: `\t' is a tab.\n";
+  /* ....................000 00000011111 11111222222  byte
+     ....................123 45678901234 56789012345  columns  */
+
+  const int first_non_ws_byte_col = 7;
+  const int right_quote_byte_col = 15;
+  const int last_byte_col = 25;
+  ASSERT_EQ (35, cpp_display_width (content, last_byte_col));
+
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+  line_table_test ltt (case_);
+  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+
+  /* Don't attempt to run the tests if column data might be unavailable.  */
+  location_t line_end = linemap_position_for_column (line_table, last_byte_col);
+  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return;
+
+  /* Check that the leading whitespace with mixed tabs and spaces is expanded
+     into 11 spaces.  Recall that print_line() also puts one space before
+     everything too.  */
+  {
+    test_diagnostic_context dc;
+    rich_location richloc (line_table,
+			   linemap_position_for_column (line_table,
+							first_non_ws_byte_col));
+    layout test_layout (&dc, &richloc, DK_ERROR);
+    test_layout.print_line (1);
+    ASSERT_STREQ ("            This: `      ' is a tab.\n"
+		  "            ^\n",
+		  pp_formatted_text (dc.printer));
+  }
+
+  /* Confirm the display width was tracked correctly across the internal tab
+     as well.  */
+  {
+    test_diagnostic_context dc;
+    rich_location richloc (line_table,
+			   linemap_position_for_column (line_table,
+							right_quote_byte_col));
+    layout test_layout (&dc, &richloc, DK_ERROR);
+    test_layout.print_line (1);
+    ASSERT_STREQ ("            This: `      ' is a tab.\n"
+		  "                         ^\n",
+		  pp_formatted_text (dc.printer));
+  }
+
+  cpp_set_tabstop (global_tabstop);
+}
+
 /* Verify that line numbers are correctly printed for the case of
    a multiline range in which the width of the line numbers changes
    (e.g. from "9" to "10").  */
@@ -5012,6 +5214,7 @@ diagnostic_show_locus_c_tests ()
   test_layout_range_for_multiple_lines ();
 
   for_each_line_table_case (test_layout_x_offset_display_utf8);
+  for_each_line_table_case (test_layout_x_offset_display_tab);
 
   test_get_line_bytes_without_trailing_whitespace ();
 
@@ -5029,6 +5232,7 @@ diagnostic_show_locus_c_tests ()
   for_each_line_table_case (test_fixit_insert_containing_newline_2);
   for_each_line_table_case (test_fixit_replace_containing_newline);
   for_each_line_table_case (test_fixit_deletion_affecting_newline);
+  for_each_line_table_case (test_tab_expansion);
 
   test_line_numbers_multiline_range ();
 }
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index ed52bc03d17..120c3258540 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "selftest.h"
 #include "selftest-diagnostic.h"
 #include "opts.h"
+#include "cpplib.h"
 
 #ifdef HAVE_TERMIOS_H
 # include <termios.h>
@@ -219,6 +220,8 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
   context->min_margin_width = 0;
   context->show_ruler_p = false;
   context->parseable_fixits_p = false;
+  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;
+  context->column_origin = 1;
   context->edit_context_ptr = NULL;
   context->diagnostic_group_nesting_depth = 0;
   context->diagnostic_group_emission_count = 0;
@@ -353,8 +356,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)
   return diagnostic_kind_color[kind];
 }
 
+/* Given an expanded_location, convert the column (which is in 1-based bytes)
+   to the requested units and origin.  Return -1 if the column is
+   invalid (<= 0).  */
+int
+diagnostic_converted_column (diagnostic_context *context, expanded_location s)
+{
+  if (s.column <= 0)
+    return -1;
+
+  int one_based_col;
+  switch (context->column_unit)
+    {
+    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:
+      one_based_col = location_compute_display_column (s);
+      break;
+
+    case DIAGNOSTICS_COLUMN_UNIT_BYTE:
+      one_based_col = s.column;
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return one_based_col + (context->column_origin - 1);
+}
+
 /* Return a formatted line and column ':%line:%column'.  Elided if
-   zero.  The result is a statically allocated buffer.  */
+   line == 0 or col < 0.  (A column of 0 may be valid due to the
+   -fdiagnostics-column-origin option.)
+   The result is a statically allocated buffer.  */
 
 static const char *
 maybe_line_and_column (int line, int col)
@@ -363,8 +395,9 @@ maybe_line_and_column (int line, int col)
 
   if (line)
     {
-      size_t l = snprintf (result, sizeof (result),
-			   col ? ":%d:%d" : ":%d", line, col);
+      size_t l
+	= snprintf (result, sizeof (result),
+		    col >= 0 ? ":%d:%d" : ":%d", line, col);
       gcc_checking_assert (l < sizeof (result));
     }
   else
@@ -383,8 +416,14 @@ diagnostic_get_location_text (diagnostic_context *context,
   const char *locus_cs = colorize_start (pp_show_color (pp), "locus");
   const char *locus_ce = colorize_stop (pp_show_color (pp));
   const char *file = s.file ? s.file : progname;
-  int line = strcmp (file, N_("<built-in>")) ? s.line : 0;
-  int col = context->show_column ? s.column : 0;
+  int line = 0;
+  int col = -1;
+  if (strcmp (file, N_("<built-in>")))
+    {
+      line = s.line;
+      if (context->show_column)
+	col = diagnostic_converted_column (context, s);
+    }
 
   const char *line_col = maybe_line_and_column (line, col);
   return build_message_string ("%s%s%s:%s", locus_cs, file,
@@ -650,14 +689,20 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
       if (! MAIN_FILE_P (map))
 	{
 	  bool first = true;
+	  expanded_location s = {};
 	  do
 	    {
 	      where = linemap_included_from (map);
 	      map = linemap_included_from_linemap (line_table, map);
-	      const char *line_col
-		= maybe_line_and_column (SOURCE_LINE (map, where),
-					 first && context->show_column
-					 ? SOURCE_COLUMN (map, where) : 0);
+	      s.file = LINEMAP_FILE (map);
+	      s.line = SOURCE_LINE (map, where);
+	      int col = -1;
+	      if (first && context->show_column)
+		{
+		  s.column = SOURCE_COLUMN (map, where);
+		  col = diagnostic_converted_column (context, s);
+		}
+	      const char *line_col = maybe_line_and_column (s.line, col);
 	      static const char *const msgs[] =
 		{
 		 N_("In file included from"),
@@ -666,7 +711,7 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
 	      unsigned index = !first;
 	      pp_verbatim (context->printer, "%s%s %r%s%s%R",
 			   first ? "" : ",\n", _(msgs[index]),
-			   "locus", LINEMAP_FILE (map), line_col);
+			   "locus", s.file, line_col);
 	      first = false;
 	    }
 	  while (! MAIN_FILE_P (map));
@@ -2042,10 +2087,15 @@ test_print_parseable_fixits_replace ()
 static void
 assert_location_text (const char *expected_loc_text,
 		      const char *filename, int line, int column,
-		      bool show_column)
+		      bool show_column,
+		      int origin = 1,
+		      enum diagnostics_column_unit column_unit
+			= DIAGNOSTICS_COLUMN_UNIT_BYTE)
 {
   test_diagnostic_context dc;
   dc.show_column = show_column;
+  dc.column_unit = column_unit;
+  dc.column_origin = origin;
 
   expanded_location xloc;
   xloc.file = filename;
@@ -2069,7 +2119,10 @@ test_diagnostic_get_location_text ()
   assert_location_text ("PROGNAME:", NULL, 0, 0, true);
   assert_location_text ("<built-in>:", "<built-in>", 42, 10, true);
   assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);
-  assert_location_text ("foo.c:42:", "foo.c", 42, 0, true);
+  assert_location_text ("foo.c:42:9:", "foo.c", 42, 10, true, 0);
+  assert_location_text ("foo.c:42:1010:", "foo.c", 42, 10, true, 1001);
+  for (int origin = 0; origin != 2; ++origin)
+    assert_location_text ("foo.c:42:", "foo.c", 42, 0, true, origin);
   assert_location_text ("foo.c:", "foo.c", 0, 10, true);
   assert_location_text ("foo.c:42:", "foo.c", 42, 10, false);
   assert_location_text ("foo.c:", "foo.c", 0, 10, false);
@@ -2077,6 +2130,39 @@ test_diagnostic_get_location_text ()
   maybe_line_and_column (INT_MAX, INT_MAX);
   maybe_line_and_column (INT_MIN, INT_MIN);
 
+  {
+    /* In order to test display columns vs byte columns, we need to create a
+       file for location_get_source_line() to read.  */
+
+    const char *const content = "smile \xf0\x9f\x98\x82\n";
+    const int line_bytes = strlen (content) - 1;
+    const int display_width = cpp_display_width (content, line_bytes);
+    ASSERT_EQ (line_bytes - 2, display_width);
+    temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+    const char *const fname = tmp.get_filename ();
+    const int buf_len = strlen (fname) + 16;
+    char *const expected = XNEWVEC (char, buf_len);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    XDELETEVEC (expected);
+  }
+
+
   progname = old_progname;
 }
 
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 307dbcfb34a..ab152a129c9 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -24,6 +24,20 @@ along with GCC; see the file COPYING3.  If not see
 #include "pretty-print.h"
 #include "diagnostic-core.h"
 
+/* An enum for controlling what units to use for the column number
+   when diagnostics are output, used by the -fdiagnostics-column-unit option.
+   Tabs will be expanded or not according to the value of -ftabstop.  The origin
+   (default 1) is controlled by -fdiagnostics-column-origin.  */
+
+enum diagnostics_column_unit
+{
+  /* The new default: display columns.  */
+  DIAGNOSTICS_COLUMN_UNIT_DISPLAY,
+
+  /* The historical behavior: simple bytes.  */
+  DIAGNOSTICS_COLUMN_UNIT_BYTE
+};
+
 /* Enum for overriding the standard output format.  */
 
 enum diagnostics_output_format
@@ -280,6 +294,12 @@ struct diagnostic_context
      rest of the diagnostic.  */
   bool parseable_fixits_p;
 
+  /* What units to use when outputting the column number.  */
+  enum diagnostics_column_unit column_unit;
+
+  /* The origin for the column number (1-based or 0-based typically).  */
+  int column_origin;
+
   /* If non-NULL, an edit_context to which fix-it hints should be
      applied, for generating patches.  */
   edit_context *edit_context_ptr;
@@ -458,6 +478,8 @@ diagnostic_same_line (const diagnostic_context *context,
 }
 
 extern const char *diagnostic_get_color_for_kind (diagnostic_t kind);
+extern int diagnostic_converted_column (diagnostic_context *context,
+					expanded_location s);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
@@ -470,6 +492,7 @@ extern void diagnostic_output_format_init (diagnostic_context *,
 /* Compute the number of digits in the decimal representation of an integer.  */
 extern int num_digits (int);
 
-extern json::value *json_from_expanded_location (location_t loc);
+extern json::value *json_from_expanded_location (diagnostic_context *context,
+						 location_t loc);
 
 #endif /* ! GCC_DIAGNOSTIC_H */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 35e8242af5f..aa76f6acbae 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -290,7 +290,9 @@ Objective-C and Objective-C++ Dialects}.
 -fdiagnostics-show-template-tree  -fno-elide-type @gol
 -fdiagnostics-path-format=@r{[}none@r{|}separate-events@r{|}inline-events@r{]} @gol
 -fdiagnostics-show-path-depths @gol
--fno-show-column}
+-fno-show-column @gol
+-fdiagnostics-column-unit=@r{[}display@r{|}byte@r{]} @gol
+-fdiagnostics-column-origin=@var{origin}}
 
 @item Warning Options
 @xref{Warning Options,,Options to Request or Suppress Warnings}.
@@ -4418,6 +4420,29 @@ Do not print column numbers in diagnostics.  This may be necessary if
 diagnostics are being scanned by a program that does not understand the
 column numbers, such as @command{dejagnu}.
 
+@item -fdiagnostics-column-unit=@var{UNIT}
+@opindex fdiagnostics-column-unit
+Select the units for the column number.  This affects traditional diagnostics
+(in the absence of @option{-fno-show-column}), as well as JSON format
+diagnostics if requested.
+
+The default @var{UNIT}, @samp{display}, considers the number of display columns
+occupied by each character.  This may be larger than the number of bytes
+occupied, in the case of tab characters, or it may be smaller, in the case of
+multibyte characters.  For example, the UTF-8 character ``@U{03C0}'' occupies
+two bytes and one display column, while the character ``@U{1F642}'' occupies
+four bytes and two display columns.
+
+Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte
+count in all cases, as was traditionally output by GCC prior to version 11.1.0.
+
+@item -fdiagnostics-column-origin=@var{ORIGIN}
+@opindex fdiagnostics-column-origin
+Select the origin for column numbers, i.e. the column number assigned to the
+first column.  The default value of 1 corresponds to traditional GCC
+behavior and to the GNU style guide.  Some utilities may perform better with an
+origin of 0; any non-negative value may be specified.
+
 @item -fdiagnostics-format=@var{FORMAT}
 @opindex fdiagnostics-format
 Select a different format for printing diagnostics.
@@ -4453,11 +4478,15 @@ might be printed in JSON form (after formatting) like this:
         "locations": [
             @{
                 "caret": @{
+		    "display-column": 3,
+		    "byte-column": 3,
                     "column": 3,
                     "file": "misleading-indentation.c",
                     "line": 15
                 @},
                 "finish": @{
+		    "display-column": 4,
+		    "byte-column": 4,
                     "column": 4,
                     "file": "misleading-indentation.c",
                     "line": 15
@@ -4473,6 +4502,8 @@ might be printed in JSON form (after formatting) like this:
                 "locations": [
                     @{
                         "caret": @{
+			    "display-column": 5,
+			    "byte-column": 5,
                             "column": 5,
                             "file": "misleading-indentation.c",
                             "line": 17
@@ -4482,6 +4513,7 @@ might be printed in JSON form (after formatting) like this:
                 "message": "...this statement, but the latter is @dots{}"
             @}
         ]
+	"column-origin": 1,
     @},
     @dots{}
 ]
@@ -4494,10 +4526,22 @@ A diagnostic has a @code{kind}.  If this is @code{warning}, then there is
 an @code{option} key describing the command-line option controlling the
 warning.
 
-A diagnostic can contain zero or more locations.  Each location has up
-to three positions within it: a @code{caret} position and optional
-@code{start} and @code{finish} positions.  A location can also have
-an optional @code{label} string.  For example, this error:
+A diagnostic can contain zero or more locations.  Each location has an
+optional @code{label} string and up to three positions within it: a
+@code{caret} position and optional @code{start} and @code{finish} positions.
+A position is described by a @code{file} name, a @code{line} number, and
+three numbers indicating a column position: @code{display-column} counts
+display columns, accounting for tabs and multibyte characters;
+@code{byte-column} counts raw bytes; and @code{column} is equal to one of
+the previous two, as dictated by the @option{-fdiagnostics-column-unit}
+option.  All three columns are relative to the origin specified by
+@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may
+be set, for instance, to 0 for compatibility with other utilities that
+number columns from 0.  The column origin is recorded in the JSON output in
+the @code{column-origin} tag.  In the remaining examples below, the extra
+column number outputs have been omitted for brevity.
+
+For example, this error:
 
 @smallexample
 bad-binary-ops.c:64:23: error: invalid operands to binary + (have 'S' @{aka
diff --git a/gcc/input.c b/gcc/input.c
index dd1d23df2f7..ab2fb7092d1 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -913,7 +913,7 @@ make_location (location_t caret, source_range src_range)
    source line in order to calculate the display width.  If that cannot be done
    for any reason, then returns the byte column as a fallback.  */
 int
-location_compute_display_column (expanded_location exploc)
+location_compute_display_column (expanded_location exploc, int tabstop)
 {
   if (!(exploc.file && *exploc.file && exploc.line && exploc.column))
     return exploc.column;
@@ -921,7 +921,7 @@ location_compute_display_column (expanded_location exploc)
   /* If line is NULL, this function returns exploc.column which is the
      desired fallback.  */
   return cpp_byte_column_to_display_column (line.get_buffer (), line.length (),
-					    exploc.column);
+					    exploc.column, tabstop);
 }
 
 /* Dump statistics to stderr about the memory usage of the line_table
@@ -3612,8 +3612,8 @@ void test_cpp_utf8 ()
   {
     int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8);
     ASSERT_EQ (8, w_bad);
-    int w_ctrl = cpp_display_width ("\r\t\n\v\0\1", 6);
-    ASSERT_EQ (6, w_ctrl);
+    int w_ctrl = cpp_display_width ("\r\n\v\0\1", 5);
+    ASSERT_EQ (5, w_ctrl);
   }
 
   /* Verify that wcwidth of valid UTF-8 is as expected.  */
@@ -3635,6 +3635,15 @@ void test_cpp_utf8 ()
     ASSERT_EQ (18, w_mixed);
   }
 
+  /* Verify that display width properly expands tabs.  */
+  {
+    const char *tstr = "\tabc\td";
+    ASSERT_EQ (6, cpp_display_width (tstr, 6, 1));
+    ASSERT_EQ (10, cpp_display_width (tstr, 6, 3));
+    ASSERT_EQ (17, cpp_display_width (tstr, 6, 8));
+    ASSERT_EQ (1, cpp_display_column_to_byte_column (tstr, 6, 7, 8));
+  }
+
   /* Verify that cpp_byte_column_to_display_column can go past the end,
      and similar edge cases.  */
   {
diff --git a/gcc/input.h b/gcc/input.h
index df48ce63ef9..906d3ae244b 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -38,7 +38,12 @@ STATIC_ASSERT (BUILTINS_LOCATION < RESERVED_LOCATION_COUNT);
 
 extern bool is_location_from_builtin_token (location_t);
 extern expanded_location expand_location (location_t);
-extern int location_compute_display_column (expanded_location);
+
+/* As with cpp_byte_column_to_display_column(), TABSTOP <= 0 means to use the
+   global default cpp_get_tabstop(), which is typically set with the
+   -ftabstop option.  */
+extern int location_compute_display_column (expanded_location exploc,
+					    int tabstop = 0);
 
 /* A class capturing the bounds of a buffer, to allow for run-time
    bounds-checking in a checked build.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index ec3ca0720f9..f6bd2d2972b 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "opt-suggestions.h"
 #include "diagnostic-color.h"
 #include "selftest.h"
+#include "cpplib.h"
 
 static void set_Wstrict_aliasing (struct gcc_options *opts, int onoff);
 
@@ -2439,6 +2440,14 @@ common_handle_option (struct gcc_options *opts,
       dc->parseable_fixits_p = value;
       break;
 
+    case OPT_fdiagnostics_column_unit_:
+      dc->column_unit = (enum diagnostics_column_unit)value;
+      break;
+
+    case OPT_fdiagnostics_column_origin_:
+      dc->column_origin = value;
+      break;
+
     case OPT_fdiagnostics_show_cwe:
       dc->show_cwe = value;
       break;
@@ -2827,6 +2836,12 @@ common_handle_option (struct gcc_options *opts,
       check_alignment_argument (loc, arg, "functions");
       break;
 
+    case OPT_ftabstop_:
+      /* It is documented that we silently ignore silly values.  */
+      if (value >= 1 && value <= 100)
+	cpp_set_tabstop (value);
+      break;
+
     default:
       /* If the flag was handled in a standard way, assume the lack of
 	 processing here is intentional.  */
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
index 870ba720c5f..2314ad42402 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
@@ -36,20 +36,20 @@ int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
 
 /* { dg-begin-multiline-output "" }
-  if ((err = foo (b)) != 0)
-  ^~
+         if ((err = foo (b)) != 0)
+         ^~
    { dg-end-multiline-output "" } */
 /* { dg-begin-multiline-output "" }
-   goto fail;
-   ^~~~
+                 goto fail;
+                 ^~~~
    { dg-end-multiline-output "" } */
 
 fail:
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
index 5cdeba1cbba..202c6bc7fdf 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
@@ -65,9 +65,9 @@ int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
@@ -178,7 +178,7 @@ void fn_16_tabs (void)
     while (flagA)
       if (flagB) /* { dg-message "7: this 'if' clause does not guard..." } */
 	foo (0);
-	foo (1);/* { dg-message "2: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+	foo (1);/* { dg-message "9: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 }
 
 void fn_17_spaces (void)
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
index 9359db48c17..740becb5548 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
@@ -8,17 +8,22 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#error message\"" } */
 
 /* { dg-regexp "\"caret\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 6" } */
+/* { dg-regexp "\"display-column\": 6" } */
+/* { dg-regexp "\"byte-column\": 6" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
index 557ccf8378b..2f24a6c6596 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
@@ -8,6 +8,7 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"warning\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#warning message\"" } */
 /* { dg-regexp "\"option\": \"-Wcpp\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */
@@ -16,11 +17,15 @@
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 8" } */
+/* { dg-regexp "\"display-column\": 8" } */
+/* { dg-regexp "\"byte-column\": 8" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
index 378205c5bf5..afe96a9048f 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
@@ -8,6 +8,7 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#warning message\"" } */
 /* { dg-regexp "\"option\": \"-Werror=cpp\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */
@@ -16,11 +17,15 @@
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 8" } */
+/* { dg-regexp "\"display-column\": 8" } */
+/* { dg-regexp "\"byte-column\": 8" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
index 2738be6548f..ae51091e0ea 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
@@ -24,15 +24,20 @@ int test (void)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 5" } */
+/* { dg-regexp "\"display-column\": 5" } */
+/* { dg-regexp "\"byte-column\": 5" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 10" } */
+/* { dg-regexp "\"display-column\": 10" } */
+/* { dg-regexp "\"byte-column\": 10" } */
 
 /* The outer diagnostic.  */
 
 /* { dg-regexp "\"kind\": \"warning\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"this 'if' clause does not guard...\"" } */
 /* { dg-regexp "\"option\": \"-Wmisleading-indentation\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wmisleading-indentation\"" } */
@@ -41,11 +46,15 @@ int test (void)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 6" } */
 /* { dg-regexp "\"column\": 3" } */
+/* { dg-regexp "\"display-column\": 3" } */
+/* { dg-regexp "\"byte-column\": 3" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 6" } */
 /* { dg-regexp "\"column\": 4" } */
+/* { dg-regexp "\"display-column\": 4" } */
+/* { dg-regexp "\"byte-column\": 4" } */
 
 /* More from the nested diagnostic (we can't guarantee what order the
    "file" keys are consumed).  */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
index f36e896d228..e0e9ce4be98 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
@@ -13,6 +13,7 @@ int test (struct s *ptr)
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \".*\"" } */
 
 /* Verify fix-it hints.  */
@@ -23,11 +24,15 @@ int test (struct s *ptr)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 15" } */
+/* { dg-regexp "\"display-column\": 15" } */
+/* { dg-regexp "\"byte-column\": 15" } */
 
 /* { dg-regexp "\"next\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 21" } */
+/* { dg-regexp "\"display-column\": 21" } */
+/* { dg-regexp "\"byte-column\": 21" } */
 
 /* { dg-regexp "\"fixits\": \[\[\{\}, \]*\]" } */
 
@@ -35,11 +40,15 @@ int test (struct s *ptr)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 15" } */
+/* { dg-regexp "\"display-column\": 15" } */
+/* { dg-regexp "\"byte-column\": 15" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 20" } */
+/* { dg-regexp "\"display-column\": 20" } */
+/* { dg-regexp "\"byte-column\": 20" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-1.c b/gcc/testsuite/c-c++-common/diagnostic-units-1.c
new file mode 100644
index 00000000000..8d38b7de03e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via default)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-2.c b/gcc/testsuite/c-c++-common/diagnostic-units-2.c
new file mode 100644
index 00000000000..29a2edefd9f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-2.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -Wmultichar" } */
+
+/* column units: display (via arg)
+   column origin: 1 (via default)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "25: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-3.c b/gcc/testsuite/c-c++-common/diagnostic-units-3.c
new file mode 100644
index 00000000000..714ee8f2de4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-3.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=200 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via fallback from overly large argument)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-4.c b/gcc/testsuite/c-c++-common/diagnostic-units-4.c
new file mode 100644
index 00000000000..f9c9da914b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-4.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 0 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "10: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-5.c b/gcc/testsuite/c-c++-common/diagnostic-units-5.c
new file mode 100644
index 00000000000..99d5299a732
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */
+
+/* column units: display (via arg)
+   column origin: 0 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "24: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-6.c b/gcc/testsuite/c-c++-common/diagnostic-units-6.c
new file mode 100644
index 00000000000..c1e6e4ed477
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-6.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=100 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 100 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "110: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "117: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "118: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-7.c b/gcc/testsuite/c-c++-common/diagnostic-units-7.c
new file mode 100644
index 00000000000..dab221ae235
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-7.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via default)
+   tabstop: 9 (via arg) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c1 = 'c1';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c2 = 'c2';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+         int c3 = 	'c3'; /* { dg-warning "20: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c3 =          'c3';
+                            ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-8.c b/gcc/testsuite/c-c++-common/diagnostic-units-8.c
new file mode 100644
index 00000000000..d713b32dabc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-8.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */
+
+/* column units: display (via default)
+   column origin: 1 (via default)
+   tabstop: 9 (via arg) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c1 = 'c1';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c2 = 'c2';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+         int c3 = 	'c3'; /* { dg-warning "28: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c3 =          'c3';
+                            ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/missing-close-symbol.c b/gcc/testsuite/c-c++-common/missing-close-symbol.c
index abeb83748c1..9f1de3d0c47 100644
--- a/gcc/testsuite/c-c++-common/missing-close-symbol.c
+++ b/gcc/testsuite/c-c++-common/missing-close-symbol.c
@@ -24,9 +24,9 @@ void test_static_assert_different_line (void)
   _Static_assert(sizeof(int) >= sizeof(char), /* { dg-message "to match this '\\('" } */
 		 "msg"; /* { dg-error "expected '\\)' before ';' token" } */
   /* { dg-begin-multiline-output "" }
-    "msg";
-         ^
-         )
+                  "msg";
+                       ^
+                       )
      { dg-end-multiline-output "" } */
   /* { dg-begin-multiline-output "" }
    _Static_assert(sizeof(int) >= sizeof(char),
diff --git a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
index fab5849dfc7..ebbf3001055 100644
--- a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
+++ b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
@@ -33,10 +33,10 @@ int test_2 (void)
            ~~~~~~~~~~~~~~~~
                          |
                          s
-    + some_other_function ());
-    ^ ~~~~~~~~~~~~~~~~~~~~~~
-                          |
-                          t
+           + some_other_function ());
+           ^ ~~~~~~~~~~~~~~~~~~~~~~
+                                 |
+                                 t
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/g++.dg/parse/error4.C b/gcc/testsuite/g++.dg/parse/error4.C
index 792bf4dc063..fe8de73790d 100644
--- a/gcc/testsuite/g++.dg/parse/error4.C
+++ b/gcc/testsuite/g++.dg/parse/error4.C
@@ -7,4 +7,4 @@ struct X {
 		 int);
 };
 
-// { dg-error "4:'itn' has not been declared" "" { target *-*-* } 6 }
+// { dg-error "18:'itn' has not been declared" "" { target *-*-* } 6 }
diff --git a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
index 96ebb71645c..d2b37a5122d 100644
--- a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
+++ b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
@@ -9,13 +9,13 @@ class A {
 	int	h;
 	A() { i=10; j=20; }
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "16:virtual functions cannot be friends" }
 };
 
 class B : public A {
     public:
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }// { dg-error "" }  member.*// ERROR -  member.*
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "16:virtual functions cannot be friends" }
 // { dg-error "private" "" { target *-*-* } .-1 }
 };
 
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
index b438543d445..bbc9e51aff6 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
@@ -12,5 +12,5 @@ int
 main()
 {
 	C<char*>	c;
-	char*		p = Z(c.O); //{ dg-error "13:'Z' was not declared" } ambiguous c.O
+	char*		p = Z(c.O); //{ dg-error "29:'Z' was not declared" } ambiguous c.O
 }
diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
index 6dc2c55be58..b98e8da6b1e 100644
--- a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
+++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
@@ -48,8 +48,8 @@ ostream& operator<<(ostream& os, Graph<VertexType,EdgeType>& G)
 
         // The compiler does not like this line!!!!!!
         typename Graph<VertexType, EdgeType>::Successor::iterator
-	  startN = G[i].second.begin(), // { dg-error "14:no match" } no index operator
-	  endN   = G[i].second.end();  // { dg-error "14:no match" } no index operator
+	  startN = G[i].second.begin(), // { dg-error "21:no match" } no index operator
+	  endN   = G[i].second.end();  // { dg-error "21:no match" } no index operator
 
         while(startN != endN)
         {
diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
index c5ff96e5644..51190c92391 100644
--- a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
+++ b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
@@ -288,7 +288,7 @@ int test_3 (int x, int y)
     |      |     ~~~~~~~~~~
     |      |     |
     |      |     (4) ...to here
-    |   NN |      to dereference it above
+    |   NN |                    to dereference it above
     |   NN |   return *ptr;
     |      |          ~~~~
     |      |          |
diff --git a/gcc/testsuite/gcc.dg/bad-binary-ops.c b/gcc/testsuite/gcc.dg/bad-binary-ops.c
index 46c158e6a5f..45668be0a29 100644
--- a/gcc/testsuite/gcc.dg/bad-binary-ops.c
+++ b/gcc/testsuite/gcc.dg/bad-binary-ops.c
@@ -35,10 +35,10 @@ int test_2 (void)
            ~~~~~~~~~~~~~~~~
            |
            struct s
-    + some_other_function ());
-    ^ ~~~~~~~~~~~~~~~~~~~~~~
-      |
-      struct t
+           + some_other_function ());
+           ^ ~~~~~~~~~~~~~~~~~~~~~~
+             |
+             struct t
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/format/branch-1.c b/gcc/testsuite/gcc.dg/format/branch-1.c
index 1782064645e..4ea39b52b2e 100644
--- a/gcc/testsuite/gcc.dg/format/branch-1.c
+++ b/gcc/testsuite/gcc.dg/format/branch-1.c
@@ -10,7 +10,7 @@ foo (long l, int nfoo)
 {
   printf ((nfoo > 1) ? "%d foos" : "%d foo", nfoo);
   printf ((l > 1) ? "%d foos" /* { dg-warning "23:int" "wrong type in conditional expr" } */
-	          : "%d foo", l); /* { dg-warning "16:int" "wrong type in conditional expr" } */
+	          : "%d foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%ld foos" : "%d foo", l); /* { dg-warning "36:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%d foos" : "%ld foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   /* Should allow one case to have extra arguments.  */
diff --git a/gcc/testsuite/gcc.dg/format/pr79210.c b/gcc/testsuite/gcc.dg/format/pr79210.c
index 71f5dd6e082..6bdabdf21ec 100644
--- a/gcc/testsuite/gcc.dg/format/pr79210.c
+++ b/gcc/testsuite/gcc.dg/format/pr79210.c
@@ -20,4 +20,4 @@ LPFC_VPORT_ATTR_R(peer_port_login,
 		  "Allow peer ports on the same physical port to login to each "
 		  "other.");
 
-/* { dg-warning "6: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
+/* { dg-warning "20: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
index 03b78042107..d7691e4be51 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -540,15 +540,15 @@ void test_builtin_types_compatible_p (unsigned long i)
   __emit_expression_range (0,
 			   f (i) + __builtin_types_compatible_p (long, int)); /* { dg-warning "range" } */
 /* { dg-begin-multiline-output "" }
-       f (i) + __builtin_types_compatible_p (long, int));
-       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                            f (i) + __builtin_types_compatible_p (long, int));
+                            ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    { dg-end-multiline-output "" } */
 
   __emit_expression_range (0,
 			   __builtin_types_compatible_p (long, int) + f (i)); /* { dg-warning "range" } */
 /* { dg-begin-multiline-output "" }
-       __builtin_types_compatible_p (long, int) + f (i));
-       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
+                            __builtin_types_compatible_p (long, int) + f (i));
+                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
    { dg-end-multiline-output "" } */
 }
 
@@ -671,8 +671,8 @@ void test_multiple_ordinary_maps (void)
 /* { dg-begin-multiline-output "" }
    __emit_expression_range (0, foo (0,
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-        "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));
-        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));
+                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    { dg-end-multiline-output "" } */
 
   /* Another expression that transitions between ordinary maps; this
@@ -685,8 +685,8 @@ void test_multiple_ordinary_maps (void)
 /* { dg-begin-multiline-output "" }
    __emit_expression_range (0, foo (0, "012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456
 7890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
 4567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789",
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-        0));
-        ~~                      
+                                    0));
+                                    ~~
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
index ac4fa1b52bd..4cba87be2ae 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
@@ -335,11 +335,11 @@ pr87652 (const char *stem, int counter)
 /* { dg-error "unable to read substring location: unable to read source line" "" { target c } 329 } */
 /* { dg-error "unable to read substring location: failed to get ordinary maps" "" { target c++ } 329 } */
 /* { dg-begin-multiline-output "" }
-     __emit_string_literal_range(__FILE__":%5d: " format, \
+     __emit_string_literal_range(__FILE__":%5d: " format,        \
                                  ^~~~~~~~
      { dg-end-multiline-output "" { target c } } */
 /* { dg-begin-multiline-output "" }
-     __emit_string_literal_range(__FILE__":%5d: " format, \
+     __emit_string_literal_range(__FILE__":%5d: " format,        \
                                  ^
      { dg-end-multiline-output "" { target c++ } } */
 
diff --git a/gcc/testsuite/gcc.dg/redecl-4.c b/gcc/testsuite/gcc.dg/redecl-4.c
index 8f124886da8..2c214bb02c7 100644
--- a/gcc/testsuite/gcc.dg/redecl-4.c
+++ b/gcc/testsuite/gcc.dg/redecl-4.c
@@ -15,7 +15,7 @@ f (void)
     /* Should get format warnings even though the built-in declaration
        isn't "visible".  */
     printf (
-	    "%s", 1); /* { dg-warning "8:format" } */
+	    "%s", 1); /* { dg-warning "15:format" } */
     /* The type of strcmp here should have no prototype.  */
     if (0)
       strcmp (1);
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
index 7fade1f65fc..606fe0f891a 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
@@ -8,17 +8,22 @@
 ! We can't rely on any ordering of the keys.
 
 ! { dg-regexp "\"kind\": \"error\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#error message\"" }
 
 ! { dg-regexp "\"caret\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 6" }
+! { dg-regexp "\"display-column\": 6" }
+! { dg-regexp "\"byte-column\": 6" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
index bebcf68d431..56615f0ca5a 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
@@ -8,6 +8,7 @@
 ! We can't rely on any ordering of the keys. 
 
 ! { dg-regexp "\"kind\": \"warning\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#warning message\"" }
 ! { dg-regexp "\"option\": \"-Wcpp\"" }
 ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }
@@ -16,11 +17,15 @@
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 8" }
+! { dg-regexp "\"display-column\": 8" }
+! { dg-regexp "\"byte-column\": 8" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
index 7ab78eb570b..50214759091 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
@@ -8,6 +8,7 @@
 ! We can't rely on any ordering of the keys.
 
 ! { dg-regexp "\"kind\": \"error\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#warning message\"" }
 ! { dg-regexp "\"option\": \"-Werror=cpp\"" }
 ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }
@@ -16,11 +17,15 @@
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 8" }
+! { dg-regexp "\"display-column\": 8" }
+! { dg-regexp "\"byte-column\": 8" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/go.dg/arrayclear.go b/gcc/testsuite/go.dg/arrayclear.go
index 6daebc0b8f5..aa5ba0761d7 100644
--- a/gcc/testsuite/go.dg/arrayclear.go
+++ b/gcc/testsuite/go.dg/arrayclear.go
@@ -1,5 +1,8 @@
 // { dg-do compile }
 // { dg-options "-fgo-debug-optimization" }
+// This comment is necessary to work around a dejagnu bug. Otherwise, the
+// column of the second error message would equal the row of the first one, and
+// since the errors are also identical, dejagnu is not able to distinguish them.
 
 package p
 
diff --git a/gcc/tree-diagnostic-path.cc b/gcc/tree-diagnostic-path.cc
index 381a49cb0b4..82b3c2d6b6a 100644
--- a/gcc/tree-diagnostic-path.cc
+++ b/gcc/tree-diagnostic-path.cc
@@ -493,7 +493,7 @@ default_tree_diagnostic_path_printer (diagnostic_context *context,
    doesn't have access to trees (for m_fndecl).  */
 
 json::value *
-default_tree_make_json_for_path (diagnostic_context *,
+default_tree_make_json_for_path (diagnostic_context *context,
 				 const diagnostic_path *path)
 {
   json::array *path_array = new json::array ();
@@ -504,7 +504,8 @@ default_tree_make_json_for_path (diagnostic_context *,
       json::object *event_obj = new json::object ();
       if (event.get_location ())
 	event_obj->set ("location",
-			json_from_expanded_location (event.get_location ()));
+			json_from_expanded_location (context,
+						     event.get_location ()));
       label_text event_text (event.get_desc (false));
       event_obj->set ("description", new json::string (event_text.m_buffer));
       event_text.maybe_free ();
diff --git a/libcpp/charset.c b/libcpp/charset.c
index d9281c5fb97..66a5f2b7f26 100644
--- a/libcpp/charset.c
+++ b/libcpp/charset.c
@@ -2276,49 +2276,105 @@ cpp_string_location_reader::get_next ()
   return result;
 }
 
-/* Helper for cpp_byte_column_to_display_column and its inverse.  Given a
-   pointer to a UTF-8-encoded character, compute its display width.  *INBUFP
-   points on entry to the start of the UTF-8 encoding of the character, and
-   is updated to point just after the last byte of the encoding.  *INBYTESLEFTP
-   contains on entry the remaining size of the buffer into which *INBUFP
-   points, and this is also updated accordingly.  If *INBUFP does not
+/* This is normally determined by the -ftabstop option.  We need to know it so
+   the display column computations below can expand tabs as well.  */
+
+static int global_tabstop = 8;
+
+int
+cpp_set_tabstop (int t)
+{
+  return global_tabstop = MAX (1, t);
+}
+
+int
+cpp_get_tabstop ()
+{
+  return global_tabstop;
+}
+
+cpp_display_width_computation::
+cpp_display_width_computation (const char *data, int data_length, int tabstop) :
+  m_begin (data),
+  m_next (m_begin),
+  m_bytes_left (data_length),
+  m_tabstop (tabstop > 0 ? tabstop : global_tabstop),
+  m_display_cols (0)
+{}
+
+
+/* The main implementation function for class cpp_display_width_computation.
+   m_next points on entry to the start of the UTF-8 encoding of the next
+   character, and is updated to point just after the last byte of the encoding.
+   m_bytes_left contains on entry the remaining size of the buffer into which
+   m_next points, and this is also updated accordingly.  If m_next does not
    point to a valid UTF-8-encoded sequence, then it will be treated as a single
-   byte with display width 1.  */
+   byte with display width 1.  m_cur_display_col is the current display column,
+   relative to which tab stops should be expanded.  Returns the display width of
+   the codepoint just processed.  */
 
-static inline int
-compute_next_display_width (const uchar **inbufp, size_t *inbytesleftp)
+int
+cpp_display_width_computation::process_next_codepoint ()
 {
   cppchar_t c;
-  if (one_utf8_to_cppchar (inbufp, inbytesleftp, &c) != 0)
+  int next_width;
+
+  if (*m_next == '\t')
+    {
+      ++m_next;
+      --m_bytes_left;
+      next_width = m_tabstop - (m_display_cols % m_tabstop);
+    }
+  else if (one_utf8_to_cppchar ((const uchar **) &m_next, &m_bytes_left, &c)
+	   != 0)
     {
       /* Input is not convertible to UTF-8.  This could be fine, e.g. in a
 	 string literal, so don't complain.  Just treat it as if it has a width
 	 of one.  */
-      ++*inbufp;
-      --*inbytesleftp;
-      return 1;
+      ++m_next;
+      --m_bytes_left;
+      next_width = 1;
     }
+  else
+    {
+      /*  one_utf8_to_cppchar() has updated m_next and m_bytes_left for us.  */
+      next_width = cpp_wcwidth (c);
+    }
+
+  m_display_cols += next_width;
+  return next_width;
+}
 
-  /*  one_utf8_to_cppchar() has updated inbufp and inbytesleftp for us.  */
-  return cpp_wcwidth (c);
+/*  Utility to advance the byte stream by the minimum amount needed to consume
+    N display columnns.  Returns the number of display columns that were
+    actually skipped.  This could be less than N, if there was not enough data,
+    or more than N, if the last character to be skipped had a sufficiently large
+    display width.  */
+int
+cpp_display_width_computation::advance_display_cols (int n)
+{
+  const int start = m_display_cols;
+  const int target = start + n;
+  while (m_display_cols < target && !done ())
+    process_next_codepoint ();
+  return m_display_cols - start;
 }
 
 /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute
     how many display columns are occupied by the first COLUMN bytes.  COLUMN
     may exceed DATA_LENGTH, in which case the phantom bytes at the end are
-    treated as if they have display width 1.  */
+    treated as if they have display width 1.  Tabs are expanded to the next tab
+    stop, relative to the start of DATA.  */
 
 int
 cpp_byte_column_to_display_column (const char *data, int data_length,
-				   int column)
+				   int column, int tabstop)
 {
-  int display_col = 0;
-  const uchar *udata = (const uchar *) data;
   const int offset = MAX (0, column - data_length);
-  size_t inbytesleft = column - offset;
-  while (inbytesleft)
-    display_col += compute_next_display_width (&udata, &inbytesleft);
-  return display_col + offset;
+  cpp_display_width_computation dw (data, column - offset, tabstop);
+  while (!dw.done ())
+    dw.process_next_codepoint ();
+  return dw.display_cols_processed () + offset;
 }
 
 /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute
@@ -2328,14 +2384,11 @@ cpp_byte_column_to_display_column (const char *data, int data_length,
 
 int
 cpp_display_column_to_byte_column (const char *data, int data_length,
-				   int display_col)
+				   int display_col, int tabstop)
 {
-  int column = 0;
-  const uchar *udata = (const uchar *) data;
-  size_t inbytesleft = data_length;
-  while (column < display_col && inbytesleft)
-      column += compute_next_display_width (&udata, &inbytesleft);
-  return data_length - inbytesleft + MAX (0, display_col - column);
+  cpp_display_width_computation dw (data, data_length, tabstop);
+  const int avail_display = dw.advance_display_cols (display_col);
+  return dw.bytes_processed () + MAX (0, display_col - avail_display);
 }
 
 /* Our own version of wcwidth().  We don't use the actual wcwidth() in glibc,
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 03cc72a12e2..9bf866ad7b6 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -312,9 +312,6 @@ enum cpp_normalize_level {
    carries all the options visible to the command line.  */
 struct cpp_options
 {
-  /* Characters between tab stops.  */
-  unsigned int tabstop;
-
   /* The language we're preprocessing.  */
   enum c_lang lang;
 
@@ -1322,14 +1319,48 @@ extern const char * cpp_get_userdef_suffix
   (const cpp_token *);
 
 /* In charset.c */
+
+/* A class to manage the state while converting a UTF-8 sequence to cppchar_t
+   and computing the display width one character at a time.  */
+class cpp_display_width_computation {
+ public:
+  /* TABSTOP <= 0 means to use cpp_get_tabstop().  */
+  cpp_display_width_computation (const char *data, int data_length,
+				 int tabstop = 0);
+  const char *next_byte () const { return m_next; }
+  int bytes_processed () const { return m_next - m_begin; }
+  int bytes_left () const { return m_bytes_left; }
+  bool done () const { return !bytes_left (); }
+  int display_cols_processed () const { return m_display_cols; }
+
+  int process_next_codepoint ();
+  int advance_display_cols (int n);
+
+ private:
+  const char *const m_begin;
+  const char *m_next;
+  size_t m_bytes_left;
+  const int m_tabstop;
+  int m_display_cols;
+};
+
+/* Convenience functions that are simple use cases for class
+   cpp_display_width_computation.  Tab characters will be expanded to spaces
+   as determined by TABSTOP.  If TABSTOP <= 0, the tab width is set to the
+   global default cpp_get_tabstop (), which is typically set with the
+   -ftabstop option.  */
 int cpp_byte_column_to_display_column (const char *data, int data_length,
-				       int column);
-inline int cpp_display_width (const char *data, int data_length)
+				       int column, int tabstop = 0);
+inline int cpp_display_width (const char *data, int data_length,
+			      int tabstop = 0)
 {
-    return cpp_byte_column_to_display_column (data, data_length, data_length);
+  return cpp_byte_column_to_display_column (data, data_length, data_length,
+					    tabstop);
 }
 int cpp_display_column_to_byte_column (const char *data, int data_length,
-				       int display_col);
+				       int display_col, int tabstop = 0);
 int cpp_wcwidth (cppchar_t c);
+int cpp_set_tabstop (int t);
+int cpp_get_tabstop ();
 
 #endif /* ! LIBCPP_CPPLIB_H */
diff --git a/libcpp/init.c b/libcpp/init.c
index a3cd8e28f62..cb0d5006339 100644
--- a/libcpp/init.c
+++ b/libcpp/init.c
@@ -190,7 +190,6 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, discard_comments) = 1;
   CPP_OPTION (pfile, discard_comments_in_macro_exp) = 1;
   CPP_OPTION (pfile, max_include_depth) = 200;
-  CPP_OPTION (pfile, tabstop) = 8;
   CPP_OPTION (pfile, operator_names) = 1;
   CPP_OPTION (pfile, warn_trigraphs) = 2;
   CPP_OPTION (pfile, warn_endif_labels) = 1;
Richard Sandiford via Gcc-patches June 9, 2020, 4:45 p.m. | #6
May I please ping this patch?
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545426.html

Thanks!

-Lewis

On Fri, May 08, 2020 at 03:35:25PM -0400, Lewis Hyatt wrote:
> On Fri, Jan 31, 2020 at 03:31:59PM -0500, David Malcolm wrote:

> > On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:

> > > Hello-

> > > 

> > > Here is the second patch that I mentioned when I submitted the other

> > > related

> > > patch (which is awaiting review):

> > > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 

> > 

> > Sorry about that; I'm v. busy with analyzer bugs right now.

> > 

> > > This second patch

> > > is based on top of the first one and it closes out PR49973 and

> > > PR86904 by

> > > adding the new option -fdiagnostics-column-unit=[display|byte]. This

> > > allows

> > > to specify whether columns are output as simple byte counts (the

> > > current

> > > behavior), or as display columns including handling multibyte

> > > characters and

> > > tabs. The patch makes display columns the new default. Additionally,

> > > a

> > > second new option -fdiagnostics-column-origin is added, which allows

> > > to make

> > > the column 0-based (or N-based for any N) instead of 1-based. The

> > > default

> > > remains at 1-based as it is now.

> > > 

> > > A number of testcases were explicitly testing for the old behavior,

> > > so I

> > > have updated them to test for the new behavior instead, since the

> > > column

> > > number adjusted for tabs is more natural to test for, and matches

> > > what

> > > editors typically show (give or take 1 for the origin convention).

> > > 

> > > One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It

> > > failed

> > > after this patch, although it doesn't test for any column numbers.

> > > The

> > > answer turned out to be, this test checks for identical error text on

> > > two

> > > different lines. When the column units are changed to display

> > > columns, then

> > > the column of the second error happens to match the line of the first

> > > one. dejagnu then misinterprets the second error as if it matched the

> > > location of the first one (it doesn't distinguish whether it checks

> > > for the

> > > line number or the column number in the output). I added a comment to

> > > the

> > > test explaining the situation; since adding the comment has the side

> > > effect

> > > of making the first line number no longer match the second column

> > > number, it

> > > also makes the test pass again.

> > > 

> > > It wasn't quite clear to me whether this change was appropriate for

> > > GCC 10

> > > or not at this point. We discussed it a couple months ago here:

> > > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way,

> > > I hope

> > > it isn't a problem that I submitted the patch for review now, whether

> > > it

> > > will end up in 10 or 11. Please let me know what's normally expected?

> > > Thanks!

> > 

> > Thanks Lewis.

> > 

> > This patch looks very promising, but should wait until gcc 11; we're

> > trying to stabilize gcc 10 right now (I'm knee-deep in analyzer bug-

> > fixing, so I don't want to add any more diagnostics changes).

> >

> 

> Hi Dave-

> 

> Well GCC 10 was released for a whole day so I thought I would bug you with this

> patch again now :). To summarize, I previously sent this in two separate parts.

> 

> Part 1: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01626.html

> Part 2: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg02108.html

> 

> Part 1 added the support for converting tabs to spaces when outputting

> diagnostics. Part 2 added the new options -fdiagnostics-column-unit and

> -fdiagnostics-column-origin to control whether the column number is printed

> in display or byte units. Together they resolve both PR49973 and PR86904.

> 

> You provided me with feedback on part 2, which is quoted below with some

> notes interspersed. The new version of the patch incorporates all of your

> suggestions. Part 1 has not changed other than some trivial rebasing

> conflicts. The two patches touch nearly disjoint sets of files and are

> logically linked together, so I thought it would be simpler if I just sent

> one combined patch now. If you prefer them to be separated as before, please

> let me know and I can send them that way as well.

> 

> Bootstrap and reg tests were done on x86-64 Linux for all languages.  Tests

> look good:

> 

> type, before, after

> FAIL 96 96

> PASS 474637 475097

> UNSUPPORTED 11607 11607

> UNTESTED 195 195

> XFAIL 1816 1816

> XPASS 36 36

> 

> > 

> > > gcc/ChangeLog:

> > > 

> > > 2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>

> > >

> > 

> > Please reference the PRs here

> > 

> > [...]

> > 

> > > gcc/testsuite/ChangeLog:

> > > 

> > > 2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>

> > 

> > Likewise here.

> > 

> > [...]

> >

> 

> Done.

> 

> > > diff --git a/gcc/common.opt b/gcc/common.opt

> > > index 630c380bd6a..657985450c2 100644

> > > --- a/gcc/common.opt

> > > +++ b/gcc/common.opt

> > > @@ -1309,6 +1309,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)

> > >  EnumValue

> > >  Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)

> > >  

> > > +fdiagnostics-column-unit=

> > > +Common Joined RejectNegative Enum(diagnostics_column_unit)

> > > +-fdiagnostics-column-unit=[display|byte]	Select units for column numbers.

> > Should this line mention the default?

> >

> 

> Done.

> 

> > > +fdiagnostics-column-origin=

> > > +Common Joined RejectNegative UInteger

> > > +-fdiagnostics-column-origin=<number>	Set the number of the first column.  Default 1-based.

> > 

> > These new options should be documented in gcc/doc/invoke.texi.

> > 

> > [...]

> >

> 

> Done.

> 

> > > @@ -43,21 +44,23 @@ static json::array *cur_children_array;

> > >  /* Generate a JSON object for LOC.  */

> > >  

> > >  json::value *

> > > -json_from_expanded_location (location_t loc)

> > > +json_from_expanded_location (diagnostic_context *context, location_t loc)

> > >  {

> > >    expanded_location exploc = expand_location (loc);

> > >    json::object *result = new json::object ();

> > >    if (exploc.file)

> > >      result->set ("file", new json::string (exploc.file));

> > >    result->set ("line", new json::integer_number (exploc.line));

> > > -  result->set ("column", new json::integer_number (exploc.column));

> > > +  const int col = diagnostic_converted_column (context, exploc);

> > > +  result->set ("column", new json::integer_number (col));

> > 

> > I wonder if the JSON output format should show *both* values: perhaps

> > add fields "byte-column" and "display-column", and retain the field

> > "column", which would follow -fdiagnostics-column-unit?

> > 

> > [...]

> >

> 

> Done. Adjusted the docs for JSON output as well.

> 

> > > @@ -219,6 +220,8 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)

> > >    context->min_margin_width = 0;

> > >    context->show_ruler_p = false;

> > >    context->parseable_fixits_p = false;

> > > +  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;

> > > +  context->column_adj = 0;

> > 

> > I'm not sure, but I think I prefer it if we store the column origin

> > instead, rather than an offset relative to an origin of 1.

> > 

> > [...]

> > 

> > > @@ -338,8 +341,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)

> > >    return diagnostic_kind_color[kind];

> > >  }

> > >  

> > > +/* Given an expanded_location, convert the column (which is in 1-based bytes)

> > > +   to the requested units and origin.  Return -1 if the column is

> > > +   invalid (<= 0).  */

> > > +int

> > > +diagnostic_converted_column (diagnostic_context *context, expanded_location s)

> > > +{

> > > +  if (s.column <= 0)

> > > +    return -1;

> > > +

> > > +  int col;

> > 

> > ...so this would be one_based_col.

> > 

> > > +  switch (context->column_unit)

> > > +    {

> > > +    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:

> > > +      col = location_compute_display_column (s);

> > > +      break;

> > > +

> > > +    case DIAGNOSTICS_COLUMN_UNIT_BYTE:

> > > +      col = s.column;

> > > +      break;

> > > +

> > > +    default:

> > > +      gcc_unreachable ();

> > > +    }

> > > +

> > > +  return col + context->column_adj;

> > 

> > ...and this would be (I think):

> > 

> >      return context->column_origin + one_based_col - 1;

> > 

> > It would be doing the -1 each time, but maybe it's conceptually clearer?

> > I'm not sure.

> >

> 

> Sure, done.

> 

> > [...]

> > 

> > > @@ -882,8 +930,10 @@ print_parseable_fixits (pretty_printer *pp, rich_location *richloc)

> > >        location_t next_loc = hint->get_next_loc ();

> > >        expanded_location next_exploc = expand_location (next_loc);

> > >        pp_printf (pp, ":{%i:%i-%i:%i}:",

> > > -		 start_exploc.line, start_exploc.column,

> > > -		 next_exploc.line, next_exploc.column);

> > > +		 start_exploc.line,

> > > +		 diagnostic_converted_column (context, start_exploc),

> > > +		 next_exploc.line,

> > > +		 diagnostic_converted_column (context, next_exploc));

> > >        print_escaped_string (pp, hint->get_string ());

> > >        pp_newline (pp);

> > >      }

> > 

> > If we're going to change the output of parseable fixits, that takes us away

> > from bug-for-bug-compatibility with clang in this area.

> > 

> > That should be documented, at least.

> >

> 

> I didn't mean to do anything controversial here, I was just assuming this should

> change for consistency, but didn't realize it needed to match an existing

> standard. I removed this part of the patch for now, can send it in a separate

> one if there's a desire to change this.

> 

> > [...]

> > 

> > There's selftest coverage which is good; it would be good to *also*

> > have a few simple DejaGnu-based tests, showing the explicit use of both

> > units, and trying some offset values, with some lines with tabs, some

> > with spaces (if nothing else to verify that the option-parsing is wired

> > up correctly).

> >

> 

> Done.

> 

> > I'm nit-picking - apart from the lack of docs, this looks very

> > promising.  But as I said earlier, this should wait until gcc 11.

> > 

> > Thanks

> > Dave

> > 

> 

> Thanks again for your time!

> 

> -Lewis


> gcc/ChangeLog:

> 

> 2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

> 

> 	PR preprocessor/49973

> 	PR other/86904

> 	* common.opt: Handle -ftabstop here instead of in c-family

> 	options.  Add -fdiagnostics-column-unit= and

> 	-fdiagnostics-column-origin= options.

> 	* opts.c (common_handle_option): Handle the new options.

> 	* diagnostic-format-json.cc (json_from_expanded_location): Add

> 	diagnostic_context argument.  Use it to convert column numbers as per

> 	the new options.

> 	(json_from_location_range): Likewise.

> 	(json_from_fixit_hint): Likewise.

> 	(json_end_diagnostic): Pass the new context argument to helper

> 	functions above.  Add "column-origin" field to the output.

> 	(test_unknown_location): Add the new context argument to calls to

> 	helper functions.

> 	(test_bad_endpoints): Likewise.

> 	* diagnostic-show-locus.c (struct line_bounds): Clarify that the

> 	units are now always display columns.  Rename members accordingly.

> 	Add constructor.

> 	(layout::print_source_line): Add support for tab expansion.

> 	(layout::print_annotation_line): Adapt to struct line_bounds changes.

> 	(layout::print_line): Likewise.

> 	(test_layout_x_offset_display_tab): New selftest.

> 	(test_one_liner_colorized_utf8): Likewise.

> 	(test_tab_expansion): Likewise.

> 	(test_diagnostic_show_locus_one_liner_utf8): Call the new tests.

> 	(diagnostic_show_locus_c_tests): Likewise.

> 	* diagnostic.c (diagnostic_initialize): Initialize new column_unit and

> 	column_origin members.

> 	(diagnostic_converted_column): New function.

> 	(maybe_line_and_column): Be willing to output a column of 0.

> 	(diagnostic_get_location_text): Convert column number as per the new

> 	options.

> 	(diagnostic_report_current_module): Likewise.

> 	(assert_location_text): Add origin and column_unit arguments for

> 	testing the new functionality.

> 	(test_diagnostic_get_location_text): Test the new functionality.

> 	* diagnostic.h (enum diagnostics_column_unit): New enum.

> 	(struct diagnostic_context): Add members for the new options.

> 	(diagnostic_converted_column): Declare.

> 	(json_from_expanded_location): Add new context argument.

> 	* doc/invoke.texi: Document the new options.

> 	* input.h (location_compute_display_column): Add tabstop argument.

> 	* input.c (location_compute_display_column): Likewise.

> 	(test_cpp_utf8): Add selftests for tab expansion.

> 	* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the

> 	new context argument to json_from_expanded_location().

> 

> gcc/c-family/ChangeLog:

> 

> 2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

> 

> 	PR other/86904

> 	* c-indentation.c (should_warn_for_misleading_indentation): Get

> 	global tabstop from the new source.

> 	* c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which

> 	is now a common option.

> 	* c.opt: Likewise.

> 

> gcc/testsuite/ChangeLog:

> 

> 2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

> 

> 	PR preprocessor/49973

> 	PR other/86904

> 	* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output

> 	for new defaults.

> 	* c-c++-common/Wmisleading-indentation.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-1.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-2.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-3.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-4.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-5.c: Likewise.

> 	* c-c++-common/missing-close-symbol.c: Likewise.

> 	* g++.dg/diagnostic/bad-binary-ops.C: Likewise.

> 	* g++.dg/parse/error4.C: Likewise.

> 	* g++.old-deja/g++.brendan/crash11.C: Likewise.

> 	* g++.old-deja/g++.pt/overload2.C: Likewise.

> 	* g++.old-deja/g++.robertl/eb109.C: Likewise.

> 	* gcc.dg/analyzer/malloc-paths-9.c: Likewise.

> 	* gcc.dg/bad-binary-ops.c: Likewise.

> 	* gcc.dg/format/branch-1.c: Likewise.

> 	* gcc.dg/format/pr79210.c: Likewise.

> 	* gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.

> 	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.

> 	* gcc.dg/redecl-4.c: Likewise.

> 	* gfortran.dg/diagnostic-format-json-1.F90: Likewise.

> 	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.

> 	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.

> 	* go.dg/arrayclear.go: Add a comment explaining why adding a

> 	comment was necessary to work around a dejagnu bug.

> 	* c-c++-common/diagnostic-units-1.c: New test.

> 	* c-c++-common/diagnostic-units-2.c: New test.

> 	* c-c++-common/diagnostic-units-3.c: New test.

> 	* c-c++-common/diagnostic-units-4.c: New test.

> 	* c-c++-common/diagnostic-units-5.c: New test.

> 	* c-c++-common/diagnostic-units-6.c: New test.

> 	* c-c++-common/diagnostic-units-7.c: New test.

> 	* c-c++-common/diagnostic-units-8.c: New test.

> 

> libcpp/ChangeLog:

> 

> 2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

> 

> 	PR preprocessor/49973

> 	PR other/86904

> 	* include/cpplib.h (struct cpp_options): Removed support for -ftabstop,

> 	which is now handled by cpp_set_tabstop ().

> 	(class cpp_display_width_computation): New class.

> 	(cpp_byte_column_to_display_column): Add optional tabstop argument.

> 	(cpp_display_width): Likewise.

> 	(cpp_display_column_to_byte_column): Likewise.

> 	(cpp_set_tabstop): New function.

> 	(cpp_get_tabstop): Likewise.

> 	* charset.c (global_tabstop): New static variable.

> 	(cpp_set_tabstop): New function to access global_tabstop.

> 	(cpp_get_tabstop): Likewise.

> 	(cpp_display_width_computation::cpp_display_width_computation): New

> 	function.

> 	(compute_next_display_width): Removed and implemented this

> 	functionality in a new function...

> 	(cpp_display_width_computation::process_next_codepoint): ...here.

> 	(cpp_display_width_computation::advance_display_cols): New function.

> 	(cpp_byte_column_to_display_column): Added tabstop argument.

> 	Reimplemented in terms of class cpp_display_width_computation.

> 	(cpp_display_column_to_byte_column): Likewise.

> 	* init.c (cpp_create_reader): Remove handling of -ftabstop, which is now

> 	handled via cpp_set_tabstop().


> commit 080d5f5ac4c18c5b8dd5d4fdd43034624e4f55a9

> Author: Lewis Hyatt <lhyatt@gmail.com>

> Date:   Fri Jan 17 17:53:58 2020 -0500

> 

>     diagnostics: Support conversion of tabs to spaces [PR49973] [PR86904]

> 

> diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c

> index 9fba3bcc67c..fa4739c47a9 100644

> --- a/gcc/c-family/c-indentation.c

> +++ b/gcc/c-family/c-indentation.c

> @@ -299,7 +299,7 @@ should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,

>    expanded_location next_stmt_exploc = expand_location (next_stmt_loc);

>    expanded_location guard_exploc = expand_location (guard_loc);

>  

> -  const unsigned int tab_width = cpp_opts->tabstop;

> +  const unsigned int tab_width = cpp_get_tabstop ();

>  

>    /* They must be in the same file.  */

>    if (next_stmt_exploc.file != body_exploc.file)

> diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c

> index 58ba0948e79..cddf1e28e1d 100644

> --- a/gcc/c-family/c-opts.c

> +++ b/gcc/c-family/c-opts.c

> @@ -504,12 +504,6 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,

>  	cpp_opts->track_macro_expansion = 2;

>        break;

>  

> -    case OPT_ftabstop_:

> -      /* It is documented that we silently ignore silly values.  */

> -      if (value >= 1 && value <= 100)

> -	cpp_opts->tabstop = value;

> -      break;

> -

>      case OPT_fexec_charset_:

>        cpp_opts->narrow_charset = arg;

>        break;

> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt

> index c49da99d395..dbdb78e0ad3 100644

> --- a/gcc/c-family/c.opt

> +++ b/gcc/c-family/c.opt

> @@ -1876,10 +1876,6 @@ Enum(strong_eval_order) String(some) Value(1)

>  EnumValue

>  Enum(strong_eval_order) String(all) Value(2)

>  

> -ftabstop=

> -C ObjC C++ ObjC++ Joined RejectNegative UInteger

> --ftabstop=<number>	Distance between tab stops for column reporting.

> -

>  ftemplate-backtrace-limit=

>  C++ ObjC++ Joined RejectNegative UInteger Var(template_backtrace_limit) Init(10)

>  Set the maximum number of template instantiation notes for a single warning or error.

> diff --git a/gcc/common.opt b/gcc/common.opt

> index 30d05734d16..e3c62a3e7ea 100644

> --- a/gcc/common.opt

> +++ b/gcc/common.opt

> @@ -1321,6 +1321,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)

>  EnumValue

>  Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)

>  

> +fdiagnostics-column-unit=

> +Common Joined RejectNegative Enum(diagnostics_column_unit)

> +-fdiagnostics-column-unit=[display|byte]	Select whether column numbers are output as display columns (default) or raw bytes.

> +

> +fdiagnostics-column-origin=

> +Common Joined RejectNegative UInteger

> +-fdiagnostics-column-origin=<number>	Set the number of the first column.  The default is 1-based as per GNU style, but some utilities may expect 0-based, for example.

> +

>  fdiagnostics-format=

>  Common Joined RejectNegative Enum(diagnostics_output_format)

>  -fdiagnostics-format=[text|json]	Select output format.

> @@ -1329,6 +1337,15 @@ Common Joined RejectNegative Enum(diagnostics_output_format)

>  SourceInclude

>  diagnostic.h

>  

> +Enum

> +Name(diagnostics_column_unit) Type(int)

> +

> +EnumValue

> +Enum(diagnostics_column_unit) String(display) Value(DIAGNOSTICS_COLUMN_UNIT_DISPLAY)

> +

> +EnumValue

> +Enum(diagnostics_column_unit) String(byte) Value(DIAGNOSTICS_COLUMN_UNIT_BYTE)

> +

>  Enum

>  Name(diagnostics_output_format) Type(int)

>  

> @@ -1358,6 +1375,10 @@ fdiagnostics-path-format=

>  Common Joined RejectNegative Var(flag_diagnostics_path_format) Enum(diagnostic_path_format) Init(DPF_INLINE_EVENTS)

>  Specify how to print any control-flow path associated with a diagnostic.

>  

> +ftabstop=

> +Common Joined RejectNegative UInteger

> +-ftabstop=<number>      Distance between tab stops for column reporting.

> +

>  Enum

>  Name(diagnostic_path_format) Type(int)

>  

> diff --git a/gcc/diagnostic-format-json.cc b/gcc/diagnostic-format-json.cc

> index 7bda5c4ba83..465c42fdfde 100644

> --- a/gcc/diagnostic-format-json.cc

> +++ b/gcc/diagnostic-format-json.cc

> @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see

>  #include "system.h"

>  #include "coretypes.h"

>  #include "diagnostic.h"

> +#include "selftest-diagnostic.h"

>  #include "diagnostic-metadata.h"

>  #include "json.h"

>  #include "selftest.h"

> @@ -43,21 +44,43 @@ static json::array *cur_children_array;

>  /* Generate a JSON object for LOC.  */

>  

>  json::value *

> -json_from_expanded_location (location_t loc)

> +json_from_expanded_location (diagnostic_context *context, location_t loc)

>  {

>    expanded_location exploc = expand_location (loc);

>    json::object *result = new json::object ();

>    if (exploc.file)

>      result->set ("file", new json::string (exploc.file));

>    result->set ("line", new json::integer_number (exploc.line));

> -  result->set ("column", new json::integer_number (exploc.column));

> +

> +  const enum diagnostics_column_unit orig_unit = context->column_unit;

> +  struct

> +  {

> +    const char *name;

> +    enum diagnostics_column_unit unit;

> +  } column_fields[] = {

> +    {"display-column", DIAGNOSTICS_COLUMN_UNIT_DISPLAY},

> +    {"byte-column", DIAGNOSTICS_COLUMN_UNIT_BYTE}

> +  };

> +  int the_column = INT_MIN;

> +  for (int i = 0; i != sizeof column_fields / sizeof (*column_fields); ++i)

> +    {

> +      context->column_unit = column_fields[i].unit;

> +      const int col = diagnostic_converted_column (context, exploc);

> +      result->set (column_fields[i].name, new json::integer_number (col));

> +      if (column_fields[i].unit == orig_unit)

> +	the_column = col;

> +    }

> +  gcc_assert (the_column != INT_MIN);

> +  result->set ("column", new json::integer_number (the_column));

> +  context->column_unit = orig_unit;

>    return result;

>  }

>  

>  /* Generate a JSON object for LOC_RANGE.  */

>  

>  static json::object *

> -json_from_location_range (const location_range *loc_range, unsigned range_idx)

> +json_from_location_range (diagnostic_context *context,

> +			  const location_range *loc_range, unsigned range_idx)

>  {

>    location_t caret_loc = get_pure_location (loc_range->m_loc);

>  

> @@ -68,13 +91,13 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)

>    location_t finish_loc = get_finish (loc_range->m_loc);

>  

>    json::object *result = new json::object ();

> -  result->set ("caret", json_from_expanded_location (caret_loc));

> +  result->set ("caret", json_from_expanded_location (context, caret_loc));

>    if (start_loc != caret_loc

>        && start_loc != UNKNOWN_LOCATION)

> -    result->set ("start", json_from_expanded_location (start_loc));

> +    result->set ("start", json_from_expanded_location (context, start_loc));

>    if (finish_loc != caret_loc

>        && finish_loc != UNKNOWN_LOCATION)

> -    result->set ("finish", json_from_expanded_location (finish_loc));

> +    result->set ("finish", json_from_expanded_location (context, finish_loc));

>  

>    if (loc_range->m_label)

>      {

> @@ -91,14 +114,14 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)

>  /* Generate a JSON object for HINT.  */

>  

>  static json::object *

> -json_from_fixit_hint (const fixit_hint *hint)

> +json_from_fixit_hint (diagnostic_context *context, const fixit_hint *hint)

>  {

>    json::object *fixit_obj = new json::object ();

>  

>    location_t start_loc = hint->get_start_loc ();

> -  fixit_obj->set ("start", json_from_expanded_location (start_loc));

> +  fixit_obj->set ("start", json_from_expanded_location (context, start_loc));

>    location_t next_loc = hint->get_next_loc ();

> -  fixit_obj->set ("next", json_from_expanded_location (next_loc));

> +  fixit_obj->set ("next", json_from_expanded_location (context, next_loc));

>    fixit_obj->set ("string", new json::string (hint->get_string ()));

>  

>    return fixit_obj;

> @@ -190,11 +213,13 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,

>    else

>      {

>        /* Otherwise, make diag_obj be the top-level object within the group;

> -	 add a "children" array.  */

> +	 add a "children" array and record the column origin.  */

>        toplevel_array->append (diag_obj);

>        cur_group = diag_obj;

>        cur_children_array = new json::array ();

>        diag_obj->set ("children", cur_children_array);

> +      diag_obj->set ("column-origin",

> +		     new json::integer_number (context->column_origin));

>      }

>  

>    const rich_location *richloc = diagnostic->richloc;

> @@ -205,7 +230,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,

>    for (unsigned int i = 0; i < richloc->get_num_locations (); i++)

>      {

>        const location_range *loc_range = richloc->get_range (i);

> -      json::object *loc_obj = json_from_location_range (loc_range, i);

> +      json::object *loc_obj = json_from_location_range (context, loc_range, i);

>        if (loc_obj)

>  	loc_array->append (loc_obj);

>      }

> @@ -217,7 +242,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,

>        for (unsigned int i = 0; i < richloc->get_num_fixit_hints (); i++)

>  	{

>  	  const fixit_hint *hint = richloc->get_fixit_hint (i);

> -	  json::object *fixit_obj = json_from_fixit_hint (hint);

> +	  json::object *fixit_obj = json_from_fixit_hint (context, hint);

>  	  fixit_array->append (fixit_obj);

>  	}

>      }

> @@ -320,7 +345,8 @@ namespace selftest {

>  static void

>  test_unknown_location ()

>  {

> -  delete json_from_expanded_location (UNKNOWN_LOCATION);

> +  test_diagnostic_context dc;

> +  delete json_from_expanded_location (&dc, UNKNOWN_LOCATION);

>  }

>  

>  /* Verify that we gracefully handle attempts to serialize bad

> @@ -338,7 +364,8 @@ test_bad_endpoints ()

>    loc_range.m_range_display_kind = SHOW_RANGE_WITH_CARET;

>    loc_range.m_label = NULL;

>  

> -  json::object *obj = json_from_location_range (&loc_range, 0);

> +  test_diagnostic_context dc;

> +  json::object *obj = json_from_location_range (&dc, &loc_range, 0);

>    /* We should have a "caret" value, but no "start" or "finish" values.  */

>    ASSERT_TRUE (obj != NULL);

>    ASSERT_TRUE (obj->get ("caret") != NULL);

> diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c

> index 4618b4edb7d..8a34e30c4c7 100644

> --- a/gcc/diagnostic-show-locus.c

> +++ b/gcc/diagnostic-show-locus.c

> @@ -226,22 +226,18 @@ class layout_range

>  

>  /* A struct for use by layout::print_source_line for telling

>     layout::print_annotation_line the extents of the source line that

> -   it printed, so that underlines can be clipped appropriately.  */

> +   it printed, so that underlines can be clipped appropriately.  Units

> +   are 1-based display columns.  */

>  

>  struct line_bounds

>  {

> -  int m_first_non_ws;

> -  int m_last_non_ws;

> +  int m_first_non_ws_disp_col;

> +  int m_last_non_ws_disp_col;

>  

> -  void convert_to_display_cols (char_span line)

> +  line_bounds ()

>    {

> -    m_first_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),

> -							line.length (),

> -							m_first_non_ws);

> -

> -    m_last_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),

> -						       line.length (),

> -						       m_last_non_ws);

> +    m_first_non_ws_disp_col = INT_MAX;

> +    m_last_non_ws_disp_col = 0;

>    }

>  };

>  

> @@ -351,8 +347,8 @@ class layout

>   private:

>    bool will_show_line_p (linenum_type row) const;

>    void print_leading_fixits (linenum_type row);

> -  void print_source_line (linenum_type row, const char *line, int line_bytes,

> -			  line_bounds *lbounds_out);

> +  line_bounds print_source_line (linenum_type row, const char *line,

> +				 int line_bytes);

>    bool should_print_annotation_line_p (linenum_type row) const;

>    void start_annotation_line (char margin_char = ' ') const;

>    void print_annotation_line (linenum_type row, const line_bounds lbounds);

> @@ -1445,16 +1441,13 @@ layout::calculate_x_offset_display ()

>  }

>  

>  /* Print line ROW of source code, potentially colorized at any ranges, and

> -   populate *LBOUNDS_OUT.

> -   LINE is the source line (not necessarily 0-terminated) and LINE_BYTES

> -   is its length in bytes.

> -   This function deals only with byte offsets, not display columns, so

> -   m_x_offset_display must be converted from display to byte units.  In

> -   particular, LINE_BYTES and LBOUNDS_OUT are in bytes.  */

> +   return the line bounds.  LINE is the source line (not necessarily

> +   0-terminated) and LINE_BYTES is its length in bytes.  In order to handle both

> +   colorization and tab expansion, this function tracks the line position in

> +   both byte and display column units.  */

>  

> -void

> -layout::print_source_line (linenum_type row, const char *line, int line_bytes,

> -			   line_bounds *lbounds_out)

> +line_bounds

> +layout::print_source_line (linenum_type row, const char *line, int line_bytes)

>  {

>    m_colorizer.set_normal_text ();

>  

> @@ -1469,30 +1462,29 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,

>    else

>      pp_space (m_pp);

>  

> -  /* We will stop printing the source line at any trailing whitespace, and start

> -     printing it as per m_x_offset_display.  */

> +  /* We will stop printing the source line at any trailing whitespace.  */

>    line_bytes = get_line_bytes_without_trailing_whitespace (line,

>  							   line_bytes);

> -  int x_offset_bytes = 0;

> -  if (m_x_offset_display)

> -    {

> -      x_offset_bytes = cpp_display_column_to_byte_column (line, line_bytes,

> -							  m_x_offset_display);

> -      /* In case the leading portion of the line that will be skipped over ends

> -	 with a character with wcwidth > 1, then it is possible we skipped too

> -	 much, so account for that by padding with spaces.  */

> -      const int overage

> -	= cpp_byte_column_to_display_column (line, line_bytes, x_offset_bytes)

> -	- m_x_offset_display;

> -      for (int column = 0; column < overage; ++column)

> -	pp_space (m_pp);

> -      line += x_offset_bytes;

> -    }

>  

> -  /* Print the line.  */

> -  int first_non_ws = INT_MAX;

> -  int last_non_ws = 0;

> -  for (int col_byte = 1 + x_offset_bytes; col_byte <= line_bytes; col_byte++)

> +  /* This object helps to keep track of which display column we are at, which is

> +     necessary for computing the line bounds in display units, for doing

> +     tab expansion, and for implementing m_x_offset_display.  */

> +  cpp_display_width_computation dw (line, line_bytes);

> +

> +  /* Skip the first m_x_offset_display display columns.  In case the leading

> +     portion that will be skipped ends with a character with wcwidth > 1, then

> +     it is possible we skipped too much, so account for that by padding with

> +     spaces.  Note that this does the right thing too in case a tab was the last

> +     character to be skipped over; the tab is effectively replaced by the

> +     correct number of trailing spaces needed to offset by the desired number of

> +     display columns.  */

> +  for (int skipped_display_cols = dw.advance_display_cols (m_x_offset_display);

> +       skipped_display_cols > m_x_offset_display; --skipped_display_cols)

> +    pp_space (m_pp);

> +

> +  /* Print the line and compute the line_bounds.  */

> +  line_bounds lbounds;

> +  while (!dw.done ())

>      {

>        /* Assuming colorization is enabled for the caret and underline

>  	 characters, we may also colorize the associated characters

> @@ -1510,7 +1502,8 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,

>  	{

>  	  bool in_range_p;

>  	  point_state state;

> -	  in_range_p = get_state_at_point (row, col_byte,

> +	  const int start_byte_col = dw.bytes_processed () + 1;

> +	  in_range_p = get_state_at_point (row, start_byte_col,

>  					   0, INT_MAX,

>  					   CU_BYTES,

>  					   &state);

> @@ -1519,22 +1512,44 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,

>  	  else

>  	    m_colorizer.set_normal_text ();

>  	}

> -      char c = *line;

> -      if (c == '\0' || c == '\t' || c == '\r')

> -	c = ' ';

> -      if (c != ' ')

> +

> +      /* Get the display width of the next character to be output, expanding

> +	 tabs and replacing some control bytes with spaces as necessary.  */

> +      const char *c = dw.next_byte ();

> +      const int start_disp_col = dw.display_cols_processed () + 1;

> +      const int this_display_width = dw.process_next_codepoint ();

> +      if (*c == '\t')

> +	{

> +	  /* The returned display width is the number of spaces into which the

> +	     tab should be expanded.  */

> +	  for (int i = 0; i != this_display_width; ++i)

> +	    pp_space (m_pp);

> +	  continue;

> +	}

> +      if (*c == '\0' || *c == '\r')

> +	{

> +	  /* cpp_wcwidth() promises to return 1 for all control bytes, and we

> +	     want to output these as a single space too, so this case is

> +	     actually the same as the '\t' case.  */

> +	  gcc_assert (this_display_width == 1);

> +	  pp_space (m_pp);

> +	  continue;

> +	}

> +

> +      /* We have a (possibly multibyte) character to output; update the line

> +	 bounds if it is not whitespace.  */

> +      if (*c != ' ')

>  	{

> -	  last_non_ws = col_byte;

> -	  if (first_non_ws == INT_MAX)

> -	    first_non_ws = col_byte;

> +	  lbounds.m_last_non_ws_disp_col = dw.display_cols_processed ();

> +	  if (lbounds.m_first_non_ws_disp_col == INT_MAX)

> +	    lbounds.m_first_non_ws_disp_col = start_disp_col;

>  	}

> -      pp_character (m_pp, c);

> -      line++;

> +

> +      /* Output the character.  */

> +      while (c != dw.next_byte ()) pp_character (m_pp, *c++);

>      }

>    print_newline ();

> -

> -  lbounds_out->m_first_non_ws = first_non_ws;

> -  lbounds_out->m_last_non_ws = last_non_ws;

> +  return lbounds;

>  }

>  

>  /* Determine if we should print an annotation line for ROW.

> @@ -1576,14 +1591,13 @@ layout::start_annotation_line (char margin_char) const

>  }

>  

>  /* Print a line consisting of the caret/underlines for the given

> -   source line.  This function works with display columns, rather than byte

> -   counts; in particular, LBOUNDS should be in display column units.  */

> +   source line.  */

>  

>  void

>  layout::print_annotation_line (linenum_type row, const line_bounds lbounds)

>  {

>    int x_bound = get_x_bound_for_row (row, m_exploc.m_display_col,

> -				     lbounds.m_last_non_ws);

> +				     lbounds.m_last_non_ws_disp_col);

>  

>    start_annotation_line ();

>    pp_space (m_pp);

> @@ -1593,8 +1607,8 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)

>        bool in_range_p;

>        point_state state;

>        in_range_p = get_state_at_point (row, column,

> -				       lbounds.m_first_non_ws,

> -				       lbounds.m_last_non_ws,

> +				       lbounds.m_first_non_ws_disp_col,

> +				       lbounds.m_last_non_ws_disp_col,

>  				       CU_DISPLAY_COLS,

>  				       &state);

>        if (in_range_p)

> @@ -2499,15 +2513,11 @@ layout::print_line (linenum_type row)

>    if (!line)

>      return;

>  

> -  line_bounds lbounds;

>    print_leading_fixits (row);

> -  print_source_line (row, line.get_buffer (), line.length (), &lbounds);

> +  const line_bounds lbounds

> +    = print_source_line (row, line.get_buffer (), line.length ());

>    if (should_print_annotation_line_p (row))

> -    {

> -      if (lbounds.m_first_non_ws != INT_MAX)

> -	lbounds.convert_to_display_cols (line);

> -      print_annotation_line (row, lbounds);

> -    }

> +    print_annotation_line (row, lbounds);

>    if (m_show_labels_p)

>      print_any_labels (row);

>    print_trailing_fixits (row);

> @@ -2774,6 +2784,114 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)

>  

>  }

>  

> +static void

> +test_layout_x_offset_display_tab (const line_table_case &case_)

> +{

> +  const char *content

> +    = "This line is very long, so that we can use it to test the logic for "

> +      "clipping long lines.  Also this: `\t' is a tab that occupies 1 byte and "

> +      "a variable number of display columns, starting at column #103.\n";

> +

> +  /* Number of bytes in the line, subtracting one to remove the newline.  */

> +  const int line_bytes = strlen (content) - 1;

> +

> + /* The column where the tab begins.  Byte or display is the same as there are

> +    no multibyte characters earlier on the line.  */

> +  const int tab_col = 103;

> +

> +  /* Effective extra size of the tab beyond what a single space would have taken

> +     up, indexed by tabstop.  */

> +  static const int num_tabstops = 11;

> +  int extra_width[num_tabstops];

> +  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)

> +    {

> +      const int this_tab_size = tabstop - (tab_col - 1) % tabstop;

> +      extra_width[tabstop] = this_tab_size - 1;

> +    }

> +  /* Example of this calculation: if tabstop is 10, the tab starting at column

> +     #103 has to expand into 8 spaces, covering columns 103-110, so that the

> +     next character is at column #111.  So it takes up 7 more columns than

> +     a space would have taken up.  */

> +  ASSERT_EQ (7, extra_width[10]);

> +

> +  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);

> +  line_table_test ltt (case_);

> +

> +  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);

> +

> +  location_t line_end = linemap_position_for_column (line_table, line_bytes);

> +

> +  /* Don't attempt to run the tests if column data might be unavailable.  */

> +  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)

> +    return;

> +

> +  /* Check that cpp_display_width handles the tabs as expected.  */

> +  char_span lspan = location_get_source_line (tmp.get_filename (), 1);

> +  ASSERT_EQ ('\t', *(lspan.get_buffer () + (tab_col - 1)));

> +  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)

> +    {

> +      ASSERT_EQ (line_bytes + extra_width[tabstop],

> +		 cpp_display_width (lspan.get_buffer (), lspan.length (),

> +				    tabstop));

> +      ASSERT_EQ (line_bytes + extra_width[tabstop],

> +		 location_compute_display_column (expand_location (line_end),

> +						  tabstop));

> +    }

> +

> +  /* Check that the tab is expanded to the expected number of spaces.  */

> +  const int global_tabstop = cpp_get_tabstop ();

> +  rich_location richloc (line_table,

> +			 linemap_position_for_column (line_table,

> +						      tab_col + 1));

> +  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)

> +    {

> +      cpp_set_tabstop (tabstop);

> +      test_diagnostic_context dc;

> +      layout test_layout (&dc, &richloc, DK_ERROR);

> +      test_layout.print_line (1);

> +      const char *out = pp_formatted_text (dc.printer);

> +      ASSERT_EQ (NULL, strchr (out, '\t'));

> +      const char *left_quote = strchr (out, '`');

> +      const char *right_quote = strchr (out, '\'');

> +      ASSERT_NE (NULL, left_quote);

> +      ASSERT_NE (NULL, right_quote);

> +      ASSERT_EQ (right_quote - left_quote, extra_width[tabstop] + 2);

> +    }

> +

> +  /* Check that the line is offset properly and that the tab is broken up

> +     into the expected number of spaces when it is the last character skipped

> +     over.  */

> +  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)

> +    {

> +      cpp_set_tabstop (tabstop);

> +      test_diagnostic_context dc;

> +      static const int small_width = 24;

> +      dc.caret_max_width = small_width - 4;

> +      dc.min_margin_width = test_left_margin - test_linenum_sep + 1;

> +      dc.show_line_numbers_p = true;

> +      layout test_layout (&dc, &richloc, DK_ERROR);

> +      test_layout.print_line (1);

> +

> +      /* We have arranged things so that two columns will be printed before

> +	 the caret.  If the tab results in more than one space, this should

> +	 produce two spaces in the output; otherwise, it will be a single space

> +	 preceded by the opening quote before the tab character.  */

> +      const char *output1

> +	= "   1 |   ' is a tab that occupies 1 byte and a variable number of "

> +	  "display columns, starting at column #103.\n"

> +	  "     |   ^\n\n";

> +      const char *output2

> +	= "   1 | ` ' is a tab that occupies 1 byte and a variable number of "

> +	  "display columns, starting at column #103.\n"

> +	  "     |   ^\n\n";

> +      const char *expected_output = (extra_width[tabstop] ? output1 : output2);

> +      ASSERT_STREQ (expected_output, pp_formatted_text (dc.printer));

> +    }

> +

> +  cpp_set_tabstop (global_tabstop);

> +}

> +

> +

>  /* Verify that diagnostic_show_locus works sanely on UNKNOWN_LOCATION.  */

>  

>  static void

> @@ -3854,6 +3972,27 @@ test_one_liner_labels_utf8 ()

>    }

>  }

>  

> +/* Make sure that colorization codes don't interrupt a multibyte

> +   sequence, which would corrupt it.  */

> +static void

> +test_one_liner_colorized_utf8 ()

> +{

> +  test_diagnostic_context dc;

> +  dc.colorize_source_p = true;

> +  diagnostic_color_init (&dc, DIAGNOSTICS_COLOR_YES);

> +  const location_t pi = linemap_position_for_column (line_table, 12);

> +  rich_location richloc (line_table, pi);

> +  diagnostic_show_locus (&dc, &richloc, DK_ERROR);

> +

> +  /* In order to avoid having the test depend on exactly how the colorization

> +     was effected, just confirm there are two pi characters in the output.  */

> +  const char *result = pp_formatted_text (dc.printer);

> +  const char *null_term = result + strlen (result);

> +  const char *first_pi = strstr (result, "\xcf\x80");

> +  ASSERT_TRUE (first_pi && first_pi <= null_term - 2);

> +  ASSERT_STR_CONTAINS (first_pi + 2, "\xcf\x80");

> +}

> +

>  /* Run the various one-liner tests.  */

>  

>  static void

> @@ -3900,6 +4039,7 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)

>    test_one_liner_many_fixits_1_utf8 ();

>    test_one_liner_many_fixits_2_utf8 ();

>    test_one_liner_labels_utf8 ();

> +  test_one_liner_colorized_utf8 ();

>  }

>  

>  /* Verify that gcc_rich_location::add_location_if_nearby works.  */

> @@ -4955,6 +5095,68 @@ test_fixit_deletion_affecting_newline (const line_table_case &case_)

>  		pp_formatted_text (dc.printer));

>  }

>  

> +static void

> +test_tab_expansion (const line_table_case &case_)

> +{

> +  /* Set up the tabstop to be sure it is 8.  */

> +  const int global_tabstop = cpp_get_tabstop ();

> +  cpp_set_tabstop (8);

> +

> +  /* Create a tempfile and write some text to it.  This example uses a tabstop

> +     of 8, as the column numbers attempt to indicate:

> +

> +    .....................000.01111111111.22222333333  display

> +    .....................123.90123456789.56789012345  columns  */

> +  const char *content = "  \t   This: `\t' is a tab.\n";

> +  /* ....................000 00000011111 11111222222  byte

> +     ....................123 45678901234 56789012345  columns  */

> +

> +  const int first_non_ws_byte_col = 7;

> +  const int right_quote_byte_col = 15;

> +  const int last_byte_col = 25;

> +  ASSERT_EQ (35, cpp_display_width (content, last_byte_col));

> +

> +  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);

> +  line_table_test ltt (case_);

> +  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);

> +

> +  /* Don't attempt to run the tests if column data might be unavailable.  */

> +  location_t line_end = linemap_position_for_column (line_table, last_byte_col);

> +  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)

> +    return;

> +

> +  /* Check that the leading whitespace with mixed tabs and spaces is expanded

> +     into 11 spaces.  Recall that print_line() also puts one space before

> +     everything too.  */

> +  {

> +    test_diagnostic_context dc;

> +    rich_location richloc (line_table,

> +			   linemap_position_for_column (line_table,

> +							first_non_ws_byte_col));

> +    layout test_layout (&dc, &richloc, DK_ERROR);

> +    test_layout.print_line (1);

> +    ASSERT_STREQ ("            This: `      ' is a tab.\n"

> +		  "            ^\n",

> +		  pp_formatted_text (dc.printer));

> +  }

> +

> +  /* Confirm the display width was tracked correctly across the internal tab

> +     as well.  */

> +  {

> +    test_diagnostic_context dc;

> +    rich_location richloc (line_table,

> +			   linemap_position_for_column (line_table,

> +							right_quote_byte_col));

> +    layout test_layout (&dc, &richloc, DK_ERROR);

> +    test_layout.print_line (1);

> +    ASSERT_STREQ ("            This: `      ' is a tab.\n"

> +		  "                         ^\n",

> +		  pp_formatted_text (dc.printer));

> +  }

> +

> +  cpp_set_tabstop (global_tabstop);

> +}

> +

>  /* Verify that line numbers are correctly printed for the case of

>     a multiline range in which the width of the line numbers changes

>     (e.g. from "9" to "10").  */

> @@ -5012,6 +5214,7 @@ diagnostic_show_locus_c_tests ()

>    test_layout_range_for_multiple_lines ();

>  

>    for_each_line_table_case (test_layout_x_offset_display_utf8);

> +  for_each_line_table_case (test_layout_x_offset_display_tab);

>  

>    test_get_line_bytes_without_trailing_whitespace ();

>  

> @@ -5029,6 +5232,7 @@ diagnostic_show_locus_c_tests ()

>    for_each_line_table_case (test_fixit_insert_containing_newline_2);

>    for_each_line_table_case (test_fixit_replace_containing_newline);

>    for_each_line_table_case (test_fixit_deletion_affecting_newline);

> +  for_each_line_table_case (test_tab_expansion);

>  

>    test_line_numbers_multiline_range ();

>  }

> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c

> index ed52bc03d17..120c3258540 100644

> --- a/gcc/diagnostic.c

> +++ b/gcc/diagnostic.c

> @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see

>  #include "selftest.h"

>  #include "selftest-diagnostic.h"

>  #include "opts.h"

> +#include "cpplib.h"

>  

>  #ifdef HAVE_TERMIOS_H

>  # include <termios.h>

> @@ -219,6 +220,8 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)

>    context->min_margin_width = 0;

>    context->show_ruler_p = false;

>    context->parseable_fixits_p = false;

> +  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;

> +  context->column_origin = 1;

>    context->edit_context_ptr = NULL;

>    context->diagnostic_group_nesting_depth = 0;

>    context->diagnostic_group_emission_count = 0;

> @@ -353,8 +356,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)

>    return diagnostic_kind_color[kind];

>  }

>  

> +/* Given an expanded_location, convert the column (which is in 1-based bytes)

> +   to the requested units and origin.  Return -1 if the column is

> +   invalid (<= 0).  */

> +int

> +diagnostic_converted_column (diagnostic_context *context, expanded_location s)

> +{

> +  if (s.column <= 0)

> +    return -1;

> +

> +  int one_based_col;

> +  switch (context->column_unit)

> +    {

> +    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:

> +      one_based_col = location_compute_display_column (s);

> +      break;

> +

> +    case DIAGNOSTICS_COLUMN_UNIT_BYTE:

> +      one_based_col = s.column;

> +      break;

> +

> +    default:

> +      gcc_unreachable ();

> +    }

> +

> +  return one_based_col + (context->column_origin - 1);

> +}

> +

>  /* Return a formatted line and column ':%line:%column'.  Elided if

> -   zero.  The result is a statically allocated buffer.  */

> +   line == 0 or col < 0.  (A column of 0 may be valid due to the

> +   -fdiagnostics-column-origin option.)

> +   The result is a statically allocated buffer.  */

>  

>  static const char *

>  maybe_line_and_column (int line, int col)

> @@ -363,8 +395,9 @@ maybe_line_and_column (int line, int col)

>  

>    if (line)

>      {

> -      size_t l = snprintf (result, sizeof (result),

> -			   col ? ":%d:%d" : ":%d", line, col);

> +      size_t l

> +	= snprintf (result, sizeof (result),

> +		    col >= 0 ? ":%d:%d" : ":%d", line, col);

>        gcc_checking_assert (l < sizeof (result));

>      }

>    else

> @@ -383,8 +416,14 @@ diagnostic_get_location_text (diagnostic_context *context,

>    const char *locus_cs = colorize_start (pp_show_color (pp), "locus");

>    const char *locus_ce = colorize_stop (pp_show_color (pp));

>    const char *file = s.file ? s.file : progname;

> -  int line = strcmp (file, N_("<built-in>")) ? s.line : 0;

> -  int col = context->show_column ? s.column : 0;

> +  int line = 0;

> +  int col = -1;

> +  if (strcmp (file, N_("<built-in>")))

> +    {

> +      line = s.line;

> +      if (context->show_column)

> +	col = diagnostic_converted_column (context, s);

> +    }

>  

>    const char *line_col = maybe_line_and_column (line, col);

>    return build_message_string ("%s%s%s:%s", locus_cs, file,

> @@ -650,14 +689,20 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)

>        if (! MAIN_FILE_P (map))

>  	{

>  	  bool first = true;

> +	  expanded_location s = {};

>  	  do

>  	    {

>  	      where = linemap_included_from (map);

>  	      map = linemap_included_from_linemap (line_table, map);

> -	      const char *line_col

> -		= maybe_line_and_column (SOURCE_LINE (map, where),

> -					 first && context->show_column

> -					 ? SOURCE_COLUMN (map, where) : 0);

> +	      s.file = LINEMAP_FILE (map);

> +	      s.line = SOURCE_LINE (map, where);

> +	      int col = -1;

> +	      if (first && context->show_column)

> +		{

> +		  s.column = SOURCE_COLUMN (map, where);

> +		  col = diagnostic_converted_column (context, s);

> +		}

> +	      const char *line_col = maybe_line_and_column (s.line, col);

>  	      static const char *const msgs[] =

>  		{

>  		 N_("In file included from"),

> @@ -666,7 +711,7 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)

>  	      unsigned index = !first;

>  	      pp_verbatim (context->printer, "%s%s %r%s%s%R",

>  			   first ? "" : ",\n", _(msgs[index]),

> -			   "locus", LINEMAP_FILE (map), line_col);

> +			   "locus", s.file, line_col);

>  	      first = false;

>  	    }

>  	  while (! MAIN_FILE_P (map));

> @@ -2042,10 +2087,15 @@ test_print_parseable_fixits_replace ()

>  static void

>  assert_location_text (const char *expected_loc_text,

>  		      const char *filename, int line, int column,

> -		      bool show_column)

> +		      bool show_column,

> +		      int origin = 1,

> +		      enum diagnostics_column_unit column_unit

> +			= DIAGNOSTICS_COLUMN_UNIT_BYTE)

>  {

>    test_diagnostic_context dc;

>    dc.show_column = show_column;

> +  dc.column_unit = column_unit;

> +  dc.column_origin = origin;

>  

>    expanded_location xloc;

>    xloc.file = filename;

> @@ -2069,7 +2119,10 @@ test_diagnostic_get_location_text ()

>    assert_location_text ("PROGNAME:", NULL, 0, 0, true);

>    assert_location_text ("<built-in>:", "<built-in>", 42, 10, true);

>    assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);

> -  assert_location_text ("foo.c:42:", "foo.c", 42, 0, true);

> +  assert_location_text ("foo.c:42:9:", "foo.c", 42, 10, true, 0);

> +  assert_location_text ("foo.c:42:1010:", "foo.c", 42, 10, true, 1001);

> +  for (int origin = 0; origin != 2; ++origin)

> +    assert_location_text ("foo.c:42:", "foo.c", 42, 0, true, origin);

>    assert_location_text ("foo.c:", "foo.c", 0, 10, true);

>    assert_location_text ("foo.c:42:", "foo.c", 42, 10, false);

>    assert_location_text ("foo.c:", "foo.c", 0, 10, false);

> @@ -2077,6 +2130,39 @@ test_diagnostic_get_location_text ()

>    maybe_line_and_column (INT_MAX, INT_MAX);

>    maybe_line_and_column (INT_MIN, INT_MIN);

>  

> +  {

> +    /* In order to test display columns vs byte columns, we need to create a

> +       file for location_get_source_line() to read.  */

> +

> +    const char *const content = "smile \xf0\x9f\x98\x82\n";

> +    const int line_bytes = strlen (content) - 1;

> +    const int display_width = cpp_display_width (content, line_bytes);

> +    ASSERT_EQ (line_bytes - 2, display_width);

> +    temp_source_file tmp (SELFTEST_LOCATION, ".c", content);

> +    const char *const fname = tmp.get_filename ();

> +    const int buf_len = strlen (fname) + 16;

> +    char *const expected = XNEWVEC (char, buf_len);

> +

> +    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes);

> +    assert_location_text (expected, fname, 1, line_bytes, true,

> +			  1, DIAGNOSTICS_COLUMN_UNIT_BYTE);

> +

> +    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes - 1);

> +    assert_location_text (expected, fname, 1, line_bytes, true,

> +			  0, DIAGNOSTICS_COLUMN_UNIT_BYTE);

> +

> +    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width);

> +    assert_location_text (expected, fname, 1, line_bytes, true,

> +			  1, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);

> +

> +    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width - 1);

> +    assert_location_text (expected, fname, 1, line_bytes, true,

> +			  0, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);

> +

> +    XDELETEVEC (expected);

> +  }

> +

> +

>    progname = old_progname;

>  }

>  

> diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h

> index 307dbcfb34a..ab152a129c9 100644

> --- a/gcc/diagnostic.h

> +++ b/gcc/diagnostic.h

> @@ -24,6 +24,20 @@ along with GCC; see the file COPYING3.  If not see

>  #include "pretty-print.h"

>  #include "diagnostic-core.h"

>  

> +/* An enum for controlling what units to use for the column number

> +   when diagnostics are output, used by the -fdiagnostics-column-unit option.

> +   Tabs will be expanded or not according to the value of -ftabstop.  The origin

> +   (default 1) is controlled by -fdiagnostics-column-origin.  */

> +

> +enum diagnostics_column_unit

> +{

> +  /* The new default: display columns.  */

> +  DIAGNOSTICS_COLUMN_UNIT_DISPLAY,

> +

> +  /* The historical behavior: simple bytes.  */

> +  DIAGNOSTICS_COLUMN_UNIT_BYTE

> +};

> +

>  /* Enum for overriding the standard output format.  */

>  

>  enum diagnostics_output_format

> @@ -280,6 +294,12 @@ struct diagnostic_context

>       rest of the diagnostic.  */

>    bool parseable_fixits_p;

>  

> +  /* What units to use when outputting the column number.  */

> +  enum diagnostics_column_unit column_unit;

> +

> +  /* The origin for the column number (1-based or 0-based typically).  */

> +  int column_origin;

> +

>    /* If non-NULL, an edit_context to which fix-it hints should be

>       applied, for generating patches.  */

>    edit_context *edit_context_ptr;

> @@ -458,6 +478,8 @@ diagnostic_same_line (const diagnostic_context *context,

>  }

>  

>  extern const char *diagnostic_get_color_for_kind (diagnostic_t kind);

> +extern int diagnostic_converted_column (diagnostic_context *context,

> +					expanded_location s);

>  

>  /* Pure text formatting support functions.  */

>  extern char *file_name_as_prefix (diagnostic_context *, const char *);

> @@ -470,6 +492,7 @@ extern void diagnostic_output_format_init (diagnostic_context *,

>  /* Compute the number of digits in the decimal representation of an integer.  */

>  extern int num_digits (int);

>  

> -extern json::value *json_from_expanded_location (location_t loc);

> +extern json::value *json_from_expanded_location (diagnostic_context *context,

> +						 location_t loc);

>  

>  #endif /* ! GCC_DIAGNOSTIC_H */

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

> index 35e8242af5f..aa76f6acbae 100644

> --- a/gcc/doc/invoke.texi

> +++ b/gcc/doc/invoke.texi

> @@ -290,7 +290,9 @@ Objective-C and Objective-C++ Dialects}.

>  -fdiagnostics-show-template-tree  -fno-elide-type @gol

>  -fdiagnostics-path-format=@r{[}none@r{|}separate-events@r{|}inline-events@r{]} @gol

>  -fdiagnostics-show-path-depths @gol

> --fno-show-column}

> +-fno-show-column @gol

> +-fdiagnostics-column-unit=@r{[}display@r{|}byte@r{]} @gol

> +-fdiagnostics-column-origin=@var{origin}}

>  

>  @item Warning Options

>  @xref{Warning Options,,Options to Request or Suppress Warnings}.

> @@ -4418,6 +4420,29 @@ Do not print column numbers in diagnostics.  This may be necessary if

>  diagnostics are being scanned by a program that does not understand the

>  column numbers, such as @command{dejagnu}.

>  

> +@item -fdiagnostics-column-unit=@var{UNIT}

> +@opindex fdiagnostics-column-unit

> +Select the units for the column number.  This affects traditional diagnostics

> +(in the absence of @option{-fno-show-column}), as well as JSON format

> +diagnostics if requested.

> +

> +The default @var{UNIT}, @samp{display}, considers the number of display columns

> +occupied by each character.  This may be larger than the number of bytes

> +occupied, in the case of tab characters, or it may be smaller, in the case of

> +multibyte characters.  For example, the UTF-8 character ``@U{03C0}'' occupies

> +two bytes and one display column, while the character ``@U{1F642}'' occupies

> +four bytes and two display columns.

> +

> +Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte

> +count in all cases, as was traditionally output by GCC prior to version 11.1.0.

> +

> +@item -fdiagnostics-column-origin=@var{ORIGIN}

> +@opindex fdiagnostics-column-origin

> +Select the origin for column numbers, i.e. the column number assigned to the

> +first column.  The default value of 1 corresponds to traditional GCC

> +behavior and to the GNU style guide.  Some utilities may perform better with an

> +origin of 0; any non-negative value may be specified.

> +

>  @item -fdiagnostics-format=@var{FORMAT}

>  @opindex fdiagnostics-format

>  Select a different format for printing diagnostics.

> @@ -4453,11 +4478,15 @@ might be printed in JSON form (after formatting) like this:

>          "locations": [

>              @{

>                  "caret": @{

> +		    "display-column": 3,

> +		    "byte-column": 3,

>                      "column": 3,

>                      "file": "misleading-indentation.c",

>                      "line": 15

>                  @},

>                  "finish": @{

> +		    "display-column": 4,

> +		    "byte-column": 4,

>                      "column": 4,

>                      "file": "misleading-indentation.c",

>                      "line": 15

> @@ -4473,6 +4502,8 @@ might be printed in JSON form (after formatting) like this:

>                  "locations": [

>                      @{

>                          "caret": @{

> +			    "display-column": 5,

> +			    "byte-column": 5,

>                              "column": 5,

>                              "file": "misleading-indentation.c",

>                              "line": 17

> @@ -4482,6 +4513,7 @@ might be printed in JSON form (after formatting) like this:

>                  "message": "...this statement, but the latter is @dots{}"

>              @}

>          ]

> +	"column-origin": 1,

>      @},

>      @dots{}

>  ]

> @@ -4494,10 +4526,22 @@ A diagnostic has a @code{kind}.  If this is @code{warning}, then there is

>  an @code{option} key describing the command-line option controlling the

>  warning.

>  

> -A diagnostic can contain zero or more locations.  Each location has up

> -to three positions within it: a @code{caret} position and optional

> -@code{start} and @code{finish} positions.  A location can also have

> -an optional @code{label} string.  For example, this error:

> +A diagnostic can contain zero or more locations.  Each location has an

> +optional @code{label} string and up to three positions within it: a

> +@code{caret} position and optional @code{start} and @code{finish} positions.

> +A position is described by a @code{file} name, a @code{line} number, and

> +three numbers indicating a column position: @code{display-column} counts

> +display columns, accounting for tabs and multibyte characters;

> +@code{byte-column} counts raw bytes; and @code{column} is equal to one of

> +the previous two, as dictated by the @option{-fdiagnostics-column-unit}

> +option.  All three columns are relative to the origin specified by

> +@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may

> +be set, for instance, to 0 for compatibility with other utilities that

> +number columns from 0.  The column origin is recorded in the JSON output in

> +the @code{column-origin} tag.  In the remaining examples below, the extra

> +column number outputs have been omitted for brevity.

> +

> +For example, this error:

>  

>  @smallexample

>  bad-binary-ops.c:64:23: error: invalid operands to binary + (have 'S' @{aka

> diff --git a/gcc/input.c b/gcc/input.c

> index dd1d23df2f7..ab2fb7092d1 100644

> --- a/gcc/input.c

> +++ b/gcc/input.c

> @@ -913,7 +913,7 @@ make_location (location_t caret, source_range src_range)

>     source line in order to calculate the display width.  If that cannot be done

>     for any reason, then returns the byte column as a fallback.  */

>  int

> -location_compute_display_column (expanded_location exploc)

> +location_compute_display_column (expanded_location exploc, int tabstop)

>  {

>    if (!(exploc.file && *exploc.file && exploc.line && exploc.column))

>      return exploc.column;

> @@ -921,7 +921,7 @@ location_compute_display_column (expanded_location exploc)

>    /* If line is NULL, this function returns exploc.column which is the

>       desired fallback.  */

>    return cpp_byte_column_to_display_column (line.get_buffer (), line.length (),

> -					    exploc.column);

> +					    exploc.column, tabstop);

>  }

>  

>  /* Dump statistics to stderr about the memory usage of the line_table

> @@ -3612,8 +3612,8 @@ void test_cpp_utf8 ()

>    {

>      int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8);

>      ASSERT_EQ (8, w_bad);

> -    int w_ctrl = cpp_display_width ("\r\t\n\v\0\1", 6);

> -    ASSERT_EQ (6, w_ctrl);

> +    int w_ctrl = cpp_display_width ("\r\n\v\0\1", 5);

> +    ASSERT_EQ (5, w_ctrl);

>    }

>  

>    /* Verify that wcwidth of valid UTF-8 is as expected.  */

> @@ -3635,6 +3635,15 @@ void test_cpp_utf8 ()

>      ASSERT_EQ (18, w_mixed);

>    }

>  

> +  /* Verify that display width properly expands tabs.  */

> +  {

> +    const char *tstr = "\tabc\td";

> +    ASSERT_EQ (6, cpp_display_width (tstr, 6, 1));

> +    ASSERT_EQ (10, cpp_display_width (tstr, 6, 3));

> +    ASSERT_EQ (17, cpp_display_width (tstr, 6, 8));

> +    ASSERT_EQ (1, cpp_display_column_to_byte_column (tstr, 6, 7, 8));

> +  }

> +

>    /* Verify that cpp_byte_column_to_display_column can go past the end,

>       and similar edge cases.  */

>    {

> diff --git a/gcc/input.h b/gcc/input.h

> index df48ce63ef9..906d3ae244b 100644

> --- a/gcc/input.h

> +++ b/gcc/input.h

> @@ -38,7 +38,12 @@ STATIC_ASSERT (BUILTINS_LOCATION < RESERVED_LOCATION_COUNT);

>  

>  extern bool is_location_from_builtin_token (location_t);

>  extern expanded_location expand_location (location_t);

> -extern int location_compute_display_column (expanded_location);

> +

> +/* As with cpp_byte_column_to_display_column(), TABSTOP <= 0 means to use the

> +   global default cpp_get_tabstop(), which is typically set with the

> +   -ftabstop option.  */

> +extern int location_compute_display_column (expanded_location exploc,

> +					    int tabstop = 0);

>  

>  /* A class capturing the bounds of a buffer, to allow for run-time

>     bounds-checking in a checked build.  */

> diff --git a/gcc/opts.c b/gcc/opts.c

> index ec3ca0720f9..f6bd2d2972b 100644

> --- a/gcc/opts.c

> +++ b/gcc/opts.c

> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see

>  #include "opt-suggestions.h"

>  #include "diagnostic-color.h"

>  #include "selftest.h"

> +#include "cpplib.h"

>  

>  static void set_Wstrict_aliasing (struct gcc_options *opts, int onoff);

>  

> @@ -2439,6 +2440,14 @@ common_handle_option (struct gcc_options *opts,

>        dc->parseable_fixits_p = value;

>        break;

>  

> +    case OPT_fdiagnostics_column_unit_:

> +      dc->column_unit = (enum diagnostics_column_unit)value;

> +      break;

> +

> +    case OPT_fdiagnostics_column_origin_:

> +      dc->column_origin = value;

> +      break;

> +

>      case OPT_fdiagnostics_show_cwe:

>        dc->show_cwe = value;

>        break;

> @@ -2827,6 +2836,12 @@ common_handle_option (struct gcc_options *opts,

>        check_alignment_argument (loc, arg, "functions");

>        break;

>  

> +    case OPT_ftabstop_:

> +      /* It is documented that we silently ignore silly values.  */

> +      if (value >= 1 && value <= 100)

> +	cpp_set_tabstop (value);

> +      break;

> +

>      default:

>        /* If the flag was handled in a standard way, assume the lack of

>  	 processing here is intentional.  */

> diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c

> index 870ba720c5f..2314ad42402 100644

> --- a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c

> +++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c

> @@ -36,20 +36,20 @@ int fn_6 (int a, int b, int c)

>  	/* ... */

>  	if ((err = foo (a)) != 0)

>  		goto fail;

> -	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */

> +	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */

>  		goto fail;

> -		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

> +		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

>  	if ((err = foo (c)) != 0)

>  		goto fail;

>  	/* ... */

>  

>  /* { dg-begin-multiline-output "" }

> -  if ((err = foo (b)) != 0)

> -  ^~

> +         if ((err = foo (b)) != 0)

> +         ^~

>     { dg-end-multiline-output "" } */

>  /* { dg-begin-multiline-output "" }

> -   goto fail;

> -   ^~~~

> +                 goto fail;

> +                 ^~~~

>     { dg-end-multiline-output "" } */

>  

>  fail:

> diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c

> index 5cdeba1cbba..202c6bc7fdf 100644

> --- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c

> +++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c

> @@ -65,9 +65,9 @@ int fn_6 (int a, int b, int c)

>  	/* ... */

>  	if ((err = foo (a)) != 0)

>  		goto fail;

> -	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */

> +	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */

>  		goto fail;

> -		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

> +		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

>  	if ((err = foo (c)) != 0)

>  		goto fail;

>  	/* ... */

> @@ -178,7 +178,7 @@ void fn_16_tabs (void)

>      while (flagA)

>        if (flagB) /* { dg-message "7: this 'if' clause does not guard..." } */

>  	foo (0);

> -	foo (1);/* { dg-message "2: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

> +	foo (1);/* { dg-message "9: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

>  }

>  

>  void fn_17_spaces (void)

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c

> index 9359db48c17..740becb5548 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c

> @@ -8,17 +8,22 @@

>     We can't rely on any ordering of the keys.  */

>  

>  /* { dg-regexp "\"kind\": \"error\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \"#error message\"" } */

>  

>  /* { dg-regexp "\"caret\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 2" } */

> +/* { dg-regexp "\"display-column\": 2" } */

> +/* { dg-regexp "\"byte-column\": 2" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 6" } */

> +/* { dg-regexp "\"display-column\": 6" } */

> +/* { dg-regexp "\"byte-column\": 6" } */

>  

>  /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */

>  /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c

> index 557ccf8378b..2f24a6c6596 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c

> @@ -8,6 +8,7 @@

>     We can't rely on any ordering of the keys.  */

>  

>  /* { dg-regexp "\"kind\": \"warning\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \"#warning message\"" } */

>  /* { dg-regexp "\"option\": \"-Wcpp\"" } */

>  /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */

> @@ -16,11 +17,15 @@

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 2" } */

> +/* { dg-regexp "\"display-column\": 2" } */

> +/* { dg-regexp "\"byte-column\": 2" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 8" } */

> +/* { dg-regexp "\"display-column\": 8" } */

> +/* { dg-regexp "\"byte-column\": 8" } */

>  

>  /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */

>  /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c

> index 378205c5bf5..afe96a9048f 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c

> @@ -8,6 +8,7 @@

>     We can't rely on any ordering of the keys.  */

>  

>  /* { dg-regexp "\"kind\": \"error\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \"#warning message\"" } */

>  /* { dg-regexp "\"option\": \"-Werror=cpp\"" } */

>  /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */

> @@ -16,11 +17,15 @@

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 2" } */

> +/* { dg-regexp "\"display-column\": 2" } */

> +/* { dg-regexp "\"byte-column\": 2" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 8" } */

> +/* { dg-regexp "\"display-column\": 8" } */

> +/* { dg-regexp "\"byte-column\": 8" } */

>  

>  /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */

>  /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c

> index 2738be6548f..ae51091e0ea 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c

> @@ -24,15 +24,20 @@ int test (void)

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 5" } */

> +/* { dg-regexp "\"display-column\": 5" } */

> +/* { dg-regexp "\"byte-column\": 5" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 10" } */

> +/* { dg-regexp "\"display-column\": 10" } */

> +/* { dg-regexp "\"byte-column\": 10" } */

>  

>  /* The outer diagnostic.  */

>  

>  /* { dg-regexp "\"kind\": \"warning\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \"this 'if' clause does not guard...\"" } */

>  /* { dg-regexp "\"option\": \"-Wmisleading-indentation\"" } */

>  /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wmisleading-indentation\"" } */

> @@ -41,11 +46,15 @@ int test (void)

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */

>  /* { dg-regexp "\"line\": 6" } */

>  /* { dg-regexp "\"column\": 3" } */

> +/* { dg-regexp "\"display-column\": 3" } */

> +/* { dg-regexp "\"byte-column\": 3" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */

>  /* { dg-regexp "\"line\": 6" } */

>  /* { dg-regexp "\"column\": 4" } */

> +/* { dg-regexp "\"display-column\": 4" } */

> +/* { dg-regexp "\"byte-column\": 4" } */

>  

>  /* More from the nested diagnostic (we can't guarantee what order the

>     "file" keys are consumed).  */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c

> index f36e896d228..e0e9ce4be98 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c

> @@ -13,6 +13,7 @@ int test (struct s *ptr)

>     We can't rely on any ordering of the keys.  */

>  

>  /* { dg-regexp "\"kind\": \"error\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \".*\"" } */

>  

>  /* Verify fix-it hints.  */

> @@ -23,11 +24,15 @@ int test (struct s *ptr)

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 15" } */

> +/* { dg-regexp "\"display-column\": 15" } */

> +/* { dg-regexp "\"byte-column\": 15" } */

>  

>  /* { dg-regexp "\"next\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 21" } */

> +/* { dg-regexp "\"display-column\": 21" } */

> +/* { dg-regexp "\"byte-column\": 21" } */

>  

>  /* { dg-regexp "\"fixits\": \[\[\{\}, \]*\]" } */

>  

> @@ -35,11 +40,15 @@ int test (struct s *ptr)

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 15" } */

> +/* { dg-regexp "\"display-column\": 15" } */

> +/* { dg-regexp "\"byte-column\": 15" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 20" } */

> +/* { dg-regexp "\"display-column\": 20" } */

> +/* { dg-regexp "\"byte-column\": 20" } */

>  

>  /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */

>  /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-1.c b/gcc/testsuite/c-c++-common/diagnostic-units-1.c

> new file mode 100644

> index 00000000000..8d38b7de03e

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-1.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 1 (via default)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-2.c b/gcc/testsuite/c-c++-common/diagnostic-units-2.c

> new file mode 100644

> index 00000000000..29a2edefd9f

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-2.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -Wmultichar" } */

> +

> +/* column units: display (via arg)

> +   column origin: 1 (via default)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "25: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-3.c b/gcc/testsuite/c-c++-common/diagnostic-units-3.c

> new file mode 100644

> index 00000000000..714ee8f2de4

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-3.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=200 -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 1 (via fallback from overly large argument)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-4.c b/gcc/testsuite/c-c++-common/diagnostic-units-4.c

> new file mode 100644

> index 00000000000..f9c9da914b2

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-4.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 0 (via arg)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "10: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-5.c b/gcc/testsuite/c-c++-common/diagnostic-units-5.c

> new file mode 100644

> index 00000000000..99d5299a732

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-5.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */

> +

> +/* column units: display (via arg)

> +   column origin: 0 (via arg)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "17: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "24: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-6.c b/gcc/testsuite/c-c++-common/diagnostic-units-6.c

> new file mode 100644

> index 00000000000..c1e6e4ed477

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-6.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=100 -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 100 (via arg)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "110: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "117: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "118: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-7.c b/gcc/testsuite/c-c++-common/diagnostic-units-7.c

> new file mode 100644

> index 00000000000..dab221ae235

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-7.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 1 (via default)

> +   tabstop: 9 (via arg) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c1 = 'c1';

> +                   ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c2 = 'c2';

> +                   ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +         int c3 = 	'c3'; /* { dg-warning "20: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c3 =          'c3';

> +                            ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-8.c b/gcc/testsuite/c-c++-common/diagnostic-units-8.c

> new file mode 100644

> index 00000000000..d713b32dabc

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-8.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */

> +

> +/* column units: display (via default)

> +   column origin: 1 (via default)

> +   tabstop: 9 (via arg) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c1 = 'c1';

> +                   ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c2 = 'c2';

> +                   ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +         int c3 = 	'c3'; /* { dg-warning "28: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c3 =          'c3';

> +                            ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/missing-close-symbol.c b/gcc/testsuite/c-c++-common/missing-close-symbol.c

> index abeb83748c1..9f1de3d0c47 100644

> --- a/gcc/testsuite/c-c++-common/missing-close-symbol.c

> +++ b/gcc/testsuite/c-c++-common/missing-close-symbol.c

> @@ -24,9 +24,9 @@ void test_static_assert_different_line (void)

>    _Static_assert(sizeof(int) >= sizeof(char), /* { dg-message "to match this '\\('" } */

>  		 "msg"; /* { dg-error "expected '\\)' before ';' token" } */

>    /* { dg-begin-multiline-output "" }

> -    "msg";

> -         ^

> -         )

> +                  "msg";

> +                       ^

> +                       )

>       { dg-end-multiline-output "" } */

>    /* { dg-begin-multiline-output "" }

>     _Static_assert(sizeof(int) >= sizeof(char),

> diff --git a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C

> index fab5849dfc7..ebbf3001055 100644

> --- a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C

> +++ b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C

> @@ -33,10 +33,10 @@ int test_2 (void)

>             ~~~~~~~~~~~~~~~~

>                           |

>                           s

> -    + some_other_function ());

> -    ^ ~~~~~~~~~~~~~~~~~~~~~~

> -                          |

> -                          t

> +           + some_other_function ());

> +           ^ ~~~~~~~~~~~~~~~~~~~~~~

> +                                 |

> +                                 t

>     { dg-end-multiline-output "" } */

>  }

>  

> diff --git a/gcc/testsuite/g++.dg/parse/error4.C b/gcc/testsuite/g++.dg/parse/error4.C

> index 792bf4dc063..fe8de73790d 100644

> --- a/gcc/testsuite/g++.dg/parse/error4.C

> +++ b/gcc/testsuite/g++.dg/parse/error4.C

> @@ -7,4 +7,4 @@ struct X {

>  		 int);

>  };

>  

> -// { dg-error "4:'itn' has not been declared" "" { target *-*-* } 6 }

> +// { dg-error "18:'itn' has not been declared" "" { target *-*-* } 6 }

> diff --git a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C

> index 96ebb71645c..d2b37a5122d 100644

> --- a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C

> +++ b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C

> @@ -9,13 +9,13 @@ class A {

>  	int	h;

>  	A() { i=10; j=20; }

>  	virtual void f1() { printf("i=%d j=%d\n",i,j); }

> -	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "9:virtual functions cannot be friends" }

> +	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "16:virtual functions cannot be friends" }

>  };

>  

>  class B : public A {

>      public:

>  	virtual void f1() { printf("i=%d j=%d\n",i,j); }// { dg-error "" }  member.*// ERROR -  member.*

> -	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "9:virtual functions cannot be friends" }

> +	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "16:virtual functions cannot be friends" }

>  // { dg-error "private" "" { target *-*-* } .-1 }

>  };

>  

> diff --git a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C

> index b438543d445..bbc9e51aff6 100644

> --- a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C

> +++ b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C

> @@ -12,5 +12,5 @@ int

>  main()

>  {

>  	C<char*>	c;

> -	char*		p = Z(c.O); //{ dg-error "13:'Z' was not declared" } ambiguous c.O

> +	char*		p = Z(c.O); //{ dg-error "29:'Z' was not declared" } ambiguous c.O

>  }

> diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C

> index 6dc2c55be58..b98e8da6b1e 100644

> --- a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C

> +++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C

> @@ -48,8 +48,8 @@ ostream& operator<<(ostream& os, Graph<VertexType,EdgeType>& G)

>  

>          // The compiler does not like this line!!!!!!

>          typename Graph<VertexType, EdgeType>::Successor::iterator

> -	  startN = G[i].second.begin(), // { dg-error "14:no match" } no index operator

> -	  endN   = G[i].second.end();  // { dg-error "14:no match" } no index operator

> +	  startN = G[i].second.begin(), // { dg-error "21:no match" } no index operator

> +	  endN   = G[i].second.end();  // { dg-error "21:no match" } no index operator

>  

>          while(startN != endN)

>          {

> diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c

> index c5ff96e5644..51190c92391 100644

> --- a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c

> +++ b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c

> @@ -288,7 +288,7 @@ int test_3 (int x, int y)

>      |      |     ~~~~~~~~~~

>      |      |     |

>      |      |     (4) ...to here

> -    |   NN |      to dereference it above

> +    |   NN |                    to dereference it above

>      |   NN |   return *ptr;

>      |      |          ~~~~

>      |      |          |

> diff --git a/gcc/testsuite/gcc.dg/bad-binary-ops.c b/gcc/testsuite/gcc.dg/bad-binary-ops.c

> index 46c158e6a5f..45668be0a29 100644

> --- a/gcc/testsuite/gcc.dg/bad-binary-ops.c

> +++ b/gcc/testsuite/gcc.dg/bad-binary-ops.c

> @@ -35,10 +35,10 @@ int test_2 (void)

>             ~~~~~~~~~~~~~~~~

>             |

>             struct s

> -    + some_other_function ());

> -    ^ ~~~~~~~~~~~~~~~~~~~~~~

> -      |

> -      struct t

> +           + some_other_function ());

> +           ^ ~~~~~~~~~~~~~~~~~~~~~~

> +             |

> +             struct t

>     { dg-end-multiline-output "" } */

>  }

>  

> diff --git a/gcc/testsuite/gcc.dg/format/branch-1.c b/gcc/testsuite/gcc.dg/format/branch-1.c

> index 1782064645e..4ea39b52b2e 100644

> --- a/gcc/testsuite/gcc.dg/format/branch-1.c

> +++ b/gcc/testsuite/gcc.dg/format/branch-1.c

> @@ -10,7 +10,7 @@ foo (long l, int nfoo)

>  {

>    printf ((nfoo > 1) ? "%d foos" : "%d foo", nfoo);

>    printf ((l > 1) ? "%d foos" /* { dg-warning "23:int" "wrong type in conditional expr" } */

> -	          : "%d foo", l); /* { dg-warning "16:int" "wrong type in conditional expr" } */

> +	          : "%d foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */

>    printf ((l > 1) ? "%ld foos" : "%d foo", l); /* { dg-warning "36:int" "wrong type in conditional expr" } */

>    printf ((l > 1) ? "%d foos" : "%ld foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */

>    /* Should allow one case to have extra arguments.  */

> diff --git a/gcc/testsuite/gcc.dg/format/pr79210.c b/gcc/testsuite/gcc.dg/format/pr79210.c

> index 71f5dd6e082..6bdabdf21ec 100644

> --- a/gcc/testsuite/gcc.dg/format/pr79210.c

> +++ b/gcc/testsuite/gcc.dg/format/pr79210.c

> @@ -20,4 +20,4 @@ LPFC_VPORT_ATTR_R(peer_port_login,

>  		  "Allow peer ports on the same physical port to login to each "

>  		  "other.");

>  

> -/* { dg-warning "6: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */

> +/* { dg-warning "20: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */

> diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c

> index 03b78042107..d7691e4be51 100644

> --- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c

> +++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c

> @@ -540,15 +540,15 @@ void test_builtin_types_compatible_p (unsigned long i)

>    __emit_expression_range (0,

>  			   f (i) + __builtin_types_compatible_p (long, int)); /* { dg-warning "range" } */

>  /* { dg-begin-multiline-output "" }

> -       f (i) + __builtin_types_compatible_p (long, int));

> -       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> +                            f (i) + __builtin_types_compatible_p (long, int));

> +                            ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

>     { dg-end-multiline-output "" } */

>  

>    __emit_expression_range (0,

>  			   __builtin_types_compatible_p (long, int) + f (i)); /* { dg-warning "range" } */

>  /* { dg-begin-multiline-output "" }

> -       __builtin_types_compatible_p (long, int) + f (i));

> -       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~

> +                            __builtin_types_compatible_p (long, int) + f (i));

> +                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~

>     { dg-end-multiline-output "" } */

>  }

>  

> @@ -671,8 +671,8 @@ void test_multiple_ordinary_maps (void)

>  /* { dg-begin-multiline-output "" }

>     __emit_expression_range (0, foo (0,

>                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> -        "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));

> -        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> +                                    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));

> +                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

>     { dg-end-multiline-output "" } */

>  

>    /* Another expression that transitions between ordinary maps; this

> @@ -685,8 +685,8 @@ void test_multiple_ordinary_maps (void)

>  /* { dg-begin-multiline-output "" }

>     __emit_expression_range (0, foo (0, "01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789",

>                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> -        0));

> -        ~~                      

> +                                    0));

> +                                    ~~

>     { dg-end-multiline-output "" } */

>  }

>  

> diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c

> index ac4fa1b52bd..4cba87be2ae 100644

> --- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c

> +++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c

> @@ -335,11 +335,11 @@ pr87652 (const char *stem, int counter)

>  /* { dg-error "unable to read substring location: unable to read source line" "" { target c } 329 } */

>  /* { dg-error "unable to read substring location: failed to get ordinary maps" "" { target c++ } 329 } */

>  /* { dg-begin-multiline-output "" }

> -     __emit_string_literal_range(__FILE__":%5d: " format, \

> +     __emit_string_literal_range(__FILE__":%5d: " format,        \

>                                   ^~~~~~~~

>       { dg-end-multiline-output "" { target c } } */

>  /* { dg-begin-multiline-output "" }

> -     __emit_string_literal_range(__FILE__":%5d: " format, \

> +     __emit_string_literal_range(__FILE__":%5d: " format,        \

>                                   ^

>       { dg-end-multiline-output "" { target c++ } } */

>  

> diff --git a/gcc/testsuite/gcc.dg/redecl-4.c b/gcc/testsuite/gcc.dg/redecl-4.c

> index 8f124886da8..2c214bb02c7 100644

> --- a/gcc/testsuite/gcc.dg/redecl-4.c

> +++ b/gcc/testsuite/gcc.dg/redecl-4.c

> @@ -15,7 +15,7 @@ f (void)

>      /* Should get format warnings even though the built-in declaration

>         isn't "visible".  */

>      printf (

> -	    "%s", 1); /* { dg-warning "8:format" } */

> +	    "%s", 1); /* { dg-warning "15:format" } */

>      /* The type of strcmp here should have no prototype.  */

>      if (0)

>        strcmp (1);

> diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90

> index 7fade1f65fc..606fe0f891a 100644

> --- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90

> +++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90

> @@ -8,17 +8,22 @@

>  ! We can't rely on any ordering of the keys.

>  

>  ! { dg-regexp "\"kind\": \"error\"" }

> +! { dg-regexp "\"column-origin\": 1" }

>  ! { dg-regexp "\"message\": \"#error message\"" }

>  

>  ! { dg-regexp "\"caret\": \{" }

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 2" }

> +! { dg-regexp "\"display-column\": 2" }

> +! { dg-regexp "\"byte-column\": 2" }

>  

>  ! { dg-regexp "\"finish\": \{" }

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 6" }

> +! { dg-regexp "\"display-column\": 6" }

> +! { dg-regexp "\"byte-column\": 6" }

>  

>  ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }

>  ! { dg-regexp "\"children\": \[\[\]\[\]\]" }

> diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90

> index bebcf68d431..56615f0ca5a 100644

> --- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90

> +++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90

> @@ -8,6 +8,7 @@

>  ! We can't rely on any ordering of the keys. 

>  

>  ! { dg-regexp "\"kind\": \"warning\"" }

> +! { dg-regexp "\"column-origin\": 1" }

>  ! { dg-regexp "\"message\": \"#warning message\"" }

>  ! { dg-regexp "\"option\": \"-Wcpp\"" }

>  ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }

> @@ -16,11 +17,15 @@

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 2" }

> +! { dg-regexp "\"display-column\": 2" }

> +! { dg-regexp "\"byte-column\": 2" }

>  

>  ! { dg-regexp "\"finish\": \{" }

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 8" }

> +! { dg-regexp "\"display-column\": 8" }

> +! { dg-regexp "\"byte-column\": 8" }

>  

>  ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }

>  ! { dg-regexp "\"children\": \[\[\]\[\]\]" }

> diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90

> index 7ab78eb570b..50214759091 100644

> --- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90

> +++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90

> @@ -8,6 +8,7 @@

>  ! We can't rely on any ordering of the keys.

>  

>  ! { dg-regexp "\"kind\": \"error\"" }

> +! { dg-regexp "\"column-origin\": 1" }

>  ! { dg-regexp "\"message\": \"#warning message\"" }

>  ! { dg-regexp "\"option\": \"-Werror=cpp\"" }

>  ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }

> @@ -16,11 +17,15 @@

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 2" }

> +! { dg-regexp "\"display-column\": 2" }

> +! { dg-regexp "\"byte-column\": 2" }

>  

>  ! { dg-regexp "\"finish\": \{" }

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 8" }

> +! { dg-regexp "\"display-column\": 8" }

> +! { dg-regexp "\"byte-column\": 8" }

>  

>  ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }

>  ! { dg-regexp "\"children\": \[\[\]\[\]\]" }

> diff --git a/gcc/testsuite/go.dg/arrayclear.go b/gcc/testsuite/go.dg/arrayclear.go

> index 6daebc0b8f5..aa5ba0761d7 100644

> --- a/gcc/testsuite/go.dg/arrayclear.go

> +++ b/gcc/testsuite/go.dg/arrayclear.go

> @@ -1,5 +1,8 @@

>  // { dg-do compile }

>  // { dg-options "-fgo-debug-optimization" }

> +// This comment is necessary to work around a dejagnu bug. Otherwise, the

> +// column of the second error message would equal the row of the first one, and

> +// since the errors are also identical, dejagnu is not able to distinguish them.

>  

>  package p

>  

> diff --git a/gcc/tree-diagnostic-path.cc b/gcc/tree-diagnostic-path.cc

> index 381a49cb0b4..82b3c2d6b6a 100644

> --- a/gcc/tree-diagnostic-path.cc

> +++ b/gcc/tree-diagnostic-path.cc

> @@ -493,7 +493,7 @@ default_tree_diagnostic_path_printer (diagnostic_context *context,

>     doesn't have access to trees (for m_fndecl).  */

>  

>  json::value *

> -default_tree_make_json_for_path (diagnostic_context *,

> +default_tree_make_json_for_path (diagnostic_context *context,

>  				 const diagnostic_path *path)

>  {

>    json::array *path_array = new json::array ();

> @@ -504,7 +504,8 @@ default_tree_make_json_for_path (diagnostic_context *,

>        json::object *event_obj = new json::object ();

>        if (event.get_location ())

>  	event_obj->set ("location",

> -			json_from_expanded_location (event.get_location ()));

> +			json_from_expanded_location (context,

> +						     event.get_location ()));

>        label_text event_text (event.get_desc (false));

>        event_obj->set ("description", new json::string (event_text.m_buffer));

>        event_text.maybe_free ();

> diff --git a/libcpp/charset.c b/libcpp/charset.c

> index d9281c5fb97..66a5f2b7f26 100644

> --- a/libcpp/charset.c

> +++ b/libcpp/charset.c

> @@ -2276,49 +2276,105 @@ cpp_string_location_reader::get_next ()

>    return result;

>  }

>  

> -/* Helper for cpp_byte_column_to_display_column and its inverse.  Given a

> -   pointer to a UTF-8-encoded character, compute its display width.  *INBUFP

> -   points on entry to the start of the UTF-8 encoding of the character, and

> -   is updated to point just after the last byte of the encoding.  *INBYTESLEFTP

> -   contains on entry the remaining size of the buffer into which *INBUFP

> -   points, and this is also updated accordingly.  If *INBUFP does not

> +/* This is normally determined by the -ftabstop option.  We need to know it so

> +   the display column computations below can expand tabs as well.  */

> +

> +static int global_tabstop = 8;

> +

> +int

> +cpp_set_tabstop (int t)

> +{

> +  return global_tabstop = MAX (1, t);

> +}

> +

> +int

> +cpp_get_tabstop ()

> +{

> +  return global_tabstop;

> +}

> +

> +cpp_display_width_computation::

> +cpp_display_width_computation (const char *data, int data_length, int tabstop) :

> +  m_begin (data),

> +  m_next (m_begin),

> +  m_bytes_left (data_length),

> +  m_tabstop (tabstop > 0 ? tabstop : global_tabstop),

> +  m_display_cols (0)

> +{}

> +

> +

> +/* The main implementation function for class cpp_display_width_computation.

> +   m_next points on entry to the start of the UTF-8 encoding of the next

> +   character, and is updated to point just after the last byte of the encoding.

> +   m_bytes_left contains on entry the remaining size of the buffer into which

> +   m_next points, and this is also updated accordingly.  If m_next does not

>     point to a valid UTF-8-encoded sequence, then it will be treated as a single

> -   byte with display width 1.  */

> +   byte with display width 1.  m_cur_display_col is the current display column,

> +   relative to which tab stops should be expanded.  Returns the display width of

> +   the codepoint just processed.  */

>  

> -static inline int

> -compute_next_display_width (const uchar **inbufp, size_t *inbytesleftp)

> +int

> +cpp_display_width_computation::process_next_codepoint ()

>  {

>    cppchar_t c;

> -  if (one_utf8_to_cppchar (inbufp, inbytesleftp, &c) != 0)

> +  int next_width;

> +

> +  if (*m_next == '\t')

> +    {

> +      ++m_next;

> +      --m_bytes_left;

> +      next_width = m_tabstop - (m_display_cols % m_tabstop);

> +    }

> +  else if (one_utf8_to_cppchar ((const uchar **) &m_next, &m_bytes_left, &c)

> +	   != 0)

>      {

>        /* Input is not convertible to UTF-8.  This could be fine, e.g. in a

>  	 string literal, so don't complain.  Just treat it as if it has a width

>  	 of one.  */

> -      ++*inbufp;

> -      --*inbytesleftp;

> -      return 1;

> +      ++m_next;

> +      --m_bytes_left;

> +      next_width = 1;

>      }

> +  else

> +    {

> +      /*  one_utf8_to_cppchar() has updated m_next and m_bytes_left for us.  */

> +      next_width = cpp_wcwidth (c);

> +    }

> +

> +  m_display_cols += next_width;

> +  return next_width;

> +}

>  

> -  /*  one_utf8_to_cppchar() has updated inbufp and inbytesleftp for us.  */

> -  return cpp_wcwidth (c);

> +/*  Utility to advance the byte stream by the minimum amount needed to consume

> +    N display columnns.  Returns the number of display columns that were

> +    actually skipped.  This could be less than N, if there was not enough data,

> +    or more than N, if the last character to be skipped had a sufficiently large

> +    display width.  */

> +int

> +cpp_display_width_computation::advance_display_cols (int n)

> +{

> +  const int start = m_display_cols;

> +  const int target = start + n;

> +  while (m_display_cols < target && !done ())

> +    process_next_codepoint ();

> +  return m_display_cols - start;

>  }

>  

>  /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute

>      how many display columns are occupied by the first COLUMN bytes.  COLUMN

>      may exceed DATA_LENGTH, in which case the phantom bytes at the end are

> -    treated as if they have display width 1.  */

> +    treated as if they have display width 1.  Tabs are expanded to the next tab

> +    stop, relative to the start of DATA.  */

>  

>  int

>  cpp_byte_column_to_display_column (const char *data, int data_length,

> -				   int column)

> +				   int column, int tabstop)

>  {

> -  int display_col = 0;

> -  const uchar *udata = (const uchar *) data;

>    const int offset = MAX (0, column - data_length);

> -  size_t inbytesleft = column - offset;

> -  while (inbytesleft)

> -    display_col += compute_next_display_width (&udata, &inbytesleft);

> -  return display_col + offset;

> +  cpp_display_width_computation dw (data, column - offset, tabstop);

> +  while (!dw.done ())

> +    dw.process_next_codepoint ();

> +  return dw.display_cols_processed () + offset;

>  }

>  

>  /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute

> @@ -2328,14 +2384,11 @@ cpp_byte_column_to_display_column (const char *data, int data_length,

>  

>  int

>  cpp_display_column_to_byte_column (const char *data, int data_length,

> -				   int display_col)

> +				   int display_col, int tabstop)

>  {

> -  int column = 0;

> -  const uchar *udata = (const uchar *) data;

> -  size_t inbytesleft = data_length;

> -  while (column < display_col && inbytesleft)

> -      column += compute_next_display_width (&udata, &inbytesleft);

> -  return data_length - inbytesleft + MAX (0, display_col - column);

> +  cpp_display_width_computation dw (data, data_length, tabstop);

> +  const int avail_display = dw.advance_display_cols (display_col);

> +  return dw.bytes_processed () + MAX (0, display_col - avail_display);

>  }

>  

>  /* Our own version of wcwidth().  We don't use the actual wcwidth() in glibc,

> diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h

> index 03cc72a12e2..9bf866ad7b6 100644

> --- a/libcpp/include/cpplib.h

> +++ b/libcpp/include/cpplib.h

> @@ -312,9 +312,6 @@ enum cpp_normalize_level {

>     carries all the options visible to the command line.  */

>  struct cpp_options

>  {

> -  /* Characters between tab stops.  */

> -  unsigned int tabstop;

> -

>    /* The language we're preprocessing.  */

>    enum c_lang lang;

>  

> @@ -1322,14 +1319,48 @@ extern const char * cpp_get_userdef_suffix

>    (const cpp_token *);

>  

>  /* In charset.c */

> +

> +/* A class to manage the state while converting a UTF-8 sequence to cppchar_t

> +   and computing the display width one character at a time.  */

> +class cpp_display_width_computation {

> + public:

> +  /* TABSTOP <= 0 means to use cpp_get_tabstop().  */

> +  cpp_display_width_computation (const char *data, int data_length,

> +				 int tabstop = 0);

> +  const char *next_byte () const { return m_next; }

> +  int bytes_processed () const { return m_next - m_begin; }

> +  int bytes_left () const { return m_bytes_left; }

> +  bool done () const { return !bytes_left (); }

> +  int display_cols_processed () const { return m_display_cols; }

> +

> +  int process_next_codepoint ();

> +  int advance_display_cols (int n);

> +

> + private:

> +  const char *const m_begin;

> +  const char *m_next;

> +  size_t m_bytes_left;

> +  const int m_tabstop;

> +  int m_display_cols;

> +};

> +

> +/* Convenience functions that are simple use cases for class

> +   cpp_display_width_computation.  Tab characters will be expanded to spaces

> +   as determined by TABSTOP.  If TABSTOP <= 0, the tab width is set to the

> +   global default cpp_get_tabstop (), which is typically set with the

> +   -ftabstop option.  */

>  int cpp_byte_column_to_display_column (const char *data, int data_length,

> -				       int column);

> -inline int cpp_display_width (const char *data, int data_length)

> +				       int column, int tabstop = 0);

> +inline int cpp_display_width (const char *data, int data_length,

> +			      int tabstop = 0)

>  {

> -    return cpp_byte_column_to_display_column (data, data_length, data_length);

> +  return cpp_byte_column_to_display_column (data, data_length, data_length,

> +					    tabstop);

>  }

>  int cpp_display_column_to_byte_column (const char *data, int data_length,

> -				       int display_col);

> +				       int display_col, int tabstop = 0);

>  int cpp_wcwidth (cppchar_t c);

> +int cpp_set_tabstop (int t);

> +int cpp_get_tabstop ();

>  

>  #endif /* ! LIBCPP_CPPLIB_H */

> diff --git a/libcpp/init.c b/libcpp/init.c

> index a3cd8e28f62..cb0d5006339 100644

> --- a/libcpp/init.c

> +++ b/libcpp/init.c

> @@ -190,7 +190,6 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,

>    CPP_OPTION (pfile, discard_comments) = 1;

>    CPP_OPTION (pfile, discard_comments_in_macro_exp) = 1;

>    CPP_OPTION (pfile, max_include_depth) = 200;

> -  CPP_OPTION (pfile, tabstop) = 8;

>    CPP_OPTION (pfile, operator_names) = 1;

>    CPP_OPTION (pfile, warn_trigraphs) = 2;

>    CPP_OPTION (pfile, warn_endif_labels) = 1;
Richard Sandiford via Gcc-patches June 10, 2020, 4:11 p.m. | #7
On Fri, 2020-05-08 at 15:35 -0400, Lewis Hyatt wrote:
> On Fri, Jan 31, 2020 at 03:31:59PM -0500, David Malcolm wrote:

> > On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:

> > > Hello-

> > > 

> > > Here is the second patch that I mentioned when I submitted the

> > > other

> > > related

> > > patch (which is awaiting review):

> > > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 

> > 

> > Sorry about that; I'm v. busy with analyzer bugs right now.

> > 

> > > This second patch

> > > is based on top of the first one and it closes out PR49973 and

> > > PR86904 by

> > > adding the new option -fdiagnostics-column-unit=[display|byte].

> > > This

> > > allows

> > > to specify whether columns are output as simple byte counts (the

> > > current

> > > behavior), or as display columns including handling multibyte

> > > characters and

> > > tabs. The patch makes display columns the new default.

> > > Additionally,

> > > a

> > > second new option -fdiagnostics-column-origin is added, which

> > > allows

> > > to make

> > > the column 0-based (or N-based for any N) instead of 1-based. The

> > > default

> > > remains at 1-based as it is now.

> > > 

> > > A number of testcases were explicitly testing for the old

> > > behavior,

> > > so I

> > > have updated them to test for the new behavior instead, since the

> > > column

> > > number adjusted for tabs is more natural to test for, and matches

> > > what

> > > editors typically show (give or take 1 for the origin

> > > convention).

> > > 

> > > One other testcase (go.dg/arrayclear.go) was a bit of an oddity.

> > > It

> > > failed

> > > after this patch, although it doesn't test for any column

> > > numbers.

> > > The

> > > answer turned out to be, this test checks for identical error

> > > text on

> > > two

> > > different lines. When the column units are changed to display

> > > columns, then

> > > the column of the second error happens to match the line of the

> > > first

> > > one. dejagnu then misinterprets the second error as if it matched

> > > the

> > > location of the first one (it doesn't distinguish whether it

> > > checks

> > > for the

> > > line number or the column number in the output). I added a

> > > comment to

> > > the

> > > test explaining the situation; since adding the comment has the

> > > side

> > > effect

> > > of making the first line number no longer match the second column

> > > number, it

> > > also makes the test pass again.

> > > 

> > > It wasn't quite clear to me whether this change was appropriate

> > > for

> > > GCC 10

> > > or not at this point. We discussed it a couple months ago here:

> > > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either

> > > way,

> > > I hope

> > > it isn't a problem that I submitted the patch for review now,

> > > whether

> > > it

> > > will end up in 10 or 11. Please let me know what's normally

> > > expected?

> > > Thanks!

> > 

> > Thanks Lewis.

> > 

> > This patch looks very promising, but should wait until gcc 11;

> > we're

> > trying to stabilize gcc 10 right now (I'm knee-deep in analyzer

> > bug-

> > fixing, so I don't want to add any more diagnostics changes).

> > 

> 

> Hi Dave-

> 

> Well GCC 10 was released for a whole day so I thought I would bug you

> with this

> patch again now :). To summarize, I previously sent this in two

> separate parts.

> 

> Part 1: 

> https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01626.html

> Part 2: 

> https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg02108.html

> 

> Part 1 added the support for converting tabs to spaces when

> outputting

> diagnostics. Part 2 added the new options -fdiagnostics-column-unit

> and

> -fdiagnostics-column-origin to control whether the column number is

> printed

> in display or byte units. Together they resolve both PR49973 and

> PR86904.

> 

> You provided me with feedback on part 2, which is quoted below with

> some

> notes interspersed. The new version of the patch incorporates all of

> your

> suggestions. Part 1 has not changed other than some trivial rebasing

> conflicts. The two patches touch nearly disjoint sets of files and

> are

> logically linked together, so I thought it would be simpler if I just

> sent

> one combined patch now. If you prefer them to be separated as before,

> please

> let me know and I can send them that way as well.

> 

> Bootstrap and reg tests were done on x86-64 Linux for all

> languages.  Tests

> look good:

> 

> type, before, after

> FAIL 96 96

> PASS 474637 475097

> UNSUPPORTED 11607 11607

> UNTESTED 195 195

> XFAIL 1816 1816

> XPASS 36 362


Thanks for the patch; sorry about the delay in reviewing it.

Some high-level review points

- I like the patch overall

- This will deserve an item in the release notes

- I don't like adding "global_tabstop" (I don't like global
variables).  Is there nowhere else we can handle this? I believe
there's a cluster of functions in the callgraph that make use of
it; can we simply pass around the tabstop value instead?  "tabstop"
seems to have several meanings.  If I'm reading the patch correctly
  * "tabstop > 0" means to expand tabs so that column numbers are a
multiple of tabstop
  * "tabstop == 0" means "don't expand tabs"
  * "tabstop < 0" in some places means: use the global_tabstop value
Is it possible to eliminate global_tabstop value?  Or is there some
deep reason I'm missing?

I'll do a more thorough review once that's addressed/resolved (since
eliminating global_tabstop might touch a few places).

Thanks for adding docs; some nits on them:

> --- a/gcc/doc/invoke.texi

> +++ b/gcc/doc/invoke.texi


[...snip...]

> +@item -fdiagnostics-column-unit=@var{UNIT}

> +@opindex fdiagnostics-column-unit

> +Select the units for the column number.  This affects traditional diagnostics

> +(in the absence of @option{-fno-show-column}), as well as JSON format

> +diagnostics if requested.

> +

> +The default @var{UNIT}, @samp{display}, considers the number of display columns

> +occupied by each character.  This may be larger than the number of bytes

> +occupied, in the case of tab characters, or it may be smaller, in the case of

> +multibyte characters.  For example, the UTF-8 character ``@U{03C0}'' occupies

> +two bytes and one display column, while the character ``@U{1F642}'' occupies

> +four bytes and two display columns.


This is imprecise.  A unicode code point occupies some number of display columns,
and its *UTF-8 encoding* occupies some number of bytes.

[and my inner pedant is now thinking: what about combining diacritics? 
But I don't think we can ever issue a diagnostic on a diacritic; I
*think* we only ever care about the per-glyph level]

> +Setting @var{UNIT} to @samp{byte} changes the column number to the

raw byte
> +count in all cases, as was traditionally output by GCC prior to version 11.1.0.

> +

> +@item -fdiagnostics-column-origin=@var{ORIGIN}

> +@opindex fdiagnostics-column-origin

> +Select the origin for column numbers, i.e. the column number assigned to the

> +first column.  The default value of 1 corresponds to traditional GCC

> +behavior and to the GNU style guide.  Some utilities may perform better with an

> +origin of 0; any non-negative value may be specified.

> +

>  @item -fdiagnostics-format=@var{FORMAT}

>  @opindex fdiagnostics-format

>  Select a different format for printing diagnostics.


[...snip...]

> +A diagnostic can contain zero or more locations.  Each location has an

> +optional @code{label} string and up to three positions within it: a

> +@code{caret} position and optional @code{start} and @code{finish} positions.

> +A position is described by a @code{file} name, a @code{line} number, and

> +three numbers indicating a column position: @code{display-column} counts

> +display columns, accounting for tabs and multibyte characters;

> +@code{byte-column} counts raw bytes; and @code{column} is equal to one of

> +the previous two, as dictated by the @option{-fdiagnostics-column-unit}

> +option.


Might be clearer to use an unordered list here for the three kinds of column.

> All three columns are relative to the origin specified by

> +@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may

> +be set, for instance, to 0 for compatibility with other utilities that

> +number columns from 0.  The column origin is recorded in the JSON output in

> +the @code{column-origin} tag.  In the remaining examples below, the extra

> +column number outputs have been omitted for brevity.


[...snip...]


Thanks again for the patch; hope this is constructive
Dave
Richard Sandiford via Gcc-patches June 11, 2020, 3:26 p.m. | #8
On Wed, Jun 10, 2020 at 12:11:00PM -0400, David Malcolm wrote:
> Thanks for the patch; sorry about the delay in reviewing it.

> 

> Some high-level review points

> 

> - I like the patch overall

> 

> - This will deserve an item in the release notes

> 

> - I don't like adding "global_tabstop" (I don't like global

> variables).  Is there nowhere else we can handle this? I believe

> there's a cluster of functions in the callgraph that make use of

> it; can we simply pass around the tabstop value instead?  "tabstop"

> seems to have several meanings.  If I'm reading the patch correctly

>   * "tabstop > 0" means to expand tabs so that column numbers are a

> multiple of tabstop

>   * "tabstop == 0" means "don't expand tabs"

>   * "tabstop < 0" in some places means: use the global_tabstop value

> Is it possible to eliminate global_tabstop value?  Or is there some

> deep reason I'm missing?

> 

> I'll do a more thorough review once that's addressed/resolved (since

> eliminating global_tabstop might touch a few places).

>


Thanks for the feedback! The attached updated patch addresses these
concerns. Regarding tabstop, I have removed the new static variable
global_tabstop in charset.c. FWIW, the usage of "tabstop" arguments in the
various new APIs did previously work a bit more consistently than you
described. In all cases "tabstop <= 0" meant to use the default value,
otherwise it specified the tabstop to use (with tabstop=1 naturally
restoring the old behavior of changing tabs to a single space). In order
for libcpp to provide this feature (callers can pass tabstop <= 0 to get a
default, and the default can in turn by configured when processing the
-ftabstop option), it does need to remember the default, and this has to
be a file-level static variable because the routines need to work
independent of any cpp_reader instance. (Some frontends don't use
libcpp to read their input, for instance.) Anyway, I see the point that
this file-level static, being accessible with cpp_set_tabstop() and
cpp_get_tabstop(), is effectively just a global variable, so I have
removed this feature, which just means that all callers need to pass the
tabstop they want to use. I am now rather using the diagnostic_context
object to remember the value passed to -ftabstop. The only place this
involves global variables is now in c-family/c-indentation.c, where if I
understood correctly, the only diagnostic_context available is global_dc,
so I am getting the tabstop value from there. Please let me know if
there's a better way to handle that? Prior to my patch, the tabstop was
obtained from a different global variable (extern cpp_options *cpp_opts),
so at least conservation of total globals is maintained. :)

Compared to the previous version, this one is a bit longer, since 25 or
so call sites had to be modified to know the value of -ftabstop. Most of
the churn is in diagnostic-show-locus.c, because there are a fair number of
static helper functions and helper classes there, which just needed to
receive the diagnostic_context object from their callers. I could
have made this simpler by letting the tabstop argument default to
something like 8 in all functions that require it... this would remove the
need to pass it in all the selftests that are indifferent to it. I figured
it would be better to force this argument to be passed, though, or else in
the future it may be easy to forget to pass it where it is needed. 

> Thanks for adding docs; some nits on them:

> 

> > --- a/gcc/doc/invoke.texi

> > +++ b/gcc/doc/invoke.texi

> 

> [...snip...]

> 

> > +@item -fdiagnostics-column-unit=@var{UNIT}

> > +@opindex fdiagnostics-column-unit

> > +Select the units for the column number.  This affects traditional diagnostics

> > +(in the absence of @option{-fno-show-column}), as well as JSON format

> > +diagnostics if requested.

> > +

> > +The default @var{UNIT}, @samp{display}, considers the number of display columns

> > +occupied by each character.  This may be larger than the number of bytes

> > +occupied, in the case of tab characters, or it may be smaller, in the case of

> > +multibyte characters.  For example, the UTF-8 character ``@U{03C0}'' occupies

> > +two bytes and one display column, while the character ``@U{1F642}'' occupies

> > +four bytes and two display columns.

> 

> This is imprecise.  A unicode code point occupies some number of display columns,

> and its *UTF-8 encoding* occupies some number of bytes.

> 

> [and my inner pedant is now thinking: what about combining diacritics? 

> But I don't think we can ever issue a diagnostic on a diacritic; I

> *think* we only ever care about the per-glyph level]

> 

> > +Setting @var{UNIT} to @samp{byte} changes the column number to the

> raw byte

> > +count in all cases, as was traditionally output by GCC prior to version 11.1.0.

> > +

> > +@item -fdiagnostics-column-origin=@var{ORIGIN}

> > +@opindex fdiagnostics-column-origin

> > +Select the origin for column numbers, i.e. the column number assigned to the

> > +first column.  The default value of 1 corresponds to traditional GCC

> > +behavior and to the GNU style guide.  Some utilities may perform better with an

> > +origin of 0; any non-negative value may be specified.

> > +

> >  @item -fdiagnostics-format=@var{FORMAT}

> >  @opindex fdiagnostics-format

> >  Select a different format for printing diagnostics.

> 

> [...snip...]

> 

> > +A diagnostic can contain zero or more locations.  Each location has an

> > +optional @code{label} string and up to three positions within it: a

> > +@code{caret} position and optional @code{start} and @code{finish} positions.

> > +A position is described by a @code{file} name, a @code{line} number, and

> > +three numbers indicating a column position: @code{display-column} counts

> > +display columns, accounting for tabs and multibyte characters;

> > +@code{byte-column} counts raw bytes; and @code{column} is equal to one of

> > +the previous two, as dictated by the @option{-fdiagnostics-column-unit}

> > +option.

> 

> Might be clearer to use an unordered list here for the three kinds of column.

> 

> > All three columns are relative to the origin specified by

> > +@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may

> > +be set, for instance, to 0 for compatibility with other utilities that

> > +number columns from 0.  The column origin is recorded in the JSON output in

> > +the @code{column-origin} tag.  In the remaining examples below, the extra

> > +column number outputs have been omitted for brevity.

> 

> [...snip...]

> 


I improved the docs along these lines.

> Thanks again for the patch; hope this is constructive

> Dave

>


Thanks for your time! BTW, I did bootstrap + regtest this version as well on
x86-64 Linux, it looks good, new tests pass and others are the same:

FAIL 97 97
PASS 476837 477297
UNRESOLVED 7 7
UNSUPPORTED 11726 11726
UNTESTED 195 195
XFAIL 1807 1807
XPASS 37 37

-Lewis
From 7729ce3334b6768a25967a6dd4a0a5a2ed0923cc Mon Sep 17 00:00:00 2001
From: Lewis Hyatt <lhyatt@gmail.com>

Date: Wed, 10 Jun 2020 22:04:07 -0400
Subject: [PATCH] diagnostics: Support conversion of tabs to spaces [PR49973] [PR86904]

Supports conversion of tabs to spaces when outputting diagnostics. Also
adds -fdiagnostics-column-unit and -fdiagnostics-column-origin options to
control how the column number is output, thereby resolving the two PRs.

gcc/c-family/ChangeLog:

	PR other/86904
	* c-indentation.c (should_warn_for_misleading_indentation): Get
	global tabstop from the new source.
	* c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which
	is now a common option.
	* c.opt: Likewise.

gcc/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* common.opt: Handle -ftabstop here instead of in c-family
	options.  Add -fdiagnostics-column-unit= and
	-fdiagnostics-column-origin= options.
	* opts.c (common_handle_option): Handle the new options.
	* diagnostic-format-json.cc (json_from_expanded_location): Add
	diagnostic_context argument.  Use it to convert column numbers as per
	the new options.
	(json_from_location_range): Likewise.
	(json_from_fixit_hint): Likewise.
	(json_end_diagnostic): Pass the new context argument to helper
	functions above.  Add "column-origin" field to the output.
	(test_unknown_location): Add the new context argument to calls to
	helper functions.
	(test_bad_endpoints): Likewise.
	* diagnostic-show-locus.c
	(exploc_with_display_col::exploc_with_display_col): Support
	tabstop parameter.
	(layout_point::layout_point): Make use of class
	exploc_with_display_col.
	(layout_range::layout_range): Likewise.
	(struct line_bounds): Clarify that the units are now always
	display columns.  Rename members accordingly.  Add constructor.
	(layout::print_source_line): Add support for tab expansion.
	(make_range): Adapt to class layout_range changes.
	(layout::maybe_add_location_range): Likewise.
	(layout::layout): Adapt to class exploc_with_display_col changes.
	(layout::calculate_x_offset_display): Support tabstop parameter.
	(layout::print_annotation_line): Adapt to struct line_bounds changes.
	(layout::print_line): Likewise.
	(line_label::line_label): Add diagnostic_context argument.
	(get_affected_range): Likewise.
	(get_printed_columns): Likewise.
	(layout::print_any_labels): Adapt to struct line_label changes.
	(class correction): Add m_tabstop member.
	(correction::correction): Add tabstop argument.
	(correction::compute_display_cols): Use m_tabstop.
	(class line_corrections): Add m_context member.
	(line_corrections::line_corrections): Add diagnostic_context argument.
	(line_corrections::add_hint): Use m_context to handle tabstops.
	(layout::print_trailing_fixits): Adapt to class line_corrections
	changes.
	(test_layout_x_offset_display_utf8): Support tabstop parameter.
	(test_layout_x_offset_display_tab): New selftest.
	(test_one_liner_colorized_utf8): Likewise.
	(test_tab_expansion): Likewise.
	(test_diagnostic_show_locus_one_liner_utf8): Call the new tests.
	(diagnostic_show_locus_c_tests): Likewise.
	(test_overlapped_fixit_printing): Adapt to helper class and
	function changes.
	(test_overlapped_fixit_printing_utf8): Likewise.
	(test_overlapped_fixit_printing_2): Likewise.
	* diagnostic.h (enum diagnostics_column_unit): New enum.
	(struct diagnostic_context): Add members for the new options.
	(diagnostic_converted_column): Declare.
	(json_from_expanded_location): Add new context argument.
	* diagnostic.c (diagnostic_initialize): Initialize new members.
	(diagnostic_converted_column): New function.
	(maybe_line_and_column): Be willing to output a column of 0.
	(diagnostic_get_location_text): Convert column number as per the new
	options.
	(diagnostic_report_current_module): Likewise.
	(assert_location_text): Add origin and column_unit arguments for
	testing the new functionality.
	(test_diagnostic_get_location_text): Test the new functionality.
	* doc/invoke.texi: Document the new options and behavior.
	* input.h (location_compute_display_column): Add tabstop argument.
	* input.c (location_compute_display_column): Likewise.
	(test_cpp_utf8): Add selftests for tab expansion.
	* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the
	new context argument to json_from_expanded_location().

libcpp/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* include/cpplib.h (struct cpp_options):  Removed support for -ftabstop,
	which is now handled by diagnostic_context.
	(class cpp_display_width_computation): New class.
	(cpp_byte_column_to_display_column): Add optional tabstop argument.
	(cpp_display_width): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	* charset.c
	(cpp_display_width_computation::cpp_display_width_computation): New
	function.
	(cpp_display_width_computation::advance_display_cols): Likewise.
	(compute_next_display_width): Removed and implemented this
	functionality in a new function...
	(cpp_display_width_computation::process_next_codepoint): ...here.
	(cpp_byte_column_to_display_column): Added tabstop argument.
	Reimplemented in terms of class cpp_display_width_computation.
	(cpp_display_column_to_byte_column): Likewise.
	* init.c (cpp_create_reader): Remove handling of -ftabstop, which is now
	handled by diagnostic_context.

gcc/testsuite/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output
	for new defaults.
	* c-c++-common/Wmisleading-indentation.c: Likewise.
	* c-c++-common/diagnostic-format-json-1.c: Likewise.
	* c-c++-common/diagnostic-format-json-2.c: Likewise.
	* c-c++-common/diagnostic-format-json-3.c: Likewise.
	* c-c++-common/diagnostic-format-json-4.c: Likewise.
	* c-c++-common/diagnostic-format-json-5.c: Likewise.
	* c-c++-common/missing-close-symbol.c: Likewise.
	* g++.dg/diagnostic/bad-binary-ops.C: Likewise.
	* g++.dg/parse/error4.C: Likewise.
	* g++.old-deja/g++.brendan/crash11.C: Likewise.
	* g++.old-deja/g++.pt/overload2.C: Likewise.
	* g++.old-deja/g++.robertl/eb109.C: Likewise.
	* gcc.dg/analyzer/malloc-paths-9.c: Likewise.
	* gcc.dg/bad-binary-ops.c: Likewise.
	* gcc.dg/format/branch-1.c: Likewise.
	* gcc.dg/format/pr79210.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.
	* gcc.dg/redecl-4.c: Likewise.
	* gfortran.dg/diagnostic-format-json-1.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
	* go.dg/arrayclear.go: Add a comment explaining why adding a
	comment was necessary to work around a dejagnu bug.
	* c-c++-common/diagnostic-units-1.c: New test.
	* c-c++-common/diagnostic-units-2.c: New test.
	* c-c++-common/diagnostic-units-3.c: New test.
	* c-c++-common/diagnostic-units-4.c: New test.
	* c-c++-common/diagnostic-units-5.c: New test.
	* c-c++-common/diagnostic-units-6.c: New test.
	* c-c++-common/diagnostic-units-7.c: New test.
	* c-c++-common/diagnostic-units-8.c: New test.

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index 9fba3bcc67c..d814f6f29e6 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -24,8 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "c-common.h"
 #include "c-indentation.h"
 #include "selftest.h"
-
-extern cpp_options *cpp_opts;
+#include "diagnostic.h"
 
 /* Round up VIS_COLUMN to nearest tab stop. */
 
@@ -299,7 +298,7 @@ should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,
   expanded_location next_stmt_exploc = expand_location (next_stmt_loc);
   expanded_location guard_exploc = expand_location (guard_loc);
 
-  const unsigned int tab_width = cpp_opts->tabstop;
+  const unsigned int tab_width = global_dc->tabstop;
 
   /* They must be in the same file.  */
   if (next_stmt_exploc.file != body_exploc.file)
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 8a5131b8ac6..f6588277565 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -504,12 +504,6 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,
 	cpp_opts->track_macro_expansion = 2;
       break;
 
-    case OPT_ftabstop_:
-      /* It is documented that we silently ignore silly values.  */
-      if (value >= 1 && value <= 100)
-	cpp_opts->tabstop = value;
-      break;
-
     case OPT_fexec_charset_:
       cpp_opts->narrow_charset = arg;
       break;
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 89a58282b3f..913f91d818a 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1876,10 +1876,6 @@ Enum(strong_eval_order) String(some) Value(1)
 EnumValue
 Enum(strong_eval_order) String(all) Value(2)
 
-ftabstop=
-C ObjC C++ ObjC++ Joined RejectNegative UInteger
--ftabstop=<number>	Distance between tab stops for column reporting.
-
 ftemplate-backtrace-limit=
 C++ ObjC++ Joined RejectNegative UInteger Var(template_backtrace_limit) Init(10)
 Set the maximum number of template instantiation notes for a single warning or error.
diff --git a/gcc/common.opt b/gcc/common.opt
index df8af365d1b..a3893a4725e 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1328,6 +1328,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)
 EnumValue
 Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)
 
+fdiagnostics-column-unit=
+Common Joined RejectNegative Enum(diagnostics_column_unit)
+-fdiagnostics-column-unit=[display|byte]	Select whether column numbers are output as display columns (default) or raw bytes.
+
+fdiagnostics-column-origin=
+Common Joined RejectNegative UInteger
+-fdiagnostics-column-origin=<number>	Set the number of the first column.  The default is 1-based as per GNU style, but some utilities may expect 0-based, for example.
+
 fdiagnostics-format=
 Common Joined RejectNegative Enum(diagnostics_output_format)
 -fdiagnostics-format=[text|json]	Select output format.
@@ -1336,6 +1344,15 @@ Common Joined RejectNegative Enum(diagnostics_output_format)
 SourceInclude
 diagnostic.h
 
+Enum
+Name(diagnostics_column_unit) Type(int)
+
+EnumValue
+Enum(diagnostics_column_unit) String(display) Value(DIAGNOSTICS_COLUMN_UNIT_DISPLAY)
+
+EnumValue
+Enum(diagnostics_column_unit) String(byte) Value(DIAGNOSTICS_COLUMN_UNIT_BYTE)
+
 Enum
 Name(diagnostics_output_format) Type(int)
 
@@ -1365,6 +1382,10 @@ fdiagnostics-path-format=
 Common Joined RejectNegative Var(flag_diagnostics_path_format) Enum(diagnostic_path_format) Init(DPF_INLINE_EVENTS)
 Specify how to print any control-flow path associated with a diagnostic.
 
+ftabstop=
+Common Joined RejectNegative UInteger
+-ftabstop=<number>      Distance between tab stops for column reporting.
+
 Enum
 Name(diagnostic_path_format) Type(int)
 
diff --git a/gcc/diagnostic-format-json.cc b/gcc/diagnostic-format-json.cc
index 7bda5c4ba83..465c42fdfde 100644
--- a/gcc/diagnostic-format-json.cc
+++ b/gcc/diagnostic-format-json.cc
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "diagnostic.h"
+#include "selftest-diagnostic.h"
 #include "diagnostic-metadata.h"
 #include "json.h"
 #include "selftest.h"
@@ -43,21 +44,43 @@ static json::array *cur_children_array;
 /* Generate a JSON object for LOC.  */
 
 json::value *
-json_from_expanded_location (location_t loc)
+json_from_expanded_location (diagnostic_context *context, location_t loc)
 {
   expanded_location exploc = expand_location (loc);
   json::object *result = new json::object ();
   if (exploc.file)
     result->set ("file", new json::string (exploc.file));
   result->set ("line", new json::integer_number (exploc.line));
-  result->set ("column", new json::integer_number (exploc.column));
+
+  const enum diagnostics_column_unit orig_unit = context->column_unit;
+  struct
+  {
+    const char *name;
+    enum diagnostics_column_unit unit;
+  } column_fields[] = {
+    {"display-column", DIAGNOSTICS_COLUMN_UNIT_DISPLAY},
+    {"byte-column", DIAGNOSTICS_COLUMN_UNIT_BYTE}
+  };
+  int the_column = INT_MIN;
+  for (int i = 0; i != sizeof column_fields / sizeof (*column_fields); ++i)
+    {
+      context->column_unit = column_fields[i].unit;
+      const int col = diagnostic_converted_column (context, exploc);
+      result->set (column_fields[i].name, new json::integer_number (col));
+      if (column_fields[i].unit == orig_unit)
+	the_column = col;
+    }
+  gcc_assert (the_column != INT_MIN);
+  result->set ("column", new json::integer_number (the_column));
+  context->column_unit = orig_unit;
   return result;
 }
 
 /* Generate a JSON object for LOC_RANGE.  */
 
 static json::object *
-json_from_location_range (const location_range *loc_range, unsigned range_idx)
+json_from_location_range (diagnostic_context *context,
+			  const location_range *loc_range, unsigned range_idx)
 {
   location_t caret_loc = get_pure_location (loc_range->m_loc);
 
@@ -68,13 +91,13 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)
   location_t finish_loc = get_finish (loc_range->m_loc);
 
   json::object *result = new json::object ();
-  result->set ("caret", json_from_expanded_location (caret_loc));
+  result->set ("caret", json_from_expanded_location (context, caret_loc));
   if (start_loc != caret_loc
       && start_loc != UNKNOWN_LOCATION)
-    result->set ("start", json_from_expanded_location (start_loc));
+    result->set ("start", json_from_expanded_location (context, start_loc));
   if (finish_loc != caret_loc
       && finish_loc != UNKNOWN_LOCATION)
-    result->set ("finish", json_from_expanded_location (finish_loc));
+    result->set ("finish", json_from_expanded_location (context, finish_loc));
 
   if (loc_range->m_label)
     {
@@ -91,14 +114,14 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)
 /* Generate a JSON object for HINT.  */
 
 static json::object *
-json_from_fixit_hint (const fixit_hint *hint)
+json_from_fixit_hint (diagnostic_context *context, const fixit_hint *hint)
 {
   json::object *fixit_obj = new json::object ();
 
   location_t start_loc = hint->get_start_loc ();
-  fixit_obj->set ("start", json_from_expanded_location (start_loc));
+  fixit_obj->set ("start", json_from_expanded_location (context, start_loc));
   location_t next_loc = hint->get_next_loc ();
-  fixit_obj->set ("next", json_from_expanded_location (next_loc));
+  fixit_obj->set ("next", json_from_expanded_location (context, next_loc));
   fixit_obj->set ("string", new json::string (hint->get_string ()));
 
   return fixit_obj;
@@ -190,11 +213,13 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   else
     {
       /* Otherwise, make diag_obj be the top-level object within the group;
-	 add a "children" array.  */
+	 add a "children" array and record the column origin.  */
       toplevel_array->append (diag_obj);
       cur_group = diag_obj;
       cur_children_array = new json::array ();
       diag_obj->set ("children", cur_children_array);
+      diag_obj->set ("column-origin",
+		     new json::integer_number (context->column_origin));
     }
 
   const rich_location *richloc = diagnostic->richloc;
@@ -205,7 +230,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   for (unsigned int i = 0; i < richloc->get_num_locations (); i++)
     {
       const location_range *loc_range = richloc->get_range (i);
-      json::object *loc_obj = json_from_location_range (loc_range, i);
+      json::object *loc_obj = json_from_location_range (context, loc_range, i);
       if (loc_obj)
 	loc_array->append (loc_obj);
     }
@@ -217,7 +242,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
       for (unsigned int i = 0; i < richloc->get_num_fixit_hints (); i++)
 	{
 	  const fixit_hint *hint = richloc->get_fixit_hint (i);
-	  json::object *fixit_obj = json_from_fixit_hint (hint);
+	  json::object *fixit_obj = json_from_fixit_hint (context, hint);
 	  fixit_array->append (fixit_obj);
 	}
     }
@@ -320,7 +345,8 @@ namespace selftest {
 static void
 test_unknown_location ()
 {
-  delete json_from_expanded_location (UNKNOWN_LOCATION);
+  test_diagnostic_context dc;
+  delete json_from_expanded_location (&dc, UNKNOWN_LOCATION);
 }
 
 /* Verify that we gracefully handle attempts to serialize bad
@@ -338,7 +364,8 @@ test_bad_endpoints ()
   loc_range.m_range_display_kind = SHOW_RANGE_WITH_CARET;
   loc_range.m_label = NULL;
 
-  json::object *obj = json_from_location_range (&loc_range, 0);
+  test_diagnostic_context dc;
+  json::object *obj = json_from_location_range (&dc, &loc_range, 0);
   /* We should have a "caret" value, but no "start" or "finish" values.  */
   ASSERT_TRUE (obj != NULL);
   ASSERT_TRUE (obj->get ("caret") != NULL);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 4618b4edb7d..da3c5b6a92d 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -175,9 +175,10 @@ enum column_unit {
 class exploc_with_display_col : public expanded_location
 {
  public:
-  exploc_with_display_col (const expanded_location &exploc)
+  exploc_with_display_col (const expanded_location &exploc, int tabstop)
     : expanded_location (exploc),
-      m_display_col (location_compute_display_column (exploc)) {}
+      m_display_col (location_compute_display_column (exploc, tabstop))
+  {}
 
   int m_display_col;
 };
@@ -189,11 +190,11 @@ class exploc_with_display_col : public expanded_location
 class layout_point
 {
  public:
-  layout_point (const expanded_location &exploc)
+  layout_point (const exploc_with_display_col &exploc)
     : m_line (exploc.line)
   {
     m_columns[CU_BYTES] = exploc.column;
-    m_columns[CU_DISPLAY_COLS] = location_compute_display_column (exploc);
+    m_columns[CU_DISPLAY_COLS] = exploc.m_display_col;
   }
 
   linenum_type m_line;
@@ -205,10 +206,10 @@ class layout_point
 class layout_range
 {
  public:
-  layout_range (const expanded_location *start_exploc,
-		const expanded_location *finish_exploc,
+  layout_range (const exploc_with_display_col &start_exploc,
+		const exploc_with_display_col &finish_exploc,
 		enum range_display_kind range_display_kind,
-		const expanded_location *caret_exploc,
+		const exploc_with_display_col &caret_exploc,
 		unsigned original_idx,
 		const range_label *label);
 
@@ -226,22 +227,18 @@ class layout_range
 
 /* A struct for use by layout::print_source_line for telling
    layout::print_annotation_line the extents of the source line that
-   it printed, so that underlines can be clipped appropriately.  */
+   it printed, so that underlines can be clipped appropriately.  Units
+   are 1-based display columns.  */
 
 struct line_bounds
 {
-  int m_first_non_ws;
-  int m_last_non_ws;
+  int m_first_non_ws_disp_col;
+  int m_last_non_ws_disp_col;
 
-  void convert_to_display_cols (char_span line)
+  line_bounds ()
   {
-    m_first_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),
-							line.length (),
-							m_first_non_ws);
-
-    m_last_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),
-						       line.length (),
-						       m_last_non_ws);
+    m_first_non_ws_disp_col = INT_MAX;
+    m_last_non_ws_disp_col = 0;
   }
 };
 
@@ -351,8 +348,8 @@ class layout
  private:
   bool will_show_line_p (linenum_type row) const;
   void print_leading_fixits (linenum_type row);
-  void print_source_line (linenum_type row, const char *line, int line_bytes,
-			  line_bounds *lbounds_out);
+  line_bounds print_source_line (linenum_type row, const char *line,
+				 int line_bytes);
   bool should_print_annotation_line_p (linenum_type row) const;
   void start_annotation_line (char margin_char = ' ') const;
   void print_annotation_line (linenum_type row, const line_bounds lbounds);
@@ -513,16 +510,16 @@ colorizer::get_color_by_name (const char *name)
    Initialize various layout_point fields from expanded_location
    equivalents; we've already filtered on file.  */
 
-layout_range::layout_range (const expanded_location *start_exploc,
-			    const expanded_location *finish_exploc,
+layout_range::layout_range (const exploc_with_display_col &start_exploc,
+			    const exploc_with_display_col &finish_exploc,
 			    enum range_display_kind range_display_kind,
-			    const expanded_location *caret_exploc,
+			    const exploc_with_display_col &caret_exploc,
 			    unsigned original_idx,
 			    const range_label *label)
-: m_start (*start_exploc),
-  m_finish (*finish_exploc),
+: m_start (start_exploc),
+  m_finish (finish_exploc),
   m_range_display_kind (range_display_kind),
-  m_caret (*caret_exploc),
+  m_caret (caret_exploc),
   m_original_idx (original_idx),
   m_label (label)
 {
@@ -646,6 +643,9 @@ layout_range::intersects_line_p (linenum_type row) const
 
 #if CHECKING_P
 
+/* Default for when we don't care what the tab expansion is set to.  */
+static const int def_tabstop = 8;
+
 /* Create some expanded locations for testing layout_range.  The filename
    member of the explocs is set to the empty string.  This member will only be
    inspected by the calls to location_compute_display_column() made from the
@@ -662,8 +662,11 @@ make_range (int start_line, int start_col, int end_line, int end_col)
     = {"", start_line, start_col, NULL, false};
   const expanded_location finish_exploc
     = {"", end_line, end_col, NULL, false};
-  return layout_range (&start_exploc, &finish_exploc, SHOW_RANGE_WITHOUT_CARET,
-		       &start_exploc, 0, NULL);
+  return layout_range (exploc_with_display_col (start_exploc, def_tabstop),
+		       exploc_with_display_col (finish_exploc, def_tabstop),
+		       SHOW_RANGE_WITHOUT_CARET,
+		       exploc_with_display_col (start_exploc, def_tabstop),
+		       0, NULL);
 }
 
 /* Selftests for layout_range::contains_point and
@@ -964,7 +967,7 @@ layout::layout (diagnostic_context * context,
 : m_context (context),
   m_pp (context->printer),
   m_primary_loc (richloc->get_range (0)->m_loc),
-  m_exploc (richloc->get_expanded_location (0)),
+  m_exploc (richloc->get_expanded_location (0), context->tabstop),
   m_colorizer (context, diagnostic_kind),
   m_colorize_source_p (context->colorize_source_p),
   m_show_labels_p (context->show_labels_p),
@@ -1060,7 +1063,10 @@ layout::maybe_add_location_range (const location_range *loc_range,
 
   /* Everything is now known to be in the correct source file,
      but it may require further sanitization.  */
-  layout_range ri (&start, &finish, loc_range->m_range_display_kind, &caret,
+  layout_range ri (exploc_with_display_col (start, m_context->tabstop),
+		   exploc_with_display_col (finish, m_context->tabstop),
+		   loc_range->m_range_display_kind,
+		   exploc_with_display_col (caret, m_context->tabstop),
 		   original_idx, loc_range->m_label);
 
   /* If we have a range that finishes before it starts (perhaps
@@ -1394,7 +1400,7 @@ layout::calculate_x_offset_display ()
     = get_line_bytes_without_trailing_whitespace (line.get_buffer (),
 						  line.length ());
   int eol_display_column
-    = cpp_display_width (line.get_buffer (), line_bytes);
+    = cpp_display_width (line.get_buffer (), line_bytes, m_context->tabstop);
   if (caret_display_column > eol_display_column
       || !caret_display_column)
     {
@@ -1445,16 +1451,13 @@ layout::calculate_x_offset_display ()
 }
 
 /* Print line ROW of source code, potentially colorized at any ranges, and
-   populate *LBOUNDS_OUT.
-   LINE is the source line (not necessarily 0-terminated) and LINE_BYTES
-   is its length in bytes.
-   This function deals only with byte offsets, not display columns, so
-   m_x_offset_display must be converted from display to byte units.  In
-   particular, LINE_BYTES and LBOUNDS_OUT are in bytes.  */
+   return the line bounds.  LINE is the source line (not necessarily
+   0-terminated) and LINE_BYTES is its length in bytes.  In order to handle both
+   colorization and tab expansion, this function tracks the line position in
+   both byte and display column units.  */
 
-void
-layout::print_source_line (linenum_type row, const char *line, int line_bytes,
-			   line_bounds *lbounds_out)
+line_bounds
+layout::print_source_line (linenum_type row, const char *line, int line_bytes)
 {
   m_colorizer.set_normal_text ();
 
@@ -1469,30 +1472,29 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
   else
     pp_space (m_pp);
 
-  /* We will stop printing the source line at any trailing whitespace, and start
-     printing it as per m_x_offset_display.  */
+  /* We will stop printing the source line at any trailing whitespace.  */
   line_bytes = get_line_bytes_without_trailing_whitespace (line,
 							   line_bytes);
-  int x_offset_bytes = 0;
-  if (m_x_offset_display)
-    {
-      x_offset_bytes = cpp_display_column_to_byte_column (line, line_bytes,
-							  m_x_offset_display);
-      /* In case the leading portion of the line that will be skipped over ends
-	 with a character with wcwidth > 1, then it is possible we skipped too
-	 much, so account for that by padding with spaces.  */
-      const int overage
-	= cpp_byte_column_to_display_column (line, line_bytes, x_offset_bytes)
-	- m_x_offset_display;
-      for (int column = 0; column < overage; ++column)
-	pp_space (m_pp);
-      line += x_offset_bytes;
-    }
 
-  /* Print the line.  */
-  int first_non_ws = INT_MAX;
-  int last_non_ws = 0;
-  for (int col_byte = 1 + x_offset_bytes; col_byte <= line_bytes; col_byte++)
+  /* This object helps to keep track of which display column we are at, which is
+     necessary for computing the line bounds in display units, for doing
+     tab expansion, and for implementing m_x_offset_display.  */
+  cpp_display_width_computation dw (line, line_bytes, m_context->tabstop);
+
+  /* Skip the first m_x_offset_display display columns.  In case the leading
+     portion that will be skipped ends with a character with wcwidth > 1, then
+     it is possible we skipped too much, so account for that by padding with
+     spaces.  Note that this does the right thing too in case a tab was the last
+     character to be skipped over; the tab is effectively replaced by the
+     correct number of trailing spaces needed to offset by the desired number of
+     display columns.  */
+  for (int skipped_display_cols = dw.advance_display_cols (m_x_offset_display);
+       skipped_display_cols > m_x_offset_display; --skipped_display_cols)
+    pp_space (m_pp);
+
+  /* Print the line and compute the line_bounds.  */
+  line_bounds lbounds;
+  while (!dw.done ())
     {
       /* Assuming colorization is enabled for the caret and underline
 	 characters, we may also colorize the associated characters
@@ -1510,7 +1512,8 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
 	{
 	  bool in_range_p;
 	  point_state state;
-	  in_range_p = get_state_at_point (row, col_byte,
+	  const int start_byte_col = dw.bytes_processed () + 1;
+	  in_range_p = get_state_at_point (row, start_byte_col,
 					   0, INT_MAX,
 					   CU_BYTES,
 					   &state);
@@ -1519,22 +1522,44 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
 	  else
 	    m_colorizer.set_normal_text ();
 	}
-      char c = *line;
-      if (c == '\0' || c == '\t' || c == '\r')
-	c = ' ';
-      if (c != ' ')
+
+      /* Get the display width of the next character to be output, expanding
+	 tabs and replacing some control bytes with spaces as necessary.  */
+      const char *c = dw.next_byte ();
+      const int start_disp_col = dw.display_cols_processed () + 1;
+      const int this_display_width = dw.process_next_codepoint ();
+      if (*c == '\t')
+	{
+	  /* The returned display width is the number of spaces into which the
+	     tab should be expanded.  */
+	  for (int i = 0; i != this_display_width; ++i)
+	    pp_space (m_pp);
+	  continue;
+	}
+      if (*c == '\0' || *c == '\r')
 	{
-	  last_non_ws = col_byte;
-	  if (first_non_ws == INT_MAX)
-	    first_non_ws = col_byte;
+	  /* cpp_wcwidth() promises to return 1 for all control bytes, and we
+	     want to output these as a single space too, so this case is
+	     actually the same as the '\t' case.  */
+	  gcc_assert (this_display_width == 1);
+	  pp_space (m_pp);
+	  continue;
 	}
-      pp_character (m_pp, c);
-      line++;
+
+      /* We have a (possibly multibyte) character to output; update the line
+	 bounds if it is not whitespace.  */
+      if (*c != ' ')
+	{
+	  lbounds.m_last_non_ws_disp_col = dw.display_cols_processed ();
+	  if (lbounds.m_first_non_ws_disp_col == INT_MAX)
+	    lbounds.m_first_non_ws_disp_col = start_disp_col;
+	}
+
+      /* Output the character.  */
+      while (c != dw.next_byte ()) pp_character (m_pp, *c++);
     }
   print_newline ();
-
-  lbounds_out->m_first_non_ws = first_non_ws;
-  lbounds_out->m_last_non_ws = last_non_ws;
+  return lbounds;
 }
 
 /* Determine if we should print an annotation line for ROW.
@@ -1576,14 +1601,13 @@ layout::start_annotation_line (char margin_char) const
 }
 
 /* Print a line consisting of the caret/underlines for the given
-   source line.  This function works with display columns, rather than byte
-   counts; in particular, LBOUNDS should be in display column units.  */
+   source line.  */
 
 void
 layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
 {
   int x_bound = get_x_bound_for_row (row, m_exploc.m_display_col,
-				     lbounds.m_last_non_ws);
+				     lbounds.m_last_non_ws_disp_col);
 
   start_annotation_line ();
   pp_space (m_pp);
@@ -1593,8 +1617,8 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
       bool in_range_p;
       point_state state;
       in_range_p = get_state_at_point (row, column,
-				       lbounds.m_first_non_ws,
-				       lbounds.m_last_non_ws,
+				       lbounds.m_first_non_ws_disp_col,
+				       lbounds.m_last_non_ws_disp_col,
 				       CU_DISPLAY_COLS,
 				       &state);
       if (in_range_p)
@@ -1631,12 +1655,14 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
 class line_label
 {
 public:
-  line_label (int state_idx, int column, label_text text)
+  line_label (diagnostic_context *context, int state_idx, int column,
+	      label_text text)
   : m_state_idx (state_idx), m_column (column),
     m_text (text), m_label_line (0), m_has_vbar (true)
   {
     const int bytes = strlen (text.m_buffer);
-    m_display_width = cpp_display_width (text.m_buffer, bytes);
+    m_display_width
+      = cpp_display_width (text.m_buffer, bytes, context->tabstop);
   }
 
   /* Sorting is primarily by column, then by state index.  */
@@ -1696,7 +1722,7 @@ layout::print_any_labels (linenum_type row)
 	if (text.m_buffer == NULL)
 	  continue;
 
-	labels.safe_push (line_label (i, disp_col, text));
+	labels.safe_push (line_label (m_context, i, disp_col, text));
       }
   }
 
@@ -1976,7 +2002,8 @@ public:
 
 /* Get the range of bytes or display columns that HINT would affect.  */
 static column_range
-get_affected_range (const fixit_hint *hint, enum column_unit col_unit)
+get_affected_range (diagnostic_context *context,
+		    const fixit_hint *hint, enum column_unit col_unit)
 {
   expanded_location exploc_start = expand_location (hint->get_start_loc ());
   expanded_location exploc_finish = expand_location (hint->get_next_loc ());
@@ -1986,11 +2013,13 @@ get_affected_range (const fixit_hint *hint, enum column_unit col_unit)
   int finish_column;
   if (col_unit == CU_DISPLAY_COLS)
     {
-      start_column = location_compute_display_column (exploc_start);
+      start_column
+	= location_compute_display_column (exploc_start, context->tabstop);
       if (hint->insertion_p ())
 	finish_column = start_column - 1;
       else
-	finish_column = location_compute_display_column (exploc_finish);
+	finish_column
+	  = location_compute_display_column (exploc_finish, context->tabstop);
     }
   else
     {
@@ -2003,12 +2032,12 @@ get_affected_range (const fixit_hint *hint, enum column_unit col_unit)
 /* Get the range of display columns that would be printed for HINT.  */
 
 static column_range
-get_printed_columns (const fixit_hint *hint)
+get_printed_columns (diagnostic_context *context, const fixit_hint *hint)
 {
   expanded_location exploc = expand_location (hint->get_start_loc ());
-  int start_column = location_compute_display_column (exploc);
-  int hint_width = cpp_display_width (hint->get_string (),
-				      hint->get_length ());
+  int start_column = location_compute_display_column (exploc, context->tabstop);
+  int hint_width = cpp_display_width (hint->get_string (), hint->get_length (),
+				      context->tabstop);
   int final_hint_column = start_column + hint_width - 1;
   if (hint->insertion_p ())
     {
@@ -2018,7 +2047,8 @@ get_printed_columns (const fixit_hint *hint)
     {
       exploc = expand_location (hint->get_next_loc ());
       --exploc.column;
-      int finish_column = location_compute_display_column (exploc);
+      int finish_column
+	= location_compute_display_column (exploc, context->tabstop);
       return column_range (start_column,
 			   MAX (finish_column, final_hint_column));
     }
@@ -2035,12 +2065,14 @@ public:
   correction (column_range affected_bytes,
 	      column_range affected_columns,
 	      column_range printed_columns,
-	      const char *new_text, size_t new_text_len)
+	      const char *new_text, size_t new_text_len,
+	      int tabstop)
   : m_affected_bytes (affected_bytes),
     m_affected_columns (affected_columns),
     m_printed_columns (printed_columns),
     m_text (xstrdup (new_text)),
     m_byte_length (new_text_len),
+    m_tabstop (tabstop),
     m_alloc_sz (new_text_len + 1)
   {
     compute_display_cols ();
@@ -2058,7 +2090,7 @@ public:
 
   void compute_display_cols ()
   {
-    m_display_cols = cpp_display_width (m_text, m_byte_length);
+    m_display_cols = cpp_display_width (m_text, m_byte_length, m_tabstop);
   }
 
   void overwrite (int dst_offset, const char_span &src_span)
@@ -2086,6 +2118,7 @@ public:
   char *m_text;
   size_t m_byte_length; /* Not including null-terminator.  */
   int m_display_cols;
+  int m_tabstop;
   size_t m_alloc_sz;
 };
 
@@ -2121,13 +2154,15 @@ correction::ensure_terminated ()
 class line_corrections
 {
 public:
-  line_corrections (const char *filename, linenum_type row)
-  : m_filename (filename), m_row (row)
+  line_corrections (diagnostic_context *context, const char *filename,
+		    linenum_type row)
+    : m_context (context), m_filename (filename), m_row (row)
   {}
   ~line_corrections ();
 
   void add_hint (const fixit_hint *hint);
 
+  diagnostic_context *m_context;
   const char *m_filename;
   linenum_type m_row;
   auto_vec <correction *> m_corrections;
@@ -2173,9 +2208,10 @@ source_line::source_line (const char *filename, int line)
 void
 line_corrections::add_hint (const fixit_hint *hint)
 {
-  column_range affected_bytes = get_affected_range (hint, CU_BYTES);
-  column_range affected_columns = get_affected_range (hint, CU_DISPLAY_COLS);
-  column_range printed_columns = get_printed_columns (hint);
+  column_range affected_bytes = get_affected_range (m_context, hint, CU_BYTES);
+  column_range affected_columns = get_affected_range (m_context, hint,
+						      CU_DISPLAY_COLS);
+  column_range printed_columns = get_printed_columns (m_context, hint);
 
   /* Potentially consolidate.  */
   if (!m_corrections.is_empty ())
@@ -2243,7 +2279,8 @@ line_corrections::add_hint (const fixit_hint *hint)
 					   affected_columns,
 					   printed_columns,
 					   hint->get_string (),
-					   hint->get_length ()));
+					   hint->get_length (),
+					   m_context->tabstop));
 }
 
 /* If there are any fixit hints on source line ROW, print them.
@@ -2257,7 +2294,7 @@ layout::print_trailing_fixits (linenum_type row)
 {
   /* Build a list of correction instances for the line,
      potentially consolidating hints (for the sake of readability).  */
-  line_corrections corrections (m_exploc.file, row);
+  line_corrections corrections (m_context, m_exploc.file, row);
   for (unsigned int i = 0; i < m_fixit_hints.length (); i++)
     {
       const fixit_hint *hint = m_fixit_hints[i];
@@ -2499,15 +2536,11 @@ layout::print_line (linenum_type row)
   if (!line)
     return;
 
-  line_bounds lbounds;
   print_leading_fixits (row);
-  print_source_line (row, line.get_buffer (), line.length (), &lbounds);
+  const line_bounds lbounds
+    = print_source_line (row, line.get_buffer (), line.length ());
   if (should_print_annotation_line_p (row))
-    {
-      if (lbounds.m_first_non_ws != INT_MAX)
-	lbounds.convert_to_display_cols (line);
-      print_annotation_line (row, lbounds);
-    }
+    print_annotation_line (row, lbounds);
   if (m_show_labels_p)
     print_any_labels (row);
   print_trailing_fixits (row);
@@ -2670,9 +2703,11 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
 
   char_span lspan = location_get_source_line (tmp.get_filename (), 1);
   ASSERT_EQ (line_display_cols,
-	     cpp_display_width (lspan.get_buffer (), lspan.length ()));
+	     cpp_display_width (lspan.get_buffer (), lspan.length (),
+				def_tabstop));
   ASSERT_EQ (line_display_cols,
-	     location_compute_display_column (expand_location (line_end)));
+	     location_compute_display_column (expand_location (line_end),
+					      def_tabstop));
   ASSERT_EQ (0, memcmp (lspan.get_buffer () + (emoji_col - 1),
 			"\xf0\x9f\x98\x82\xf0\x9f\x98\x82", 8));
 
@@ -2774,6 +2809,111 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
 
 }
 
+static void
+test_layout_x_offset_display_tab (const line_table_case &case_)
+{
+  const char *content
+    = "This line is very long, so that we can use it to test the logic for "
+      "clipping long lines.  Also this: `\t' is a tab that occupies 1 byte and "
+      "a variable number of display columns, starting at column #103.\n";
+
+  /* Number of bytes in the line, subtracting one to remove the newline.  */
+  const int line_bytes = strlen (content) - 1;
+
+ /* The column where the tab begins.  Byte or display is the same as there are
+    no multibyte characters earlier on the line.  */
+  const int tab_col = 103;
+
+  /* Effective extra size of the tab beyond what a single space would have taken
+     up, indexed by tabstop.  */
+  static const int num_tabstops = 11;
+  int extra_width[num_tabstops];
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      const int this_tab_size = tabstop - (tab_col - 1) % tabstop;
+      extra_width[tabstop] = this_tab_size - 1;
+    }
+  /* Example of this calculation: if tabstop is 10, the tab starting at column
+     #103 has to expand into 8 spaces, covering columns 103-110, so that the
+     next character is at column #111.  So it takes up 7 more columns than
+     a space would have taken up.  */
+  ASSERT_EQ (7, extra_width[10]);
+
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+  line_table_test ltt (case_);
+
+  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+
+  location_t line_end = linemap_position_for_column (line_table, line_bytes);
+
+  /* Don't attempt to run the tests if column data might be unavailable.  */
+  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return;
+
+  /* Check that cpp_display_width handles the tabs as expected.  */
+  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  ASSERT_EQ ('\t', *(lspan.get_buffer () + (tab_col - 1)));
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      ASSERT_EQ (line_bytes + extra_width[tabstop],
+		 cpp_display_width (lspan.get_buffer (), lspan.length (),
+				    tabstop));
+      ASSERT_EQ (line_bytes + extra_width[tabstop],
+		 location_compute_display_column (expand_location (line_end),
+						  tabstop));
+    }
+
+  /* Check that the tab is expanded to the expected number of spaces.  */
+  rich_location richloc (line_table,
+			 linemap_position_for_column (line_table,
+						      tab_col + 1));
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      test_diagnostic_context dc;
+      dc.tabstop = tabstop;
+      layout test_layout (&dc, &richloc, DK_ERROR);
+      test_layout.print_line (1);
+      const char *out = pp_formatted_text (dc.printer);
+      ASSERT_EQ (NULL, strchr (out, '\t'));
+      const char *left_quote = strchr (out, '`');
+      const char *right_quote = strchr (out, '\'');
+      ASSERT_NE (NULL, left_quote);
+      ASSERT_NE (NULL, right_quote);
+      ASSERT_EQ (right_quote - left_quote, extra_width[tabstop] + 2);
+    }
+
+  /* Check that the line is offset properly and that the tab is broken up
+     into the expected number of spaces when it is the last character skipped
+     over.  */
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      test_diagnostic_context dc;
+      dc.tabstop = tabstop;
+      static const int small_width = 24;
+      dc.caret_max_width = small_width - 4;
+      dc.min_margin_width = test_left_margin - test_linenum_sep + 1;
+      dc.show_line_numbers_p = true;
+      layout test_layout (&dc, &richloc, DK_ERROR);
+      test_layout.print_line (1);
+
+      /* We have arranged things so that two columns will be printed before
+	 the caret.  If the tab results in more than one space, this should
+	 produce two spaces in the output; otherwise, it will be a single space
+	 preceded by the opening quote before the tab character.  */
+      const char *output1
+	= "   1 |   ' is a tab that occupies 1 byte and a variable number of "
+	  "display columns, starting at column #103.\n"
+	  "     |   ^\n\n";
+      const char *output2
+	= "   1 | ` ' is a tab that occupies 1 byte and a variable number of "
+	  "display columns, starting at column #103.\n"
+	  "     |   ^\n\n";
+      const char *expected_output = (extra_width[tabstop] ? output1 : output2);
+      ASSERT_STREQ (expected_output, pp_formatted_text (dc.printer));
+    }
+}
+
+
 /* Verify that diagnostic_show_locus works sanely on UNKNOWN_LOCATION.  */
 
 static void
@@ -3854,6 +3994,27 @@ test_one_liner_labels_utf8 ()
   }
 }
 
+/* Make sure that colorization codes don't interrupt a multibyte
+   sequence, which would corrupt it.  */
+static void
+test_one_liner_colorized_utf8 ()
+{
+  test_diagnostic_context dc;
+  dc.colorize_source_p = true;
+  diagnostic_color_init (&dc, DIAGNOSTICS_COLOR_YES);
+  const location_t pi = linemap_position_for_column (line_table, 12);
+  rich_location richloc (line_table, pi);
+  diagnostic_show_locus (&dc, &richloc, DK_ERROR);
+
+  /* In order to avoid having the test depend on exactly how the colorization
+     was effected, just confirm there are two pi characters in the output.  */
+  const char *result = pp_formatted_text (dc.printer);
+  const char *null_term = result + strlen (result);
+  const char *first_pi = strstr (result, "\xcf\x80");
+  ASSERT_TRUE (first_pi && first_pi <= null_term - 2);
+  ASSERT_STR_CONTAINS (first_pi + 2, "\xcf\x80");
+}
+
 /* Run the various one-liner tests.  */
 
 static void
@@ -3884,8 +4045,10 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
   ASSERT_EQ (31, LOCATION_COLUMN (line_end));
 
   char_span lspan = location_get_source_line (tmp.get_filename (), 1);
-  ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length ()));
-  ASSERT_EQ (25, location_compute_display_column (expand_location (line_end)));
+  ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length (),
+				    def_tabstop));
+  ASSERT_EQ (25, location_compute_display_column (expand_location (line_end),
+						  def_tabstop));
 
   test_one_liner_simple_caret_utf8 ();
   test_one_liner_caret_and_range_utf8 ();
@@ -3900,6 +4063,7 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
   test_one_liner_many_fixits_1_utf8 ();
   test_one_liner_many_fixits_2_utf8 ();
   test_one_liner_labels_utf8 ();
+  test_one_liner_colorized_utf8 ();
 }
 
 /* Verify that gcc_rich_location::add_location_if_nearby works.  */
@@ -4272,25 +4436,28 @@ test_overlapped_fixit_printing (const line_table_case &case_)
     /* Unit-test the line_corrections machinery.  */
     ASSERT_EQ (3, richloc.get_num_fixit_hints ());
     const fixit_hint *hint_0 = richloc.get_fixit_hint (0);
-    ASSERT_EQ (column_range (12, 12), get_affected_range (hint_0, CU_BYTES));
     ASSERT_EQ (column_range (12, 12),
-			   get_affected_range (hint_0, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (12, 22), get_printed_columns (hint_0));
+	       get_affected_range (&dc, hint_0, CU_BYTES));
+    ASSERT_EQ (column_range (12, 12),
+	       get_affected_range (&dc, hint_0, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (12, 22), get_printed_columns (&dc, hint_0));
     const fixit_hint *hint_1 = richloc.get_fixit_hint (1);
-    ASSERT_EQ (column_range (18, 18), get_affected_range (hint_1, CU_BYTES));
     ASSERT_EQ (column_range (18, 18),
-			   get_affected_range (hint_1, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (18, 20), get_printed_columns (hint_1));
+	       get_affected_range (&dc, hint_1, CU_BYTES));
+    ASSERT_EQ (column_range (18, 18),
+	       get_affected_range (&dc, hint_1, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (18, 20), get_printed_columns (&dc, hint_1));
     const fixit_hint *hint_2 = richloc.get_fixit_hint (2);
-    ASSERT_EQ (column_range (29, 28), get_affected_range (hint_2, CU_BYTES));
     ASSERT_EQ (column_range (29, 28),
-			   get_affected_range (hint_2, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (29, 29), get_printed_columns (hint_2));
+	       get_affected_range (&dc, hint_2, CU_BYTES));
+    ASSERT_EQ (column_range (29, 28),
+	       get_affected_range (&dc, hint_2, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (29, 29), get_printed_columns (&dc, hint_2));
 
     /* Add each hint in turn to a line_corrections instance,
        and verify that they are consolidated into one correction instance
        as expected.  */
-    line_corrections lc (tmp.get_filename (), 1);
+    line_corrections lc (&dc, tmp.get_filename (), 1);
 
     /* The first replace hint by itself.  */
     lc.add_hint (hint_0);
@@ -4484,25 +4651,28 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)
     /* Unit-test the line_corrections machinery.  */
     ASSERT_EQ (3, richloc.get_num_fixit_hints ());
     const fixit_hint *hint_0 = richloc.get_fixit_hint (0);
-    ASSERT_EQ (column_range (14, 14), get_affected_range (hint_0, CU_BYTES));
+    ASSERT_EQ (column_range (14, 14),
+	       get_affected_range (&dc, hint_0, CU_BYTES));
     ASSERT_EQ (column_range (12, 12),
-			   get_affected_range (hint_0, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (12, 22), get_printed_columns (hint_0));
+	       get_affected_range (&dc, hint_0, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (12, 22), get_printed_columns (&dc, hint_0));
     const fixit_hint *hint_1 = richloc.get_fixit_hint (1);
-    ASSERT_EQ (column_range (22, 22), get_affected_range (hint_1, CU_BYTES));
+    ASSERT_EQ (column_range (22, 22),
+	       get_affected_range (&dc, hint_1, CU_BYTES));
     ASSERT_EQ (column_range (18, 18),
-			   get_affected_range (hint_1, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (18, 20), get_printed_columns (hint_1));
+	       get_affected_range (&dc, hint_1, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (18, 20), get_printed_columns (&dc, hint_1));
     const fixit_hint *hint_2 = richloc.get_fixit_hint (2);
-    ASSERT_EQ (column_range (35, 34), get_affected_range (hint_2, CU_BYTES));
+    ASSERT_EQ (column_range (35, 34),
+	       get_affected_range (&dc, hint_2, CU_BYTES));
     ASSERT_EQ (column_range (30, 29),
-			   get_affected_range (hint_2, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (30, 30), get_printed_columns (hint_2));
+	       get_affected_range (&dc, hint_2, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (30, 30), get_printed_columns (&dc, hint_2));
 
     /* Add each hint in turn to a line_corrections instance,
        and verify that they are consolidated into one correction instance
        as expected.  */
-    line_corrections lc (tmp.get_filename (), 1);
+    line_corrections lc (&dc, tmp.get_filename (), 1);
 
     /* The first replace hint by itself.  */
     lc.add_hint (hint_0);
@@ -4689,6 +4859,8 @@ test_overlapped_fixit_printing_2 (const line_table_case &case_)
 
   /* Two insertions, in the wrong order.  */
   {
+    test_diagnostic_context dc;
+
     rich_location richloc (line_table, col_20);
     richloc.add_fixit_insert_before (col_23, "{");
     richloc.add_fixit_insert_before (col_21, "}");
@@ -4696,14 +4868,15 @@ test_overlapped_fixit_printing_2 (const line_table_case &case_)
     /* These fixits should be accepted; they can't be consolidated.  */
     ASSERT_EQ (2, richloc.get_num_fixit_hints ());
     const fixit_hint *hint_0 = richloc.get_fixit_hint (0);
-    ASSERT_EQ (column_range (23, 22), get_affected_range (hint_0, CU_BYTES));
-    ASSERT_EQ (column_range (23, 23), get_printed_columns (hint_0));
+    ASSERT_EQ (column_range (23, 22),
+	       get_affected_range (&dc, hint_0, CU_BYTES));
+    ASSERT_EQ (column_range (23, 23), get_printed_columns (&dc, hint_0));
     const fixit_hint *hint_1 = richloc.get_fixit_hint (1);
-    ASSERT_EQ (column_range (21, 20), get_affected_range (hint_1, CU_BYTES));
-    ASSERT_EQ (column_range (21, 21), get_printed_columns (hint_1));
+    ASSERT_EQ (column_range (21, 20),
+	       get_affected_range (&dc, hint_1, CU_BYTES));
+    ASSERT_EQ (column_range (21, 21), get_printed_columns (&dc, hint_1));
 
     /* Verify that they're printed correctly.  */
-    test_diagnostic_context dc;
     diagnostic_show_locus (&dc, &richloc, DK_ERROR);
     ASSERT_STREQ (" int a5[][0][0] = { 1, 2 };\n"
 		  "                    ^\n"
@@ -4955,6 +5128,65 @@ test_fixit_deletion_affecting_newline (const line_table_case &case_)
 		pp_formatted_text (dc.printer));
 }
 
+static void
+test_tab_expansion (const line_table_case &case_)
+{
+  /* Create a tempfile and write some text to it.  This example uses a tabstop
+     of 8, as the column numbers attempt to indicate:
+
+    .....................000.01111111111.22222333333  display
+    .....................123.90123456789.56789012345  columns  */
+  const char *content = "  \t   This: `\t' is a tab.\n";
+  /* ....................000 00000011111 11111222222  byte
+     ....................123 45678901234 56789012345  columns  */
+
+  const int tabstop = 8;
+  const int first_non_ws_byte_col = 7;
+  const int right_quote_byte_col = 15;
+  const int last_byte_col = 25;
+  ASSERT_EQ (35, cpp_display_width (content, last_byte_col, tabstop));
+
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+  line_table_test ltt (case_);
+  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+
+  /* Don't attempt to run the tests if column data might be unavailable.  */
+  location_t line_end = linemap_position_for_column (line_table, last_byte_col);
+  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return;
+
+  /* Check that the leading whitespace with mixed tabs and spaces is expanded
+     into 11 spaces.  Recall that print_line() also puts one space before
+     everything too.  */
+  {
+    test_diagnostic_context dc;
+    dc.tabstop = tabstop;
+    rich_location richloc (line_table,
+			   linemap_position_for_column (line_table,
+							first_non_ws_byte_col));
+    layout test_layout (&dc, &richloc, DK_ERROR);
+    test_layout.print_line (1);
+    ASSERT_STREQ ("            This: `      ' is a tab.\n"
+		  "            ^\n",
+		  pp_formatted_text (dc.printer));
+  }
+
+  /* Confirm the display width was tracked correctly across the internal tab
+     as well.  */
+  {
+    test_diagnostic_context dc;
+    dc.tabstop = tabstop;
+    rich_location richloc (line_table,
+			   linemap_position_for_column (line_table,
+							right_quote_byte_col));
+    layout test_layout (&dc, &richloc, DK_ERROR);
+    test_layout.print_line (1);
+    ASSERT_STREQ ("            This: `      ' is a tab.\n"
+		  "                         ^\n",
+		  pp_formatted_text (dc.printer));
+  }
+}
+
 /* Verify that line numbers are correctly printed for the case of
    a multiline range in which the width of the line numbers changes
    (e.g. from "9" to "10").  */
@@ -5012,6 +5244,7 @@ diagnostic_show_locus_c_tests ()
   test_layout_range_for_multiple_lines ();
 
   for_each_line_table_case (test_layout_x_offset_display_utf8);
+  for_each_line_table_case (test_layout_x_offset_display_tab);
 
   test_get_line_bytes_without_trailing_whitespace ();
 
@@ -5029,6 +5262,7 @@ diagnostic_show_locus_c_tests ()
   for_each_line_table_case (test_fixit_insert_containing_newline_2);
   for_each_line_table_case (test_fixit_replace_containing_newline);
   for_each_line_table_case (test_fixit_deletion_affecting_newline);
+  for_each_line_table_case (test_tab_expansion);
 
   test_line_numbers_multiline_range ();
 }
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index ed52bc03d17..1b6c9845892 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "selftest.h"
 #include "selftest-diagnostic.h"
 #include "opts.h"
+#include "cpplib.h"
 
 #ifdef HAVE_TERMIOS_H
 # include <termios.h>
@@ -219,6 +220,9 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
   context->min_margin_width = 0;
   context->show_ruler_p = false;
   context->parseable_fixits_p = false;
+  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;
+  context->column_origin = 1;
+  context->tabstop = 8;
   context->edit_context_ptr = NULL;
   context->diagnostic_group_nesting_depth = 0;
   context->diagnostic_group_emission_count = 0;
@@ -353,8 +357,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)
   return diagnostic_kind_color[kind];
 }
 
+/* Given an expanded_location, convert the column (which is in 1-based bytes)
+   to the requested units and origin.  Return -1 if the column is
+   invalid (<= 0).  */
+int
+diagnostic_converted_column (diagnostic_context *context, expanded_location s)
+{
+  if (s.column <= 0)
+    return -1;
+
+  int one_based_col;
+  switch (context->column_unit)
+    {
+    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:
+      one_based_col = location_compute_display_column (s, context->tabstop);
+      break;
+
+    case DIAGNOSTICS_COLUMN_UNIT_BYTE:
+      one_based_col = s.column;
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return one_based_col + (context->column_origin - 1);
+}
+
 /* Return a formatted line and column ':%line:%column'.  Elided if
-   zero.  The result is a statically allocated buffer.  */
+   line == 0 or col < 0.  (A column of 0 may be valid due to the
+   -fdiagnostics-column-origin option.)
+   The result is a statically allocated buffer.  */
 
 static const char *
 maybe_line_and_column (int line, int col)
@@ -363,8 +396,9 @@ maybe_line_and_column (int line, int col)
 
   if (line)
     {
-      size_t l = snprintf (result, sizeof (result),
-			   col ? ":%d:%d" : ":%d", line, col);
+      size_t l
+	= snprintf (result, sizeof (result),
+		    col >= 0 ? ":%d:%d" : ":%d", line, col);
       gcc_checking_assert (l < sizeof (result));
     }
   else
@@ -383,8 +417,14 @@ diagnostic_get_location_text (diagnostic_context *context,
   const char *locus_cs = colorize_start (pp_show_color (pp), "locus");
   const char *locus_ce = colorize_stop (pp_show_color (pp));
   const char *file = s.file ? s.file : progname;
-  int line = strcmp (file, N_("<built-in>")) ? s.line : 0;
-  int col = context->show_column ? s.column : 0;
+  int line = 0;
+  int col = -1;
+  if (strcmp (file, N_("<built-in>")))
+    {
+      line = s.line;
+      if (context->show_column)
+	col = diagnostic_converted_column (context, s);
+    }
 
   const char *line_col = maybe_line_and_column (line, col);
   return build_message_string ("%s%s%s:%s", locus_cs, file,
@@ -650,14 +690,20 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
       if (! MAIN_FILE_P (map))
 	{
 	  bool first = true;
+	  expanded_location s = {};
 	  do
 	    {
 	      where = linemap_included_from (map);
 	      map = linemap_included_from_linemap (line_table, map);
-	      const char *line_col
-		= maybe_line_and_column (SOURCE_LINE (map, where),
-					 first && context->show_column
-					 ? SOURCE_COLUMN (map, where) : 0);
+	      s.file = LINEMAP_FILE (map);
+	      s.line = SOURCE_LINE (map, where);
+	      int col = -1;
+	      if (first && context->show_column)
+		{
+		  s.column = SOURCE_COLUMN (map, where);
+		  col = diagnostic_converted_column (context, s);
+		}
+	      const char *line_col = maybe_line_and_column (s.line, col);
 	      static const char *const msgs[] =
 		{
 		 N_("In file included from"),
@@ -666,7 +712,7 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
 	      unsigned index = !first;
 	      pp_verbatim (context->printer, "%s%s %r%s%s%R",
 			   first ? "" : ",\n", _(msgs[index]),
-			   "locus", LINEMAP_FILE (map), line_col);
+			   "locus", s.file, line_col);
 	      first = false;
 	    }
 	  while (! MAIN_FILE_P (map));
@@ -2042,10 +2088,15 @@ test_print_parseable_fixits_replace ()
 static void
 assert_location_text (const char *expected_loc_text,
 		      const char *filename, int line, int column,
-		      bool show_column)
+		      bool show_column,
+		      int origin = 1,
+		      enum diagnostics_column_unit column_unit
+			= DIAGNOSTICS_COLUMN_UNIT_BYTE)
 {
   test_diagnostic_context dc;
   dc.show_column = show_column;
+  dc.column_unit = column_unit;
+  dc.column_origin = origin;
 
   expanded_location xloc;
   xloc.file = filename;
@@ -2069,7 +2120,10 @@ test_diagnostic_get_location_text ()
   assert_location_text ("PROGNAME:", NULL, 0, 0, true);
   assert_location_text ("<built-in>:", "<built-in>", 42, 10, true);
   assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);
-  assert_location_text ("foo.c:42:", "foo.c", 42, 0, true);
+  assert_location_text ("foo.c:42:9:", "foo.c", 42, 10, true, 0);
+  assert_location_text ("foo.c:42:1010:", "foo.c", 42, 10, true, 1001);
+  for (int origin = 0; origin != 2; ++origin)
+    assert_location_text ("foo.c:42:", "foo.c", 42, 0, true, origin);
   assert_location_text ("foo.c:", "foo.c", 0, 10, true);
   assert_location_text ("foo.c:42:", "foo.c", 42, 10, false);
   assert_location_text ("foo.c:", "foo.c", 0, 10, false);
@@ -2077,6 +2131,41 @@ test_diagnostic_get_location_text ()
   maybe_line_and_column (INT_MAX, INT_MAX);
   maybe_line_and_column (INT_MIN, INT_MIN);
 
+  {
+    /* In order to test display columns vs byte columns, we need to create a
+       file for location_get_source_line() to read.  */
+
+    const char *const content = "smile \xf0\x9f\x98\x82\n";
+    const int line_bytes = strlen (content) - 1;
+    const int def_tabstop = 8;
+    const int display_width = cpp_display_width (content, line_bytes,
+						 def_tabstop);
+    ASSERT_EQ (line_bytes - 2, display_width);
+    temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+    const char *const fname = tmp.get_filename ();
+    const int buf_len = strlen (fname) + 16;
+    char *const expected = XNEWVEC (char, buf_len);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    XDELETEVEC (expected);
+  }
+
+
   progname = old_progname;
 }
 
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 307dbcfb34a..75706c5f4d8 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -24,6 +24,20 @@ along with GCC; see the file COPYING3.  If not see
 #include "pretty-print.h"
 #include "diagnostic-core.h"
 
+/* An enum for controlling what units to use for the column number
+   when diagnostics are output, used by the -fdiagnostics-column-unit option.
+   Tabs will be expanded or not according to the value of -ftabstop.  The origin
+   (default 1) is controlled by -fdiagnostics-column-origin.  */
+
+enum diagnostics_column_unit
+{
+  /* The new default: display columns.  */
+  DIAGNOSTICS_COLUMN_UNIT_DISPLAY,
+
+  /* The historical behavior: simple bytes.  */
+  DIAGNOSTICS_COLUMN_UNIT_BYTE
+};
+
 /* Enum for overriding the standard output format.  */
 
 enum diagnostics_output_format
@@ -280,6 +294,15 @@ struct diagnostic_context
      rest of the diagnostic.  */
   bool parseable_fixits_p;
 
+  /* What units to use when outputting the column number.  */
+  enum diagnostics_column_unit column_unit;
+
+  /* The origin for the column number (1-based or 0-based typically).  */
+  int column_origin;
+
+  /* The size of the tabstop for tab expansion.  */
+  int tabstop;
+
   /* If non-NULL, an edit_context to which fix-it hints should be
      applied, for generating patches.  */
   edit_context *edit_context_ptr;
@@ -458,6 +481,8 @@ diagnostic_same_line (const diagnostic_context *context,
 }
 
 extern const char *diagnostic_get_color_for_kind (diagnostic_t kind);
+extern int diagnostic_converted_column (diagnostic_context *context,
+					expanded_location s);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
@@ -470,6 +495,7 @@ extern void diagnostic_output_format_init (diagnostic_context *,
 /* Compute the number of digits in the decimal representation of an integer.  */
 extern int num_digits (int);
 
-extern json::value *json_from_expanded_location (location_t loc);
+extern json::value *json_from_expanded_location (diagnostic_context *context,
+						 location_t loc);
 
 #endif /* ! GCC_DIAGNOSTIC_H */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 06a04e3d7dd..f463275bc8b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -292,7 +292,9 @@ Objective-C and Objective-C++ Dialects}.
 -fdiagnostics-show-template-tree  -fno-elide-type @gol
 -fdiagnostics-path-format=@r{[}none@r{|}separate-events@r{|}inline-events@r{]} @gol
 -fdiagnostics-show-path-depths @gol
--fno-show-column}
+-fno-show-column @gol
+-fdiagnostics-column-unit=@r{[}display@r{|}byte@r{]} @gol
+-fdiagnostics-column-origin=@var{origin}}
 
 @item Warning Options
 @xref{Warning Options,,Options to Request or Suppress Warnings}.
@@ -4729,6 +4731,31 @@ Do not print column numbers in diagnostics.  This may be necessary if
 diagnostics are being scanned by a program that does not understand the
 column numbers, such as @command{dejagnu}.
 
+@item -fdiagnostics-column-unit=@var{UNIT}
+@opindex fdiagnostics-column-unit
+Select the units for the column number.  This affects traditional diagnostics
+(in the absence of @option{-fno-show-column}), as well as JSON format
+diagnostics if requested.
+
+The default @var{UNIT}, @samp{display}, considers the number of display
+columns occupied by each character.  This may be larger than the number
+of bytes required to encode the character, in the case of tab
+characters, or it may be smaller, in the case of multibyte characters.
+For example, the character ``@U{03C0}'' occupies one display column,
+and its UTF-8 encoding requires two bytes; the character ``@U{1F642}''
+occupies two display columns, and its UTF-8 encoding requires four
+bytes.
+
+Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte
+count in all cases, as was traditionally output by GCC prior to version 11.1.0.
+
+@item -fdiagnostics-column-origin=@var{ORIGIN}
+@opindex fdiagnostics-column-origin
+Select the origin for column numbers, i.e. the column number assigned to the
+first column.  The default value of 1 corresponds to traditional GCC
+behavior and to the GNU style guide.  Some utilities may perform better with an
+origin of 0; any non-negative value may be specified.
+
 @item -fdiagnostics-format=@var{FORMAT}
 @opindex fdiagnostics-format
 Select a different format for printing diagnostics.
@@ -4764,11 +4791,15 @@ might be printed in JSON form (after formatting) like this:
         "locations": [
             @{
                 "caret": @{
+		    "display-column": 3,
+		    "byte-column": 3,
                     "column": 3,
                     "file": "misleading-indentation.c",
                     "line": 15
                 @},
                 "finish": @{
+		    "display-column": 4,
+		    "byte-column": 4,
                     "column": 4,
                     "file": "misleading-indentation.c",
                     "line": 15
@@ -4784,6 +4815,8 @@ might be printed in JSON form (after formatting) like this:
                 "locations": [
                     @{
                         "caret": @{
+			    "display-column": 5,
+			    "byte-column": 5,
                             "column": 5,
                             "file": "misleading-indentation.c",
                             "line": 17
@@ -4793,6 +4826,7 @@ might be printed in JSON form (after formatting) like this:
                 "message": "...this statement, but the latter is @dots{}"
             @}
         ]
+	"column-origin": 1,
     @},
     @dots{}
 ]
@@ -4805,10 +4839,34 @@ A diagnostic has a @code{kind}.  If this is @code{warning}, then there is
 an @code{option} key describing the command-line option controlling the
 warning.
 
-A diagnostic can contain zero or more locations.  Each location has up
-to three positions within it: a @code{caret} position and optional
-@code{start} and @code{finish} positions.  A location can also have
-an optional @code{label} string.  For example, this error:
+A diagnostic can contain zero or more locations.  Each location has an
+optional @code{label} string and up to three positions within it: a
+@code{caret} position and optional @code{start} and @code{finish} positions.
+A position is described by a @code{file} name, a @code{line} number, and
+three numbers indicating a column position:
+@itemize @bullet
+
+@item
+@code{display-column} counts display columns, accounting for tabs and
+multibyte characters.
+
+@item
+@code{byte-column} counts raw bytes.
+
+@item
+@code{column} is equal to one of
+the previous two, as dictated by the @option{-fdiagnostics-column-unit}
+option.
+
+@end itemize
+All three columns are relative to the origin specified by
+@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may
+be set, for instance, to 0 for compatibility with other utilities that
+number columns from 0.  The column origin is recorded in the JSON output in
+the @code{column-origin} tag.  In the remaining examples below, the extra
+column number outputs have been omitted for brevity.
+
+For example, this error:
 
 @smallexample
 bad-binary-ops.c:64:23: error: invalid operands to binary + (have 'S' @{aka
diff --git a/gcc/input.c b/gcc/input.c
index dd1d23df2f7..d573b90341a 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -913,7 +913,7 @@ make_location (location_t caret, source_range src_range)
    source line in order to calculate the display width.  If that cannot be done
    for any reason, then returns the byte column as a fallback.  */
 int
-location_compute_display_column (expanded_location exploc)
+location_compute_display_column (expanded_location exploc, int tabstop)
 {
   if (!(exploc.file && *exploc.file && exploc.line && exploc.column))
     return exploc.column;
@@ -921,7 +921,7 @@ location_compute_display_column (expanded_location exploc)
   /* If line is NULL, this function returns exploc.column which is the
      desired fallback.  */
   return cpp_byte_column_to_display_column (line.get_buffer (), line.length (),
-					    exploc.column);
+					    exploc.column, tabstop);
 }
 
 /* Dump statistics to stderr about the memory usage of the line_table
@@ -3608,33 +3608,46 @@ test_line_offset_overflow ()
 
 void test_cpp_utf8 ()
 {
+  const int def_tabstop = 8;
   /* Verify that wcwidth of invalid UTF-8 or control bytes is 1.  */
   {
-    int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8);
+    int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8, def_tabstop);
     ASSERT_EQ (8, w_bad);
-    int w_ctrl = cpp_display_width ("\r\t\n\v\0\1", 6);
-    ASSERT_EQ (6, w_ctrl);
+    int w_ctrl = cpp_display_width ("\r\n\v\0\1", 5, def_tabstop);
+    ASSERT_EQ (5, w_ctrl);
   }
 
   /* Verify that wcwidth of valid UTF-8 is as expected.  */
   {
-    const int w_pi = cpp_display_width ("\xcf\x80", 2);
+    const int w_pi = cpp_display_width ("\xcf\x80", 2, def_tabstop);
     ASSERT_EQ (1, w_pi);
-    const int w_emoji = cpp_display_width ("\xf0\x9f\x98\x82", 4);
+    const int w_emoji = cpp_display_width ("\xf0\x9f\x98\x82", 4, def_tabstop);
     ASSERT_EQ (2, w_emoji);
-    const int w_umlaut_precomposed = cpp_display_width ("\xc3\xbf", 2);
+    const int w_umlaut_precomposed = cpp_display_width ("\xc3\xbf", 2,
+							def_tabstop);
     ASSERT_EQ (1, w_umlaut_precomposed);
-    const int w_umlaut_combining = cpp_display_width ("y\xcc\x88", 3);
+    const int w_umlaut_combining = cpp_display_width ("y\xcc\x88", 3,
+						      def_tabstop);
     ASSERT_EQ (1, w_umlaut_combining);
-    const int w_han = cpp_display_width ("\xe4\xb8\xba", 3);
+    const int w_han = cpp_display_width ("\xe4\xb8\xba", 3, def_tabstop);
     ASSERT_EQ (2, w_han);
-    const int w_ascii = cpp_display_width ("GCC", 3);
+    const int w_ascii = cpp_display_width ("GCC", 3, def_tabstop);
     ASSERT_EQ (3, w_ascii);
     const int w_mixed = cpp_display_width ("\xcf\x80 = 3.14 \xf0\x9f\x98\x82"
-					   "\x9f! \xe4\xb8\xba y\xcc\x88", 24);
+					   "\x9f! \xe4\xb8\xba y\xcc\x88",
+					   24, def_tabstop);
     ASSERT_EQ (18, w_mixed);
   }
 
+  /* Verify that display width properly expands tabs.  */
+  {
+    const char *tstr = "\tabc\td";
+    ASSERT_EQ (6, cpp_display_width (tstr, 6, 1));
+    ASSERT_EQ (10, cpp_display_width (tstr, 6, 3));
+    ASSERT_EQ (17, cpp_display_width (tstr, 6, 8));
+    ASSERT_EQ (1, cpp_display_column_to_byte_column (tstr, 6, 7, 8));
+  }
+
   /* Verify that cpp_byte_column_to_display_column can go past the end,
      and similar edge cases.  */
   {
@@ -3645,10 +3658,13 @@ void test_cpp_utf8 ()
       /* 111122223456
 	 Byte columns.  */
 
-    ASSERT_EQ (5, cpp_display_width (str, 6));
-    ASSERT_EQ (105, cpp_byte_column_to_display_column (str, 6, 106));
-    ASSERT_EQ (10000, cpp_byte_column_to_display_column (NULL, 0, 10000));
-    ASSERT_EQ (0, cpp_byte_column_to_display_column (NULL, 10000, 0));
+    ASSERT_EQ (5, cpp_display_width (str, 6, def_tabstop));
+    ASSERT_EQ (105,
+	       cpp_byte_column_to_display_column (str, 6, 106, def_tabstop));
+    ASSERT_EQ (10000,
+	       cpp_byte_column_to_display_column (NULL, 0, 10000, def_tabstop));
+    ASSERT_EQ (0,
+	       cpp_byte_column_to_display_column (NULL, 10000, 0, def_tabstop));
   }
 
   /* Verify that cpp_display_column_to_byte_column can go past the end,
@@ -3662,21 +3678,25 @@ void test_cpp_utf8 ()
       /* 000000000000000000000000000000000111111
 	 111122223333444456666777788889999012345
 	 Byte columns.  */
-    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 2));
-    ASSERT_EQ (15, cpp_display_column_to_byte_column (str, 15, 11));
-    ASSERT_EQ (115, cpp_display_column_to_byte_column (str, 15, 111));
-    ASSERT_EQ (10000, cpp_display_column_to_byte_column (NULL, 0, 10000));
-    ASSERT_EQ (0, cpp_display_column_to_byte_column (NULL, 10000, 0));
+    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 2, def_tabstop));
+    ASSERT_EQ (15,
+	       cpp_display_column_to_byte_column (str, 15, 11, def_tabstop));
+    ASSERT_EQ (115,
+	       cpp_display_column_to_byte_column (str, 15, 111, def_tabstop));
+    ASSERT_EQ (10000,
+	       cpp_display_column_to_byte_column (NULL, 0, 10000, def_tabstop));
+    ASSERT_EQ (0,
+	       cpp_display_column_to_byte_column (NULL, 10000, 0, def_tabstop));
 
     /* Verify that we do not interrupt a UTF-8 sequence.  */
-    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 1));
+    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 1, def_tabstop));
 
     for (int byte_col = 1; byte_col <= 15; ++byte_col)
       {
-	const int disp_col = cpp_byte_column_to_display_column (str, 15,
-								byte_col);
-	const int byte_col2 = cpp_display_column_to_byte_column (str, 15,
-								 disp_col);
+	const int disp_col
+	  = cpp_byte_column_to_display_column (str, 15, byte_col, def_tabstop);
+	const int byte_col2
+	  = cpp_display_column_to_byte_column (str, 15, disp_col, def_tabstop);
 
 	/* If we ask for the display column in the middle of a UTF-8
 	   sequence, it will return the length of the partial sequence,
diff --git a/gcc/input.h b/gcc/input.h
index df48ce63ef9..4790a571c6a 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -38,7 +38,9 @@ STATIC_ASSERT (BUILTINS_LOCATION < RESERVED_LOCATION_COUNT);
 
 extern bool is_location_from_builtin_token (location_t);
 extern expanded_location expand_location (location_t);
-extern int location_compute_display_column (expanded_location);
+
+extern int location_compute_display_column (expanded_location exploc,
+					    int tabstop);
 
 /* A class capturing the bounds of a buffer, to allow for run-time
    bounds-checking in a checked build.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 340d99434b3..525f44d079f 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "opt-suggestions.h"
 #include "diagnostic-color.h"
 #include "selftest.h"
+#include "cpplib.h"
 
 static void set_Wstrict_aliasing (struct gcc_options *opts, int onoff);
 
@@ -2404,6 +2405,14 @@ common_handle_option (struct gcc_options *opts,
       dc->parseable_fixits_p = value;
       break;
 
+    case OPT_fdiagnostics_column_unit_:
+      dc->column_unit = (enum diagnostics_column_unit)value;
+      break;
+
+    case OPT_fdiagnostics_column_origin_:
+      dc->column_origin = value;
+      break;
+
     case OPT_fdiagnostics_show_cwe:
       dc->show_cwe = value;
       break;
@@ -2792,6 +2801,12 @@ common_handle_option (struct gcc_options *opts,
       check_alignment_argument (loc, arg, "functions");
       break;
 
+    case OPT_ftabstop_:
+      /* It is documented that we silently ignore silly values.  */
+      if (value >= 1 && value <= 100)
+	dc->tabstop = value;
+      break;
+
     default:
       /* If the flag was handled in a standard way, assume the lack of
 	 processing here is intentional.  */
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
index 870ba720c5f..2314ad42402 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
@@ -36,20 +36,20 @@ int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
 
 /* { dg-begin-multiline-output "" }
-  if ((err = foo (b)) != 0)
-  ^~
+         if ((err = foo (b)) != 0)
+         ^~
    { dg-end-multiline-output "" } */
 /* { dg-begin-multiline-output "" }
-   goto fail;
-   ^~~~
+                 goto fail;
+                 ^~~~
    { dg-end-multiline-output "" } */
 
 fail:
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
index 5cdeba1cbba..202c6bc7fdf 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
@@ -65,9 +65,9 @@ int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
@@ -178,7 +178,7 @@ void fn_16_tabs (void)
     while (flagA)
       if (flagB) /* { dg-message "7: this 'if' clause does not guard..." } */
 	foo (0);
-	foo (1);/* { dg-message "2: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+	foo (1);/* { dg-message "9: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 }
 
 void fn_17_spaces (void)
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
index 9359db48c17..740becb5548 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
@@ -8,17 +8,22 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#error message\"" } */
 
 /* { dg-regexp "\"caret\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 6" } */
+/* { dg-regexp "\"display-column\": 6" } */
+/* { dg-regexp "\"byte-column\": 6" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
index 557ccf8378b..2f24a6c6596 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
@@ -8,6 +8,7 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"warning\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#warning message\"" } */
 /* { dg-regexp "\"option\": \"-Wcpp\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */
@@ -16,11 +17,15 @@
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 8" } */
+/* { dg-regexp "\"display-column\": 8" } */
+/* { dg-regexp "\"byte-column\": 8" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
index 378205c5bf5..afe96a9048f 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
@@ -8,6 +8,7 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#warning message\"" } */
 /* { dg-regexp "\"option\": \"-Werror=cpp\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */
@@ -16,11 +17,15 @@
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 8" } */
+/* { dg-regexp "\"display-column\": 8" } */
+/* { dg-regexp "\"byte-column\": 8" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
index 2738be6548f..ae51091e0ea 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
@@ -24,15 +24,20 @@ int test (void)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 5" } */
+/* { dg-regexp "\"display-column\": 5" } */
+/* { dg-regexp "\"byte-column\": 5" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 10" } */
+/* { dg-regexp "\"display-column\": 10" } */
+/* { dg-regexp "\"byte-column\": 10" } */
 
 /* The outer diagnostic.  */
 
 /* { dg-regexp "\"kind\": \"warning\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"this 'if' clause does not guard...\"" } */
 /* { dg-regexp "\"option\": \"-Wmisleading-indentation\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wmisleading-indentation\"" } */
@@ -41,11 +46,15 @@ int test (void)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 6" } */
 /* { dg-regexp "\"column\": 3" } */
+/* { dg-regexp "\"display-column\": 3" } */
+/* { dg-regexp "\"byte-column\": 3" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 6" } */
 /* { dg-regexp "\"column\": 4" } */
+/* { dg-regexp "\"display-column\": 4" } */
+/* { dg-regexp "\"byte-column\": 4" } */
 
 /* More from the nested diagnostic (we can't guarantee what order the
    "file" keys are consumed).  */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
index f36e896d228..e0e9ce4be98 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
@@ -13,6 +13,7 @@ int test (struct s *ptr)
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \".*\"" } */
 
 /* Verify fix-it hints.  */
@@ -23,11 +24,15 @@ int test (struct s *ptr)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 15" } */
+/* { dg-regexp "\"display-column\": 15" } */
+/* { dg-regexp "\"byte-column\": 15" } */
 
 /* { dg-regexp "\"next\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 21" } */
+/* { dg-regexp "\"display-column\": 21" } */
+/* { dg-regexp "\"byte-column\": 21" } */
 
 /* { dg-regexp "\"fixits\": \[\[\{\}, \]*\]" } */
 
@@ -35,11 +40,15 @@ int test (struct s *ptr)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 15" } */
+/* { dg-regexp "\"display-column\": 15" } */
+/* { dg-regexp "\"byte-column\": 15" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 20" } */
+/* { dg-regexp "\"display-column\": 20" } */
+/* { dg-regexp "\"byte-column\": 20" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-1.c b/gcc/testsuite/c-c++-common/diagnostic-units-1.c
new file mode 100644
index 00000000000..8d38b7de03e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via default)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-2.c b/gcc/testsuite/c-c++-common/diagnostic-units-2.c
new file mode 100644
index 00000000000..29a2edefd9f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-2.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -Wmultichar" } */
+
+/* column units: display (via arg)
+   column origin: 1 (via default)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "25: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-3.c b/gcc/testsuite/c-c++-common/diagnostic-units-3.c
new file mode 100644
index 00000000000..714ee8f2de4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-3.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=200 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via fallback from overly large argument)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-4.c b/gcc/testsuite/c-c++-common/diagnostic-units-4.c
new file mode 100644
index 00000000000..f9c9da914b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-4.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 0 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "10: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-5.c b/gcc/testsuite/c-c++-common/diagnostic-units-5.c
new file mode 100644
index 00000000000..99d5299a732
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */
+
+/* column units: display (via arg)
+   column origin: 0 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "24: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-6.c b/gcc/testsuite/c-c++-common/diagnostic-units-6.c
new file mode 100644
index 00000000000..c1e6e4ed477
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-6.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=100 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 100 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "110: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "117: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "118: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-7.c b/gcc/testsuite/c-c++-common/diagnostic-units-7.c
new file mode 100644
index 00000000000..dab221ae235
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-7.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via default)
+   tabstop: 9 (via arg) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c1 = 'c1';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c2 = 'c2';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+         int c3 = 	'c3'; /* { dg-warning "20: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c3 =          'c3';
+                            ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-8.c b/gcc/testsuite/c-c++-common/diagnostic-units-8.c
new file mode 100644
index 00000000000..d713b32dabc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-8.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */
+
+/* column units: display (via default)
+   column origin: 1 (via default)
+   tabstop: 9 (via arg) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c1 = 'c1';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c2 = 'c2';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+         int c3 = 	'c3'; /* { dg-warning "28: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c3 =          'c3';
+                            ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/missing-close-symbol.c b/gcc/testsuite/c-c++-common/missing-close-symbol.c
index abeb83748c1..9f1de3d0c47 100644
--- a/gcc/testsuite/c-c++-common/missing-close-symbol.c
+++ b/gcc/testsuite/c-c++-common/missing-close-symbol.c
@@ -24,9 +24,9 @@ void test_static_assert_different_line (void)
   _Static_assert(sizeof(int) >= sizeof(char), /* { dg-message "to match this '\\('" } */
 		 "msg"; /* { dg-error "expected '\\)' before ';' token" } */
   /* { dg-begin-multiline-output "" }
-    "msg";
-         ^
-         )
+                  "msg";
+                       ^
+                       )
      { dg-end-multiline-output "" } */
   /* { dg-begin-multiline-output "" }
    _Static_assert(sizeof(int) >= sizeof(char),
diff --git a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
index fab5849dfc7..ebbf3001055 100644
--- a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
+++ b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
@@ -33,10 +33,10 @@ int test_2 (void)
            ~~~~~~~~~~~~~~~~
                          |
                          s
-    + some_other_function ());
-    ^ ~~~~~~~~~~~~~~~~~~~~~~
-                          |
-                          t
+           + some_other_function ());
+           ^ ~~~~~~~~~~~~~~~~~~~~~~
+                                 |
+                                 t
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/g++.dg/parse/error4.C b/gcc/testsuite/g++.dg/parse/error4.C
index 792bf4dc063..fe8de73790d 100644
--- a/gcc/testsuite/g++.dg/parse/error4.C
+++ b/gcc/testsuite/g++.dg/parse/error4.C
@@ -7,4 +7,4 @@ struct X {
 		 int);
 };
 
-// { dg-error "4:'itn' has not been declared" "" { target *-*-* } 6 }
+// { dg-error "18:'itn' has not been declared" "" { target *-*-* } 6 }
diff --git a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
index 96ebb71645c..d2b37a5122d 100644
--- a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
+++ b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
@@ -9,13 +9,13 @@ class A {
 	int	h;
 	A() { i=10; j=20; }
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "16:virtual functions cannot be friends" }
 };
 
 class B : public A {
     public:
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }// { dg-error "" }  member.*// ERROR -  member.*
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "16:virtual functions cannot be friends" }
 // { dg-error "private" "" { target *-*-* } .-1 }
 };
 
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
index b438543d445..bbc9e51aff6 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
@@ -12,5 +12,5 @@ int
 main()
 {
 	C<char*>	c;
-	char*		p = Z(c.O); //{ dg-error "13:'Z' was not declared" } ambiguous c.O
+	char*		p = Z(c.O); //{ dg-error "29:'Z' was not declared" } ambiguous c.O
 }
diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
index 6dc2c55be58..b98e8da6b1e 100644
--- a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
+++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
@@ -48,8 +48,8 @@ ostream& operator<<(ostream& os, Graph<VertexType,EdgeType>& G)
 
         // The compiler does not like this line!!!!!!
         typename Graph<VertexType, EdgeType>::Successor::iterator
-	  startN = G[i].second.begin(), // { dg-error "14:no match" } no index operator
-	  endN   = G[i].second.end();  // { dg-error "14:no match" } no index operator
+	  startN = G[i].second.begin(), // { dg-error "21:no match" } no index operator
+	  endN   = G[i].second.end();  // { dg-error "21:no match" } no index operator
 
         while(startN != endN)
         {
diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
index c5ff96e5644..51190c92391 100644
--- a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
+++ b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
@@ -288,7 +288,7 @@ int test_3 (int x, int y)
     |      |     ~~~~~~~~~~
     |      |     |
     |      |     (4) ...to here
-    |   NN |      to dereference it above
+    |   NN |                    to dereference it above
     |   NN |   return *ptr;
     |      |          ~~~~
     |      |          |
diff --git a/gcc/testsuite/gcc.dg/bad-binary-ops.c b/gcc/testsuite/gcc.dg/bad-binary-ops.c
index 46c158e6a5f..45668be0a29 100644
--- a/gcc/testsuite/gcc.dg/bad-binary-ops.c
+++ b/gcc/testsuite/gcc.dg/bad-binary-ops.c
@@ -35,10 +35,10 @@ int test_2 (void)
            ~~~~~~~~~~~~~~~~
            |
            struct s
-    + some_other_function ());
-    ^ ~~~~~~~~~~~~~~~~~~~~~~
-      |
-      struct t
+           + some_other_function ());
+           ^ ~~~~~~~~~~~~~~~~~~~~~~
+             |
+             struct t
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/format/branch-1.c b/gcc/testsuite/gcc.dg/format/branch-1.c
index 1782064645e..4ea39b52b2e 100644
--- a/gcc/testsuite/gcc.dg/format/branch-1.c
+++ b/gcc/testsuite/gcc.dg/format/branch-1.c
@@ -10,7 +10,7 @@ foo (long l, int nfoo)
 {
   printf ((nfoo > 1) ? "%d foos" : "%d foo", nfoo);
   printf ((l > 1) ? "%d foos" /* { dg-warning "23:int" "wrong type in conditional expr" } */
-	          : "%d foo", l); /* { dg-warning "16:int" "wrong type in conditional expr" } */
+	          : "%d foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%ld foos" : "%d foo", l); /* { dg-warning "36:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%d foos" : "%ld foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   /* Should allow one case to have extra arguments.  */
diff --git a/gcc/testsuite/gcc.dg/format/pr79210.c b/gcc/testsuite/gcc.dg/format/pr79210.c
index 71f5dd6e082..6bdabdf21ec 100644
--- a/gcc/testsuite/gcc.dg/format/pr79210.c
+++ b/gcc/testsuite/gcc.dg/format/pr79210.c
@@ -20,4 +20,4 @@ LPFC_VPORT_ATTR_R(peer_port_login,
 		  "Allow peer ports on the same physical port to login to each "
 		  "other.");
 
-/* { dg-warning "6: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
+/* { dg-warning "20: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
index 03b78042107..d7691e4be51 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -540,15 +540,15 @@ void test_builtin_types_compatible_p (unsigned long i)
   __emit_expression_range (0,
 			   f (i) + __builtin_types_compatible_p (long, int)); /* { dg-warning "range" } */
 /* { dg-begin-multiline-output "" }
-       f (i) + __builtin_types_compatible_p (long, int));
-       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                            f (i) + __builtin_types_compatible_p (long, int));
+                            ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    { dg-end-multiline-output "" } */
 
   __emit_expression_range (0,
 			   __builtin_types_compatible_p (long, int) + f (i)); /* { dg-warning "range" } */
 /* { dg-begin-multiline-output "" }
-       __builtin_types_compatible_p (long, int) + f (i));
-       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
+                            __builtin_types_compatible_p (long, int) + f (i));
+                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
    { dg-end-multiline-output "" } */
 }
 
@@ -671,8 +671,8 @@ void test_multiple_ordinary_maps (void)
 /* { dg-begin-multiline-output "" }
    __emit_expression_range (0, foo (0,
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-        "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));
-        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));
+                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    { dg-end-multiline-output "" } */
 
   /* Another expression that transitions between ordinary maps; this
@@ -685,8 +685,8 @@ void test_multiple_ordinary_maps (void)
 /* { dg-begin-multiline-output "" }
    __emit_expression_range (0, foo (0, "012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456
 7890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
 4567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789",
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-        0));
-        ~~                      
+                                    0));
+                                    ~~
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
index ac4fa1b52bd..4cba87be2ae 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
@@ -335,11 +335,11 @@ pr87652 (const char *stem, int counter)
 /* { dg-error "unable to read substring location: unable to read source line" "" { target c } 329 } */
 /* { dg-error "unable to read substring location: failed to get ordinary maps" "" { target c++ } 329 } */
 /* { dg-begin-multiline-output "" }
-     __emit_string_literal_range(__FILE__":%5d: " format, \
+     __emit_string_literal_range(__FILE__":%5d: " format,        \
                                  ^~~~~~~~
      { dg-end-multiline-output "" { target c } } */
 /* { dg-begin-multiline-output "" }
-     __emit_string_literal_range(__FILE__":%5d: " format, \
+     __emit_string_literal_range(__FILE__":%5d: " format,        \
                                  ^
      { dg-end-multiline-output "" { target c++ } } */
 
diff --git a/gcc/testsuite/gcc.dg/redecl-4.c b/gcc/testsuite/gcc.dg/redecl-4.c
index 8f124886da8..2c214bb02c7 100644
--- a/gcc/testsuite/gcc.dg/redecl-4.c
+++ b/gcc/testsuite/gcc.dg/redecl-4.c
@@ -15,7 +15,7 @@ f (void)
     /* Should get format warnings even though the built-in declaration
        isn't "visible".  */
     printf (
-	    "%s", 1); /* { dg-warning "8:format" } */
+	    "%s", 1); /* { dg-warning "15:format" } */
     /* The type of strcmp here should have no prototype.  */
     if (0)
       strcmp (1);
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
index 7fade1f65fc..606fe0f891a 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
@@ -8,17 +8,22 @@
 ! We can't rely on any ordering of the keys.
 
 ! { dg-regexp "\"kind\": \"error\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#error message\"" }
 
 ! { dg-regexp "\"caret\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 6" }
+! { dg-regexp "\"display-column\": 6" }
+! { dg-regexp "\"byte-column\": 6" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
index bebcf68d431..56615f0ca5a 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
@@ -8,6 +8,7 @@
 ! We can't rely on any ordering of the keys. 
 
 ! { dg-regexp "\"kind\": \"warning\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#warning message\"" }
 ! { dg-regexp "\"option\": \"-Wcpp\"" }
 ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }
@@ -16,11 +17,15 @@
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 8" }
+! { dg-regexp "\"display-column\": 8" }
+! { dg-regexp "\"byte-column\": 8" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
index 7ab78eb570b..50214759091 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
@@ -8,6 +8,7 @@
 ! We can't rely on any ordering of the keys.
 
 ! { dg-regexp "\"kind\": \"error\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#warning message\"" }
 ! { dg-regexp "\"option\": \"-Werror=cpp\"" }
 ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }
@@ -16,11 +17,15 @@
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 8" }
+! { dg-regexp "\"display-column\": 8" }
+! { dg-regexp "\"byte-column\": 8" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/go.dg/arrayclear.go b/gcc/testsuite/go.dg/arrayclear.go
index 6daebc0b8f5..aa5ba0761d7 100644
--- a/gcc/testsuite/go.dg/arrayclear.go
+++ b/gcc/testsuite/go.dg/arrayclear.go
@@ -1,5 +1,8 @@
 // { dg-do compile }
 // { dg-options "-fgo-debug-optimization" }
+// This comment is necessary to work around a dejagnu bug. Otherwise, the
+// column of the second error message would equal the row of the first one, and
+// since the errors are also identical, dejagnu is not able to distinguish them.
 
 package p
 
diff --git a/gcc/tree-diagnostic-path.cc b/gcc/tree-diagnostic-path.cc
index 381a49cb0b4..82b3c2d6b6a 100644
--- a/gcc/tree-diagnostic-path.cc
+++ b/gcc/tree-diagnostic-path.cc
@@ -493,7 +493,7 @@ default_tree_diagnostic_path_printer (diagnostic_context *context,
    doesn't have access to trees (for m_fndecl).  */
 
 json::value *
-default_tree_make_json_for_path (diagnostic_context *,
+default_tree_make_json_for_path (diagnostic_context *context,
 				 const diagnostic_path *path)
 {
   json::array *path_array = new json::array ();
@@ -504,7 +504,8 @@ default_tree_make_json_for_path (diagnostic_context *,
       json::object *event_obj = new json::object ();
       if (event.get_location ())
 	event_obj->set ("location",
-			json_from_expanded_location (event.get_location ()));
+			json_from_expanded_location (context,
+						     event.get_location ()));
       label_text event_text (event.get_desc (false));
       event_obj->set ("description", new json::string (event_text.m_buffer));
       event_text.maybe_free ();
diff --git a/libcpp/charset.c b/libcpp/charset.c
index db47235b847..28b81c9c864 100644
--- a/libcpp/charset.c
+++ b/libcpp/charset.c
@@ -2276,49 +2276,90 @@ cpp_string_location_reader::get_next ()
   return result;
 }
 
-/* Helper for cpp_byte_column_to_display_column and its inverse.  Given a
-   pointer to a UTF-8-encoded character, compute its display width.  *INBUFP
-   points on entry to the start of the UTF-8 encoding of the character, and
-   is updated to point just after the last byte of the encoding.  *INBYTESLEFTP
-   contains on entry the remaining size of the buffer into which *INBUFP
-   points, and this is also updated accordingly.  If *INBUFP does not
+cpp_display_width_computation::
+cpp_display_width_computation (const char *data, int data_length, int tabstop) :
+  m_begin (data),
+  m_next (m_begin),
+  m_bytes_left (data_length),
+  m_tabstop (tabstop),
+  m_display_cols (0)
+{
+  gcc_assert (m_tabstop > 0);
+}
+
+
+/* The main implementation function for class cpp_display_width_computation.
+   m_next points on entry to the start of the UTF-8 encoding of the next
+   character, and is updated to point just after the last byte of the encoding.
+   m_bytes_left contains on entry the remaining size of the buffer into which
+   m_next points, and this is also updated accordingly.  If m_next does not
    point to a valid UTF-8-encoded sequence, then it will be treated as a single
-   byte with display width 1.  */
+   byte with display width 1.  m_cur_display_col is the current display column,
+   relative to which tab stops should be expanded.  Returns the display width of
+   the codepoint just processed.  */
 
-static inline int
-compute_next_display_width (const uchar **inbufp, size_t *inbytesleftp)
+int
+cpp_display_width_computation::process_next_codepoint ()
 {
   cppchar_t c;
-  if (one_utf8_to_cppchar (inbufp, inbytesleftp, &c) != 0)
+  int next_width;
+
+  if (*m_next == '\t')
+    {
+      ++m_next;
+      --m_bytes_left;
+      next_width = m_tabstop - (m_display_cols % m_tabstop);
+    }
+  else if (one_utf8_to_cppchar ((const uchar **) &m_next, &m_bytes_left, &c)
+	   != 0)
     {
       /* Input is not convertible to UTF-8.  This could be fine, e.g. in a
 	 string literal, so don't complain.  Just treat it as if it has a width
 	 of one.  */
-      ++*inbufp;
-      --*inbytesleftp;
-      return 1;
+      ++m_next;
+      --m_bytes_left;
+      next_width = 1;
+    }
+  else
+    {
+      /*  one_utf8_to_cppchar() has updated m_next and m_bytes_left for us.  */
+      next_width = cpp_wcwidth (c);
     }
 
-  /*  one_utf8_to_cppchar() has updated inbufp and inbytesleftp for us.  */
-  return cpp_wcwidth (c);
+  m_display_cols += next_width;
+  return next_width;
+}
+
+/*  Utility to advance the byte stream by the minimum amount needed to consume
+    N display columns.  Returns the number of display columns that were
+    actually skipped.  This could be less than N, if there was not enough data,
+    or more than N, if the last character to be skipped had a sufficiently large
+    display width.  */
+int
+cpp_display_width_computation::advance_display_cols (int n)
+{
+  const int start = m_display_cols;
+  const int target = start + n;
+  while (m_display_cols < target && !done ())
+    process_next_codepoint ();
+  return m_display_cols - start;
 }
 
 /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute
     how many display columns are occupied by the first COLUMN bytes.  COLUMN
     may exceed DATA_LENGTH, in which case the phantom bytes at the end are
-    treated as if they have display width 1.  */
+    treated as if they have display width 1.  Tabs are expanded to the next tab
+    stop, relative to the start of DATA.  */
 
 int
 cpp_byte_column_to_display_column (const char *data, int data_length,
-				   int column)
+				   int column, int tabstop)
 {
-  int display_col = 0;
-  const uchar *udata = (const uchar *) data;
   const int offset = MAX (0, column - data_length);
-  size_t inbytesleft = column - offset;
-  while (inbytesleft)
-    display_col += compute_next_display_width (&udata, &inbytesleft);
-  return display_col + offset;
+  cpp_display_width_computation dw (data, column - offset, tabstop);
+  while (!dw.done ())
+    dw.process_next_codepoint ();
+  return dw.display_cols_processed () + offset;
 }
 
 /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute
@@ -2328,14 +2369,11 @@ cpp_byte_column_to_display_column (const char *data, int data_length,
 
 int
 cpp_display_column_to_byte_column (const char *data, int data_length,
-				   int display_col)
+				   int display_col, int tabstop)
 {
-  int column = 0;
-  const uchar *udata = (const uchar *) data;
-  size_t inbytesleft = data_length;
-  while (column < display_col && inbytesleft)
-      column += compute_next_display_width (&udata, &inbytesleft);
-  return data_length - inbytesleft + MAX (0, display_col - column);
+  cpp_display_width_computation dw (data, data_length, tabstop);
+  const int avail_display = dw.advance_display_cols (display_col);
+  return dw.bytes_processed () + MAX (0, display_col - avail_display);
 }
 
 /* Our own version of wcwidth().  We don't use the actual wcwidth() in glibc,
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 544735a51af..c18f455f82a 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -312,9 +312,6 @@ enum cpp_normalize_level {
    carries all the options visible to the command line.  */
 struct cpp_options
 {
-  /* Characters between tab stops.  */
-  unsigned int tabstop;
-
   /* The language we're preprocessing.  */
   enum c_lang lang;
 
@@ -1334,14 +1331,43 @@ extern const char * cpp_get_userdef_suffix
   (const cpp_token *);
 
 /* In charset.c */
+
+/* A class to manage the state while converting a UTF-8 sequence to cppchar_t
+   and computing the display width one character at a time.  */
+class cpp_display_width_computation {
+ public:
+  cpp_display_width_computation (const char *data, int data_length,
+				 int tabstop);
+  const char *next_byte () const { return m_next; }
+  int bytes_processed () const { return m_next - m_begin; }
+  int bytes_left () const { return m_bytes_left; }
+  bool done () const { return !bytes_left (); }
+  int display_cols_processed () const { return m_display_cols; }
+
+  int process_next_codepoint ();
+  int advance_display_cols (int n);
+
+ private:
+  const char *const m_begin;
+  const char *m_next;
+  size_t m_bytes_left;
+  const int m_tabstop;
+  int m_display_cols;
+};
+
+/* Convenience functions that are simple use cases for class
+   cpp_display_width_computation.  Tab characters will be expanded to spaces
+   as determined by TABSTOP.  */
 int cpp_byte_column_to_display_column (const char *data, int data_length,
-				       int column);
-inline int cpp_display_width (const char *data, int data_length)
+				       int column, int tabstop);
+inline int cpp_display_width (const char *data, int data_length,
+			      int tabstop)
 {
-    return cpp_byte_column_to_display_column (data, data_length, data_length);
+  return cpp_byte_column_to_display_column (data, data_length, data_length,
+					    tabstop);
 }
 int cpp_display_column_to_byte_column (const char *data, int data_length,
-				       int display_col);
+				       int display_col, int tabstop);
 int cpp_wcwidth (cppchar_t c);
 
 #endif /* ! LIBCPP_CPPLIB_H */
diff --git a/libcpp/init.c b/libcpp/init.c
index 63124c8161e..6e94c486059 100644
--- a/libcpp/init.c
+++ b/libcpp/init.c
@@ -190,7 +190,6 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, discard_comments) = 1;
   CPP_OPTION (pfile, discard_comments_in_macro_exp) = 1;
   CPP_OPTION (pfile, max_include_depth) = 200;
-  CPP_OPTION (pfile, tabstop) = 8;
   CPP_OPTION (pfile, operator_names) = 1;
   CPP_OPTION (pfile, warn_trigraphs) = 2;
   CPP_OPTION (pfile, warn_endif_labels) = 1;
Richard Sandiford via Gcc-patches July 10, 2020, 1:40 p.m. | #9
Hello-

May I please ping you about this patch? Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547900.html

-Lewis

On Thu, Jun 11, 2020 at 11:26:28AM -0400, Lewis Hyatt wrote:
> On Wed, Jun 10, 2020 at 12:11:00PM -0400, David Malcolm wrote:

> > Thanks for the patch; sorry about the delay in reviewing it.

> > 

> > Some high-level review points

> > 

> > - I like the patch overall

> > 

> > - This will deserve an item in the release notes

> > 

> > - I don't like adding "global_tabstop" (I don't like global

> > variables).  Is there nowhere else we can handle this? I believe

> > there's a cluster of functions in the callgraph that make use of

> > it; can we simply pass around the tabstop value instead?  "tabstop"

> > seems to have several meanings.  If I'm reading the patch correctly

> >   * "tabstop > 0" means to expand tabs so that column numbers are a

> > multiple of tabstop

> >   * "tabstop == 0" means "don't expand tabs"

> >   * "tabstop < 0" in some places means: use the global_tabstop value

> > Is it possible to eliminate global_tabstop value?  Or is there some

> > deep reason I'm missing?

> > 

> > I'll do a more thorough review once that's addressed/resolved (since

> > eliminating global_tabstop might touch a few places).

> >

> 

> Thanks for the feedback! The attached updated patch addresses these

> concerns. Regarding tabstop, I have removed the new static variable

> global_tabstop in charset.c. FWIW, the usage of "tabstop" arguments in the

> various new APIs did previously work a bit more consistently than you

> described. In all cases "tabstop <= 0" meant to use the default value,

> otherwise it specified the tabstop to use (with tabstop=1 naturally

> restoring the old behavior of changing tabs to a single space). In order

> for libcpp to provide this feature (callers can pass tabstop <= 0 to get a

> default, and the default can in turn by configured when processing the

> -ftabstop option), it does need to remember the default, and this has to

> be a file-level static variable because the routines need to work

> independent of any cpp_reader instance. (Some frontends don't use

> libcpp to read their input, for instance.) Anyway, I see the point that

> this file-level static, being accessible with cpp_set_tabstop() and

> cpp_get_tabstop(), is effectively just a global variable, so I have

> removed this feature, which just means that all callers need to pass the

> tabstop they want to use. I am now rather using the diagnostic_context

> object to remember the value passed to -ftabstop. The only place this

> involves global variables is now in c-family/c-indentation.c, where if I

> understood correctly, the only diagnostic_context available is global_dc,

> so I am getting the tabstop value from there. Please let me know if

> there's a better way to handle that? Prior to my patch, the tabstop was

> obtained from a different global variable (extern cpp_options *cpp_opts),

> so at least conservation of total globals is maintained. :)

> 

> Compared to the previous version, this one is a bit longer, since 25 or

> so call sites had to be modified to know the value of -ftabstop. Most of

> the churn is in diagnostic-show-locus.c, because there are a fair number of

> static helper functions and helper classes there, which just needed to

> receive the diagnostic_context object from their callers. I could

> have made this simpler by letting the tabstop argument default to

> something like 8 in all functions that require it... this would remove the

> need to pass it in all the selftests that are indifferent to it. I figured

> it would be better to force this argument to be passed, though, or else in

> the future it may be easy to forget to pass it where it is needed. 

> 

> > Thanks for adding docs; some nits on them:

> > 

> > > --- a/gcc/doc/invoke.texi

> > > +++ b/gcc/doc/invoke.texi

> > 

> > [...snip...]

> > 

> > > +@item -fdiagnostics-column-unit=@var{UNIT}

> > > +@opindex fdiagnostics-column-unit

> > > +Select the units for the column number.  This affects traditional diagnostics

> > > +(in the absence of @option{-fno-show-column}), as well as JSON format

> > > +diagnostics if requested.

> > > +

> > > +The default @var{UNIT}, @samp{display}, considers the number of display columns

> > > +occupied by each character.  This may be larger than the number of bytes

> > > +occupied, in the case of tab characters, or it may be smaller, in the case of

> > > +multibyte characters.  For example, the UTF-8 character ``@U{03C0}'' occupies

> > > +two bytes and one display column, while the character ``@U{1F642}'' occupies

> > > +four bytes and two display columns.

> > 

> > This is imprecise.  A unicode code point occupies some number of display columns,

> > and its *UTF-8 encoding* occupies some number of bytes.

> > 

> > [and my inner pedant is now thinking: what about combining diacritics? 

> > But I don't think we can ever issue a diagnostic on a diacritic; I

> > *think* we only ever care about the per-glyph level]

> > 

> > > +Setting @var{UNIT} to @samp{byte} changes the column number to the

> > raw byte

> > > +count in all cases, as was traditionally output by GCC prior to version 11.1.0.

> > > +

> > > +@item -fdiagnostics-column-origin=@var{ORIGIN}

> > > +@opindex fdiagnostics-column-origin

> > > +Select the origin for column numbers, i.e. the column number assigned to the

> > > +first column.  The default value of 1 corresponds to traditional GCC

> > > +behavior and to the GNU style guide.  Some utilities may perform better with an

> > > +origin of 0; any non-negative value may be specified.

> > > +

> > >  @item -fdiagnostics-format=@var{FORMAT}

> > >  @opindex fdiagnostics-format

> > >  Select a different format for printing diagnostics.

> > 

> > [...snip...]

> > 

> > > +A diagnostic can contain zero or more locations.  Each location has an

> > > +optional @code{label} string and up to three positions within it: a

> > > +@code{caret} position and optional @code{start} and @code{finish} positions.

> > > +A position is described by a @code{file} name, a @code{line} number, and

> > > +three numbers indicating a column position: @code{display-column} counts

> > > +display columns, accounting for tabs and multibyte characters;

> > > +@code{byte-column} counts raw bytes; and @code{column} is equal to one of

> > > +the previous two, as dictated by the @option{-fdiagnostics-column-unit}

> > > +option.

> > 

> > Might be clearer to use an unordered list here for the three kinds of column.

> > 

> > > All three columns are relative to the origin specified by

> > > +@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may

> > > +be set, for instance, to 0 for compatibility with other utilities that

> > > +number columns from 0.  The column origin is recorded in the JSON output in

> > > +the @code{column-origin} tag.  In the remaining examples below, the extra

> > > +column number outputs have been omitted for brevity.

> > 

> > [...snip...]

> > 

> 

> I improved the docs along these lines.

> 

> > Thanks again for the patch; hope this is constructive

> > Dave

> >

> 

> Thanks for your time! BTW, I did bootstrap + regtest this version as well on

> x86-64 Linux, it looks good, new tests pass and others are the same:

> 

> FAIL 97 97

> PASS 476837 477297

> UNRESOLVED 7 7

> UNSUPPORTED 11726 11726

> UNTESTED 195 195

> XFAIL 1807 1807

> XPASS 37 37

> 

> -Lewis


> From 7729ce3334b6768a25967a6dd4a0a5a2ed0923cc Mon Sep 17 00:00:00 2001

> From: Lewis Hyatt <lhyatt@gmail.com>

> Date: Wed, 10 Jun 2020 22:04:07 -0400

> Subject: [PATCH] diagnostics: Support conversion of tabs to spaces [PR49973] [PR86904]

> 

> Supports conversion of tabs to spaces when outputting diagnostics. Also

> adds -fdiagnostics-column-unit and -fdiagnostics-column-origin options to

> control how the column number is output, thereby resolving the two PRs.

> 

> gcc/c-family/ChangeLog:

> 

> 	PR other/86904

> 	* c-indentation.c (should_warn_for_misleading_indentation): Get

> 	global tabstop from the new source.

> 	* c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which

> 	is now a common option.

> 	* c.opt: Likewise.

> 

> gcc/ChangeLog:

> 

> 	PR preprocessor/49973

> 	PR other/86904

> 	* common.opt: Handle -ftabstop here instead of in c-family

> 	options.  Add -fdiagnostics-column-unit= and

> 	-fdiagnostics-column-origin= options.

> 	* opts.c (common_handle_option): Handle the new options.

> 	* diagnostic-format-json.cc (json_from_expanded_location): Add

> 	diagnostic_context argument.  Use it to convert column numbers as per

> 	the new options.

> 	(json_from_location_range): Likewise.

> 	(json_from_fixit_hint): Likewise.

> 	(json_end_diagnostic): Pass the new context argument to helper

> 	functions above.  Add "column-origin" field to the output.

> 	(test_unknown_location): Add the new context argument to calls to

> 	helper functions.

> 	(test_bad_endpoints): Likewise.

> 	* diagnostic-show-locus.c

> 	(exploc_with_display_col::exploc_with_display_col): Support

> 	tabstop parameter.

> 	(layout_point::layout_point): Make use of class

> 	exploc_with_display_col.

> 	(layout_range::layout_range): Likewise.

> 	(struct line_bounds): Clarify that the units are now always

> 	display columns.  Rename members accordingly.  Add constructor.

> 	(layout::print_source_line): Add support for tab expansion.

> 	(make_range): Adapt to class layout_range changes.

> 	(layout::maybe_add_location_range): Likewise.

> 	(layout::layout): Adapt to class exploc_with_display_col changes.

> 	(layout::calculate_x_offset_display): Support tabstop parameter.

> 	(layout::print_annotation_line): Adapt to struct line_bounds changes.

> 	(layout::print_line): Likewise.

> 	(line_label::line_label): Add diagnostic_context argument.

> 	(get_affected_range): Likewise.

> 	(get_printed_columns): Likewise.

> 	(layout::print_any_labels): Adapt to struct line_label changes.

> 	(class correction): Add m_tabstop member.

> 	(correction::correction): Add tabstop argument.

> 	(correction::compute_display_cols): Use m_tabstop.

> 	(class line_corrections): Add m_context member.

> 	(line_corrections::line_corrections): Add diagnostic_context argument.

> 	(line_corrections::add_hint): Use m_context to handle tabstops.

> 	(layout::print_trailing_fixits): Adapt to class line_corrections

> 	changes.

> 	(test_layout_x_offset_display_utf8): Support tabstop parameter.

> 	(test_layout_x_offset_display_tab): New selftest.

> 	(test_one_liner_colorized_utf8): Likewise.

> 	(test_tab_expansion): Likewise.

> 	(test_diagnostic_show_locus_one_liner_utf8): Call the new tests.

> 	(diagnostic_show_locus_c_tests): Likewise.

> 	(test_overlapped_fixit_printing): Adapt to helper class and

> 	function changes.

> 	(test_overlapped_fixit_printing_utf8): Likewise.

> 	(test_overlapped_fixit_printing_2): Likewise.

> 	* diagnostic.h (enum diagnostics_column_unit): New enum.

> 	(struct diagnostic_context): Add members for the new options.

> 	(diagnostic_converted_column): Declare.

> 	(json_from_expanded_location): Add new context argument.

> 	* diagnostic.c (diagnostic_initialize): Initialize new members.

> 	(diagnostic_converted_column): New function.

> 	(maybe_line_and_column): Be willing to output a column of 0.

> 	(diagnostic_get_location_text): Convert column number as per the new

> 	options.

> 	(diagnostic_report_current_module): Likewise.

> 	(assert_location_text): Add origin and column_unit arguments for

> 	testing the new functionality.

> 	(test_diagnostic_get_location_text): Test the new functionality.

> 	* doc/invoke.texi: Document the new options and behavior.

> 	* input.h (location_compute_display_column): Add tabstop argument.

> 	* input.c (location_compute_display_column): Likewise.

> 	(test_cpp_utf8): Add selftests for tab expansion.

> 	* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the

> 	new context argument to json_from_expanded_location().

> 

> libcpp/ChangeLog:

> 

> 	PR preprocessor/49973

> 	PR other/86904

> 	* include/cpplib.h (struct cpp_options):  Removed support for -ftabstop,

> 	which is now handled by diagnostic_context.

> 	(class cpp_display_width_computation): New class.

> 	(cpp_byte_column_to_display_column): Add optional tabstop argument.

> 	(cpp_display_width): Likewise.

> 	(cpp_display_column_to_byte_column): Likewise.

> 	* charset.c

> 	(cpp_display_width_computation::cpp_display_width_computation): New

> 	function.

> 	(cpp_display_width_computation::advance_display_cols): Likewise.

> 	(compute_next_display_width): Removed and implemented this

> 	functionality in a new function...

> 	(cpp_display_width_computation::process_next_codepoint): ...here.

> 	(cpp_byte_column_to_display_column): Added tabstop argument.

> 	Reimplemented in terms of class cpp_display_width_computation.

> 	(cpp_display_column_to_byte_column): Likewise.

> 	* init.c (cpp_create_reader): Remove handling of -ftabstop, which is now

> 	handled by diagnostic_context.

> 

> gcc/testsuite/ChangeLog:

> 

> 	PR preprocessor/49973

> 	PR other/86904

> 	* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output

> 	for new defaults.

> 	* c-c++-common/Wmisleading-indentation.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-1.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-2.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-3.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-4.c: Likewise.

> 	* c-c++-common/diagnostic-format-json-5.c: Likewise.

> 	* c-c++-common/missing-close-symbol.c: Likewise.

> 	* g++.dg/diagnostic/bad-binary-ops.C: Likewise.

> 	* g++.dg/parse/error4.C: Likewise.

> 	* g++.old-deja/g++.brendan/crash11.C: Likewise.

> 	* g++.old-deja/g++.pt/overload2.C: Likewise.

> 	* g++.old-deja/g++.robertl/eb109.C: Likewise.

> 	* gcc.dg/analyzer/malloc-paths-9.c: Likewise.

> 	* gcc.dg/bad-binary-ops.c: Likewise.

> 	* gcc.dg/format/branch-1.c: Likewise.

> 	* gcc.dg/format/pr79210.c: Likewise.

> 	* gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.

> 	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.

> 	* gcc.dg/redecl-4.c: Likewise.

> 	* gfortran.dg/diagnostic-format-json-1.F90: Likewise.

> 	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.

> 	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.

> 	* go.dg/arrayclear.go: Add a comment explaining why adding a

> 	comment was necessary to work around a dejagnu bug.

> 	* c-c++-common/diagnostic-units-1.c: New test.

> 	* c-c++-common/diagnostic-units-2.c: New test.

> 	* c-c++-common/diagnostic-units-3.c: New test.

> 	* c-c++-common/diagnostic-units-4.c: New test.

> 	* c-c++-common/diagnostic-units-5.c: New test.

> 	* c-c++-common/diagnostic-units-6.c: New test.

> 	* c-c++-common/diagnostic-units-7.c: New test.

> 	* c-c++-common/diagnostic-units-8.c: New test.

> 

> diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c

> index 9fba3bcc67c..d814f6f29e6 100644

> --- a/gcc/c-family/c-indentation.c

> +++ b/gcc/c-family/c-indentation.c

> @@ -24,8 +24,7 @@ along with GCC; see the file COPYING3.  If not see

>  #include "c-common.h"

>  #include "c-indentation.h"

>  #include "selftest.h"

> -

> -extern cpp_options *cpp_opts;

> +#include "diagnostic.h"

>  

>  /* Round up VIS_COLUMN to nearest tab stop. */

>  

> @@ -299,7 +298,7 @@ should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,

>    expanded_location next_stmt_exploc = expand_location (next_stmt_loc);

>    expanded_location guard_exploc = expand_location (guard_loc);

>  

> -  const unsigned int tab_width = cpp_opts->tabstop;

> +  const unsigned int tab_width = global_dc->tabstop;

>  

>    /* They must be in the same file.  */

>    if (next_stmt_exploc.file != body_exploc.file)

> diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c

> index 8a5131b8ac6..f6588277565 100644

> --- a/gcc/c-family/c-opts.c

> +++ b/gcc/c-family/c-opts.c

> @@ -504,12 +504,6 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,

>  	cpp_opts->track_macro_expansion = 2;

>        break;

>  

> -    case OPT_ftabstop_:

> -      /* It is documented that we silently ignore silly values.  */

> -      if (value >= 1 && value <= 100)

> -	cpp_opts->tabstop = value;

> -      break;

> -

>      case OPT_fexec_charset_:

>        cpp_opts->narrow_charset = arg;

>        break;

> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt

> index 89a58282b3f..913f91d818a 100644

> --- a/gcc/c-family/c.opt

> +++ b/gcc/c-family/c.opt

> @@ -1876,10 +1876,6 @@ Enum(strong_eval_order) String(some) Value(1)

>  EnumValue

>  Enum(strong_eval_order) String(all) Value(2)

>  

> -ftabstop=

> -C ObjC C++ ObjC++ Joined RejectNegative UInteger

> --ftabstop=<number>	Distance between tab stops for column reporting.

> -

>  ftemplate-backtrace-limit=

>  C++ ObjC++ Joined RejectNegative UInteger Var(template_backtrace_limit) Init(10)

>  Set the maximum number of template instantiation notes for a single warning or error.

> diff --git a/gcc/common.opt b/gcc/common.opt

> index df8af365d1b..a3893a4725e 100644

> --- a/gcc/common.opt

> +++ b/gcc/common.opt

> @@ -1328,6 +1328,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)

>  EnumValue

>  Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)

>  

> +fdiagnostics-column-unit=

> +Common Joined RejectNegative Enum(diagnostics_column_unit)

> +-fdiagnostics-column-unit=[display|byte]	Select whether column numbers are output as display columns (default) or raw bytes.

> +

> +fdiagnostics-column-origin=

> +Common Joined RejectNegative UInteger

> +-fdiagnostics-column-origin=<number>	Set the number of the first column.  The default is 1-based as per GNU style, but some utilities may expect 0-based, for example.

> +

>  fdiagnostics-format=

>  Common Joined RejectNegative Enum(diagnostics_output_format)

>  -fdiagnostics-format=[text|json]	Select output format.

> @@ -1336,6 +1344,15 @@ Common Joined RejectNegative Enum(diagnostics_output_format)

>  SourceInclude

>  diagnostic.h

>  

> +Enum

> +Name(diagnostics_column_unit) Type(int)

> +

> +EnumValue

> +Enum(diagnostics_column_unit) String(display) Value(DIAGNOSTICS_COLUMN_UNIT_DISPLAY)

> +

> +EnumValue

> +Enum(diagnostics_column_unit) String(byte) Value(DIAGNOSTICS_COLUMN_UNIT_BYTE)

> +

>  Enum

>  Name(diagnostics_output_format) Type(int)

>  

> @@ -1365,6 +1382,10 @@ fdiagnostics-path-format=

>  Common Joined RejectNegative Var(flag_diagnostics_path_format) Enum(diagnostic_path_format) Init(DPF_INLINE_EVENTS)

>  Specify how to print any control-flow path associated with a diagnostic.

>  

> +ftabstop=

> +Common Joined RejectNegative UInteger

> +-ftabstop=<number>      Distance between tab stops for column reporting.

> +

>  Enum

>  Name(diagnostic_path_format) Type(int)

>  

> diff --git a/gcc/diagnostic-format-json.cc b/gcc/diagnostic-format-json.cc

> index 7bda5c4ba83..465c42fdfde 100644

> --- a/gcc/diagnostic-format-json.cc

> +++ b/gcc/diagnostic-format-json.cc

> @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see

>  #include "system.h"

>  #include "coretypes.h"

>  #include "diagnostic.h"

> +#include "selftest-diagnostic.h"

>  #include "diagnostic-metadata.h"

>  #include "json.h"

>  #include "selftest.h"

> @@ -43,21 +44,43 @@ static json::array *cur_children_array;

>  /* Generate a JSON object for LOC.  */

>  

>  json::value *

> -json_from_expanded_location (location_t loc)

> +json_from_expanded_location (diagnostic_context *context, location_t loc)

>  {

>    expanded_location exploc = expand_location (loc);

>    json::object *result = new json::object ();

>    if (exploc.file)

>      result->set ("file", new json::string (exploc.file));

>    result->set ("line", new json::integer_number (exploc.line));

> -  result->set ("column", new json::integer_number (exploc.column));

> +

> +  const enum diagnostics_column_unit orig_unit = context->column_unit;

> +  struct

> +  {

> +    const char *name;

> +    enum diagnostics_column_unit unit;

> +  } column_fields[] = {

> +    {"display-column", DIAGNOSTICS_COLUMN_UNIT_DISPLAY},

> +    {"byte-column", DIAGNOSTICS_COLUMN_UNIT_BYTE}

> +  };

> +  int the_column = INT_MIN;

> +  for (int i = 0; i != sizeof column_fields / sizeof (*column_fields); ++i)

> +    {

> +      context->column_unit = column_fields[i].unit;

> +      const int col = diagnostic_converted_column (context, exploc);

> +      result->set (column_fields[i].name, new json::integer_number (col));

> +      if (column_fields[i].unit == orig_unit)

> +	the_column = col;

> +    }

> +  gcc_assert (the_column != INT_MIN);

> +  result->set ("column", new json::integer_number (the_column));

> +  context->column_unit = orig_unit;

>    return result;

>  }

>  

>  /* Generate a JSON object for LOC_RANGE.  */

>  

>  static json::object *

> -json_from_location_range (const location_range *loc_range, unsigned range_idx)

> +json_from_location_range (diagnostic_context *context,

> +			  const location_range *loc_range, unsigned range_idx)

>  {

>    location_t caret_loc = get_pure_location (loc_range->m_loc);

>  

> @@ -68,13 +91,13 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)

>    location_t finish_loc = get_finish (loc_range->m_loc);

>  

>    json::object *result = new json::object ();

> -  result->set ("caret", json_from_expanded_location (caret_loc));

> +  result->set ("caret", json_from_expanded_location (context, caret_loc));

>    if (start_loc != caret_loc

>        && start_loc != UNKNOWN_LOCATION)

> -    result->set ("start", json_from_expanded_location (start_loc));

> +    result->set ("start", json_from_expanded_location (context, start_loc));

>    if (finish_loc != caret_loc

>        && finish_loc != UNKNOWN_LOCATION)

> -    result->set ("finish", json_from_expanded_location (finish_loc));

> +    result->set ("finish", json_from_expanded_location (context, finish_loc));

>  

>    if (loc_range->m_label)

>      {

> @@ -91,14 +114,14 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)

>  /* Generate a JSON object for HINT.  */

>  

>  static json::object *

> -json_from_fixit_hint (const fixit_hint *hint)

> +json_from_fixit_hint (diagnostic_context *context, const fixit_hint *hint)

>  {

>    json::object *fixit_obj = new json::object ();

>  

>    location_t start_loc = hint->get_start_loc ();

> -  fixit_obj->set ("start", json_from_expanded_location (start_loc));

> +  fixit_obj->set ("start", json_from_expanded_location (context, start_loc));

>    location_t next_loc = hint->get_next_loc ();

> -  fixit_obj->set ("next", json_from_expanded_location (next_loc));

> +  fixit_obj->set ("next", json_from_expanded_location (context, next_loc));

>    fixit_obj->set ("string", new json::string (hint->get_string ()));

>  

>    return fixit_obj;

> @@ -190,11 +213,13 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,

>    else

>      {

>        /* Otherwise, make diag_obj be the top-level object within the group;

> -	 add a "children" array.  */

> +	 add a "children" array and record the column origin.  */

>        toplevel_array->append (diag_obj);

>        cur_group = diag_obj;

>        cur_children_array = new json::array ();

>        diag_obj->set ("children", cur_children_array);

> +      diag_obj->set ("column-origin",

> +		     new json::integer_number (context->column_origin));

>      }

>  

>    const rich_location *richloc = diagnostic->richloc;

> @@ -205,7 +230,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,

>    for (unsigned int i = 0; i < richloc->get_num_locations (); i++)

>      {

>        const location_range *loc_range = richloc->get_range (i);

> -      json::object *loc_obj = json_from_location_range (loc_range, i);

> +      json::object *loc_obj = json_from_location_range (context, loc_range, i);

>        if (loc_obj)

>  	loc_array->append (loc_obj);

>      }

> @@ -217,7 +242,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,

>        for (unsigned int i = 0; i < richloc->get_num_fixit_hints (); i++)

>  	{

>  	  const fixit_hint *hint = richloc->get_fixit_hint (i);

> -	  json::object *fixit_obj = json_from_fixit_hint (hint);

> +	  json::object *fixit_obj = json_from_fixit_hint (context, hint);

>  	  fixit_array->append (fixit_obj);

>  	}

>      }

> @@ -320,7 +345,8 @@ namespace selftest {

>  static void

>  test_unknown_location ()

>  {

> -  delete json_from_expanded_location (UNKNOWN_LOCATION);

> +  test_diagnostic_context dc;

> +  delete json_from_expanded_location (&dc, UNKNOWN_LOCATION);

>  }

>  

>  /* Verify that we gracefully handle attempts to serialize bad

> @@ -338,7 +364,8 @@ test_bad_endpoints ()

>    loc_range.m_range_display_kind = SHOW_RANGE_WITH_CARET;

>    loc_range.m_label = NULL;

>  

> -  json::object *obj = json_from_location_range (&loc_range, 0);

> +  test_diagnostic_context dc;

> +  json::object *obj = json_from_location_range (&dc, &loc_range, 0);

>    /* We should have a "caret" value, but no "start" or "finish" values.  */

>    ASSERT_TRUE (obj != NULL);

>    ASSERT_TRUE (obj->get ("caret") != NULL);

> diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c

> index 4618b4edb7d..da3c5b6a92d 100644

> --- a/gcc/diagnostic-show-locus.c

> +++ b/gcc/diagnostic-show-locus.c

> @@ -175,9 +175,10 @@ enum column_unit {

>  class exploc_with_display_col : public expanded_location

>  {

>   public:

> -  exploc_with_display_col (const expanded_location &exploc)

> +  exploc_with_display_col (const expanded_location &exploc, int tabstop)

>      : expanded_location (exploc),

> -      m_display_col (location_compute_display_column (exploc)) {}

> +      m_display_col (location_compute_display_column (exploc, tabstop))

> +  {}

>  

>    int m_display_col;

>  };

> @@ -189,11 +190,11 @@ class exploc_with_display_col : public expanded_location

>  class layout_point

>  {

>   public:

> -  layout_point (const expanded_location &exploc)

> +  layout_point (const exploc_with_display_col &exploc)

>      : m_line (exploc.line)

>    {

>      m_columns[CU_BYTES] = exploc.column;

> -    m_columns[CU_DISPLAY_COLS] = location_compute_display_column (exploc);

> +    m_columns[CU_DISPLAY_COLS] = exploc.m_display_col;

>    }

>  

>    linenum_type m_line;

> @@ -205,10 +206,10 @@ class layout_point

>  class layout_range

>  {

>   public:

> -  layout_range (const expanded_location *start_exploc,

> -		const expanded_location *finish_exploc,

> +  layout_range (const exploc_with_display_col &start_exploc,

> +		const exploc_with_display_col &finish_exploc,

>  		enum range_display_kind range_display_kind,

> -		const expanded_location *caret_exploc,

> +		const exploc_with_display_col &caret_exploc,

>  		unsigned original_idx,

>  		const range_label *label);

>  

> @@ -226,22 +227,18 @@ class layout_range

>  

>  /* A struct for use by layout::print_source_line for telling

>     layout::print_annotation_line the extents of the source line that

> -   it printed, so that underlines can be clipped appropriately.  */

> +   it printed, so that underlines can be clipped appropriately.  Units

> +   are 1-based display columns.  */

>  

>  struct line_bounds

>  {

> -  int m_first_non_ws;

> -  int m_last_non_ws;

> +  int m_first_non_ws_disp_col;

> +  int m_last_non_ws_disp_col;

>  

> -  void convert_to_display_cols (char_span line)

> +  line_bounds ()

>    {

> -    m_first_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),

> -							line.length (),

> -							m_first_non_ws);

> -

> -    m_last_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),

> -						       line.length (),

> -						       m_last_non_ws);

> +    m_first_non_ws_disp_col = INT_MAX;

> +    m_last_non_ws_disp_col = 0;

>    }

>  };

>  

> @@ -351,8 +348,8 @@ class layout

>   private:

>    bool will_show_line_p (linenum_type row) const;

>    void print_leading_fixits (linenum_type row);

> -  void print_source_line (linenum_type row, const char *line, int line_bytes,

> -			  line_bounds *lbounds_out);

> +  line_bounds print_source_line (linenum_type row, const char *line,

> +				 int line_bytes);

>    bool should_print_annotation_line_p (linenum_type row) const;

>    void start_annotation_line (char margin_char = ' ') const;

>    void print_annotation_line (linenum_type row, const line_bounds lbounds);

> @@ -513,16 +510,16 @@ colorizer::get_color_by_name (const char *name)

>     Initialize various layout_point fields from expanded_location

>     equivalents; we've already filtered on file.  */

>  

> -layout_range::layout_range (const expanded_location *start_exploc,

> -			    const expanded_location *finish_exploc,

> +layout_range::layout_range (const exploc_with_display_col &start_exploc,

> +			    const exploc_with_display_col &finish_exploc,

>  			    enum range_display_kind range_display_kind,

> -			    const expanded_location *caret_exploc,

> +			    const exploc_with_display_col &caret_exploc,

>  			    unsigned original_idx,

>  			    const range_label *label)

> -: m_start (*start_exploc),

> -  m_finish (*finish_exploc),

> +: m_start (start_exploc),

> +  m_finish (finish_exploc),

>    m_range_display_kind (range_display_kind),

> -  m_caret (*caret_exploc),

> +  m_caret (caret_exploc),

>    m_original_idx (original_idx),

>    m_label (label)

>  {

> @@ -646,6 +643,9 @@ layout_range::intersects_line_p (linenum_type row) const

>  

>  #if CHECKING_P

>  

> +/* Default for when we don't care what the tab expansion is set to.  */

> +static const int def_tabstop = 8;

> +

>  /* Create some expanded locations for testing layout_range.  The filename

>     member of the explocs is set to the empty string.  This member will only be

>     inspected by the calls to location_compute_display_column() made from the

> @@ -662,8 +662,11 @@ make_range (int start_line, int start_col, int end_line, int end_col)

>      = {"", start_line, start_col, NULL, false};

>    const expanded_location finish_exploc

>      = {"", end_line, end_col, NULL, false};

> -  return layout_range (&start_exploc, &finish_exploc, SHOW_RANGE_WITHOUT_CARET,

> -		       &start_exploc, 0, NULL);

> +  return layout_range (exploc_with_display_col (start_exploc, def_tabstop),

> +		       exploc_with_display_col (finish_exploc, def_tabstop),

> +		       SHOW_RANGE_WITHOUT_CARET,

> +		       exploc_with_display_col (start_exploc, def_tabstop),

> +		       0, NULL);

>  }

>  

>  /* Selftests for layout_range::contains_point and

> @@ -964,7 +967,7 @@ layout::layout (diagnostic_context * context,

>  : m_context (context),

>    m_pp (context->printer),

>    m_primary_loc (richloc->get_range (0)->m_loc),

> -  m_exploc (richloc->get_expanded_location (0)),

> +  m_exploc (richloc->get_expanded_location (0), context->tabstop),

>    m_colorizer (context, diagnostic_kind),

>    m_colorize_source_p (context->colorize_source_p),

>    m_show_labels_p (context->show_labels_p),

> @@ -1060,7 +1063,10 @@ layout::maybe_add_location_range (const location_range *loc_range,

>  

>    /* Everything is now known to be in the correct source file,

>       but it may require further sanitization.  */

> -  layout_range ri (&start, &finish, loc_range->m_range_display_kind, &caret,

> +  layout_range ri (exploc_with_display_col (start, m_context->tabstop),

> +		   exploc_with_display_col (finish, m_context->tabstop),

> +		   loc_range->m_range_display_kind,

> +		   exploc_with_display_col (caret, m_context->tabstop),

>  		   original_idx, loc_range->m_label);

>  

>    /* If we have a range that finishes before it starts (perhaps

> @@ -1394,7 +1400,7 @@ layout::calculate_x_offset_display ()

>      = get_line_bytes_without_trailing_whitespace (line.get_buffer (),

>  						  line.length ());

>    int eol_display_column

> -    = cpp_display_width (line.get_buffer (), line_bytes);

> +    = cpp_display_width (line.get_buffer (), line_bytes, m_context->tabstop);

>    if (caret_display_column > eol_display_column

>        || !caret_display_column)

>      {

> @@ -1445,16 +1451,13 @@ layout::calculate_x_offset_display ()

>  }

>  

>  /* Print line ROW of source code, potentially colorized at any ranges, and

> -   populate *LBOUNDS_OUT.

> -   LINE is the source line (not necessarily 0-terminated) and LINE_BYTES

> -   is its length in bytes.

> -   This function deals only with byte offsets, not display columns, so

> -   m_x_offset_display must be converted from display to byte units.  In

> -   particular, LINE_BYTES and LBOUNDS_OUT are in bytes.  */

> +   return the line bounds.  LINE is the source line (not necessarily

> +   0-terminated) and LINE_BYTES is its length in bytes.  In order to handle both

> +   colorization and tab expansion, this function tracks the line position in

> +   both byte and display column units.  */

>  

> -void

> -layout::print_source_line (linenum_type row, const char *line, int line_bytes,

> -			   line_bounds *lbounds_out)

> +line_bounds

> +layout::print_source_line (linenum_type row, const char *line, int line_bytes)

>  {

>    m_colorizer.set_normal_text ();

>  

> @@ -1469,30 +1472,29 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,

>    else

>      pp_space (m_pp);

>  

> -  /* We will stop printing the source line at any trailing whitespace, and start

> -     printing it as per m_x_offset_display.  */

> +  /* We will stop printing the source line at any trailing whitespace.  */

>    line_bytes = get_line_bytes_without_trailing_whitespace (line,

>  							   line_bytes);

> -  int x_offset_bytes = 0;

> -  if (m_x_offset_display)

> -    {

> -      x_offset_bytes = cpp_display_column_to_byte_column (line, line_bytes,

> -							  m_x_offset_display);

> -      /* In case the leading portion of the line that will be skipped over ends

> -	 with a character with wcwidth > 1, then it is possible we skipped too

> -	 much, so account for that by padding with spaces.  */

> -      const int overage

> -	= cpp_byte_column_to_display_column (line, line_bytes, x_offset_bytes)

> -	- m_x_offset_display;

> -      for (int column = 0; column < overage; ++column)

> -	pp_space (m_pp);

> -      line += x_offset_bytes;

> -    }

>  

> -  /* Print the line.  */

> -  int first_non_ws = INT_MAX;

> -  int last_non_ws = 0;

> -  for (int col_byte = 1 + x_offset_bytes; col_byte <= line_bytes; col_byte++)

> +  /* This object helps to keep track of which display column we are at, which is

> +     necessary for computing the line bounds in display units, for doing

> +     tab expansion, and for implementing m_x_offset_display.  */

> +  cpp_display_width_computation dw (line, line_bytes, m_context->tabstop);

> +

> +  /* Skip the first m_x_offset_display display columns.  In case the leading

> +     portion that will be skipped ends with a character with wcwidth > 1, then

> +     it is possible we skipped too much, so account for that by padding with

> +     spaces.  Note that this does the right thing too in case a tab was the last

> +     character to be skipped over; the tab is effectively replaced by the

> +     correct number of trailing spaces needed to offset by the desired number of

> +     display columns.  */

> +  for (int skipped_display_cols = dw.advance_display_cols (m_x_offset_display);

> +       skipped_display_cols > m_x_offset_display; --skipped_display_cols)

> +    pp_space (m_pp);

> +

> +  /* Print the line and compute the line_bounds.  */

> +  line_bounds lbounds;

> +  while (!dw.done ())

>      {

>        /* Assuming colorization is enabled for the caret and underline

>  	 characters, we may also colorize the associated characters

> @@ -1510,7 +1512,8 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,

>  	{

>  	  bool in_range_p;

>  	  point_state state;

> -	  in_range_p = get_state_at_point (row, col_byte,

> +	  const int start_byte_col = dw.bytes_processed () + 1;

> +	  in_range_p = get_state_at_point (row, start_byte_col,

>  					   0, INT_MAX,

>  					   CU_BYTES,

>  					   &state);

> @@ -1519,22 +1522,44 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,

>  	  else

>  	    m_colorizer.set_normal_text ();

>  	}

> -      char c = *line;

> -      if (c == '\0' || c == '\t' || c == '\r')

> -	c = ' ';

> -      if (c != ' ')

> +

> +      /* Get the display width of the next character to be output, expanding

> +	 tabs and replacing some control bytes with spaces as necessary.  */

> +      const char *c = dw.next_byte ();

> +      const int start_disp_col = dw.display_cols_processed () + 1;

> +      const int this_display_width = dw.process_next_codepoint ();

> +      if (*c == '\t')

> +	{

> +	  /* The returned display width is the number of spaces into which the

> +	     tab should be expanded.  */

> +	  for (int i = 0; i != this_display_width; ++i)

> +	    pp_space (m_pp);

> +	  continue;

> +	}

> +      if (*c == '\0' || *c == '\r')

>  	{

> -	  last_non_ws = col_byte;

> -	  if (first_non_ws == INT_MAX)

> -	    first_non_ws = col_byte;

> +	  /* cpp_wcwidth() promises to return 1 for all control bytes, and we

> +	     want to output these as a single space too, so this case is

> +	     actually the same as the '\t' case.  */

> +	  gcc_assert (this_display_width == 1);

> +	  pp_space (m_pp);

> +	  continue;

>  	}

> -      pp_character (m_pp, c);

> -      line++;

> +

> +      /* We have a (possibly multibyte) character to output; update the line

> +	 bounds if it is not whitespace.  */

> +      if (*c != ' ')

> +	{

> +	  lbounds.m_last_non_ws_disp_col = dw.display_cols_processed ();

> +	  if (lbounds.m_first_non_ws_disp_col == INT_MAX)

> +	    lbounds.m_first_non_ws_disp_col = start_disp_col;

> +	}

> +

> +      /* Output the character.  */

> +      while (c != dw.next_byte ()) pp_character (m_pp, *c++);

>      }

>    print_newline ();

> -

> -  lbounds_out->m_first_non_ws = first_non_ws;

> -  lbounds_out->m_last_non_ws = last_non_ws;

> +  return lbounds;

>  }

>  

>  /* Determine if we should print an annotation line for ROW.

> @@ -1576,14 +1601,13 @@ layout::start_annotation_line (char margin_char) const

>  }

>  

>  /* Print a line consisting of the caret/underlines for the given

> -   source line.  This function works with display columns, rather than byte

> -   counts; in particular, LBOUNDS should be in display column units.  */

> +   source line.  */

>  

>  void

>  layout::print_annotation_line (linenum_type row, const line_bounds lbounds)

>  {

>    int x_bound = get_x_bound_for_row (row, m_exploc.m_display_col,

> -				     lbounds.m_last_non_ws);

> +				     lbounds.m_last_non_ws_disp_col);

>  

>    start_annotation_line ();

>    pp_space (m_pp);

> @@ -1593,8 +1617,8 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)

>        bool in_range_p;

>        point_state state;

>        in_range_p = get_state_at_point (row, column,

> -				       lbounds.m_first_non_ws,

> -				       lbounds.m_last_non_ws,

> +				       lbounds.m_first_non_ws_disp_col,

> +				       lbounds.m_last_non_ws_disp_col,

>  				       CU_DISPLAY_COLS,

>  				       &state);

>        if (in_range_p)

> @@ -1631,12 +1655,14 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)

>  class line_label

>  {

>  public:

> -  line_label (int state_idx, int column, label_text text)

> +  line_label (diagnostic_context *context, int state_idx, int column,

> +	      label_text text)

>    : m_state_idx (state_idx), m_column (column),

>      m_text (text), m_label_line (0), m_has_vbar (true)

>    {

>      const int bytes = strlen (text.m_buffer);

> -    m_display_width = cpp_display_width (text.m_buffer, bytes);

> +    m_display_width

> +      = cpp_display_width (text.m_buffer, bytes, context->tabstop);

>    }

>  

>    /* Sorting is primarily by column, then by state index.  */

> @@ -1696,7 +1722,7 @@ layout::print_any_labels (linenum_type row)

>  	if (text.m_buffer == NULL)

>  	  continue;

>  

> -	labels.safe_push (line_label (i, disp_col, text));

> +	labels.safe_push (line_label (m_context, i, disp_col, text));

>        }

>    }

>  

> @@ -1976,7 +2002,8 @@ public:

>  

>  /* Get the range of bytes or display columns that HINT would affect.  */

>  static column_range

> -get_affected_range (const fixit_hint *hint, enum column_unit col_unit)

> +get_affected_range (diagnostic_context *context,

> +		    const fixit_hint *hint, enum column_unit col_unit)

>  {

>    expanded_location exploc_start = expand_location (hint->get_start_loc ());

>    expanded_location exploc_finish = expand_location (hint->get_next_loc ());

> @@ -1986,11 +2013,13 @@ get_affected_range (const fixit_hint *hint, enum column_unit col_unit)

>    int finish_column;

>    if (col_unit == CU_DISPLAY_COLS)

>      {

> -      start_column = location_compute_display_column (exploc_start);

> +      start_column

> +	= location_compute_display_column (exploc_start, context->tabstop);

>        if (hint->insertion_p ())

>  	finish_column = start_column - 1;

>        else

> -	finish_column = location_compute_display_column (exploc_finish);

> +	finish_column

> +	  = location_compute_display_column (exploc_finish, context->tabstop);

>      }

>    else

>      {

> @@ -2003,12 +2032,12 @@ get_affected_range (const fixit_hint *hint, enum column_unit col_unit)

>  /* Get the range of display columns that would be printed for HINT.  */

>  

>  static column_range

> -get_printed_columns (const fixit_hint *hint)

> +get_printed_columns (diagnostic_context *context, const fixit_hint *hint)

>  {

>    expanded_location exploc = expand_location (hint->get_start_loc ());

> -  int start_column = location_compute_display_column (exploc);

> -  int hint_width = cpp_display_width (hint->get_string (),

> -				      hint->get_length ());

> +  int start_column = location_compute_display_column (exploc, context->tabstop);

> +  int hint_width = cpp_display_width (hint->get_string (), hint->get_length (),

> +				      context->tabstop);

>    int final_hint_column = start_column + hint_width - 1;

>    if (hint->insertion_p ())

>      {

> @@ -2018,7 +2047,8 @@ get_printed_columns (const fixit_hint *hint)

>      {

>        exploc = expand_location (hint->get_next_loc ());

>        --exploc.column;

> -      int finish_column = location_compute_display_column (exploc);

> +      int finish_column

> +	= location_compute_display_column (exploc, context->tabstop);

>        return column_range (start_column,

>  			   MAX (finish_column, final_hint_column));

>      }

> @@ -2035,12 +2065,14 @@ public:

>    correction (column_range affected_bytes,

>  	      column_range affected_columns,

>  	      column_range printed_columns,

> -	      const char *new_text, size_t new_text_len)

> +	      const char *new_text, size_t new_text_len,

> +	      int tabstop)

>    : m_affected_bytes (affected_bytes),

>      m_affected_columns (affected_columns),

>      m_printed_columns (printed_columns),

>      m_text (xstrdup (new_text)),

>      m_byte_length (new_text_len),

> +    m_tabstop (tabstop),

>      m_alloc_sz (new_text_len + 1)

>    {

>      compute_display_cols ();

> @@ -2058,7 +2090,7 @@ public:

>  

>    void compute_display_cols ()

>    {

> -    m_display_cols = cpp_display_width (m_text, m_byte_length);

> +    m_display_cols = cpp_display_width (m_text, m_byte_length, m_tabstop);

>    }

>  

>    void overwrite (int dst_offset, const char_span &src_span)

> @@ -2086,6 +2118,7 @@ public:

>    char *m_text;

>    size_t m_byte_length; /* Not including null-terminator.  */

>    int m_display_cols;

> +  int m_tabstop;

>    size_t m_alloc_sz;

>  };

>  

> @@ -2121,13 +2154,15 @@ correction::ensure_terminated ()

>  class line_corrections

>  {

>  public:

> -  line_corrections (const char *filename, linenum_type row)

> -  : m_filename (filename), m_row (row)

> +  line_corrections (diagnostic_context *context, const char *filename,

> +		    linenum_type row)

> +    : m_context (context), m_filename (filename), m_row (row)

>    {}

>    ~line_corrections ();

>  

>    void add_hint (const fixit_hint *hint);

>  

> +  diagnostic_context *m_context;

>    const char *m_filename;

>    linenum_type m_row;

>    auto_vec <correction *> m_corrections;

> @@ -2173,9 +2208,10 @@ source_line::source_line (const char *filename, int line)

>  void

>  line_corrections::add_hint (const fixit_hint *hint)

>  {

> -  column_range affected_bytes = get_affected_range (hint, CU_BYTES);

> -  column_range affected_columns = get_affected_range (hint, CU_DISPLAY_COLS);

> -  column_range printed_columns = get_printed_columns (hint);

> +  column_range affected_bytes = get_affected_range (m_context, hint, CU_BYTES);

> +  column_range affected_columns = get_affected_range (m_context, hint,

> +						      CU_DISPLAY_COLS);

> +  column_range printed_columns = get_printed_columns (m_context, hint);

>  

>    /* Potentially consolidate.  */

>    if (!m_corrections.is_empty ())

> @@ -2243,7 +2279,8 @@ line_corrections::add_hint (const fixit_hint *hint)

>  					   affected_columns,

>  					   printed_columns,

>  					   hint->get_string (),

> -					   hint->get_length ()));

> +					   hint->get_length (),

> +					   m_context->tabstop));

>  }

>  

>  /* If there are any fixit hints on source line ROW, print them.

> @@ -2257,7 +2294,7 @@ layout::print_trailing_fixits (linenum_type row)

>  {

>    /* Build a list of correction instances for the line,

>       potentially consolidating hints (for the sake of readability).  */

> -  line_corrections corrections (m_exploc.file, row);

> +  line_corrections corrections (m_context, m_exploc.file, row);

>    for (unsigned int i = 0; i < m_fixit_hints.length (); i++)

>      {

>        const fixit_hint *hint = m_fixit_hints[i];

> @@ -2499,15 +2536,11 @@ layout::print_line (linenum_type row)

>    if (!line)

>      return;

>  

> -  line_bounds lbounds;

>    print_leading_fixits (row);

> -  print_source_line (row, line.get_buffer (), line.length (), &lbounds);

> +  const line_bounds lbounds

> +    = print_source_line (row, line.get_buffer (), line.length ());

>    if (should_print_annotation_line_p (row))

> -    {

> -      if (lbounds.m_first_non_ws != INT_MAX)

> -	lbounds.convert_to_display_cols (line);

> -      print_annotation_line (row, lbounds);

> -    }

> +    print_annotation_line (row, lbounds);

>    if (m_show_labels_p)

>      print_any_labels (row);

>    print_trailing_fixits (row);

> @@ -2670,9 +2703,11 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)

>  

>    char_span lspan = location_get_source_line (tmp.get_filename (), 1);

>    ASSERT_EQ (line_display_cols,

> -	     cpp_display_width (lspan.get_buffer (), lspan.length ()));

> +	     cpp_display_width (lspan.get_buffer (), lspan.length (),

> +				def_tabstop));

>    ASSERT_EQ (line_display_cols,

> -	     location_compute_display_column (expand_location (line_end)));

> +	     location_compute_display_column (expand_location (line_end),

> +					      def_tabstop));

>    ASSERT_EQ (0, memcmp (lspan.get_buffer () + (emoji_col - 1),

>  			"\xf0\x9f\x98\x82\xf0\x9f\x98\x82", 8));

>  

> @@ -2774,6 +2809,111 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)

>  

>  }

>  

> +static void

> +test_layout_x_offset_display_tab (const line_table_case &case_)

> +{

> +  const char *content

> +    = "This line is very long, so that we can use it to test the logic for "

> +      "clipping long lines.  Also this: `\t' is a tab that occupies 1 byte and "

> +      "a variable number of display columns, starting at column #103.\n";

> +

> +  /* Number of bytes in the line, subtracting one to remove the newline.  */

> +  const int line_bytes = strlen (content) - 1;

> +

> + /* The column where the tab begins.  Byte or display is the same as there are

> +    no multibyte characters earlier on the line.  */

> +  const int tab_col = 103;

> +

> +  /* Effective extra size of the tab beyond what a single space would have taken

> +     up, indexed by tabstop.  */

> +  static const int num_tabstops = 11;

> +  int extra_width[num_tabstops];

> +  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)

> +    {

> +      const int this_tab_size = tabstop - (tab_col - 1) % tabstop;

> +      extra_width[tabstop] = this_tab_size - 1;

> +    }

> +  /* Example of this calculation: if tabstop is 10, the tab starting at column

> +     #103 has to expand into 8 spaces, covering columns 103-110, so that the

> +     next character is at column #111.  So it takes up 7 more columns than

> +     a space would have taken up.  */

> +  ASSERT_EQ (7, extra_width[10]);

> +

> +  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);

> +  line_table_test ltt (case_);

> +

> +  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);

> +

> +  location_t line_end = linemap_position_for_column (line_table, line_bytes);

> +

> +  /* Don't attempt to run the tests if column data might be unavailable.  */

> +  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)

> +    return;

> +

> +  /* Check that cpp_display_width handles the tabs as expected.  */

> +  char_span lspan = location_get_source_line (tmp.get_filename (), 1);

> +  ASSERT_EQ ('\t', *(lspan.get_buffer () + (tab_col - 1)));

> +  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)

> +    {

> +      ASSERT_EQ (line_bytes + extra_width[tabstop],

> +		 cpp_display_width (lspan.get_buffer (), lspan.length (),

> +				    tabstop));

> +      ASSERT_EQ (line_bytes + extra_width[tabstop],

> +		 location_compute_display_column (expand_location (line_end),

> +						  tabstop));

> +    }

> +

> +  /* Check that the tab is expanded to the expected number of spaces.  */

> +  rich_location richloc (line_table,

> +			 linemap_position_for_column (line_table,

> +						      tab_col + 1));

> +  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)

> +    {

> +      test_diagnostic_context dc;

> +      dc.tabstop = tabstop;

> +      layout test_layout (&dc, &richloc, DK_ERROR);

> +      test_layout.print_line (1);

> +      const char *out = pp_formatted_text (dc.printer);

> +      ASSERT_EQ (NULL, strchr (out, '\t'));

> +      const char *left_quote = strchr (out, '`');

> +      const char *right_quote = strchr (out, '\'');

> +      ASSERT_NE (NULL, left_quote);

> +      ASSERT_NE (NULL, right_quote);

> +      ASSERT_EQ (right_quote - left_quote, extra_width[tabstop] + 2);

> +    }

> +

> +  /* Check that the line is offset properly and that the tab is broken up

> +     into the expected number of spaces when it is the last character skipped

> +     over.  */

> +  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)

> +    {

> +      test_diagnostic_context dc;

> +      dc.tabstop = tabstop;

> +      static const int small_width = 24;

> +      dc.caret_max_width = small_width - 4;

> +      dc.min_margin_width = test_left_margin - test_linenum_sep + 1;

> +      dc.show_line_numbers_p = true;

> +      layout test_layout (&dc, &richloc, DK_ERROR);

> +      test_layout.print_line (1);

> +

> +      /* We have arranged things so that two columns will be printed before

> +	 the caret.  If the tab results in more than one space, this should

> +	 produce two spaces in the output; otherwise, it will be a single space

> +	 preceded by the opening quote before the tab character.  */

> +      const char *output1

> +	= "   1 |   ' is a tab that occupies 1 byte and a variable number of "

> +	  "display columns, starting at column #103.\n"

> +	  "     |   ^\n\n";

> +      const char *output2

> +	= "   1 | ` ' is a tab that occupies 1 byte and a variable number of "

> +	  "display columns, starting at column #103.\n"

> +	  "     |   ^\n\n";

> +      const char *expected_output = (extra_width[tabstop] ? output1 : output2);

> +      ASSERT_STREQ (expected_output, pp_formatted_text (dc.printer));

> +    }

> +}

> +

> +

>  /* Verify that diagnostic_show_locus works sanely on UNKNOWN_LOCATION.  */

>  

>  static void

> @@ -3854,6 +3994,27 @@ test_one_liner_labels_utf8 ()

>    }

>  }

>  

> +/* Make sure that colorization codes don't interrupt a multibyte

> +   sequence, which would corrupt it.  */

> +static void

> +test_one_liner_colorized_utf8 ()

> +{

> +  test_diagnostic_context dc;

> +  dc.colorize_source_p = true;

> +  diagnostic_color_init (&dc, DIAGNOSTICS_COLOR_YES);

> +  const location_t pi = linemap_position_for_column (line_table, 12);

> +  rich_location richloc (line_table, pi);

> +  diagnostic_show_locus (&dc, &richloc, DK_ERROR);

> +

> +  /* In order to avoid having the test depend on exactly how the colorization

> +     was effected, just confirm there are two pi characters in the output.  */

> +  const char *result = pp_formatted_text (dc.printer);

> +  const char *null_term = result + strlen (result);

> +  const char *first_pi = strstr (result, "\xcf\x80");

> +  ASSERT_TRUE (first_pi && first_pi <= null_term - 2);

> +  ASSERT_STR_CONTAINS (first_pi + 2, "\xcf\x80");

> +}

> +

>  /* Run the various one-liner tests.  */

>  

>  static void

> @@ -3884,8 +4045,10 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)

>    ASSERT_EQ (31, LOCATION_COLUMN (line_end));

>  

>    char_span lspan = location_get_source_line (tmp.get_filename (), 1);

> -  ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length ()));

> -  ASSERT_EQ (25, location_compute_display_column (expand_location (line_end)));

> +  ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length (),

> +				    def_tabstop));

> +  ASSERT_EQ (25, location_compute_display_column (expand_location (line_end),

> +						  def_tabstop));

>  

>    test_one_liner_simple_caret_utf8 ();

>    test_one_liner_caret_and_range_utf8 ();

> @@ -3900,6 +4063,7 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)

>    test_one_liner_many_fixits_1_utf8 ();

>    test_one_liner_many_fixits_2_utf8 ();

>    test_one_liner_labels_utf8 ();

> +  test_one_liner_colorized_utf8 ();

>  }

>  

>  /* Verify that gcc_rich_location::add_location_if_nearby works.  */

> @@ -4272,25 +4436,28 @@ test_overlapped_fixit_printing (const line_table_case &case_)

>      /* Unit-test the line_corrections machinery.  */

>      ASSERT_EQ (3, richloc.get_num_fixit_hints ());

>      const fixit_hint *hint_0 = richloc.get_fixit_hint (0);

> -    ASSERT_EQ (column_range (12, 12), get_affected_range (hint_0, CU_BYTES));

>      ASSERT_EQ (column_range (12, 12),

> -			   get_affected_range (hint_0, CU_DISPLAY_COLS));

> -    ASSERT_EQ (column_range (12, 22), get_printed_columns (hint_0));

> +	       get_affected_range (&dc, hint_0, CU_BYTES));

> +    ASSERT_EQ (column_range (12, 12),

> +	       get_affected_range (&dc, hint_0, CU_DISPLAY_COLS));

> +    ASSERT_EQ (column_range (12, 22), get_printed_columns (&dc, hint_0));

>      const fixit_hint *hint_1 = richloc.get_fixit_hint (1);

> -    ASSERT_EQ (column_range (18, 18), get_affected_range (hint_1, CU_BYTES));

>      ASSERT_EQ (column_range (18, 18),

> -			   get_affected_range (hint_1, CU_DISPLAY_COLS));

> -    ASSERT_EQ (column_range (18, 20), get_printed_columns (hint_1));

> +	       get_affected_range (&dc, hint_1, CU_BYTES));

> +    ASSERT_EQ (column_range (18, 18),

> +	       get_affected_range (&dc, hint_1, CU_DISPLAY_COLS));

> +    ASSERT_EQ (column_range (18, 20), get_printed_columns (&dc, hint_1));

>      const fixit_hint *hint_2 = richloc.get_fixit_hint (2);

> -    ASSERT_EQ (column_range (29, 28), get_affected_range (hint_2, CU_BYTES));

>      ASSERT_EQ (column_range (29, 28),

> -			   get_affected_range (hint_2, CU_DISPLAY_COLS));

> -    ASSERT_EQ (column_range (29, 29), get_printed_columns (hint_2));

> +	       get_affected_range (&dc, hint_2, CU_BYTES));

> +    ASSERT_EQ (column_range (29, 28),

> +	       get_affected_range (&dc, hint_2, CU_DISPLAY_COLS));

> +    ASSERT_EQ (column_range (29, 29), get_printed_columns (&dc, hint_2));

>  

>      /* Add each hint in turn to a line_corrections instance,

>         and verify that they are consolidated into one correction instance

>         as expected.  */

> -    line_corrections lc (tmp.get_filename (), 1);

> +    line_corrections lc (&dc, tmp.get_filename (), 1);

>  

>      /* The first replace hint by itself.  */

>      lc.add_hint (hint_0);

> @@ -4484,25 +4651,28 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)

>      /* Unit-test the line_corrections machinery.  */

>      ASSERT_EQ (3, richloc.get_num_fixit_hints ());

>      const fixit_hint *hint_0 = richloc.get_fixit_hint (0);

> -    ASSERT_EQ (column_range (14, 14), get_affected_range (hint_0, CU_BYTES));

> +    ASSERT_EQ (column_range (14, 14),

> +	       get_affected_range (&dc, hint_0, CU_BYTES));

>      ASSERT_EQ (column_range (12, 12),

> -			   get_affected_range (hint_0, CU_DISPLAY_COLS));

> -    ASSERT_EQ (column_range (12, 22), get_printed_columns (hint_0));

> +	       get_affected_range (&dc, hint_0, CU_DISPLAY_COLS));

> +    ASSERT_EQ (column_range (12, 22), get_printed_columns (&dc, hint_0));

>      const fixit_hint *hint_1 = richloc.get_fixit_hint (1);

> -    ASSERT_EQ (column_range (22, 22), get_affected_range (hint_1, CU_BYTES));

> +    ASSERT_EQ (column_range (22, 22),

> +	       get_affected_range (&dc, hint_1, CU_BYTES));

>      ASSERT_EQ (column_range (18, 18),

> -			   get_affected_range (hint_1, CU_DISPLAY_COLS));

> -    ASSERT_EQ (column_range (18, 20), get_printed_columns (hint_1));

> +	       get_affected_range (&dc, hint_1, CU_DISPLAY_COLS));

> +    ASSERT_EQ (column_range (18, 20), get_printed_columns (&dc, hint_1));

>      const fixit_hint *hint_2 = richloc.get_fixit_hint (2);

> -    ASSERT_EQ (column_range (35, 34), get_affected_range (hint_2, CU_BYTES));

> +    ASSERT_EQ (column_range (35, 34),

> +	       get_affected_range (&dc, hint_2, CU_BYTES));

>      ASSERT_EQ (column_range (30, 29),

> -			   get_affected_range (hint_2, CU_DISPLAY_COLS));

> -    ASSERT_EQ (column_range (30, 30), get_printed_columns (hint_2));

> +	       get_affected_range (&dc, hint_2, CU_DISPLAY_COLS));

> +    ASSERT_EQ (column_range (30, 30), get_printed_columns (&dc, hint_2));

>  

>      /* Add each hint in turn to a line_corrections instance,

>         and verify that they are consolidated into one correction instance

>         as expected.  */

> -    line_corrections lc (tmp.get_filename (), 1);

> +    line_corrections lc (&dc, tmp.get_filename (), 1);

>  

>      /* The first replace hint by itself.  */

>      lc.add_hint (hint_0);

> @@ -4689,6 +4859,8 @@ test_overlapped_fixit_printing_2 (const line_table_case &case_)

>  

>    /* Two insertions, in the wrong order.  */

>    {

> +    test_diagnostic_context dc;

> +

>      rich_location richloc (line_table, col_20);

>      richloc.add_fixit_insert_before (col_23, "{");

>      richloc.add_fixit_insert_before (col_21, "}");

> @@ -4696,14 +4868,15 @@ test_overlapped_fixit_printing_2 (const line_table_case &case_)

>      /* These fixits should be accepted; they can't be consolidated.  */

>      ASSERT_EQ (2, richloc.get_num_fixit_hints ());

>      const fixit_hint *hint_0 = richloc.get_fixit_hint (0);

> -    ASSERT_EQ (column_range (23, 22), get_affected_range (hint_0, CU_BYTES));

> -    ASSERT_EQ (column_range (23, 23), get_printed_columns (hint_0));

> +    ASSERT_EQ (column_range (23, 22),

> +	       get_affected_range (&dc, hint_0, CU_BYTES));

> +    ASSERT_EQ (column_range (23, 23), get_printed_columns (&dc, hint_0));

>      const fixit_hint *hint_1 = richloc.get_fixit_hint (1);

> -    ASSERT_EQ (column_range (21, 20), get_affected_range (hint_1, CU_BYTES));

> -    ASSERT_EQ (column_range (21, 21), get_printed_columns (hint_1));

> +    ASSERT_EQ (column_range (21, 20),

> +	       get_affected_range (&dc, hint_1, CU_BYTES));

> +    ASSERT_EQ (column_range (21, 21), get_printed_columns (&dc, hint_1));

>  

>      /* Verify that they're printed correctly.  */

> -    test_diagnostic_context dc;

>      diagnostic_show_locus (&dc, &richloc, DK_ERROR);

>      ASSERT_STREQ (" int a5[][0][0] = { 1, 2 };\n"

>  		  "                    ^\n"

> @@ -4955,6 +5128,65 @@ test_fixit_deletion_affecting_newline (const line_table_case &case_)

>  		pp_formatted_text (dc.printer));

>  }

>  

> +static void

> +test_tab_expansion (const line_table_case &case_)

> +{

> +  /* Create a tempfile and write some text to it.  This example uses a tabstop

> +     of 8, as the column numbers attempt to indicate:

> +

> +    .....................000.01111111111.22222333333  display

> +    .....................123.90123456789.56789012345  columns  */

> +  const char *content = "  \t   This: `\t' is a tab.\n";

> +  /* ....................000 00000011111 11111222222  byte

> +     ....................123 45678901234 56789012345  columns  */

> +

> +  const int tabstop = 8;

> +  const int first_non_ws_byte_col = 7;

> +  const int right_quote_byte_col = 15;

> +  const int last_byte_col = 25;

> +  ASSERT_EQ (35, cpp_display_width (content, last_byte_col, tabstop));

> +

> +  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);

> +  line_table_test ltt (case_);

> +  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);

> +

> +  /* Don't attempt to run the tests if column data might be unavailable.  */

> +  location_t line_end = linemap_position_for_column (line_table, last_byte_col);

> +  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)

> +    return;

> +

> +  /* Check that the leading whitespace with mixed tabs and spaces is expanded

> +     into 11 spaces.  Recall that print_line() also puts one space before

> +     everything too.  */

> +  {

> +    test_diagnostic_context dc;

> +    dc.tabstop = tabstop;

> +    rich_location richloc (line_table,

> +			   linemap_position_for_column (line_table,

> +							first_non_ws_byte_col));

> +    layout test_layout (&dc, &richloc, DK_ERROR);

> +    test_layout.print_line (1);

> +    ASSERT_STREQ ("            This: `      ' is a tab.\n"

> +		  "            ^\n",

> +		  pp_formatted_text (dc.printer));

> +  }

> +

> +  /* Confirm the display width was tracked correctly across the internal tab

> +     as well.  */

> +  {

> +    test_diagnostic_context dc;

> +    dc.tabstop = tabstop;

> +    rich_location richloc (line_table,

> +			   linemap_position_for_column (line_table,

> +							right_quote_byte_col));

> +    layout test_layout (&dc, &richloc, DK_ERROR);

> +    test_layout.print_line (1);

> +    ASSERT_STREQ ("            This: `      ' is a tab.\n"

> +		  "                         ^\n",

> +		  pp_formatted_text (dc.printer));

> +  }

> +}

> +

>  /* Verify that line numbers are correctly printed for the case of

>     a multiline range in which the width of the line numbers changes

>     (e.g. from "9" to "10").  */

> @@ -5012,6 +5244,7 @@ diagnostic_show_locus_c_tests ()

>    test_layout_range_for_multiple_lines ();

>  

>    for_each_line_table_case (test_layout_x_offset_display_utf8);

> +  for_each_line_table_case (test_layout_x_offset_display_tab);

>  

>    test_get_line_bytes_without_trailing_whitespace ();

>  

> @@ -5029,6 +5262,7 @@ diagnostic_show_locus_c_tests ()

>    for_each_line_table_case (test_fixit_insert_containing_newline_2);

>    for_each_line_table_case (test_fixit_replace_containing_newline);

>    for_each_line_table_case (test_fixit_deletion_affecting_newline);

> +  for_each_line_table_case (test_tab_expansion);

>  

>    test_line_numbers_multiline_range ();

>  }

> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c

> index ed52bc03d17..1b6c9845892 100644

> --- a/gcc/diagnostic.c

> +++ b/gcc/diagnostic.c

> @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see

>  #include "selftest.h"

>  #include "selftest-diagnostic.h"

>  #include "opts.h"

> +#include "cpplib.h"

>  

>  #ifdef HAVE_TERMIOS_H

>  # include <termios.h>

> @@ -219,6 +220,9 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)

>    context->min_margin_width = 0;

>    context->show_ruler_p = false;

>    context->parseable_fixits_p = false;

> +  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;

> +  context->column_origin = 1;

> +  context->tabstop = 8;

>    context->edit_context_ptr = NULL;

>    context->diagnostic_group_nesting_depth = 0;

>    context->diagnostic_group_emission_count = 0;

> @@ -353,8 +357,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)

>    return diagnostic_kind_color[kind];

>  }

>  

> +/* Given an expanded_location, convert the column (which is in 1-based bytes)

> +   to the requested units and origin.  Return -1 if the column is

> +   invalid (<= 0).  */

> +int

> +diagnostic_converted_column (diagnostic_context *context, expanded_location s)

> +{

> +  if (s.column <= 0)

> +    return -1;

> +

> +  int one_based_col;

> +  switch (context->column_unit)

> +    {

> +    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:

> +      one_based_col = location_compute_display_column (s, context->tabstop);

> +      break;

> +

> +    case DIAGNOSTICS_COLUMN_UNIT_BYTE:

> +      one_based_col = s.column;

> +      break;

> +

> +    default:

> +      gcc_unreachable ();

> +    }

> +

> +  return one_based_col + (context->column_origin - 1);

> +}

> +

>  /* Return a formatted line and column ':%line:%column'.  Elided if

> -   zero.  The result is a statically allocated buffer.  */

> +   line == 0 or col < 0.  (A column of 0 may be valid due to the

> +   -fdiagnostics-column-origin option.)

> +   The result is a statically allocated buffer.  */

>  

>  static const char *

>  maybe_line_and_column (int line, int col)

> @@ -363,8 +396,9 @@ maybe_line_and_column (int line, int col)

>  

>    if (line)

>      {

> -      size_t l = snprintf (result, sizeof (result),

> -			   col ? ":%d:%d" : ":%d", line, col);

> +      size_t l

> +	= snprintf (result, sizeof (result),

> +		    col >= 0 ? ":%d:%d" : ":%d", line, col);

>        gcc_checking_assert (l < sizeof (result));

>      }

>    else

> @@ -383,8 +417,14 @@ diagnostic_get_location_text (diagnostic_context *context,

>    const char *locus_cs = colorize_start (pp_show_color (pp), "locus");

>    const char *locus_ce = colorize_stop (pp_show_color (pp));

>    const char *file = s.file ? s.file : progname;

> -  int line = strcmp (file, N_("<built-in>")) ? s.line : 0;

> -  int col = context->show_column ? s.column : 0;

> +  int line = 0;

> +  int col = -1;

> +  if (strcmp (file, N_("<built-in>")))

> +    {

> +      line = s.line;

> +      if (context->show_column)

> +	col = diagnostic_converted_column (context, s);

> +    }

>  

>    const char *line_col = maybe_line_and_column (line, col);

>    return build_message_string ("%s%s%s:%s", locus_cs, file,

> @@ -650,14 +690,20 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)

>        if (! MAIN_FILE_P (map))

>  	{

>  	  bool first = true;

> +	  expanded_location s = {};

>  	  do

>  	    {

>  	      where = linemap_included_from (map);

>  	      map = linemap_included_from_linemap (line_table, map);

> -	      const char *line_col

> -		= maybe_line_and_column (SOURCE_LINE (map, where),

> -					 first && context->show_column

> -					 ? SOURCE_COLUMN (map, where) : 0);

> +	      s.file = LINEMAP_FILE (map);

> +	      s.line = SOURCE_LINE (map, where);

> +	      int col = -1;

> +	      if (first && context->show_column)

> +		{

> +		  s.column = SOURCE_COLUMN (map, where);

> +		  col = diagnostic_converted_column (context, s);

> +		}

> +	      const char *line_col = maybe_line_and_column (s.line, col);

>  	      static const char *const msgs[] =

>  		{

>  		 N_("In file included from"),

> @@ -666,7 +712,7 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)

>  	      unsigned index = !first;

>  	      pp_verbatim (context->printer, "%s%s %r%s%s%R",

>  			   first ? "" : ",\n", _(msgs[index]),

> -			   "locus", LINEMAP_FILE (map), line_col);

> +			   "locus", s.file, line_col);

>  	      first = false;

>  	    }

>  	  while (! MAIN_FILE_P (map));

> @@ -2042,10 +2088,15 @@ test_print_parseable_fixits_replace ()

>  static void

>  assert_location_text (const char *expected_loc_text,

>  		      const char *filename, int line, int column,

> -		      bool show_column)

> +		      bool show_column,

> +		      int origin = 1,

> +		      enum diagnostics_column_unit column_unit

> +			= DIAGNOSTICS_COLUMN_UNIT_BYTE)

>  {

>    test_diagnostic_context dc;

>    dc.show_column = show_column;

> +  dc.column_unit = column_unit;

> +  dc.column_origin = origin;

>  

>    expanded_location xloc;

>    xloc.file = filename;

> @@ -2069,7 +2120,10 @@ test_diagnostic_get_location_text ()

>    assert_location_text ("PROGNAME:", NULL, 0, 0, true);

>    assert_location_text ("<built-in>:", "<built-in>", 42, 10, true);

>    assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);

> -  assert_location_text ("foo.c:42:", "foo.c", 42, 0, true);

> +  assert_location_text ("foo.c:42:9:", "foo.c", 42, 10, true, 0);

> +  assert_location_text ("foo.c:42:1010:", "foo.c", 42, 10, true, 1001);

> +  for (int origin = 0; origin != 2; ++origin)

> +    assert_location_text ("foo.c:42:", "foo.c", 42, 0, true, origin);

>    assert_location_text ("foo.c:", "foo.c", 0, 10, true);

>    assert_location_text ("foo.c:42:", "foo.c", 42, 10, false);

>    assert_location_text ("foo.c:", "foo.c", 0, 10, false);

> @@ -2077,6 +2131,41 @@ test_diagnostic_get_location_text ()

>    maybe_line_and_column (INT_MAX, INT_MAX);

>    maybe_line_and_column (INT_MIN, INT_MIN);

>  

> +  {

> +    /* In order to test display columns vs byte columns, we need to create a

> +       file for location_get_source_line() to read.  */

> +

> +    const char *const content = "smile \xf0\x9f\x98\x82\n";

> +    const int line_bytes = strlen (content) - 1;

> +    const int def_tabstop = 8;

> +    const int display_width = cpp_display_width (content, line_bytes,

> +						 def_tabstop);

> +    ASSERT_EQ (line_bytes - 2, display_width);

> +    temp_source_file tmp (SELFTEST_LOCATION, ".c", content);

> +    const char *const fname = tmp.get_filename ();

> +    const int buf_len = strlen (fname) + 16;

> +    char *const expected = XNEWVEC (char, buf_len);

> +

> +    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes);

> +    assert_location_text (expected, fname, 1, line_bytes, true,

> +			  1, DIAGNOSTICS_COLUMN_UNIT_BYTE);

> +

> +    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes - 1);

> +    assert_location_text (expected, fname, 1, line_bytes, true,

> +			  0, DIAGNOSTICS_COLUMN_UNIT_BYTE);

> +

> +    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width);

> +    assert_location_text (expected, fname, 1, line_bytes, true,

> +			  1, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);

> +

> +    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width - 1);

> +    assert_location_text (expected, fname, 1, line_bytes, true,

> +			  0, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);

> +

> +    XDELETEVEC (expected);

> +  }

> +

> +

>    progname = old_progname;

>  }

>  

> diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h

> index 307dbcfb34a..75706c5f4d8 100644

> --- a/gcc/diagnostic.h

> +++ b/gcc/diagnostic.h

> @@ -24,6 +24,20 @@ along with GCC; see the file COPYING3.  If not see

>  #include "pretty-print.h"

>  #include "diagnostic-core.h"

>  

> +/* An enum for controlling what units to use for the column number

> +   when diagnostics are output, used by the -fdiagnostics-column-unit option.

> +   Tabs will be expanded or not according to the value of -ftabstop.  The origin

> +   (default 1) is controlled by -fdiagnostics-column-origin.  */

> +

> +enum diagnostics_column_unit

> +{

> +  /* The new default: display columns.  */

> +  DIAGNOSTICS_COLUMN_UNIT_DISPLAY,

> +

> +  /* The historical behavior: simple bytes.  */

> +  DIAGNOSTICS_COLUMN_UNIT_BYTE

> +};

> +

>  /* Enum for overriding the standard output format.  */

>  

>  enum diagnostics_output_format

> @@ -280,6 +294,15 @@ struct diagnostic_context

>       rest of the diagnostic.  */

>    bool parseable_fixits_p;

>  

> +  /* What units to use when outputting the column number.  */

> +  enum diagnostics_column_unit column_unit;

> +

> +  /* The origin for the column number (1-based or 0-based typically).  */

> +  int column_origin;

> +

> +  /* The size of the tabstop for tab expansion.  */

> +  int tabstop;

> +

>    /* If non-NULL, an edit_context to which fix-it hints should be

>       applied, for generating patches.  */

>    edit_context *edit_context_ptr;

> @@ -458,6 +481,8 @@ diagnostic_same_line (const diagnostic_context *context,

>  }

>  

>  extern const char *diagnostic_get_color_for_kind (diagnostic_t kind);

> +extern int diagnostic_converted_column (diagnostic_context *context,

> +					expanded_location s);

>  

>  /* Pure text formatting support functions.  */

>  extern char *file_name_as_prefix (diagnostic_context *, const char *);

> @@ -470,6 +495,7 @@ extern void diagnostic_output_format_init (diagnostic_context *,

>  /* Compute the number of digits in the decimal representation of an integer.  */

>  extern int num_digits (int);

>  

> -extern json::value *json_from_expanded_location (location_t loc);

> +extern json::value *json_from_expanded_location (diagnostic_context *context,

> +						 location_t loc);

>  

>  #endif /* ! GCC_DIAGNOSTIC_H */

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

> index 06a04e3d7dd..f463275bc8b 100644

> --- a/gcc/doc/invoke.texi

> +++ b/gcc/doc/invoke.texi

> @@ -292,7 +292,9 @@ Objective-C and Objective-C++ Dialects}.

>  -fdiagnostics-show-template-tree  -fno-elide-type @gol

>  -fdiagnostics-path-format=@r{[}none@r{|}separate-events@r{|}inline-events@r{]} @gol

>  -fdiagnostics-show-path-depths @gol

> --fno-show-column}

> +-fno-show-column @gol

> +-fdiagnostics-column-unit=@r{[}display@r{|}byte@r{]} @gol

> +-fdiagnostics-column-origin=@var{origin}}

>  

>  @item Warning Options

>  @xref{Warning Options,,Options to Request or Suppress Warnings}.

> @@ -4729,6 +4731,31 @@ Do not print column numbers in diagnostics.  This may be necessary if

>  diagnostics are being scanned by a program that does not understand the

>  column numbers, such as @command{dejagnu}.

>  

> +@item -fdiagnostics-column-unit=@var{UNIT}

> +@opindex fdiagnostics-column-unit

> +Select the units for the column number.  This affects traditional diagnostics

> +(in the absence of @option{-fno-show-column}), as well as JSON format

> +diagnostics if requested.

> +

> +The default @var{UNIT}, @samp{display}, considers the number of display

> +columns occupied by each character.  This may be larger than the number

> +of bytes required to encode the character, in the case of tab

> +characters, or it may be smaller, in the case of multibyte characters.

> +For example, the character ``@U{03C0}'' occupies one display column,

> +and its UTF-8 encoding requires two bytes; the character ``@U{1F642}''

> +occupies two display columns, and its UTF-8 encoding requires four

> +bytes.

> +

> +Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte

> +count in all cases, as was traditionally output by GCC prior to version 11.1.0.

> +

> +@item -fdiagnostics-column-origin=@var{ORIGIN}

> +@opindex fdiagnostics-column-origin

> +Select the origin for column numbers, i.e. the column number assigned to the

> +first column.  The default value of 1 corresponds to traditional GCC

> +behavior and to the GNU style guide.  Some utilities may perform better with an

> +origin of 0; any non-negative value may be specified.

> +

>  @item -fdiagnostics-format=@var{FORMAT}

>  @opindex fdiagnostics-format

>  Select a different format for printing diagnostics.

> @@ -4764,11 +4791,15 @@ might be printed in JSON form (after formatting) like this:

>          "locations": [

>              @{

>                  "caret": @{

> +		    "display-column": 3,

> +		    "byte-column": 3,

>                      "column": 3,

>                      "file": "misleading-indentation.c",

>                      "line": 15

>                  @},

>                  "finish": @{

> +		    "display-column": 4,

> +		    "byte-column": 4,

>                      "column": 4,

>                      "file": "misleading-indentation.c",

>                      "line": 15

> @@ -4784,6 +4815,8 @@ might be printed in JSON form (after formatting) like this:

>                  "locations": [

>                      @{

>                          "caret": @{

> +			    "display-column": 5,

> +			    "byte-column": 5,

>                              "column": 5,

>                              "file": "misleading-indentation.c",

>                              "line": 17

> @@ -4793,6 +4826,7 @@ might be printed in JSON form (after formatting) like this:

>                  "message": "...this statement, but the latter is @dots{}"

>              @}

>          ]

> +	"column-origin": 1,

>      @},

>      @dots{}

>  ]

> @@ -4805,10 +4839,34 @@ A diagnostic has a @code{kind}.  If this is @code{warning}, then there is

>  an @code{option} key describing the command-line option controlling the

>  warning.

>  

> -A diagnostic can contain zero or more locations.  Each location has up

> -to three positions within it: a @code{caret} position and optional

> -@code{start} and @code{finish} positions.  A location can also have

> -an optional @code{label} string.  For example, this error:

> +A diagnostic can contain zero or more locations.  Each location has an

> +optional @code{label} string and up to three positions within it: a

> +@code{caret} position and optional @code{start} and @code{finish} positions.

> +A position is described by a @code{file} name, a @code{line} number, and

> +three numbers indicating a column position:

> +@itemize @bullet

> +

> +@item

> +@code{display-column} counts display columns, accounting for tabs and

> +multibyte characters.

> +

> +@item

> +@code{byte-column} counts raw bytes.

> +

> +@item

> +@code{column} is equal to one of

> +the previous two, as dictated by the @option{-fdiagnostics-column-unit}

> +option.

> +

> +@end itemize

> +All three columns are relative to the origin specified by

> +@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may

> +be set, for instance, to 0 for compatibility with other utilities that

> +number columns from 0.  The column origin is recorded in the JSON output in

> +the @code{column-origin} tag.  In the remaining examples below, the extra

> +column number outputs have been omitted for brevity.

> +

> +For example, this error:

>  

>  @smallexample

>  bad-binary-ops.c:64:23: error: invalid operands to binary + (have 'S' @{aka

> diff --git a/gcc/input.c b/gcc/input.c

> index dd1d23df2f7..d573b90341a 100644

> --- a/gcc/input.c

> +++ b/gcc/input.c

> @@ -913,7 +913,7 @@ make_location (location_t caret, source_range src_range)

>     source line in order to calculate the display width.  If that cannot be done

>     for any reason, then returns the byte column as a fallback.  */

>  int

> -location_compute_display_column (expanded_location exploc)

> +location_compute_display_column (expanded_location exploc, int tabstop)

>  {

>    if (!(exploc.file && *exploc.file && exploc.line && exploc.column))

>      return exploc.column;

> @@ -921,7 +921,7 @@ location_compute_display_column (expanded_location exploc)

>    /* If line is NULL, this function returns exploc.column which is the

>       desired fallback.  */

>    return cpp_byte_column_to_display_column (line.get_buffer (), line.length (),

> -					    exploc.column);

> +					    exploc.column, tabstop);

>  }

>  

>  /* Dump statistics to stderr about the memory usage of the line_table

> @@ -3608,33 +3608,46 @@ test_line_offset_overflow ()

>  

>  void test_cpp_utf8 ()

>  {

> +  const int def_tabstop = 8;

>    /* Verify that wcwidth of invalid UTF-8 or control bytes is 1.  */

>    {

> -    int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8);

> +    int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8, def_tabstop);

>      ASSERT_EQ (8, w_bad);

> -    int w_ctrl = cpp_display_width ("\r\t\n\v\0\1", 6);

> -    ASSERT_EQ (6, w_ctrl);

> +    int w_ctrl = cpp_display_width ("\r\n\v\0\1", 5, def_tabstop);

> +    ASSERT_EQ (5, w_ctrl);

>    }

>  

>    /* Verify that wcwidth of valid UTF-8 is as expected.  */

>    {

> -    const int w_pi = cpp_display_width ("\xcf\x80", 2);

> +    const int w_pi = cpp_display_width ("\xcf\x80", 2, def_tabstop);

>      ASSERT_EQ (1, w_pi);

> -    const int w_emoji = cpp_display_width ("\xf0\x9f\x98\x82", 4);

> +    const int w_emoji = cpp_display_width ("\xf0\x9f\x98\x82", 4, def_tabstop);

>      ASSERT_EQ (2, w_emoji);

> -    const int w_umlaut_precomposed = cpp_display_width ("\xc3\xbf", 2);

> +    const int w_umlaut_precomposed = cpp_display_width ("\xc3\xbf", 2,

> +							def_tabstop);

>      ASSERT_EQ (1, w_umlaut_precomposed);

> -    const int w_umlaut_combining = cpp_display_width ("y\xcc\x88", 3);

> +    const int w_umlaut_combining = cpp_display_width ("y\xcc\x88", 3,

> +						      def_tabstop);

>      ASSERT_EQ (1, w_umlaut_combining);

> -    const int w_han = cpp_display_width ("\xe4\xb8\xba", 3);

> +    const int w_han = cpp_display_width ("\xe4\xb8\xba", 3, def_tabstop);

>      ASSERT_EQ (2, w_han);

> -    const int w_ascii = cpp_display_width ("GCC", 3);

> +    const int w_ascii = cpp_display_width ("GCC", 3, def_tabstop);

>      ASSERT_EQ (3, w_ascii);

>      const int w_mixed = cpp_display_width ("\xcf\x80 = 3.14 \xf0\x9f\x98\x82"

> -					   "\x9f! \xe4\xb8\xba y\xcc\x88", 24);

> +					   "\x9f! \xe4\xb8\xba y\xcc\x88",

> +					   24, def_tabstop);

>      ASSERT_EQ (18, w_mixed);

>    }

>  

> +  /* Verify that display width properly expands tabs.  */

> +  {

> +    const char *tstr = "\tabc\td";

> +    ASSERT_EQ (6, cpp_display_width (tstr, 6, 1));

> +    ASSERT_EQ (10, cpp_display_width (tstr, 6, 3));

> +    ASSERT_EQ (17, cpp_display_width (tstr, 6, 8));

> +    ASSERT_EQ (1, cpp_display_column_to_byte_column (tstr, 6, 7, 8));

> +  }

> +

>    /* Verify that cpp_byte_column_to_display_column can go past the end,

>       and similar edge cases.  */

>    {

> @@ -3645,10 +3658,13 @@ void test_cpp_utf8 ()

>        /* 111122223456

>  	 Byte columns.  */

>  

> -    ASSERT_EQ (5, cpp_display_width (str, 6));

> -    ASSERT_EQ (105, cpp_byte_column_to_display_column (str, 6, 106));

> -    ASSERT_EQ (10000, cpp_byte_column_to_display_column (NULL, 0, 10000));

> -    ASSERT_EQ (0, cpp_byte_column_to_display_column (NULL, 10000, 0));

> +    ASSERT_EQ (5, cpp_display_width (str, 6, def_tabstop));

> +    ASSERT_EQ (105,

> +	       cpp_byte_column_to_display_column (str, 6, 106, def_tabstop));

> +    ASSERT_EQ (10000,

> +	       cpp_byte_column_to_display_column (NULL, 0, 10000, def_tabstop));

> +    ASSERT_EQ (0,

> +	       cpp_byte_column_to_display_column (NULL, 10000, 0, def_tabstop));

>    }

>  

>    /* Verify that cpp_display_column_to_byte_column can go past the end,

> @@ -3662,21 +3678,25 @@ void test_cpp_utf8 ()

>        /* 000000000000000000000000000000000111111

>  	 111122223333444456666777788889999012345

>  	 Byte columns.  */

> -    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 2));

> -    ASSERT_EQ (15, cpp_display_column_to_byte_column (str, 15, 11));

> -    ASSERT_EQ (115, cpp_display_column_to_byte_column (str, 15, 111));

> -    ASSERT_EQ (10000, cpp_display_column_to_byte_column (NULL, 0, 10000));

> -    ASSERT_EQ (0, cpp_display_column_to_byte_column (NULL, 10000, 0));

> +    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 2, def_tabstop));

> +    ASSERT_EQ (15,

> +	       cpp_display_column_to_byte_column (str, 15, 11, def_tabstop));

> +    ASSERT_EQ (115,

> +	       cpp_display_column_to_byte_column (str, 15, 111, def_tabstop));

> +    ASSERT_EQ (10000,

> +	       cpp_display_column_to_byte_column (NULL, 0, 10000, def_tabstop));

> +    ASSERT_EQ (0,

> +	       cpp_display_column_to_byte_column (NULL, 10000, 0, def_tabstop));

>  

>      /* Verify that we do not interrupt a UTF-8 sequence.  */

> -    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 1));

> +    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 1, def_tabstop));

>  

>      for (int byte_col = 1; byte_col <= 15; ++byte_col)

>        {

> -	const int disp_col = cpp_byte_column_to_display_column (str, 15,

> -								byte_col);

> -	const int byte_col2 = cpp_display_column_to_byte_column (str, 15,

> -								 disp_col);

> +	const int disp_col

> +	  = cpp_byte_column_to_display_column (str, 15, byte_col, def_tabstop);

> +	const int byte_col2

> +	  = cpp_display_column_to_byte_column (str, 15, disp_col, def_tabstop);

>  

>  	/* If we ask for the display column in the middle of a UTF-8

>  	   sequence, it will return the length of the partial sequence,

> diff --git a/gcc/input.h b/gcc/input.h

> index df48ce63ef9..4790a571c6a 100644

> --- a/gcc/input.h

> +++ b/gcc/input.h

> @@ -38,7 +38,9 @@ STATIC_ASSERT (BUILTINS_LOCATION < RESERVED_LOCATION_COUNT);

>  

>  extern bool is_location_from_builtin_token (location_t);

>  extern expanded_location expand_location (location_t);

> -extern int location_compute_display_column (expanded_location);

> +

> +extern int location_compute_display_column (expanded_location exploc,

> +					    int tabstop);

>  

>  /* A class capturing the bounds of a buffer, to allow for run-time

>     bounds-checking in a checked build.  */

> diff --git a/gcc/opts.c b/gcc/opts.c

> index 340d99434b3..525f44d079f 100644

> --- a/gcc/opts.c

> +++ b/gcc/opts.c

> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see

>  #include "opt-suggestions.h"

>  #include "diagnostic-color.h"

>  #include "selftest.h"

> +#include "cpplib.h"

>  

>  static void set_Wstrict_aliasing (struct gcc_options *opts, int onoff);

>  

> @@ -2404,6 +2405,14 @@ common_handle_option (struct gcc_options *opts,

>        dc->parseable_fixits_p = value;

>        break;

>  

> +    case OPT_fdiagnostics_column_unit_:

> +      dc->column_unit = (enum diagnostics_column_unit)value;

> +      break;

> +

> +    case OPT_fdiagnostics_column_origin_:

> +      dc->column_origin = value;

> +      break;

> +

>      case OPT_fdiagnostics_show_cwe:

>        dc->show_cwe = value;

>        break;

> @@ -2792,6 +2801,12 @@ common_handle_option (struct gcc_options *opts,

>        check_alignment_argument (loc, arg, "functions");

>        break;

>  

> +    case OPT_ftabstop_:

> +      /* It is documented that we silently ignore silly values.  */

> +      if (value >= 1 && value <= 100)

> +	dc->tabstop = value;

> +      break;

> +

>      default:

>        /* If the flag was handled in a standard way, assume the lack of

>  	 processing here is intentional.  */

> diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c

> index 870ba720c5f..2314ad42402 100644

> --- a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c

> +++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c

> @@ -36,20 +36,20 @@ int fn_6 (int a, int b, int c)

>  	/* ... */

>  	if ((err = foo (a)) != 0)

>  		goto fail;

> -	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */

> +	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */

>  		goto fail;

> -		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

> +		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

>  	if ((err = foo (c)) != 0)

>  		goto fail;

>  	/* ... */

>  

>  /* { dg-begin-multiline-output "" }

> -  if ((err = foo (b)) != 0)

> -  ^~

> +         if ((err = foo (b)) != 0)

> +         ^~

>     { dg-end-multiline-output "" } */

>  /* { dg-begin-multiline-output "" }

> -   goto fail;

> -   ^~~~

> +                 goto fail;

> +                 ^~~~

>     { dg-end-multiline-output "" } */

>  

>  fail:

> diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c

> index 5cdeba1cbba..202c6bc7fdf 100644

> --- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c

> +++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c

> @@ -65,9 +65,9 @@ int fn_6 (int a, int b, int c)

>  	/* ... */

>  	if ((err = foo (a)) != 0)

>  		goto fail;

> -	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */

> +	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */

>  		goto fail;

> -		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

> +		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

>  	if ((err = foo (c)) != 0)

>  		goto fail;

>  	/* ... */

> @@ -178,7 +178,7 @@ void fn_16_tabs (void)

>      while (flagA)

>        if (flagB) /* { dg-message "7: this 'if' clause does not guard..." } */

>  	foo (0);

> -	foo (1);/* { dg-message "2: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

> +	foo (1);/* { dg-message "9: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */

>  }

>  

>  void fn_17_spaces (void)

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c

> index 9359db48c17..740becb5548 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c

> @@ -8,17 +8,22 @@

>     We can't rely on any ordering of the keys.  */

>  

>  /* { dg-regexp "\"kind\": \"error\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \"#error message\"" } */

>  

>  /* { dg-regexp "\"caret\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 2" } */

> +/* { dg-regexp "\"display-column\": 2" } */

> +/* { dg-regexp "\"byte-column\": 2" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 6" } */

> +/* { dg-regexp "\"display-column\": 6" } */

> +/* { dg-regexp "\"byte-column\": 6" } */

>  

>  /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */

>  /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c

> index 557ccf8378b..2f24a6c6596 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c

> @@ -8,6 +8,7 @@

>     We can't rely on any ordering of the keys.  */

>  

>  /* { dg-regexp "\"kind\": \"warning\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \"#warning message\"" } */

>  /* { dg-regexp "\"option\": \"-Wcpp\"" } */

>  /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */

> @@ -16,11 +17,15 @@

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 2" } */

> +/* { dg-regexp "\"display-column\": 2" } */

> +/* { dg-regexp "\"byte-column\": 2" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 8" } */

> +/* { dg-regexp "\"display-column\": 8" } */

> +/* { dg-regexp "\"byte-column\": 8" } */

>  

>  /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */

>  /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c

> index 378205c5bf5..afe96a9048f 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c

> @@ -8,6 +8,7 @@

>     We can't rely on any ordering of the keys.  */

>  

>  /* { dg-regexp "\"kind\": \"error\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \"#warning message\"" } */

>  /* { dg-regexp "\"option\": \"-Werror=cpp\"" } */

>  /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */

> @@ -16,11 +17,15 @@

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 2" } */

> +/* { dg-regexp "\"display-column\": 2" } */

> +/* { dg-regexp "\"byte-column\": 2" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */

>  /* { dg-regexp "\"line\": 4" } */

>  /* { dg-regexp "\"column\": 8" } */

> +/* { dg-regexp "\"display-column\": 8" } */

> +/* { dg-regexp "\"byte-column\": 8" } */

>  

>  /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */

>  /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c

> index 2738be6548f..ae51091e0ea 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c

> @@ -24,15 +24,20 @@ int test (void)

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 5" } */

> +/* { dg-regexp "\"display-column\": 5" } */

> +/* { dg-regexp "\"byte-column\": 5" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 10" } */

> +/* { dg-regexp "\"display-column\": 10" } */

> +/* { dg-regexp "\"byte-column\": 10" } */

>  

>  /* The outer diagnostic.  */

>  

>  /* { dg-regexp "\"kind\": \"warning\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \"this 'if' clause does not guard...\"" } */

>  /* { dg-regexp "\"option\": \"-Wmisleading-indentation\"" } */

>  /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wmisleading-indentation\"" } */

> @@ -41,11 +46,15 @@ int test (void)

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */

>  /* { dg-regexp "\"line\": 6" } */

>  /* { dg-regexp "\"column\": 3" } */

> +/* { dg-regexp "\"display-column\": 3" } */

> +/* { dg-regexp "\"byte-column\": 3" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */

>  /* { dg-regexp "\"line\": 6" } */

>  /* { dg-regexp "\"column\": 4" } */

> +/* { dg-regexp "\"display-column\": 4" } */

> +/* { dg-regexp "\"byte-column\": 4" } */

>  

>  /* More from the nested diagnostic (we can't guarantee what order the

>     "file" keys are consumed).  */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c

> index f36e896d228..e0e9ce4be98 100644

> --- a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c

> +++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c

> @@ -13,6 +13,7 @@ int test (struct s *ptr)

>     We can't rely on any ordering of the keys.  */

>  

>  /* { dg-regexp "\"kind\": \"error\"" } */

> +/* { dg-regexp "\"column-origin\": 1" } */

>  /* { dg-regexp "\"message\": \".*\"" } */

>  

>  /* Verify fix-it hints.  */

> @@ -23,11 +24,15 @@ int test (struct s *ptr)

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 15" } */

> +/* { dg-regexp "\"display-column\": 15" } */

> +/* { dg-regexp "\"byte-column\": 15" } */

>  

>  /* { dg-regexp "\"next\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 21" } */

> +/* { dg-regexp "\"display-column\": 21" } */

> +/* { dg-regexp "\"byte-column\": 21" } */

>  

>  /* { dg-regexp "\"fixits\": \[\[\{\}, \]*\]" } */

>  

> @@ -35,11 +40,15 @@ int test (struct s *ptr)

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 15" } */

> +/* { dg-regexp "\"display-column\": 15" } */

> +/* { dg-regexp "\"byte-column\": 15" } */

>  

>  /* { dg-regexp "\"finish\": \{" } */

>  /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */

>  /* { dg-regexp "\"line\": 8" } */

>  /* { dg-regexp "\"column\": 20" } */

> +/* { dg-regexp "\"display-column\": 20" } */

> +/* { dg-regexp "\"byte-column\": 20" } */

>  

>  /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */

>  /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-1.c b/gcc/testsuite/c-c++-common/diagnostic-units-1.c

> new file mode 100644

> index 00000000000..8d38b7de03e

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-1.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 1 (via default)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-2.c b/gcc/testsuite/c-c++-common/diagnostic-units-2.c

> new file mode 100644

> index 00000000000..29a2edefd9f

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-2.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -Wmultichar" } */

> +

> +/* column units: display (via arg)

> +   column origin: 1 (via default)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "25: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-3.c b/gcc/testsuite/c-c++-common/diagnostic-units-3.c

> new file mode 100644

> index 00000000000..714ee8f2de4

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-3.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=200 -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 1 (via fallback from overly large argument)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-4.c b/gcc/testsuite/c-c++-common/diagnostic-units-4.c

> new file mode 100644

> index 00000000000..f9c9da914b2

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-4.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 0 (via arg)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "10: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "18: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-5.c b/gcc/testsuite/c-c++-common/diagnostic-units-5.c

> new file mode 100644

> index 00000000000..99d5299a732

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-5.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */

> +

> +/* column units: display (via arg)

> +   column origin: 0 (via arg)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "17: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "24: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-6.c b/gcc/testsuite/c-c++-common/diagnostic-units-6.c

> new file mode 100644

> index 00000000000..c1e6e4ed477

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-6.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=100 -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 100 (via arg)

> +   tabstop: 8 (via default) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "110: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c1 = 'c1';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +        int c2 = 'c2'; /* { dg-warning "117: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c2 = 'c2';

> +                  ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +        int c3 = 	'c3'; /* { dg-warning "118: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +         int c3 =        'c3';

> +                         ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-7.c b/gcc/testsuite/c-c++-common/diagnostic-units-7.c

> new file mode 100644

> index 00000000000..dab221ae235

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-7.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */

> +

> +/* column units: bytes (via arg)

> +   column origin: 1 (via default)

> +   tabstop: 9 (via arg) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c1 = 'c1';

> +                   ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c2 = 'c2';

> +                   ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +         int c3 = 	'c3'; /* { dg-warning "20: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c3 =          'c3';

> +                            ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-8.c b/gcc/testsuite/c-c++-common/diagnostic-units-8.c

> new file mode 100644

> index 00000000000..d713b32dabc

> --- /dev/null

> +++ b/gcc/testsuite/c-c++-common/diagnostic-units-8.c

> @@ -0,0 +1,28 @@

> +/* { dg-do compile } */

> +/* { dg-additional-options "-fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */

> +

> +/* column units: display (via default)

> +   column origin: 1 (via default)

> +   tabstop: 9 (via arg) */

> +

> +/* This line starts with a tab.  */

> +	int c1 = 'c1'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c1 = 'c1';

> +                   ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces.  */

> +         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c2 = 'c2';

> +                   ^~~~

> +   { dg-end-multiline-output "" } */

> +

> +/* This line starts with <tabstop> spaces and has an internal tab after

> +   a space.  */

> +         int c3 = 	'c3'; /* { dg-warning "28: multi-character character constant" } */

> +/* { dg-begin-multiline-output "" }

> +          int c3 =          'c3';

> +                            ^~~~

> +   { dg-end-multiline-output "" } */

> diff --git a/gcc/testsuite/c-c++-common/missing-close-symbol.c b/gcc/testsuite/c-c++-common/missing-close-symbol.c

> index abeb83748c1..9f1de3d0c47 100644

> --- a/gcc/testsuite/c-c++-common/missing-close-symbol.c

> +++ b/gcc/testsuite/c-c++-common/missing-close-symbol.c

> @@ -24,9 +24,9 @@ void test_static_assert_different_line (void)

>    _Static_assert(sizeof(int) >= sizeof(char), /* { dg-message "to match this '\\('" } */

>  		 "msg"; /* { dg-error "expected '\\)' before ';' token" } */

>    /* { dg-begin-multiline-output "" }

> -    "msg";

> -         ^

> -         )

> +                  "msg";

> +                       ^

> +                       )

>       { dg-end-multiline-output "" } */

>    /* { dg-begin-multiline-output "" }

>     _Static_assert(sizeof(int) >= sizeof(char),

> diff --git a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C

> index fab5849dfc7..ebbf3001055 100644

> --- a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C

> +++ b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C

> @@ -33,10 +33,10 @@ int test_2 (void)

>             ~~~~~~~~~~~~~~~~

>                           |

>                           s

> -    + some_other_function ());

> -    ^ ~~~~~~~~~~~~~~~~~~~~~~

> -                          |

> -                          t

> +           + some_other_function ());

> +           ^ ~~~~~~~~~~~~~~~~~~~~~~

> +                                 |

> +                                 t

>     { dg-end-multiline-output "" } */

>  }

>  

> diff --git a/gcc/testsuite/g++.dg/parse/error4.C b/gcc/testsuite/g++.dg/parse/error4.C

> index 792bf4dc063..fe8de73790d 100644

> --- a/gcc/testsuite/g++.dg/parse/error4.C

> +++ b/gcc/testsuite/g++.dg/parse/error4.C

> @@ -7,4 +7,4 @@ struct X {

>  		 int);

>  };

>  

> -// { dg-error "4:'itn' has not been declared" "" { target *-*-* } 6 }

> +// { dg-error "18:'itn' has not been declared" "" { target *-*-* } 6 }

> diff --git a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C

> index 96ebb71645c..d2b37a5122d 100644

> --- a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C

> +++ b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C

> @@ -9,13 +9,13 @@ class A {

>  	int	h;

>  	A() { i=10; j=20; }

>  	virtual void f1() { printf("i=%d j=%d\n",i,j); }

> -	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "9:virtual functions cannot be friends" }

> +	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "16:virtual functions cannot be friends" }

>  };

>  

>  class B : public A {

>      public:

>  	virtual void f1() { printf("i=%d j=%d\n",i,j); }// { dg-error "" }  member.*// ERROR -  member.*

> -	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "9:virtual functions cannot be friends" }

> +	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "16:virtual functions cannot be friends" }

>  // { dg-error "private" "" { target *-*-* } .-1 }

>  };

>  

> diff --git a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C

> index b438543d445..bbc9e51aff6 100644

> --- a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C

> +++ b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C

> @@ -12,5 +12,5 @@ int

>  main()

>  {

>  	C<char*>	c;

> -	char*		p = Z(c.O); //{ dg-error "13:'Z' was not declared" } ambiguous c.O

> +	char*		p = Z(c.O); //{ dg-error "29:'Z' was not declared" } ambiguous c.O

>  }

> diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C

> index 6dc2c55be58..b98e8da6b1e 100644

> --- a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C

> +++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C

> @@ -48,8 +48,8 @@ ostream& operator<<(ostream& os, Graph<VertexType,EdgeType>& G)

>  

>          // The compiler does not like this line!!!!!!

>          typename Graph<VertexType, EdgeType>::Successor::iterator

> -	  startN = G[i].second.begin(), // { dg-error "14:no match" } no index operator

> -	  endN   = G[i].second.end();  // { dg-error "14:no match" } no index operator

> +	  startN = G[i].second.begin(), // { dg-error "21:no match" } no index operator

> +	  endN   = G[i].second.end();  // { dg-error "21:no match" } no index operator

>  

>          while(startN != endN)

>          {

> diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c

> index c5ff96e5644..51190c92391 100644

> --- a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c

> +++ b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c

> @@ -288,7 +288,7 @@ int test_3 (int x, int y)

>      |      |     ~~~~~~~~~~

>      |      |     |

>      |      |     (4) ...to here

> -    |   NN |      to dereference it above

> +    |   NN |                    to dereference it above

>      |   NN |   return *ptr;

>      |      |          ~~~~

>      |      |          |

> diff --git a/gcc/testsuite/gcc.dg/bad-binary-ops.c b/gcc/testsuite/gcc.dg/bad-binary-ops.c

> index 46c158e6a5f..45668be0a29 100644

> --- a/gcc/testsuite/gcc.dg/bad-binary-ops.c

> +++ b/gcc/testsuite/gcc.dg/bad-binary-ops.c

> @@ -35,10 +35,10 @@ int test_2 (void)

>             ~~~~~~~~~~~~~~~~

>             |

>             struct s

> -    + some_other_function ());

> -    ^ ~~~~~~~~~~~~~~~~~~~~~~

> -      |

> -      struct t

> +           + some_other_function ());

> +           ^ ~~~~~~~~~~~~~~~~~~~~~~

> +             |

> +             struct t

>     { dg-end-multiline-output "" } */

>  }

>  

> diff --git a/gcc/testsuite/gcc.dg/format/branch-1.c b/gcc/testsuite/gcc.dg/format/branch-1.c

> index 1782064645e..4ea39b52b2e 100644

> --- a/gcc/testsuite/gcc.dg/format/branch-1.c

> +++ b/gcc/testsuite/gcc.dg/format/branch-1.c

> @@ -10,7 +10,7 @@ foo (long l, int nfoo)

>  {

>    printf ((nfoo > 1) ? "%d foos" : "%d foo", nfoo);

>    printf ((l > 1) ? "%d foos" /* { dg-warning "23:int" "wrong type in conditional expr" } */

> -	          : "%d foo", l); /* { dg-warning "16:int" "wrong type in conditional expr" } */

> +	          : "%d foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */

>    printf ((l > 1) ? "%ld foos" : "%d foo", l); /* { dg-warning "36:int" "wrong type in conditional expr" } */

>    printf ((l > 1) ? "%d foos" : "%ld foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */

>    /* Should allow one case to have extra arguments.  */

> diff --git a/gcc/testsuite/gcc.dg/format/pr79210.c b/gcc/testsuite/gcc.dg/format/pr79210.c

> index 71f5dd6e082..6bdabdf21ec 100644

> --- a/gcc/testsuite/gcc.dg/format/pr79210.c

> +++ b/gcc/testsuite/gcc.dg/format/pr79210.c

> @@ -20,4 +20,4 @@ LPFC_VPORT_ATTR_R(peer_port_login,

>  		  "Allow peer ports on the same physical port to login to each "

>  		  "other.");

>  

> -/* { dg-warning "6: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */

> +/* { dg-warning "20: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */

> diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c

> index 03b78042107..d7691e4be51 100644

> --- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c

> +++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c

> @@ -540,15 +540,15 @@ void test_builtin_types_compatible_p (unsigned long i)

>    __emit_expression_range (0,

>  			   f (i) + __builtin_types_compatible_p (long, int)); /* { dg-warning "range" } */

>  /* { dg-begin-multiline-output "" }

> -       f (i) + __builtin_types_compatible_p (long, int));

> -       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> +                            f (i) + __builtin_types_compatible_p (long, int));

> +                            ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

>     { dg-end-multiline-output "" } */

>  

>    __emit_expression_range (0,

>  			   __builtin_types_compatible_p (long, int) + f (i)); /* { dg-warning "range" } */

>  /* { dg-begin-multiline-output "" }

> -       __builtin_types_compatible_p (long, int) + f (i));

> -       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~

> +                            __builtin_types_compatible_p (long, int) + f (i));

> +                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~

>     { dg-end-multiline-output "" } */

>  }

>  

> @@ -671,8 +671,8 @@ void test_multiple_ordinary_maps (void)

>  /* { dg-begin-multiline-output "" }

>     __emit_expression_range (0, foo (0,

>                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> -        "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));

> -        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> +                                    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));

> +                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

>     { dg-end-multiline-output "" } */

>  

>    /* Another expression that transitions between ordinary maps; this

> @@ -685,8 +685,8 @@ void test_multiple_ordinary_maps (void)

>  /* { dg-begin-multiline-output "" }

>     __emit_expression_range (0, foo (0, "01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789",

>                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> -        0));

> -        ~~                      

> +                                    0));

> +                                    ~~

>     { dg-end-multiline-output "" } */

>  }

>  

> diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c

> index ac4fa1b52bd..4cba87be2ae 100644

> --- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c

> +++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c

> @@ -335,11 +335,11 @@ pr87652 (const char *stem, int counter)

>  /* { dg-error "unable to read substring location: unable to read source line" "" { target c } 329 } */

>  /* { dg-error "unable to read substring location: failed to get ordinary maps" "" { target c++ } 329 } */

>  /* { dg-begin-multiline-output "" }

> -     __emit_string_literal_range(__FILE__":%5d: " format, \

> +     __emit_string_literal_range(__FILE__":%5d: " format,        \

>                                   ^~~~~~~~

>       { dg-end-multiline-output "" { target c } } */

>  /* { dg-begin-multiline-output "" }

> -     __emit_string_literal_range(__FILE__":%5d: " format, \

> +     __emit_string_literal_range(__FILE__":%5d: " format,        \

>                                   ^

>       { dg-end-multiline-output "" { target c++ } } */

>  

> diff --git a/gcc/testsuite/gcc.dg/redecl-4.c b/gcc/testsuite/gcc.dg/redecl-4.c

> index 8f124886da8..2c214bb02c7 100644

> --- a/gcc/testsuite/gcc.dg/redecl-4.c

> +++ b/gcc/testsuite/gcc.dg/redecl-4.c

> @@ -15,7 +15,7 @@ f (void)

>      /* Should get format warnings even though the built-in declaration

>         isn't "visible".  */

>      printf (

> -	    "%s", 1); /* { dg-warning "8:format" } */

> +	    "%s", 1); /* { dg-warning "15:format" } */

>      /* The type of strcmp here should have no prototype.  */

>      if (0)

>        strcmp (1);

> diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90

> index 7fade1f65fc..606fe0f891a 100644

> --- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90

> +++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90

> @@ -8,17 +8,22 @@

>  ! We can't rely on any ordering of the keys.

>  

>  ! { dg-regexp "\"kind\": \"error\"" }

> +! { dg-regexp "\"column-origin\": 1" }

>  ! { dg-regexp "\"message\": \"#error message\"" }

>  

>  ! { dg-regexp "\"caret\": \{" }

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 2" }

> +! { dg-regexp "\"display-column\": 2" }

> +! { dg-regexp "\"byte-column\": 2" }

>  

>  ! { dg-regexp "\"finish\": \{" }

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 6" }

> +! { dg-regexp "\"display-column\": 6" }

> +! { dg-regexp "\"byte-column\": 6" }

>  

>  ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }

>  ! { dg-regexp "\"children\": \[\[\]\[\]\]" }

> diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90

> index bebcf68d431..56615f0ca5a 100644

> --- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90

> +++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90

> @@ -8,6 +8,7 @@

>  ! We can't rely on any ordering of the keys. 

>  

>  ! { dg-regexp "\"kind\": \"warning\"" }

> +! { dg-regexp "\"column-origin\": 1" }

>  ! { dg-regexp "\"message\": \"#warning message\"" }

>  ! { dg-regexp "\"option\": \"-Wcpp\"" }

>  ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }

> @@ -16,11 +17,15 @@

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 2" }

> +! { dg-regexp "\"display-column\": 2" }

> +! { dg-regexp "\"byte-column\": 2" }

>  

>  ! { dg-regexp "\"finish\": \{" }

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 8" }

> +! { dg-regexp "\"display-column\": 8" }

> +! { dg-regexp "\"byte-column\": 8" }

>  

>  ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }

>  ! { dg-regexp "\"children\": \[\[\]\[\]\]" }

> diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90

> index 7ab78eb570b..50214759091 100644

> --- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90

> +++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90

> @@ -8,6 +8,7 @@

>  ! We can't rely on any ordering of the keys.

>  

>  ! { dg-regexp "\"kind\": \"error\"" }

> +! { dg-regexp "\"column-origin\": 1" }

>  ! { dg-regexp "\"message\": \"#warning message\"" }

>  ! { dg-regexp "\"option\": \"-Werror=cpp\"" }

>  ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }

> @@ -16,11 +17,15 @@

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 2" }

> +! { dg-regexp "\"display-column\": 2" }

> +! { dg-regexp "\"byte-column\": 2" }

>  

>  ! { dg-regexp "\"finish\": \{" }

>  ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }

>  ! { dg-regexp "\"line\": 4" }

>  ! { dg-regexp "\"column\": 8" }

> +! { dg-regexp "\"display-column\": 8" }

> +! { dg-regexp "\"byte-column\": 8" }

>  

>  ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }

>  ! { dg-regexp "\"children\": \[\[\]\[\]\]" }

> diff --git a/gcc/testsuite/go.dg/arrayclear.go b/gcc/testsuite/go.dg/arrayclear.go

> index 6daebc0b8f5..aa5ba0761d7 100644

> --- a/gcc/testsuite/go.dg/arrayclear.go

> +++ b/gcc/testsuite/go.dg/arrayclear.go

> @@ -1,5 +1,8 @@

>  // { dg-do compile }

>  // { dg-options "-fgo-debug-optimization" }

> +// This comment is necessary to work around a dejagnu bug. Otherwise, the

> +// column of the second error message would equal the row of the first one, and

> +// since the errors are also identical, dejagnu is not able to distinguish them.

>  

>  package p

>  

> diff --git a/gcc/tree-diagnostic-path.cc b/gcc/tree-diagnostic-path.cc

> index 381a49cb0b4..82b3c2d6b6a 100644

> --- a/gcc/tree-diagnostic-path.cc

> +++ b/gcc/tree-diagnostic-path.cc

> @@ -493,7 +493,7 @@ default_tree_diagnostic_path_printer (diagnostic_context *context,

>     doesn't have access to trees (for m_fndecl).  */

>  

>  json::value *

> -default_tree_make_json_for_path (diagnostic_context *,

> +default_tree_make_json_for_path (diagnostic_context *context,

>  				 const diagnostic_path *path)

>  {

>    json::array *path_array = new json::array ();

> @@ -504,7 +504,8 @@ default_tree_make_json_for_path (diagnostic_context *,

>        json::object *event_obj = new json::object ();

>        if (event.get_location ())

>  	event_obj->set ("location",

> -			json_from_expanded_location (event.get_location ()));

> +			json_from_expanded_location (context,

> +						     event.get_location ()));

>        label_text event_text (event.get_desc (false));

>        event_obj->set ("description", new json::string (event_text.m_buffer));

>        event_text.maybe_free ();

> diff --git a/libcpp/charset.c b/libcpp/charset.c

> index db47235b847..28b81c9c864 100644

> --- a/libcpp/charset.c

> +++ b/libcpp/charset.c

> @@ -2276,49 +2276,90 @@ cpp_string_location_reader::get_next ()

>    return result;

>  }

>  

> -/* Helper for cpp_byte_column_to_display_column and its inverse.  Given a

> -   pointer to a UTF-8-encoded character, compute its display width.  *INBUFP

> -   points on entry to the start of the UTF-8 encoding of the character, and

> -   is updated to point just after the last byte of the encoding.  *INBYTESLEFTP

> -   contains on entry the remaining size of the buffer into which *INBUFP

> -   points, and this is also updated accordingly.  If *INBUFP does not

> +cpp_display_width_computation::

> +cpp_display_width_computation (const char *data, int data_length, int tabstop) :

> +  m_begin (data),

> +  m_next (m_begin),

> +  m_bytes_left (data_length),

> +  m_tabstop (tabstop),

> +  m_display_cols (0)

> +{

> +  gcc_assert (m_tabstop > 0);

> +}

> +

> +

> +/* The main implementation function for class cpp_display_width_computation.

> +   m_next points on entry to the start of the UTF-8 encoding of the next

> +   character, and is updated to point just after the last byte of the encoding.

> +   m_bytes_left contains on entry the remaining size of the buffer into which

> +   m_next points, and this is also updated accordingly.  If m_next does not

>     point to a valid UTF-8-encoded sequence, then it will be treated as a single

> -   byte with display width 1.  */

> +   byte with display width 1.  m_cur_display_col is the current display column,

> +   relative to which tab stops should be expanded.  Returns the display width of

> +   the codepoint just processed.  */

>  

> -static inline int

> -compute_next_display_width (const uchar **inbufp, size_t *inbytesleftp)

> +int

> +cpp_display_width_computation::process_next_codepoint ()

>  {

>    cppchar_t c;

> -  if (one_utf8_to_cppchar (inbufp, inbytesleftp, &c) != 0)

> +  int next_width;

> +

> +  if (*m_next == '\t')

> +    {

> +      ++m_next;

> +      --m_bytes_left;

> +      next_width = m_tabstop - (m_display_cols % m_tabstop);

> +    }

> +  else if (one_utf8_to_cppchar ((const uchar **) &m_next, &m_bytes_left, &c)

> +	   != 0)

>      {

>        /* Input is not convertible to UTF-8.  This could be fine, e.g. in a

>  	 string literal, so don't complain.  Just treat it as if it has a width

>  	 of one.  */

> -      ++*inbufp;

> -      --*inbytesleftp;

> -      return 1;

> +      ++m_next;

> +      --m_bytes_left;

> +      next_width = 1;

> +    }

> +  else

> +    {

> +      /*  one_utf8_to_cppchar() has updated m_next and m_bytes_left for us.  */

> +      next_width = cpp_wcwidth (c);

>      }

>  

> -  /*  one_utf8_to_cppchar() has updated inbufp and inbytesleftp for us.  */

> -  return cpp_wcwidth (c);

> +  m_display_cols += next_width;

> +  return next_width;

> +}

> +

> +/*  Utility to advance the byte stream by the minimum amount needed to consume

> +    N display columns.  Returns the number of display columns that were

> +    actually skipped.  This could be less than N, if there was not enough data,

> +    or more than N, if the last character to be skipped had a sufficiently large

> +    display width.  */

> +int

> +cpp_display_width_computation::advance_display_cols (int n)

> +{

> +  const int start = m_display_cols;

> +  const int target = start + n;

> +  while (m_display_cols < target && !done ())

> +    process_next_codepoint ();

> +  return m_display_cols - start;

>  }

>  

>  /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute

>      how many display columns are occupied by the first COLUMN bytes.  COLUMN

>      may exceed DATA_LENGTH, in which case the phantom bytes at the end are

> -    treated as if they have display width 1.  */

> +    treated as if they have display width 1.  Tabs are expanded to the next tab

> +    stop, relative to the start of DATA.  */

>  

>  int

>  cpp_byte_column_to_display_column (const char *data, int data_length,

> -				   int column)

> +				   int column, int tabstop)

>  {

> -  int display_col = 0;

> -  const uchar *udata = (const uchar *) data;

>    const int offset = MAX (0, column - data_length);

> -  size_t inbytesleft = column - offset;

> -  while (inbytesleft)

> -    display_col += compute_next_display_width (&udata, &inbytesleft);

> -  return display_col + offset;

> +  cpp_display_width_computation dw (data, column - offset, tabstop);

> +  while (!dw.done ())

> +    dw.process_next_codepoint ();

> +  return dw.display_cols_processed () + offset;

>  }

>  

>  /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute

> @@ -2328,14 +2369,11 @@ cpp_byte_column_to_display_column (const char *data, int data_length,

>  

>  int

>  cpp_display_column_to_byte_column (const char *data, int data_length,

> -				   int display_col)

> +				   int display_col, int tabstop)

>  {

> -  int column = 0;

> -  const uchar *udata = (const uchar *) data;

> -  size_t inbytesleft = data_length;

> -  while (column < display_col && inbytesleft)

> -      column += compute_next_display_width (&udata, &inbytesleft);

> -  return data_length - inbytesleft + MAX (0, display_col - column);

> +  cpp_display_width_computation dw (data, data_length, tabstop);

> +  const int avail_display = dw.advance_display_cols (display_col);

> +  return dw.bytes_processed () + MAX (0, display_col - avail_display);

>  }

>  

>  /* Our own version of wcwidth().  We don't use the actual wcwidth() in glibc,

> diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h

> index 544735a51af..c18f455f82a 100644

> --- a/libcpp/include/cpplib.h

> +++ b/libcpp/include/cpplib.h

> @@ -312,9 +312,6 @@ enum cpp_normalize_level {

>     carries all the options visible to the command line.  */

>  struct cpp_options

>  {

> -  /* Characters between tab stops.  */

> -  unsigned int tabstop;

> -

>    /* The language we're preprocessing.  */

>    enum c_lang lang;

>  

> @@ -1334,14 +1331,43 @@ extern const char * cpp_get_userdef_suffix

>    (const cpp_token *);

>  

>  /* In charset.c */

> +

> +/* A class to manage the state while converting a UTF-8 sequence to cppchar_t

> +   and computing the display width one character at a time.  */

> +class cpp_display_width_computation {

> + public:

> +  cpp_display_width_computation (const char *data, int data_length,

> +				 int tabstop);

> +  const char *next_byte () const { return m_next; }

> +  int bytes_processed () const { return m_next - m_begin; }

> +  int bytes_left () const { return m_bytes_left; }

> +  bool done () const { return !bytes_left (); }

> +  int display_cols_processed () const { return m_display_cols; }

> +

> +  int process_next_codepoint ();

> +  int advance_display_cols (int n);

> +

> + private:

> +  const char *const m_begin;

> +  const char *m_next;

> +  size_t m_bytes_left;

> +  const int m_tabstop;

> +  int m_display_cols;

> +};

> +

> +/* Convenience functions that are simple use cases for class

> +   cpp_display_width_computation.  Tab characters will be expanded to spaces

> +   as determined by TABSTOP.  */

>  int cpp_byte_column_to_display_column (const char *data, int data_length,

> -				       int column);

> -inline int cpp_display_width (const char *data, int data_length)

> +				       int column, int tabstop);

> +inline int cpp_display_width (const char *data, int data_length,

> +			      int tabstop)

>  {

> -    return cpp_byte_column_to_display_column (data, data_length, data_length);

> +  return cpp_byte_column_to_display_column (data, data_length, data_length,

> +					    tabstop);

>  }

>  int cpp_display_column_to_byte_column (const char *data, int data_length,

> -				       int display_col);

> +				       int display_col, int tabstop);

>  int cpp_wcwidth (cppchar_t c);

>  

>  #endif /* ! LIBCPP_CPPLIB_H */

> diff --git a/libcpp/init.c b/libcpp/init.c

> index 63124c8161e..6e94c486059 100644

> --- a/libcpp/init.c

> +++ b/libcpp/init.c

> @@ -190,7 +190,6 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,

>    CPP_OPTION (pfile, discard_comments) = 1;

>    CPP_OPTION (pfile, discard_comments_in_macro_exp) = 1;

>    CPP_OPTION (pfile, max_include_depth) = 200;

> -  CPP_OPTION (pfile, tabstop) = 8;

>    CPP_OPTION (pfile, operator_names) = 1;

>    CPP_OPTION (pfile, warn_trigraphs) = 2;

>    CPP_OPTION (pfile, warn_endif_labels) = 1;
Richard Sandiford via Gcc-patches July 13, 2020, 7:04 p.m. | #10
> On Wed, Jun 10, 2020 at 12:11:00PM -0400, David Malcolm wrote:

> > Thanks for the patch; sorry about the delay in reviewing it.

> > 

> > Some high-level review points

> > 

> > - I like the patch overall

> > 

> > - This will deserve an item in the release notes

> > 

> > - I don't like adding "global_tabstop" (I don't like global

> > variables).  Is there nowhere else we can handle this? I believe

> > there's a cluster of functions in the callgraph that make use of

> > it; can we simply pass around the tabstop value instead?  "tabstop"

> > seems to have several meanings.  If I'm reading the patch correctly

> >   * "tabstop > 0" means to expand tabs so that column numbers are a

> > multiple of tabstop

> >   * "tabstop == 0" means "don't expand tabs"

> >   * "tabstop < 0" in some places means: use the global_tabstop

value
> > Is it possible to eliminate global_tabstop value?  Or is there some

> > deep reason I'm missing?

> > 

> > I'll do a more thorough review once that's addressed/resolved

(since
> > eliminating global_tabstop might touch a few places).

> >

> 

> Thanks for the feedback! The attached updated patch addresses these

> concerns. Regarding tabstop, I have removed the new static variable

> global_tabstop in charset.c. FWIW, the usage of "tabstop" arguments

in the
> various new APIs did previously work a bit more consistently than you

> described. In all cases "tabstop <= 0" meant to use the default

value,
> otherwise it specified the tabstop to use (with tabstop=1 naturally

> restoring the old behavior of changing tabs to a single space). In

order
> for libcpp to provide this feature (callers can pass tabstop <= 0 to

get a
> default, and the default can in turn by configured when processing

the
> -ftabstop option), it does need to remember the default, and this has

to
> be a file-level static variable because the routines need to work

> independent of any cpp_reader instance. (Some frontends don't use

> libcpp to read their input, for instance.) Anyway, I see the point

that
> this file-level static, being accessible with cpp_set_tabstop() and

> cpp_get_tabstop(), is effectively just a global variable, so I have

> removed this feature, which just means that all callers need to pass

the
> tabstop they want to use. I am now rather using the

diagnostic_context
> object to remember the value passed to -ftabstop. The only place this

> involves global variables is now in c-family/c-indentation.c, where

if I
> understood correctly, the only diagnostic_context available is

global_dc,
> so I am getting the tabstop value from there. Please let me know if

> there's a better way to handle that? Prior to my patch, the tabstop

was
> obtained from a different global variable (extern cpp_options

*cpp_opts),
> so at least conservation of total globals is maintained. :)


Thanks.  That sounds like a good approach.


> Compared to the previous version, this one is a bit longer, since 25

or
> so call sites had to be modified to know the value of -ftabstop. Most

of
> the churn is in diagnostic-show-locus.c, because there are a fair

number of
> static helper functions and helper classes there, which just needed

to
> receive the diagnostic_context object from their callers. I could

> have made this simpler by letting the tabstop argument default to

> something like 8 in all functions that require it... this would

remove the
> need to pass it in all the selftests that are indifferent to it. I

figured
> it would be better to force this argument to be passed, though, or

else in
> the future it may be easy to forget to pass it where it is needed. 


Thanks.

> > Thanks for adding docs; some nits on them:

> > 

> > > --- a/gcc/doc/invoke.texi

> > > +++ b/gcc/doc/invoke.texi

> > 

> > [...snip...]

> > 

> > > +@item -fdiagnostics-column-unit=@var{UNIT}

> > > +@opindex fdiagnostics-column-unit

> > > +Select the units for the column number.  This affects

traditional diagnostics
> > > +(in the absence of @option{-fno-show-column}), as well as JSON

format
> > > +diagnostics if requested.

> > > +

> > > +The default @var{UNIT}, @samp{display}, considers the number of

display columns
> > > +occupied by each character.  This may be larger than the number

of bytes
> > > +occupied, in the case of tab characters, or it may be smaller,

in the case of
> > > +multibyte characters.  For example, the UTF-8 character ``@U{03C

0}'' occupies
> > > +two bytes and one display column, while the character ``@U{1F642

}'' occupies
> > > +four bytes and two display columns.

> > 

> > This is imprecise.  A unicode code point occupies some number of

display columns,
> > and its *UTF-8 encoding* occupies some number of bytes.

> > 

> > [and my inner pedant is now thinking: what about combining

diacritics? 
> > But I don't think we can ever issue a diagnostic on a diacritic; I

> > *think* we only ever care about the per-glyph level]

> > 

> > > +Setting @var{UNIT} to @samp{byte} changes the column number to

the
> > raw byte

> > > +count in all cases, as was traditionally output by GCC prior to

version 11.1.0.
> > > +

> > > +@item -fdiagnostics-column-origin=@var{ORIGIN}

> > > +@opindex fdiagnostics-column-origin

> > > +Select the origin for column numbers, i.e. the column number

assigned to the
> > > +first column.  The default value of 1 corresponds to traditional

GCC
> > > +behavior and to the GNU style guide.  Some utilities may perform

better with an
> > > +origin of 0; any non-negative value may be specified.

> > > +

> > >  @item -fdiagnostics-format=@var{FORMAT}

> > >  @opindex fdiagnostics-format

> > >  Select a different format for printing diagnostics.

> > 

> > [...snip...]

> > 

> > > +A diagnostic can contain zero or more locations.  Each location

has an
> > > +optional @code{label} string and up to three positions within

it: a
> > > +@code{caret} position and optional @code{start} and

@code{finish} positions.
> > > +A position is described by a @code{file} name, a @code{line}

number, and
> > > +three numbers indicating a column position: @code{display-

column} counts
> > > +display columns, accounting for tabs and multibyte characters;

> > > +@code{byte-column} counts raw bytes; and @code{column} is equal

to one of
> > > +the previous two, as dictated by the @option{-fdiagnostics-

column-unit}
> > > +option.

> > 

> > Might be clearer to use an unordered list here for the three kinds

of column.
> > 

> > > All three columns are relative to the origin specified by

> > > +@option{-fdiagnostics-column-origin}, which is typically equal

to 1 but may
> > > +be set, for instance, to 0 for compatibility with other

utilities that
> > > +number columns from 0.  The column origin is recorded in the

JSON output in
> > > +the @code{column-origin} tag.  In the remaining examples below,

the extra
> > > +column number outputs have been omitted for brevity.

> > 

> > [...snip...]

> > 

> 

> I improved the docs along these lines.

> 

> > Thanks again for the patch; hope this is constructive

> > Dave

> >

> 

> Thanks for your time! BTW, I did bootstrap + regtest this version as

well on
> x86-64 Linux, it looks good, new tests pass and others are the same:

> 

> FAIL 97 97

> PASS 476837 477297

> UNRESOLVED 7 7

> UNSUPPORTED 11726 11726

> UNTESTED 195 195

> XFAIL 1807 1807

> XPASS 37 37

> 

> -Lewis

>


Thanks for the updated patch.

This looks (almost) ready; I have a few nits inline below....

> From 7729ce3334b6768a25967a6dd4a0a5a2ed0923cc Mon Sep 17 00:00:00 2001

> From: Lewis Hyatt <lhyatt@gmail.com>

> Date: Wed, 10 Jun 2020 22:04:07 -0400

> Subject: [PATCH] diagnostics: Support conversion of tabs to spaces [PR49973] [PR86904]

> 

> Supports conversion of tabs to spaces when outputting diagnostics. Also

> adds -fdiagnostics-column-unit and -fdiagnostics-column-origin options to

> control how the column number is output, thereby resolving the two PRs.

> 

> gcc/c-family/ChangeLog:

> 

>         PR other/86904

>         * c-indentation.c (should_warn_for_misleading_indentation): Get

>         global tabstop from the new source.

>         * c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which

>         is now a common option.

>         * c.opt: Likewise.

> 

> gcc/ChangeLog:

> 

>         PR preprocessor/49973

>         PR other/86904

>         * common.opt: Handle -ftabstop here instead of in c-family

>         options.  Add -fdiagnostics-column-unit= and

>         -fdiagnostics-column-origin= options.

>         * opts.c (common_handle_option): Handle the new options.

>         * diagnostic-format-json.cc (json_from_expanded_location): Add

>         diagnostic_context argument.  Use it to convert column numbers as per

>         the new options.

>         (json_from_location_range): Likewise.

>         (json_from_fixit_hint): Likewise.

>         (json_end_diagnostic): Pass the new context argument to helper

>         functions above.  Add "column-origin" field to the output.

>         (test_unknown_location): Add the new context argument to calls to

>         helper functions.

>         (test_bad_endpoints): Likewise.

>         * diagnostic-show-locus.c

>         (exploc_with_display_col::exploc_with_display_col): Support

>         tabstop parameter.

>         (layout_point::layout_point): Make use of class

>         exploc_with_display_col.

>         (layout_range::layout_range): Likewise.

>         (struct line_bounds): Clarify that the units are now always

>         display columns.  Rename members accordingly.  Add constructor.

>         (layout::print_source_line): Add support for tab expansion.

>         (make_range): Adapt to class layout_range changes.

>         (layout::maybe_add_location_range): Likewise.

>         (layout::layout): Adapt to class exploc_with_display_col changes.

>         (layout::calculate_x_offset_display): Support tabstop parameter.

>         (layout::print_annotation_line): Adapt to struct line_bounds changes.

>         (layout::print_line): Likewise.

>         (line_label::line_label): Add diagnostic_context argument.

>         (get_affected_range): Likewise.

>         (get_printed_columns): Likewise.

>         (layout::print_any_labels): Adapt to struct line_label changes.

>         (class correction): Add m_tabstop member.

>         (correction::correction): Add tabstop argument.

>         (correction::compute_display_cols): Use m_tabstop.

>         (class line_corrections): Add m_context member.

>         (line_corrections::line_corrections): Add diagnostic_context argument.

>         (line_corrections::add_hint): Use m_context to handle tabstops.

>         (layout::print_trailing_fixits): Adapt to class line_corrections

>         changes.

>         (test_layout_x_offset_display_utf8): Support tabstop parameter.

>         (test_layout_x_offset_display_tab): New selftest.

>         (test_one_liner_colorized_utf8): Likewise.

>         (test_tab_expansion): Likewise.

>         (test_diagnostic_show_locus_one_liner_utf8): Call the new tests.

>         (diagnostic_show_locus_c_tests): Likewise.

>         (test_overlapped_fixit_printing): Adapt to helper class and

>         function changes.

>         (test_overlapped_fixit_printing_utf8): Likewise.

>         (test_overlapped_fixit_printing_2): Likewise.

>         * diagnostic.h (enum diagnostics_column_unit): New enum.

>         (struct diagnostic_context): Add members for the new options.

>         (diagnostic_converted_column): Declare.

>         (json_from_expanded_location): Add new context argument.

>         * diagnostic.c (diagnostic_initialize): Initialize new members.

>         (diagnostic_converted_column): New function.

>         (maybe_line_and_column): Be willing to output a column of 0.

>         (diagnostic_get_location_text): Convert column number as per the new

>         options.

>         (diagnostic_report_current_module): Likewise.

>         (assert_location_text): Add origin and column_unit arguments for

>         testing the new functionality.

>         (test_diagnostic_get_location_text): Test the new functionality.

>         * doc/invoke.texi: Document the new options and behavior.

>         * input.h (location_compute_display_column): Add tabstop argument.

>         * input.c (location_compute_display_column): Likewise.

>         (test_cpp_utf8): Add selftests for tab expansion.

>         * tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the

>         new context argument to json_from_expanded_location().

> 

> libcpp/ChangeLog:

> 

>         PR preprocessor/49973

>         PR other/86904

>         * include/cpplib.h (struct cpp_options):  Removed support for -ftabstop,

>         which is now handled by diagnostic_context.

>         (class cpp_display_width_computation): New class.

>         (cpp_byte_column_to_display_column): Add optional tabstop argument.

>         (cpp_display_width): Likewise.

>         (cpp_display_column_to_byte_column): Likewise.

>         * charset.c

>         (cpp_display_width_computation::cpp_display_width_computation): New

>         function.

>         (cpp_display_width_computation::advance_display_cols): Likewise.

>         (compute_next_display_width): Removed and implemented this

>         functionality in a new function...

>         (cpp_display_width_computation::process_next_codepoint): ...here.

>         (cpp_byte_column_to_display_column): Added tabstop argument.

>         Reimplemented in terms of class cpp_display_width_computation.

>         (cpp_display_column_to_byte_column): Likewise.

>         * init.c (cpp_create_reader): Remove handling of -ftabstop, which is now

>         handled by diagnostic_context.

> 

> gcc/testsuite/ChangeLog:

> 

>         PR preprocessor/49973

>         PR other/86904

>         * c-c++-common/Wmisleading-indentation-3.c: Adjust expected output

>         for new defaults.

>         * c-c++-common/Wmisleading-indentation.c: Likewise.

>         * c-c++-common/diagnostic-format-json-1.c: Likewise.

>         * c-c++-common/diagnostic-format-json-2.c: Likewise.

>         * c-c++-common/diagnostic-format-json-3.c: Likewise.

>         * c-c++-common/diagnostic-format-json-4.c: Likewise.

>         * c-c++-common/diagnostic-format-json-5.c: Likewise.

>         * c-c++-common/missing-close-symbol.c: Likewise.

>         * g++.dg/diagnostic/bad-binary-ops.C: Likewise.

>         * g++.dg/parse/error4.C: Likewise.

>         * g++.old-deja/g++.brendan/crash11.C: Likewise.

>         * g++.old-deja/g++.pt/overload2.C: Likewise.

>         * g++.old-deja/g++.robertl/eb109.C: Likewise.

>         * gcc.dg/analyzer/malloc-paths-9.c: Likewise.

>         * gcc.dg/bad-binary-ops.c: Likewise.

>         * gcc.dg/format/branch-1.c: Likewise.

>         * gcc.dg/format/pr79210.c: Likewise.

>         * gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.

>         * gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.

>         * gcc.dg/redecl-4.c: Likewise.

>         * gfortran.dg/diagnostic-format-json-1.F90: Likewise.

>         * gfortran.dg/diagnostic-format-json-2.F90: Likewise.

>         * gfortran.dg/diagnostic-format-json-3.F90: Likewise.

>         * go.dg/arrayclear.go: Add a comment explaining why adding a

>         comment was necessary to work around a dejagnu bug.

>         * c-c++-common/diagnostic-units-1.c: New test.

>         * c-c++-common/diagnostic-units-2.c: New test.

>         * c-c++-common/diagnostic-units-3.c: New test.

>         * c-c++-common/diagnostic-units-4.c: New test.

>         * c-c++-common/diagnostic-units-5.c: New test.

>         * c-c++-common/diagnostic-units-6.c: New test.

>         * c-c++-common/diagnostic-units-7.c: New test.

>         * c-c++-common/diagnostic-units-8.c: New test.


[...snip...]

> diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h

> index 307dbcfb34a..75706c5f4d8 100644

> --- a/gcc/diagnostic.h

> +++ b/gcc/diagnostic.h

> @@ -24,6 +24,20 @@ along with GCC; see the file COPYING3.  If not see

>  #include "pretty-print.h"

>  #include "diagnostic-core.h"

>  

> +/* An enum for controlling what units to use for the column number

> +   when diagnostics are output, used by the -fdiagnostics-column-unit option.

> +   Tabs will be expanded or not according to the value of -ftabstop.  The origin

> +   (default 1) is controlled by -fdiagnostics-column-origin.  */

> +


"New" and "historical" can get out of date, so how about:

> +enum diagnostics_column_unit

> +{

> +  /* The new default: display columns.  */


     /* The default from GCC 11 onwards: display column.  */
     
> +  DIAGNOSTICS_COLUMN_UNIT_DISPLAY,

> +

> +  /* The historical behavior: simple bytes.  */


     /* The behavior in GCC 10 and earlier: simple bytes.  */
     
> +  DIAGNOSTICS_COLUMN_UNIT_BYTE

> +};


?

[...snip...]

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

> index 06a04e3d7dd..f463275bc8b 100644

> --- a/gcc/doc/invoke.texi

> +++ b/gcc/doc/invoke.texi


[...snip...]

> @@ -4729,6 +4731,31 @@ Do not print column numbers in diagnostics.  This may be necessary if

>  diagnostics are being scanned by a program that does not understand the

>  column numbers, such as @command{dejagnu}.

>  

> +@item -fdiagnostics-column-unit=@var{UNIT}

> +@opindex fdiagnostics-column-unit

> +Select the units for the column number.  This affects traditional diagnostics

> +(in the absence of @option{-fno-show-column}), as well as JSON format

> +diagnostics if requested.

> +

> +The default @var{UNIT}, @samp{display}, considers the number of display

> +columns occupied by each character.  This may be larger than the number

> +of bytes required to encode the character, in the case of tab

> +characters, or it may be smaller, in the case of multibyte characters.

> +For example, the character ``@U{03C0}'' occupies one display column,

> +and its UTF-8 encoding requires two bytes; the character ``@U{1F642}''

> +occupies two display columns, and its UTF-8 encoding requires four

> +bytes.


Thanks for reworking this.

I'm wary of those @U commands - does the generated HTML from "make
html" work? (similarly for the man page).  Would it be safer to express
them in the following form?

 For example, the character ``GREEK SMALL LETTER PI (U+03C0)'' occupies one
 display column, and its UTF-8 encoding requires two bytes; the character
 ``SLIGHTLY SMILING FACE' (U+1F642)'' occupies two display columns, and
 its UTF-8 encoding requires four bytes.

or somesuch?

> +Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte

> +count in all cases, as was traditionally output by GCC prior to version 11.1.0.


[...snip...]

>  @item -fdiagnostics-format=@var{FORMAT}

>  @opindex fdiagnostics-format

>  Select a different format for printing diagnostics.

> @@ -4764,11 +4791,15 @@ might be printed in JSON form (after formatting) like this:

>          "locations": [

>              @{

>                  "caret": @{

> +                   "display-column": 3,

> +                   "byte-column": 3,

>                      "column": 3,

>                      "file": "misleading-indentation.c",

>                      "line": 15

>                  @},

>                  "finish": @{

> +                   "display-column": 4,

> +                   "byte-column": 4,

>                      "column": 4,

>                      "file": "misleading-indentation.c",

>                      "line": 15


Nit: the new fields don't line up with the old ones.

> @@ -4784,6 +4815,8 @@ might be printed in JSON form (after formatting) like this:

>                  "locations": [

>                      @{

>                          "caret": @{

> +                           "display-column": 5,

> +                           "byte-column": 5,

>                              "column": 5,

>                              "file": "misleading-indentation.c",

>                              "line": 17


Likewise.

[...snip...]

> diff --git a/gcc/opts.c b/gcc/opts.c

> index 340d99434b3..525f44d079f 100644

> --- a/gcc/opts.c

> +++ b/gcc/opts.c

> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see

>  #include "opt-suggestions.h"

>  #include "diagnostic-color.h"

>  #include "selftest.h"

> +#include "cpplib.h"


Is this new #include still needed?

[...snip...]


OK for trunk with those nits fixed.

Dave
Richard Sandiford via Gcc-patches July 13, 2020, 9:07 p.m. | #11
On Mon, Jul 13, 2020 at 03:04:20PM -0400, David Malcolm wrote:
> > +@item -fdiagnostics-column-unit=@var{UNIT}

> > +@opindex fdiagnostics-column-unit

> > +Select the units for the column number.  This affects traditional diagnostics

> > +(in the absence of @option{-fno-show-column}), as well as JSON format

> > +diagnostics if requested.

> > +

> > +The default @var{UNIT}, @samp{display}, considers the number of display

> > +columns occupied by each character.  This may be larger than the number

> > +of bytes required to encode the character, in the case of tab

> > +characters, or it may be smaller, in the case of multibyte characters.

> > +For example, the character ``@U{03C0}'' occupies one display column,

> > +and its UTF-8 encoding requires two bytes; the character ``@U{1F642}''

> > +occupies two display columns, and its UTF-8 encoding requires four

> > +bytes.

> 

> Thanks for reworking this.

> 

> I'm wary of those @U commands - does the generated HTML from "make

> html" work? (similarly for the man page).  Would it be safer to express

> them in the following form?

> 

>  For example, the character ``GREEK SMALL LETTER PI (U+03C0)'' occupies one

>  display column, and its UTF-8 encoding requires two bytes; the character

>  ``SLIGHTLY SMILING FACE' (U+1F642)'' occupies two display columns, and

>  its UTF-8 encoding requires four bytes.

> 

> or somesuch?


The HTML works, yes, but I hadn't thought to check the man page. Seems the
@U notation is carried through unmodified there, which isn't so clear. I
changed it to your suggestion, no need to overcomplicate it.

> 

> > +Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte

> > +count in all cases, as was traditionally output by GCC prior to version 11.1.0.

> 

> [...snip...]

> 

> >  @item -fdiagnostics-format=@var{FORMAT}

> >  @opindex fdiagnostics-format

> >  Select a different format for printing diagnostics.

> > @@ -4764,11 +4791,15 @@ might be printed in JSON form (after formatting) like this:

> >          "locations": [

> >              @{

> >                  "caret": @{

> > +                   "display-column": 3,

> > +                   "byte-column": 3,

> >                      "column": 3,

> >                      "file": "misleading-indentation.c",

> >                      "line": 15

> >                  @},

> >                  "finish": @{

> > +                   "display-column": 4,

> > +                   "byte-column": 4,

> >                      "column": 4,

> >                      "file": "misleading-indentation.c",

> >                      "line": 15

> 

> Nit: the new fields don't line up with the old ones.

> 

> > @@ -4784,6 +4815,8 @@ might be printed in JSON form (after formatting) like this:

> >                  "locations": [

> >                      @{

> >                          "caret": @{

> > +                           "display-column": 5,

> > +                           "byte-column": 5,

> >                              "column": 5,

> >                              "file": "misleading-indentation.c",

> >                              "line": 17

> 

> Likewise.


You are referring to the source code as opposed to the generated HTML,
correct? Both look fine to me, I think above effect is due to mixed
tabs+spaces convention in the git diff ironically :). I'll make sure it
looks right in any case.

> 

> [...snip...]

> 

> > diff --git a/gcc/opts.c b/gcc/opts.c

> > index 340d99434b3..525f44d079f 100644

> > --- a/gcc/opts.c

> > +++ b/gcc/opts.c

> > @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see

> >  #include "opt-suggestions.h"

> >  #include "diagnostic-color.h"

> >  #include "selftest.h"

> > +#include "cpplib.h"

> 

> Is this new #include still needed?

> 

> [...snip...]

> 

> 

> OK for trunk with those nits fixed.

> 

> Dave

> 

>


Thanks again for your time! I will address the above and then push in a day or two.

-Lewis
Richard Sandiford via Gcc-patches July 14, 2020, 1:49 p.m. | #12
On Mon, 2020-07-13 at 17:07 -0400, Lewis Hyatt wrote:
> On Mon, Jul 13, 2020 at 03:04:20PM -0400, David Malcolm wrote:


[...]

> > OK for trunk with those nits fixed.

> > 

> > Dave

> > 

> > 

> 

> Thanks again for your time! I will address the above and then push in

> a day or two.


Excellent - thanks for all your work on this.

Dave

Patch

commit 02d02a7bbbd4824c230079c38e134843ac442ef5
Author: Lewis Hyatt <lhyatt@gmail.com>
Date:   Fri Jan 24 17:17:40 2020 -0500

    diagnostics: Add options to control the column units [PR49973] [PR86904]

diff --git a/gcc/common.opt b/gcc/common.opt
index 630c380bd6a..657985450c2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1309,6 +1309,14 @@  Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)
 EnumValue
 Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)
 
+fdiagnostics-column-unit=
+Common Joined RejectNegative Enum(diagnostics_column_unit)
+-fdiagnostics-column-unit=[display|byte]	Select units for column numbers.
+
+fdiagnostics-column-origin=
+Common Joined RejectNegative UInteger
+-fdiagnostics-column-origin=<number>	Set the number of the first column.  Default 1-based.
+
 fdiagnostics-format=
 Common Joined RejectNegative Enum(diagnostics_output_format)
 -fdiagnostics-format=[text|json] Select output format.
@@ -1317,6 +1325,15 @@  Common Joined RejectNegative Enum(diagnostics_output_format)
 SourceInclude
 diagnostic.h
 
+Enum
+Name(diagnostics_column_unit) Type(int)
+
+EnumValue
+Enum(diagnostics_column_unit) String(display) Value(DIAGNOSTICS_COLUMN_UNIT_DISPLAY)
+
+EnumValue
+Enum(diagnostics_column_unit) String(byte) Value(DIAGNOSTICS_COLUMN_UNIT_BYTE)
+
 Enum
 Name(diagnostics_output_format) Type(int)
 
diff --git a/gcc/diagnostic-format-json.cc b/gcc/diagnostic-format-json.cc
index 7bda5c4ba83..3b970b51d0c 100644
--- a/gcc/diagnostic-format-json.cc
+++ b/gcc/diagnostic-format-json.cc
@@ -23,6 +23,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "diagnostic.h"
+#include "selftest-diagnostic.h"
 #include "diagnostic-metadata.h"
 #include "json.h"
 #include "selftest.h"
@@ -43,21 +44,23 @@  static json::array *cur_children_array;
 /* Generate a JSON object for LOC.  */
 
 json::value *
-json_from_expanded_location (location_t loc)
+json_from_expanded_location (diagnostic_context *context, location_t loc)
 {
   expanded_location exploc = expand_location (loc);
   json::object *result = new json::object ();
   if (exploc.file)
     result->set ("file", new json::string (exploc.file));
   result->set ("line", new json::integer_number (exploc.line));
-  result->set ("column", new json::integer_number (exploc.column));
+  const int col = diagnostic_converted_column (context, exploc);
+  result->set ("column", new json::integer_number (col));
   return result;
 }
 
 /* Generate a JSON object for LOC_RANGE.  */
 
 static json::object *
-json_from_location_range (const location_range *loc_range, unsigned range_idx)
+json_from_location_range (diagnostic_context *context,
+			  const location_range *loc_range, unsigned range_idx)
 {
   location_t caret_loc = get_pure_location (loc_range->m_loc);
 
@@ -68,13 +71,13 @@  json_from_location_range (const location_range *loc_range, unsigned range_idx)
   location_t finish_loc = get_finish (loc_range->m_loc);
 
   json::object *result = new json::object ();
-  result->set ("caret", json_from_expanded_location (caret_loc));
+  result->set ("caret", json_from_expanded_location (context, caret_loc));
   if (start_loc != caret_loc
       && start_loc != UNKNOWN_LOCATION)
-    result->set ("start", json_from_expanded_location (start_loc));
+    result->set ("start", json_from_expanded_location (context, start_loc));
   if (finish_loc != caret_loc
       && finish_loc != UNKNOWN_LOCATION)
-    result->set ("finish", json_from_expanded_location (finish_loc));
+    result->set ("finish", json_from_expanded_location (context, finish_loc));
 
   if (loc_range->m_label)
     {
@@ -91,14 +94,14 @@  json_from_location_range (const location_range *loc_range, unsigned range_idx)
 /* Generate a JSON object for HINT.  */
 
 static json::object *
-json_from_fixit_hint (const fixit_hint *hint)
+json_from_fixit_hint (diagnostic_context *context, const fixit_hint *hint)
 {
   json::object *fixit_obj = new json::object ();
 
   location_t start_loc = hint->get_start_loc ();
-  fixit_obj->set ("start", json_from_expanded_location (start_loc));
+  fixit_obj->set ("start", json_from_expanded_location (context, start_loc));
   location_t next_loc = hint->get_next_loc ();
-  fixit_obj->set ("next", json_from_expanded_location (next_loc));
+  fixit_obj->set ("next", json_from_expanded_location (context, next_loc));
   fixit_obj->set ("string", new json::string (hint->get_string ()));
 
   return fixit_obj;
@@ -205,7 +208,7 @@  json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   for (unsigned int i = 0; i < richloc->get_num_locations (); i++)
     {
       const location_range *loc_range = richloc->get_range (i);
-      json::object *loc_obj = json_from_location_range (loc_range, i);
+      json::object *loc_obj = json_from_location_range (context, loc_range, i);
       if (loc_obj)
 	loc_array->append (loc_obj);
     }
@@ -217,7 +220,7 @@  json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
       for (unsigned int i = 0; i < richloc->get_num_fixit_hints (); i++)
 	{
 	  const fixit_hint *hint = richloc->get_fixit_hint (i);
-	  json::object *fixit_obj = json_from_fixit_hint (hint);
+	  json::object *fixit_obj = json_from_fixit_hint (context, hint);
 	  fixit_array->append (fixit_obj);
 	}
     }
@@ -320,7 +323,8 @@  namespace selftest {
 static void
 test_unknown_location ()
 {
-  delete json_from_expanded_location (UNKNOWN_LOCATION);
+  test_diagnostic_context dc;
+  delete json_from_expanded_location (&dc, UNKNOWN_LOCATION);
 }
 
 /* Verify that we gracefully handle attempts to serialize bad
@@ -338,7 +342,8 @@  test_bad_endpoints ()
   loc_range.m_range_display_kind = SHOW_RANGE_WITH_CARET;
   loc_range.m_label = NULL;
 
-  json::object *obj = json_from_location_range (&loc_range, 0);
+  test_diagnostic_context dc;
+  json::object *obj = json_from_location_range (&dc, &loc_range, 0);
   /* We should have a "caret" value, but no "start" or "finish" values.  */
   ASSERT_TRUE (obj != NULL);
   ASSERT_TRUE (obj->get ("caret") != NULL);
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 3386f070256..d9421c8bdf2 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -38,6 +38,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "selftest.h"
 #include "selftest-diagnostic.h"
 #include "opts.h"
+#include "cpplib.h"
 
 #ifdef HAVE_TERMIOS_H
 # include <termios.h>
@@ -219,6 +220,8 @@  diagnostic_initialize (diagnostic_context *context, int n_opts)
   context->min_margin_width = 0;
   context->show_ruler_p = false;
   context->parseable_fixits_p = false;
+  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;
+  context->column_adj = 0;
   context->edit_context_ptr = NULL;
   context->diagnostic_group_nesting_depth = 0;
   context->diagnostic_group_emission_count = 0;
@@ -338,8 +341,37 @@  diagnostic_get_color_for_kind (diagnostic_t kind)
   return diagnostic_kind_color[kind];
 }
 
+/* Given an expanded_location, convert the column (which is in 1-based bytes)
+   to the requested units and origin.  Return -1 if the column is
+   invalid (<= 0).  */
+int
+diagnostic_converted_column (diagnostic_context *context, expanded_location s)
+{
+  if (s.column <= 0)
+    return -1;
+
+  int col;
+  switch (context->column_unit)
+    {
+    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:
+      col = location_compute_display_column (s);
+      break;
+
+    case DIAGNOSTICS_COLUMN_UNIT_BYTE:
+      col = s.column;
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return col + context->column_adj;
+}
+
 /* Return a formatted line and column ':%line:%column'.  Elided if
-   zero.  The result is a statically allocated buffer.  */
+   line == 0 or col < 0.  (A column of 0 may be valid due to the
+   -fdiagnostics-column-origin option.)
+   The result is a statically allocated buffer.  */
 
 static const char *
 maybe_line_and_column (int line, int col)
@@ -348,8 +380,9 @@  maybe_line_and_column (int line, int col)
 
   if (line)
     {
-      size_t l = snprintf (result, sizeof (result),
-			   col ? ":%d:%d" : ":%d", line, col);
+      size_t l
+	= snprintf (result, sizeof (result),
+		    col >= 0 ? ":%d:%d" : ":%d", line, col);
       gcc_checking_assert (l < sizeof (result));
     }
   else
@@ -368,8 +401,14 @@  diagnostic_get_location_text (diagnostic_context *context,
   const char *locus_cs = colorize_start (pp_show_color (pp), "locus");
   const char *locus_ce = colorize_stop (pp_show_color (pp));
   const char *file = s.file ? s.file : progname;
-  int line = strcmp (file, N_("<built-in>")) ? s.line : 0;
-  int col = context->show_column ? s.column : 0;
+  int line = 0;
+  int col = -1;
+  if (strcmp (file, N_("<built-in>")))
+    {
+      line = s.line;
+      if (context->show_column)
+	col = diagnostic_converted_column (context, s);
+    }
 
   const char *line_col = maybe_line_and_column (line, col);
   return build_message_string ("%s%s%s:%s", locus_cs, file,
@@ -635,14 +674,20 @@  diagnostic_report_current_module (diagnostic_context *context, location_t where)
       if (! MAIN_FILE_P (map))
 	{
 	  bool first = true;
+	  expanded_location s = {};
 	  do
 	    {
 	      where = linemap_included_from (map);
 	      map = linemap_included_from_linemap (line_table, map);
-	      const char *line_col
-		= maybe_line_and_column (SOURCE_LINE (map, where),
-					 first && context->show_column
-					 ? SOURCE_COLUMN (map, where) : 0);
+	      s.file = LINEMAP_FILE (map);
+	      s.line = SOURCE_LINE (map, where);
+	      int col = -1;
+	      if (first && context->show_column)
+		{
+		  s.column = SOURCE_COLUMN (map, where);
+		  col = diagnostic_converted_column (context, s);
+		}
+	      const char *line_col = maybe_line_and_column (s.line, col);
 	      static const char *const msgs[] =
 		{
 		 N_("In file included from"),
@@ -651,7 +696,7 @@  diagnostic_report_current_module (diagnostic_context *context, location_t where)
 	      unsigned index = !first;
 	      pp_verbatim (context->printer, "%s%s %r%s%s%R",
 			   first ? "" : ",\n", _(msgs[index]),
-			   "locus", LINEMAP_FILE (map), line_col);
+			   "locus", s.file, line_col);
 	      first = false;
 	    }
 	  while (! MAIN_FILE_P (map));
@@ -863,11 +908,14 @@  print_escaped_string (pretty_printer *pp, const char *text)
    machine-parseable version of all fixits in RICHLOC to PP.  */
 
 static void
-print_parseable_fixits (pretty_printer *pp, rich_location *richloc)
+print_parseable_fixits (diagnostic_context *context, rich_location *richloc)
 {
-  gcc_assert (pp);
+  gcc_assert (context);
   gcc_assert (richloc);
 
+  pretty_printer *const pp = context->printer;
+  gcc_assert (pp);
+
   char *saved_prefix = pp_take_prefix (pp);
   pp_set_prefix (pp, NULL);
 
@@ -882,8 +930,10 @@  print_parseable_fixits (pretty_printer *pp, rich_location *richloc)
       location_t next_loc = hint->get_next_loc ();
       expanded_location next_exploc = expand_location (next_loc);
       pp_printf (pp, ":{%i:%i-%i:%i}:",
-		 start_exploc.line, start_exploc.column,
-		 next_exploc.line, next_exploc.column);
+		 start_exploc.line,
+		 diagnostic_converted_column (context, start_exploc),
+		 next_exploc.line,
+		 diagnostic_converted_column (context, next_exploc));
       print_escaped_string (pp, hint->get_string ());
       pp_newline (pp);
     }
@@ -1146,7 +1196,7 @@  diagnostic_report_diagnostic (diagnostic_context *context,
   (*diagnostic_finalizer (context)) (context, diagnostic, orig_diag_kind);
   if (context->parseable_fixits_p)
     {
-      print_parseable_fixits (context->printer, diagnostic->richloc);
+      print_parseable_fixits (context, diagnostic->richloc);
       pp_flush (context->printer);
     }
   diagnostic_action_after_output (context, diagnostic->kind);
@@ -1943,11 +1993,11 @@  test_print_escaped_string ()
 static void
 test_print_parseable_fixits_none ()
 {
-  pretty_printer pp;
+  test_diagnostic_context dc;
   rich_location richloc (line_table, UNKNOWN_LOCATION);
 
-  print_parseable_fixits (&pp, &richloc);
-  ASSERT_STREQ ("", pp_formatted_text (&pp));
+  print_parseable_fixits (&dc, &richloc);
+  ASSERT_STREQ ("", pp_formatted_text (dc.printer));
 }
 
 /* Verify that print_parseable_fixits does the right thing if there
@@ -1956,7 +2006,7 @@  test_print_parseable_fixits_none ()
 static void
 test_print_parseable_fixits_insert ()
 {
-  pretty_printer pp;
+  test_diagnostic_context dc;
   rich_location richloc (line_table, UNKNOWN_LOCATION);
 
   linemap_add (line_table, LC_ENTER, false, "test.c", 0);
@@ -1965,9 +2015,9 @@  test_print_parseable_fixits_insert ()
   location_t where = linemap_position_for_column (line_table, 10);
   richloc.add_fixit_insert_before (where, "added content");
 
-  print_parseable_fixits (&pp, &richloc);
+  print_parseable_fixits (&dc, &richloc);
   ASSERT_STREQ ("fix-it:\"test.c\":{5:10-5:10}:\"added content\"\n",
-		pp_formatted_text (&pp));
+		pp_formatted_text (dc.printer));
 }
 
 /* Verify that print_parseable_fixits does the right thing if there
@@ -1976,7 +2026,7 @@  test_print_parseable_fixits_insert ()
 static void
 test_print_parseable_fixits_remove ()
 {
-  pretty_printer pp;
+  test_diagnostic_context dc;
   rich_location richloc (line_table, UNKNOWN_LOCATION);
 
   linemap_add (line_table, LC_ENTER, false, "test.c", 0);
@@ -1987,9 +2037,9 @@  test_print_parseable_fixits_remove ()
   where.m_finish = linemap_position_for_column (line_table, 20);
   richloc.add_fixit_remove (where);
 
-  print_parseable_fixits (&pp, &richloc);
+  print_parseable_fixits (&dc, &richloc);
   ASSERT_STREQ ("fix-it:\"test.c\":{5:10-5:21}:\"\"\n",
-		pp_formatted_text (&pp));
+		pp_formatted_text (dc.printer));
 }
 
 /* Verify that print_parseable_fixits does the right thing if there
@@ -1998,7 +2048,7 @@  test_print_parseable_fixits_remove ()
 static void
 test_print_parseable_fixits_replace ()
 {
-  pretty_printer pp;
+  test_diagnostic_context dc;
   rich_location richloc (line_table, UNKNOWN_LOCATION);
 
   linemap_add (line_table, LC_ENTER, false, "test.c", 0);
@@ -2009,9 +2059,9 @@  test_print_parseable_fixits_replace ()
   where.m_finish = linemap_position_for_column (line_table, 20);
   richloc.add_fixit_replace (where, "replacement");
 
-  print_parseable_fixits (&pp, &richloc);
+  print_parseable_fixits (&dc, &richloc);
   ASSERT_STREQ ("fix-it:\"test.c\":{5:10-5:21}:\"replacement\"\n",
-		pp_formatted_text (&pp));
+		pp_formatted_text (dc.printer));
 }
 
 /* Verify that
@@ -2022,10 +2072,15 @@  test_print_parseable_fixits_replace ()
 static void
 assert_location_text (const char *expected_loc_text,
 		      const char *filename, int line, int column,
-		      bool show_column)
+		      bool show_column,
+		      int origin = 1,
+		      enum diagnostics_column_unit column_unit
+			= DIAGNOSTICS_COLUMN_UNIT_BYTE)
 {
   test_diagnostic_context dc;
   dc.show_column = show_column;
+  dc.column_unit = column_unit;
+  dc.column_adj = origin - 1;
 
   expanded_location xloc;
   xloc.file = filename;
@@ -2049,7 +2104,10 @@  test_diagnostic_get_location_text ()
   assert_location_text ("PROGNAME:", NULL, 0, 0, true);
   assert_location_text ("<built-in>:", "<built-in>", 42, 10, true);
   assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);
-  assert_location_text ("foo.c:42:", "foo.c", 42, 0, true);
+  assert_location_text ("foo.c:42:9:", "foo.c", 42, 10, true, 0);
+  assert_location_text ("foo.c:42:1010:", "foo.c", 42, 10, true, 1001);
+  for (int origin = 0; origin != 2; ++origin)
+    assert_location_text ("foo.c:42:", "foo.c", 42, 0, true, origin);
   assert_location_text ("foo.c:", "foo.c", 0, 10, true);
   assert_location_text ("foo.c:42:", "foo.c", 42, 10, false);
   assert_location_text ("foo.c:", "foo.c", 0, 10, false);
@@ -2057,6 +2115,39 @@  test_diagnostic_get_location_text ()
   maybe_line_and_column (INT_MAX, INT_MAX);
   maybe_line_and_column (INT_MIN, INT_MIN);
 
+  {
+    /* In order to test display columns vs byte columns, we need to create a
+       file for location_get_source_line() to read.  */
+
+    const char *const content = "smile \xf0\x9f\x98\x82\n";
+    const int line_bytes = strlen (content) - 1;
+    const int display_width = cpp_display_width (content, line_bytes);
+    ASSERT_EQ (line_bytes - 2, display_width);
+    temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+    const char *const fname = tmp.get_filename ();
+    const int buf_len = strlen (fname) + 16;
+    char *const expected = XNEWVEC (char, buf_len);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    XDELETEVEC (expected);
+  }
+
+
   progname = old_progname;
 }
 
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 307dbcfb34a..5da6528222d 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -24,6 +24,20 @@  along with GCC; see the file COPYING3.  If not see
 #include "pretty-print.h"
 #include "diagnostic-core.h"
 
+/* An enum for controlling what units to use for the column number
+   when diagnostics are output, used by the -fdiagnostics-column-unit option.
+   Tabs will be expanded or not according to the value of -ftabstop.  The origin
+   (default 1) is controlled by -fdiagnostics-column-origin.  */
+
+enum diagnostics_column_unit
+{
+  /* The new default: display columns.  */
+  DIAGNOSTICS_COLUMN_UNIT_DISPLAY,
+
+  /* The historical behavior: simple bytes.  */
+  DIAGNOSTICS_COLUMN_UNIT_BYTE
+};
+
 /* Enum for overriding the standard output format.  */
 
 enum diagnostics_output_format
@@ -280,6 +294,13 @@  struct diagnostic_context
      rest of the diagnostic.  */
   bool parseable_fixits_p;
 
+  /* What units to use when outputting the column number.  */
+  enum diagnostics_column_unit column_unit;
+
+  /* Offset by which to adjust the 1-based column to respect
+     -fdiagnostics-column-origin.  */
+  int column_adj;
+
   /* If non-NULL, an edit_context to which fix-it hints should be
      applied, for generating patches.  */
   edit_context *edit_context_ptr;
@@ -458,6 +479,8 @@  diagnostic_same_line (const diagnostic_context *context,
 }
 
 extern const char *diagnostic_get_color_for_kind (diagnostic_t kind);
+extern int diagnostic_converted_column (diagnostic_context *context,
+					expanded_location s);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
@@ -470,6 +493,7 @@  extern void diagnostic_output_format_init (diagnostic_context *,
 /* Compute the number of digits in the decimal representation of an integer.  */
 extern int num_digits (int);
 
-extern json::value *json_from_expanded_location (location_t loc);
+extern json::value *json_from_expanded_location (diagnostic_context *context,
+						 location_t loc);
 
 #endif /* ! GCC_DIAGNOSTIC_H */
diff --git a/gcc/opts.c b/gcc/opts.c
index d55adf912b1..2373eefd78a 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2423,6 +2423,14 @@  common_handle_option (struct gcc_options *opts,
       dc->parseable_fixits_p = value;
       break;
 
+    case OPT_fdiagnostics_column_unit_:
+      dc->column_unit = (enum diagnostics_column_unit)value;
+      break;
+
+    case OPT_fdiagnostics_column_origin_:
+      dc->column_adj = value - 1;
+      break;
+
     case OPT_fdiagnostics_show_cwe:
       dc->show_cwe = value;
       break;
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
index 7df431fdaef..2314ad42402 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
@@ -36,9 +36,9 @@  int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
index 5cdeba1cbba..202c6bc7fdf 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
@@ -65,9 +65,9 @@  int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
@@ -178,7 +178,7 @@  void fn_16_tabs (void)
     while (flagA)
       if (flagB) /* { dg-message "7: this 'if' clause does not guard..." } */
 	foo (0);
-	foo (1);/* { dg-message "2: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+	foo (1);/* { dg-message "9: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 }
 
 void fn_17_spaces (void)
diff --git a/gcc/testsuite/g++.dg/parse/error4.C b/gcc/testsuite/g++.dg/parse/error4.C
index 792bf4dc063..fe8de73790d 100644
--- a/gcc/testsuite/g++.dg/parse/error4.C
+++ b/gcc/testsuite/g++.dg/parse/error4.C
@@ -7,4 +7,4 @@  struct X {
 		 int);
 };
 
-// { dg-error "4:'itn' has not been declared" "" { target *-*-* } 6 }
+// { dg-error "18:'itn' has not been declared" "" { target *-*-* } 6 }
diff --git a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
index 96ebb71645c..d2b37a5122d 100644
--- a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
+++ b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
@@ -9,13 +9,13 @@  class A {
 	int	h;
 	A() { i=10; j=20; }
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "16:virtual functions cannot be friends" }
 };
 
 class B : public A {
     public:
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }// { dg-error "" }  member.*// ERROR -  member.*
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "16:virtual functions cannot be friends" }
 // { dg-error "private" "" { target *-*-* } .-1 }
 };
 
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
index b438543d445..bbc9e51aff6 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
@@ -12,5 +12,5 @@  int
 main()
 {
 	C<char*>	c;
-	char*		p = Z(c.O); //{ dg-error "13:'Z' was not declared" } ambiguous c.O
+	char*		p = Z(c.O); //{ dg-error "29:'Z' was not declared" } ambiguous c.O
 }
diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
index 6dc2c55be58..b98e8da6b1e 100644
--- a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
+++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
@@ -48,8 +48,8 @@  ostream& operator<<(ostream& os, Graph<VertexType,EdgeType>& G)
 
         // The compiler does not like this line!!!!!!
         typename Graph<VertexType, EdgeType>::Successor::iterator
-	  startN = G[i].second.begin(), // { dg-error "14:no match" } no index operator
-	  endN   = G[i].second.end();  // { dg-error "14:no match" } no index operator
+	  startN = G[i].second.begin(), // { dg-error "21:no match" } no index operator
+	  endN   = G[i].second.end();  // { dg-error "21:no match" } no index operator
 
         while(startN != endN)
         {
diff --git a/gcc/testsuite/gcc.dg/format/branch-1.c b/gcc/testsuite/gcc.dg/format/branch-1.c
index 1782064645e..4ea39b52b2e 100644
--- a/gcc/testsuite/gcc.dg/format/branch-1.c
+++ b/gcc/testsuite/gcc.dg/format/branch-1.c
@@ -10,7 +10,7 @@  foo (long l, int nfoo)
 {
   printf ((nfoo > 1) ? "%d foos" : "%d foo", nfoo);
   printf ((l > 1) ? "%d foos" /* { dg-warning "23:int" "wrong type in conditional expr" } */
-	          : "%d foo", l); /* { dg-warning "16:int" "wrong type in conditional expr" } */
+	          : "%d foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%ld foos" : "%d foo", l); /* { dg-warning "36:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%d foos" : "%ld foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   /* Should allow one case to have extra arguments.  */
diff --git a/gcc/testsuite/gcc.dg/format/pr79210.c b/gcc/testsuite/gcc.dg/format/pr79210.c
index 71f5dd6e082..6bdabdf21ec 100644
--- a/gcc/testsuite/gcc.dg/format/pr79210.c
+++ b/gcc/testsuite/gcc.dg/format/pr79210.c
@@ -20,4 +20,4 @@  LPFC_VPORT_ATTR_R(peer_port_login,
 		  "Allow peer ports on the same physical port to login to each "
 		  "other.");
 
-/* { dg-warning "6: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
+/* { dg-warning "20: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
diff --git a/gcc/testsuite/gcc.dg/redecl-4.c b/gcc/testsuite/gcc.dg/redecl-4.c
index 8f124886da8..2c214bb02c7 100644
--- a/gcc/testsuite/gcc.dg/redecl-4.c
+++ b/gcc/testsuite/gcc.dg/redecl-4.c
@@ -15,7 +15,7 @@  f (void)
     /* Should get format warnings even though the built-in declaration
        isn't "visible".  */
     printf (
-	    "%s", 1); /* { dg-warning "8:format" } */
+	    "%s", 1); /* { dg-warning "15:format" } */
     /* The type of strcmp here should have no prototype.  */
     if (0)
       strcmp (1);
diff --git a/gcc/testsuite/go.dg/arrayclear.go b/gcc/testsuite/go.dg/arrayclear.go
index 6daebc0b8f5..aa5ba0761d7 100644
--- a/gcc/testsuite/go.dg/arrayclear.go
+++ b/gcc/testsuite/go.dg/arrayclear.go
@@ -1,5 +1,8 @@ 
 // { dg-do compile }
 // { dg-options "-fgo-debug-optimization" }
+// This comment is necessary to work around a dejagnu bug. Otherwise, the
+// column of the second error message would equal the row of the first one, and
+// since the errors are also identical, dejagnu is not able to distinguish them.
 
 package p
 
diff --git a/gcc/tree-diagnostic-path.cc b/gcc/tree-diagnostic-path.cc
index 381a49cb0b4..82b3c2d6b6a 100644
--- a/gcc/tree-diagnostic-path.cc
+++ b/gcc/tree-diagnostic-path.cc
@@ -493,7 +493,7 @@  default_tree_diagnostic_path_printer (diagnostic_context *context,
    doesn't have access to trees (for m_fndecl).  */
 
 json::value *
-default_tree_make_json_for_path (diagnostic_context *,
+default_tree_make_json_for_path (diagnostic_context *context,
 				 const diagnostic_path *path)
 {
   json::array *path_array = new json::array ();
@@ -504,7 +504,8 @@  default_tree_make_json_for_path (diagnostic_context *,
       json::object *event_obj = new json::object ();
       if (event.get_location ())
 	event_obj->set ("location",
-			json_from_expanded_location (event.get_location ()));
+			json_from_expanded_location (context,
+						     event.get_location ()));
       label_text event_text (event.get_desc (false));
       event_obj->set ("description", new json::string (event_text.m_buffer));
       event_text.maybe_free ();