Make std::match_results::_M_resize more useful

Message ID 20181204143056.GA22857@redhat.com
State New
Headers show
Series
  • Make std::match_results::_M_resize more useful
Related show

Commit Message

Jonathan Wakely Dec. 4, 2018, 2:30 p.m.
As both callers of match_results::_M_resize(unsigned) immediately follow
it with a loop to update the value of each sub_match, that behaviour can
be moved into _M_resize itself. The first caller fills the container
with unmatched subs, which can be done with vector::assign, and the
second caller clears the container to establish a specific state, which
can be provided by a new member function specific to that purpose.

Tangentially, I also noticed that match_results::max_size() doesn't
account for the three special sub_match objects that are always present
in a fully established result state. This patch also fixes that.

	* include/bits/regex.h (match_results::max_size()): Adjust return
	value to account for prefix/suffix/unmatched subs.
	(match_results::_M_resize(unsigned int)): Use _Base_type::assign to
	reset the contained sub matches.
	(match_results::_M_establish_failed_match(_Bi_iter)): Add new member
	function to set result state following a failed match.
	* include/bits/regex.tcc (__regex_algo_impl): Remove loop to set
	sub_match states after _M_resize. Use _M_establish_failed_match.

Tested x86_64-linux.

Any objections?
commit 6a96a528102e1062dd2627b7eb8ddf26c43bc1e7
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Dec 4 13:21:26 2018 +0000

    Make std::match_results::_M_resize more useful
    
    As both callers of match_results::_M_resize(unsigned) immediately follow
    it with a loop to update the value of each sub_match, that behaviour can
    be moved into _M_resize itself. The first caller fills the container
    with unmatched subs, which can be done with vector::assign, and the
    second caller clears the container to establish a specific state, which
    can be provided by a new member function specific to that purpose.
    
    Tangentially, I also noticed that match_results::max_size() doesn't
    account for the three special sub_match objects that are always present
    in a fully established result state. This patch also fixes that.
    
            * include/bits/regex.h (match_results::max_size()): Adjust return
            value to account for prefix/suffix/unmatched subs.
            (match_results::_M_resize(unsigned int)): Use _Base_type::assign to
            reset the contained sub matches.
            (match_results::_M_establish_failed_match(_Bi_iter)): Add new member
            function to set result state following a failed match.
            * include/bits/regex.tcc (__regex_algo_impl): Remove loop to set
            sub_match states after _M_resize. Use _M_establish_failed_match.

Comments

Jonathan Wakely May 14, 2019, 12:19 p.m. | #1
On 04/12/18 14:30 +0000, Jonathan Wakely wrote:
>As both callers of match_results::_M_resize(unsigned) immediately follow

>it with a loop to update the value of each sub_match, that behaviour can

>be moved into _M_resize itself. The first caller fills the container

>with unmatched subs, which can be done with vector::assign, and the

>second caller clears the container to establish a specific state, which

>can be provided by a new member function specific to that purpose.

>

>Tangentially, I also noticed that match_results::max_size() doesn't

>account for the three special sub_match objects that are always present

>in a fully established result state. This patch also fixes that.

>

>	* include/bits/regex.h (match_results::max_size()): Adjust return

>	value to account for prefix/suffix/unmatched subs.

>	(match_results::_M_resize(unsigned int)): Use _Base_type::assign to

>	reset the contained sub matches.

>	(match_results::_M_establish_failed_match(_Bi_iter)): Add new member

>	function to set result state following a failed match.

>	* include/bits/regex.tcc (__regex_algo_impl): Remove loop to set

>	sub_match states after _M_resize. Use _M_establish_failed_match.

>

>Tested x86_64-linux.

>

>Any objections?


Now committed to trunk.


>commit 6a96a528102e1062dd2627b7eb8ddf26c43bc1e7

>Author: Jonathan Wakely <jwakely@redhat.com>

>Date:   Tue Dec 4 13:21:26 2018 +0000

>

>    Make std::match_results::_M_resize more useful

>

>    As both callers of match_results::_M_resize(unsigned) immediately follow

>    it with a loop to update the value of each sub_match, that behaviour can

>    be moved into _M_resize itself. The first caller fills the container

>    with unmatched subs, which can be done with vector::assign, and the

>    second caller clears the container to establish a specific state, which

>    can be provided by a new member function specific to that purpose.

>

>    Tangentially, I also noticed that match_results::max_size() doesn't

>    account for the three special sub_match objects that are always present

>    in a fully established result state. This patch also fixes that.

>

>            * include/bits/regex.h (match_results::max_size()): Adjust return

>            value to account for prefix/suffix/unmatched subs.

>            (match_results::_M_resize(unsigned int)): Use _Base_type::assign to

>            reset the contained sub matches.

>            (match_results::_M_establish_failed_match(_Bi_iter)): Add new member

>            function to set result state following a failed match.

>            * include/bits/regex.tcc (__regex_algo_impl): Remove loop to set

>            sub_match states after _M_resize. Use _M_establish_failed_match.

>

>diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h

>index af6fe3f0d79..d36ee33033d 100644

>--- a/libstdc++-v3/include/bits/regex.h

>+++ b/libstdc++-v3/include/bits/regex.h

>@@ -1698,7 +1698,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11

>

>       size_type

>       max_size() const noexcept

>-      { return _Base_type::max_size(); }

>+      { return _Base_type::max_size() - 3; }

>

>       /**

>        * @brief Indicates if the %match_results contains no results.

>@@ -1942,9 +1942,20 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11

> 				    const basic_regex<_Cp, _Rp>&,

> 				    regex_constants::match_flag_type);

>

>+      // Reset contents to __size unmatched sub_match objects

>+      // (plus additional objects for prefix, suffix and unmatched sub).

>       void

>       _M_resize(unsigned int __size)

>-      { _Base_type::resize(__size + 3); }

>+      { _Base_type::assign(__size + 3, sub_match<_Bi_iter>{}); }

>+

>+      // Set state to a failed match for the given past-the-end iterator.

>+      void

>+      _M_establish_failed_match(_Bi_iter __end)

>+      {

>+	sub_match<_Bi_iter> __sm;

>+	__sm.first = __sm.second = __end;

>+	_Base_type::assign(3, __sm);

>+      }

>

>       const_reference

>       _M_unmatched_sub() const

>diff --git a/libstdc++-v3/include/bits/regex.tcc b/libstdc++-v3/include/bits/regex.tcc

>index dcf660902bc..de0ce795b84 100644

>--- a/libstdc++-v3/include/bits/regex.tcc

>+++ b/libstdc++-v3/include/bits/regex.tcc

>@@ -57,8 +57,6 @@ namespace __detail

>       typename match_results<_BiIter, _Alloc>::_Base_type& __res = __m;

>       __m._M_begin = __s;

>       __m._M_resize(__re._M_automaton->_M_sub_count());

>-      for (auto& __it : __res)

>-	__it.matched = false;

>

>       bool __ret;

>       if ((__re.flags() & regex_constants::__polynomial)

>@@ -109,12 +107,7 @@ namespace __detail

> 	}

>       else

> 	{

>-	  __m._M_resize(0);

>-	  for (auto& __it : __res)

>-	    {

>-	      __it.matched = false;

>-	      __it.first = __it.second = __e;

>-	    }

>+	  __m._M_establish_failed_match(__e);

> 	}

>       return __ret;

>     }

Patch

diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h
index af6fe3f0d79..d36ee33033d 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -1698,7 +1698,7 @@  _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
       size_type
       max_size() const noexcept
-      { return _Base_type::max_size(); }
+      { return _Base_type::max_size() - 3; }
 
       /**
        * @brief Indicates if the %match_results contains no results.
@@ -1942,9 +1942,20 @@  _GLIBCXX_BEGIN_NAMESPACE_CXX11
 				    const basic_regex<_Cp, _Rp>&,
 				    regex_constants::match_flag_type);
 
+      // Reset contents to __size unmatched sub_match objects
+      // (plus additional objects for prefix, suffix and unmatched sub).
       void
       _M_resize(unsigned int __size)
-      { _Base_type::resize(__size + 3); }
+      { _Base_type::assign(__size + 3, sub_match<_Bi_iter>{}); }
+
+      // Set state to a failed match for the given past-the-end iterator.
+      void
+      _M_establish_failed_match(_Bi_iter __end)
+      {
+	sub_match<_Bi_iter> __sm;
+	__sm.first = __sm.second = __end;
+	_Base_type::assign(3, __sm);
+      }
 
       const_reference
       _M_unmatched_sub() const
diff --git a/libstdc++-v3/include/bits/regex.tcc b/libstdc++-v3/include/bits/regex.tcc
index dcf660902bc..de0ce795b84 100644
--- a/libstdc++-v3/include/bits/regex.tcc
+++ b/libstdc++-v3/include/bits/regex.tcc
@@ -57,8 +57,6 @@  namespace __detail
       typename match_results<_BiIter, _Alloc>::_Base_type& __res = __m;
       __m._M_begin = __s;
       __m._M_resize(__re._M_automaton->_M_sub_count());
-      for (auto& __it : __res)
-	__it.matched = false;
 
       bool __ret;
       if ((__re.flags() & regex_constants::__polynomial)
@@ -109,12 +107,7 @@  namespace __detail
 	}
       else
 	{
-	  __m._M_resize(0);
-	  for (auto& __it : __res)
-	    {
-	      __it.matched = false;
-	      __it.first = __it.second = __e;
-	    }
+	  __m._M_establish_failed_match(__e);
 	}
       return __ret;
     }