manual: Document thread/task IDs for Linux

Message ID 87woolfq5f.fsf@oldenburg2.str.redhat.com
State New
Headers show
Series
  • manual: Document thread/task IDs for Linux
Related show

Commit Message

Florian Weimer Dec. 7, 2018, 1 p.m.
2018-12-07  Florian Weimer  <fweimer@redhat.com>

	* manual/process.texi (Process Creation Concepts): Remove
	documentation of process (ID) lifetime.  List more process
	creation functions.  Reference Process Identification section.
	(Process Identification): Add information about process ID
	lifetime.  Describe Linux thread/task IDs.
	* manual/signal.texi (Signaling Another Process): Mention that the
	signal is always sent to the process.

Comments

Carlos O'Donell Dec. 14, 2018, 3:15 p.m. | #1
On 12/7/18 8:00 AM, Florian Weimer wrote:
> 2018-12-07  Florian Weimer  <fweimer@redhat.com>

> 

> 	* manual/process.texi (Process Creation Concepts): Remove

> 	documentation of process (ID) lifetime.  List more process

> 	creation functions.  Reference Process Identification section.

> 	(Process Identification): Add information about process ID

> 	lifetime.  Describe Linux thread/task IDs.

> 	* manual/signal.texi (Signaling Another Process): Mention that the

> 	signal is always sent to the process.

> 


OK for master if you resolve the question below about namesapces.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>


> diff --git a/manual/process.texi b/manual/process.texi

> index b82b91f9f1..25c8393903 100644

> --- a/manual/process.texi

> +++ b/manual/process.texi

> @@ -132,22 +132,19 @@ output channels of the command being executed.

>  This section gives an overview of processes and of the steps involved in

>  creating a process and making it run another program.

>  

> -@cindex process ID

> -@cindex process lifetime

> -Each process is named by a @dfn{process ID} number.  A unique process ID

> -is allocated to each process when it is created.  The @dfn{lifetime} of

> -a process ends when its termination is reported to its parent process;

> -at that time, all of the process resources, including its process ID,

> -are freed.

> -

>  @cindex creating a process

>  @cindex forking a process

>  @cindex child process

>  @cindex parent process

> -Processes are created with the @code{fork} system call (so the operation

> -of creating a new process is sometimes called @dfn{forking} a process).

> -The @dfn{child process} created by @code{fork} is a copy of the original

> -@dfn{parent process}, except that it has its own process ID.

> +@cindex subprocess

> +A new processes is created when one of the functions

> +@code{posix_spawn}, @code{fork}, or @code{vfork} is called.  (The

> +@code{system} and @code{popen} also create new processes internally.)

> +Due to the name of the @code{fork} function, the act of creating a new

> +process is sometimes called @dfn{forking} a process.  Each new process

> +(the @dfn{child process} or @dfn{subprocess}) is allocated a process

> +ID, distinct from the process ID of the parent process.  @xref{Process

> +Identification}.


OK, merged the intent of the previous section with this section into one
singular text expanding on the functions that could possibly create the process.

>  

>  After forking a child process, both the parent and child processes

>  continue to execute normally.  If you want your program to wait for a

> @@ -174,11 +171,39 @@ too, instead of returning to the previous process image.

>  @node Process Identification

>  @section Process Identification

>  

> -The @code{pid_t} data type represents process IDs.  You can get the

> -process ID of a process by calling @code{getpid}.  The function

> -@code{getppid} returns the process ID of the parent of the current

> -process (this is also known as the @dfn{parent process ID}).  Your

> -program should include the header files @file{unistd.h} and

> +@cindex process ID

> +Each process is named by a @dfn{process ID} number, a value of type

> +@code{pid_t}.  A process ID is allocated to each process when it is

> +created.  Process IDs are reused over time.  The lifetime of a process

> +ends when the parent process of the corresponding process waits on the

> +process ID after the process has terminated.  @xref{Process


OK. I like the description here that hte process ends when waited upon,
which is correct and more accurate.

> +Completion}.  (The parent process can arrange for such waiting to

> +happen implicitly.)  A process ID uniquely identifies a process only

> +during the lifetime of the process.  As a rule of thumb, this means

> +that the process must still be running.

> +

> +Process IDs can also denote process groups and sessions.

> +@xref{Job Control}.

> +

> +@cindex thread ID

> +@cindex task ID

> +@cindex thread group

> +On Linux, threads created by @code{pthread_create} also receive a

> +number form the process ID namespace, a @dfn{thread ID}.  The thread


s/form/from/g

We haven't defined what a "process ID namespace" is in this context,
would it be simpler to write:

"On Linux, threads created by @code{pthread_create} also receive a
unique identifier in the form of a number, a @dfn{thread ID}."

or

"On Linux, threads created by @code{pthread_create} also receive a
unique identifier in the form of a number, a @dfn{thread ID}, which
is taken from the set of currently available process IDs."

I'm trying to avoid the complexity of describing a "process ID namespace",
but I'd like to hear your feedback on that. Alternatively we'd have to
start documenting namespaces and their interactions with glibc functions,
which is actually very useful, but outside the scope of your changes.

> +ID of the initial (main) thread is the same as the process ID of the

> +entire process.  Process IDs and thread IDs are sometimes also

> +referred to collectively as @dfn{task IDs}.  In contrast to processes,

> +threads are never waited for explicitly, so a thread ID becomes

> +eligible for reuse as soon as a thread exits or is canceled.  This is

> +true even for joinable threads, not just detached threads.  Threads

> +are also assigned to a @dfn{thread group}.  In @theglibc{}

> +implementation running on Linux, the process ID is the thread group ID

> +of all threads in the process.


OK.

> +

> +You can get the process ID of a process by calling @code{getpid}.  The

> +function @code{getppid} returns the process ID of the parent of the

> +current process (this is also known as the @dfn{parent process ID}).

> +Your program should include the header files @file{unistd.h} and


OK.

>  @file{sys/types.h} to use these functions.

>  @pindex sys/types.h

>  @pindex unistd.h

> diff --git a/manual/signal.texi b/manual/signal.texi

> index 9577ff091d..8b3a52e22a 100644

> --- a/manual/signal.texi

> +++ b/manual/signal.texi

> @@ -2246,7 +2246,9 @@ signal:

>  

>  @table @code

>  @item @var{pid} > 0

> -The process whose identifier is @var{pid}.

> +The process whose identifier is @var{pid}.  (On Linux, the signal is

> +sent to the entire process even if @var{pid} is a thread ID distinct

> +from the process ID.)


OK.

>  

>  @item @var{pid} == 0

>  All processes in the same process group as the sender.

> 



-- 
Cheers,
Carlos.
Florian Weimer Dec. 14, 2018, 3:33 p.m. | #2
* Carlos O'Donell:

>> +Completion}.  (The parent process can arrange for such waiting to

>> +happen implicitly.)  A process ID uniquely identifies a process only

>> +during the lifetime of the process.  As a rule of thumb, this means

>> +that the process must still be running.

>> +

>> +Process IDs can also denote process groups and sessions.

>> +@xref{Job Control}.

>> +

>> +@cindex thread ID

>> +@cindex task ID

>> +@cindex thread group

>> +On Linux, threads created by @code{pthread_create} also receive a

>> +number form the process ID namespace, a @dfn{thread ID}.  The thread

>

> s/form/from/g


Fixed, thanks.

> We haven't defined what a "process ID namespace" is in this context,

> would it be simpler to write:

>

> "On Linux, threads created by @code{pthread_create} also receive a

> unique identifier in the form of a number, a @dfn{thread ID}."

>

> or

>

> "On Linux, threads created by @code{pthread_create} also receive a

> unique identifier in the form of a number, a @dfn{thread ID}, which

> is taken from the set of currently available process IDs."


I see why this is confusing if you come from a Linux namespace
background.  I didn't want to actually refer to Linux namespaces here.
What I wanted to express is that the thread ID is taken from the same
set that process IDs are drawn from, subject to similar constraints.
So I wanted to use “namespace” as a shorthand for “the set of available
names” (although names are really numbers here).

(I think we should not call these IDs unique because they are not unique
over time, only at every instant.)

What about this?

On Linux, threads created by @code{pthread_create} also receive a a
@dfn{thread ID}.  The thread ID of the initial (main) thread is the
same as the process ID of the entire process.  Thread IDs for
subsequently created threads are distinct.  They are allocated from
the same numbering space as process IDs.  Process IDs and thread IDs
are sometimes also referred to collectively as @dfn{task IDs}.  In
contrast to processes, threads are never waited for explicitly, so a
thread ID becomes eligible for reuse as soon as a thread exits or is
canceled.  This is true even for joinable threads, not just detached
threads.  Threads are also assigned to a @dfn{thread group}.  In
@theglibc{} implementation running on Linux, the process ID is the
thread group ID of all threads in the process.

Thanks,
Florian
Carlos O'Donell Dec. 14, 2018, 7:31 p.m. | #3
On 12/14/18 10:33 AM, Florian Weimer wrote:
> * Carlos O'Donell:

> 

>>> +Completion}.  (The parent process can arrange for such waiting to

>>> +happen implicitly.)  A process ID uniquely identifies a process only

>>> +during the lifetime of the process.  As a rule of thumb, this means

>>> +that the process must still be running.

>>> +

>>> +Process IDs can also denote process groups and sessions.

>>> +@xref{Job Control}.

>>> +

>>> +@cindex thread ID

>>> +@cindex task ID

>>> +@cindex thread group

>>> +On Linux, threads created by @code{pthread_create} also receive a

>>> +number form the process ID namespace, a @dfn{thread ID}.  The thread

>>

>> s/form/from/g

> 

> Fixed, thanks.

> 

>> We haven't defined what a "process ID namespace" is in this context,

>> would it be simpler to write:

>>

>> "On Linux, threads created by @code{pthread_create} also receive a

>> unique identifier in the form of a number, a @dfn{thread ID}."

>>

>> or

>>

>> "On Linux, threads created by @code{pthread_create} also receive a

>> unique identifier in the form of a number, a @dfn{thread ID}, which

>> is taken from the set of currently available process IDs."

> 

> I see why this is confusing if you come from a Linux namespace

> background.  I didn't want to actually refer to Linux namespaces here.


OK. Yes, I thought you were talking about the Linux PID namespace.

> What I wanted to express is that the thread ID is taken from the same

> set that process IDs are drawn from, subject to similar constraints.

> So I wanted to use “namespace” as a shorthand for “the set of available

> names” (although names are really numbers here).


OK, that is what I tried to suggest above. So I think we're on the same
page.

> (I think we should not call these IDs unique because they are not unique

> over time, only at every instant.)

> 

> What about this?

> 

> On Linux, threads created by @code{pthread_create} also receive a a


s/a //g

> @dfn{thread ID}.  The thread ID of the initial (main) thread is the

> same as the process ID of the entire process.  Thread IDs for

> subsequently created threads are distinct.  They are allocated from

> the same numbering space as process IDs.  Process IDs and thread IDs

> are sometimes also referred to collectively as @dfn{task IDs}.  In

> contrast to processes, threads are never waited for explicitly, so a

> thread ID becomes eligible for reuse as soon as a thread exits or is

> canceled.  This is true even for joinable threads, not just detached

> threads.  Threads are also assigned to a @dfn{thread group}.  In


s/also//g

> @theglibc{} implementation running on Linux, the process ID is the

> thread group ID of all threads in the process.


Perfect.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>


-- 
Cheers,
Carlos.

Patch

diff --git a/manual/process.texi b/manual/process.texi
index b82b91f9f1..25c8393903 100644
--- a/manual/process.texi
+++ b/manual/process.texi
@@ -132,22 +132,19 @@  output channels of the command being executed.
 This section gives an overview of processes and of the steps involved in
 creating a process and making it run another program.
 
-@cindex process ID
-@cindex process lifetime
-Each process is named by a @dfn{process ID} number.  A unique process ID
-is allocated to each process when it is created.  The @dfn{lifetime} of
-a process ends when its termination is reported to its parent process;
-at that time, all of the process resources, including its process ID,
-are freed.
-
 @cindex creating a process
 @cindex forking a process
 @cindex child process
 @cindex parent process
-Processes are created with the @code{fork} system call (so the operation
-of creating a new process is sometimes called @dfn{forking} a process).
-The @dfn{child process} created by @code{fork} is a copy of the original
-@dfn{parent process}, except that it has its own process ID.
+@cindex subprocess
+A new processes is created when one of the functions
+@code{posix_spawn}, @code{fork}, or @code{vfork} is called.  (The
+@code{system} and @code{popen} also create new processes internally.)
+Due to the name of the @code{fork} function, the act of creating a new
+process is sometimes called @dfn{forking} a process.  Each new process
+(the @dfn{child process} or @dfn{subprocess}) is allocated a process
+ID, distinct from the process ID of the parent process.  @xref{Process
+Identification}.
 
 After forking a child process, both the parent and child processes
 continue to execute normally.  If you want your program to wait for a
@@ -174,11 +171,39 @@  too, instead of returning to the previous process image.
 @node Process Identification
 @section Process Identification
 
-The @code{pid_t} data type represents process IDs.  You can get the
-process ID of a process by calling @code{getpid}.  The function
-@code{getppid} returns the process ID of the parent of the current
-process (this is also known as the @dfn{parent process ID}).  Your
-program should include the header files @file{unistd.h} and
+@cindex process ID
+Each process is named by a @dfn{process ID} number, a value of type
+@code{pid_t}.  A process ID is allocated to each process when it is
+created.  Process IDs are reused over time.  The lifetime of a process
+ends when the parent process of the corresponding process waits on the
+process ID after the process has terminated.  @xref{Process
+Completion}.  (The parent process can arrange for such waiting to
+happen implicitly.)  A process ID uniquely identifies a process only
+during the lifetime of the process.  As a rule of thumb, this means
+that the process must still be running.
+
+Process IDs can also denote process groups and sessions.
+@xref{Job Control}.
+
+@cindex thread ID
+@cindex task ID
+@cindex thread group
+On Linux, threads created by @code{pthread_create} also receive a
+number form the process ID namespace, a @dfn{thread ID}.  The thread
+ID of the initial (main) thread is the same as the process ID of the
+entire process.  Process IDs and thread IDs are sometimes also
+referred to collectively as @dfn{task IDs}.  In contrast to processes,
+threads are never waited for explicitly, so a thread ID becomes
+eligible for reuse as soon as a thread exits or is canceled.  This is
+true even for joinable threads, not just detached threads.  Threads
+are also assigned to a @dfn{thread group}.  In @theglibc{}
+implementation running on Linux, the process ID is the thread group ID
+of all threads in the process.
+
+You can get the process ID of a process by calling @code{getpid}.  The
+function @code{getppid} returns the process ID of the parent of the
+current process (this is also known as the @dfn{parent process ID}).
+Your program should include the header files @file{unistd.h} and
 @file{sys/types.h} to use these functions.
 @pindex sys/types.h
 @pindex unistd.h
diff --git a/manual/signal.texi b/manual/signal.texi
index 9577ff091d..8b3a52e22a 100644
--- a/manual/signal.texi
+++ b/manual/signal.texi
@@ -2246,7 +2246,9 @@  signal:
 
 @table @code
 @item @var{pid} > 0
-The process whose identifier is @var{pid}.
+The process whose identifier is @var{pid}.  (On Linux, the signal is
+sent to the entire process even if @var{pid} is a thread ID distinct
+from the process ID.)
 
 @item @var{pid} == 0
 All processes in the same process group as the sender.