Date: Sun, 21 Sep 2014 18:39:14 -0400 (EDT) From: Benjamin Kaduk <bjk@freebsd.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: doc@freebsd.org Subject: Re: libthr and main thread stack size Message-ID: <alpine.GSO.1.10.1409211629300.21571@multics.mit.edu> In-Reply-To: <20140920170658.GE2210@kib.kiev.ua> References: <53E36E84.4060806@ivan-labs.com> <FEB60EB5-546D-454D-AE62-B2483246E42C@scsiguy.com> <20140916081324.GQ2737@kib.kiev.ua> <5242716.s4iaScq0Bu@ralph.baldwin.cx> <20140920170658.GE2210@kib.kiev.ua>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] On Sat, 20 Sep 2014, Konstantin Belousov wrote: > Doc people, please review the man page in the patch. BTW, mandoc -Tlint is always a useful sanity check; it found: ./libthr.7:31:5: WARNING: document title should be all caps ./libthr.7:36:2: WARNING: section header suited to sections 2, 3, and 9 only: LIBRARY ./libthr.7:194:2: WARNING: skipping paragraph macro I am not sure what the best solution is the section header complaint. The current libthr.3 is quite short; maybe the two could be merged? I made some changes in the attached diff, though I am not fully happy with the final result. In particular, I am not sure that my text in the section THREAD STACKS is correct. I also changed from the .Dv macro for library names like libc, to the .Li macro, which is sort of a catch-all option; there may be something better. -Ben [-- Attachment #2 --] --- libthr.7.orig 2014-09-21 16:19:24.719162740 -0400 +++ libthr.7 2014-09-21 18:10:05.289706149 -0400 @@ -28,7 +28,7 @@ .\" $FreeBSD$ .\" .Dd September 20, 2014 -.Dt libthr +.Dt LIBTHR .Os .Sh NAME .Nm libthr @@ -38,129 +38,124 @@ .Sh DESCRIPTION The man page documents the quirks and tunables of the .Fx -implementation for the +implementation of the .Lb libpthread . -When linking with the +When linking with .Li -lpthread , the run-time dependency -.Dv libthr.so.3 -library is recorded in the produced object. +.Li libthr.so.3 +is recorded in the produced object. .Pp -The library is tigthly integrated with the Run-time Link-editor +The library is tightly integrated with the run-time link editor .Xr ld-elf.so.1 1 and -.Lb libc , +.Lb libc ; all three components must be built from the same source tree. Mixing -.Dv libc.so +.Li libc.so and .Nm libraries from different versions of .Fx is not supported. The run-time linker -.Li ld-elf.so.1 -has some code to ensure backward-compatibility with older +.Xr ld-elf.so.1 1 +has some code to ensure backward-compatibility with older versions of .Nm . .Sh MUTEX ACQUISITION -The locked mutex (see +A locked mutex (see .Xr pthread_mutex_lock 3 ) is represented by a volatile variable of type .Dv lwpid_t , which records the global system identifier of the thread owning the lock. -The .Nm -performs a congested mutex acquisition in three stages, each of which +performs a contested mutex acquisition in three stages, each of which is more resource-consuming than the previous. .Pp -First, the -.Li spin loop +First, a spin loop is performed, where the library attempts to acquire the lock by .Xr atomic 9 operations. The loop count is controlled by the .Ev LIBPTHREAD_SPINLOOPS -environment variable. +environment variable, with a default value of 2000. .Pp -If the -.Li spin loop -was unable to acquire the mutex, the -.Li yield loop +If the spin loop +was unable to acquire the mutex, a yeild loop is executed, performing the same .Xr atomic 9 -acquisition attempts as -.Li spin loop , -but each attempt is followed by yield of the CPU time of the thread by +acquisition attempts as the spin loop, +but each attempt is followed by a yield of the CPU time +of the thread using the .Xr sched_yield 2 syscall. -By default, the -.Li yield loop +By default, the yield loop is not executed. -This is controlled by +This is controlled by the .Ev LIBPTHREAD_YIELDLOOPS environment variable. .Pp -If both -.Li spin -and -.Li yield loops +If both the spin and yield loops failed to acquire the lock, the thread is taken off the CPU and -put to sleep in kernel with the +put to sleep in the kernel with the .Xr umtx 2 syscall. -Kernel wakes up a thread and hands the ownership of the lock to -the woken thread. -.Sh THREADS STACKS -Each thread is provided with the private stack area used by C runtime. -The size of the main (initial) thread stack is set by kernel, and is +The kernel wakes up a thread and hands the ownership of the lock to +the woken thread when the lock becomes available. +.Sh THREAD STACKS +Each thread is provided with a private stack area used by the C runtime. +The size of the main (initial) thread stack is set by the kernel, and is controlled by the .Dv RLIMIT_STACK process resource limit (see .Xr getrlimit 2 ) . .Pp -By default, the main thread size is equal to the value of resource +By default, the main thread's stack size is equal to the value of .Dv RLIMIT_STACK for the process. If the -.Dv LIBPTHREAD_SPLITSTACK_MAIN -environment variable is present (its value does not matter), -the main thread size if chomped to 4MB on 64bit architectures, and to -2MB on 32bit architectures, on the threading library initialization. -The rest of the address space area reserved by the kernel for initial -process stack, is used for non-initial threads stack in this case. +.Ev LIBPTHREAD_SPLITSTACK_MAIN +environment variable is present in the process environment +(its value does not matter), +the main thread's stack is reduced to 4MB on 64bit architectures, and to +2MB on 32bit architectures, when the threading library is initialized. +The rest of the address space area which has been reserved by the +kernel for the initial process stack is used for non-initial thread stacks +in this case. The presence of the -.Dv LIBPTHREAD_BIGSTACK_MAIN -environment variable overrides the -.Dv LIBPTHREAD_SPLITSTACK_MAIN , +.Ev LIBPTHREAD_BIGSTACK_MAIN +environment variable overrides +.Ev LIBPTHREAD_SPLITSTACK_MAIN ; it is kept for backward-compatibility. .Pp -The size of the stacks for threads created by the process at run-time +The size of stacks for threads created by the process at run-time with the .Xr pthread_create 3 -call, is controlled by thread attributes, see +call is controlled by thread attributes: see .Xr pthread_attr 3 , in particular, the .Xr pthread_attr_setstacksize 3 , .Xr pthread_attr_setguardsize 3 and -.Xr pthread_attr_setstackaddr 3 . +.Xr pthread_attr_setstackaddr 3 +functions. If no attributes for the thread stack size are specified, the default non-initial thread stack size is 2MB for 64bit architectures, and 1MB for 32bit architectures. .Sh RUN-TIME SETTINGS The following environment variables are recognized by -.Dv libthr +.Nm and adjust the operation of the library at run-time: .Bl -tag -width LIBPTHREAD_SPLITSTACK_MAIN .It Ev LIBPTHREAD_BIGSTACK_MAIN -Disables the chomp of the initial thread stack, enabled by +Disables the reduction of the initial thread stack enabled by .Ev LIBPTHREAD_SPLITSTACK_MAIN . .It Ev LIBPTHREAD_SPLITSTACK_MAIN -Causes the chomp of the initial thread stack, as described in the +Causes a reduction of the initial thread stack, as described in the section -.Li THREAD_STACKS . -This was the default behaviour of the +.Sx THREAD_STACKS . +This was the default behaviour of .Nm before .Fx 11.0 . @@ -171,70 +166,69 @@ of the mutex acquisition. The default count is 2000, set by the .Dv MUTEX_ADAPTIVE_SPINS -define in the +constant in the .Nm sources. .It Ev LIBPTHREAD_YIELDLOOPS -The non-zero integer value of the variable allows the -.Li yield loop +A non-zero integer value enables the yield loop in the process of the mutex acquisition. -The value is the counter of loop operations. +The value is the count of loop operations. .It Ev LIBPTHREAD_QUEUE_FIFO -The integer value of the variable specifies how often the blocked -threads are put into the head of the sleep queue, instead of it tail. -The bigger value reduces the frequency of the FIFO discipline. +The integer value of the variable specifies how often blocked +threads are inserted at the head of the sleep queue, instead of its tail. +Bigger values reduce the frequency of the FIFO discipline. The value must be between 0 and 255. .El .Sh INTERACTION WITH RUN-TIME LINKER. The .Nm library must appear before -.Dv libc +.Li libc in the global order of depended objects. .Pp -.Pp -Loading the +Loading .Nm -library with the +with the .Xr dlopen 3 -call in the process after the program binary is activated, -is not supported, and causes miscellaneous and hard to diagnose misbehaviour. +call in the process after the program binary is activated +is not supported, and causes miscellaneous and hard-to-diagnose misbehaviour. This is due to .Nm interposing several important -.Dv libc +.Li libc symbols to provide thread-safe services. In particular, .Dv errno -and locking stubs from -.Dv libc +and the locking stubs from +.Li libc are affected. This requirement is currently not enforced. .Pp -If the program loads the modules at run-time, and modules may require -the threading services, the main program binary must be linked with -.Dv libpthread , -even if it does not require any service from the library. +If the program loads any modules at run-time, and those modules may require +threading services, the main program binary must be linked with +.Li libpthread , +even if it does not require any services from the library. .Pp -The library cannot be unloaded, the +.Nm +cannot be unloaded; the .Xr dlclose 3 function does not perform any action when called with a handle for -.Dv libthr . -One of the reason is that interposing of -.Dv libc +.Nm . +One of the reasons is that the interposing of +.Li libc functions cannot be undone. .Sh SIGNALS The implementation also interposes the user-installed .Xr signal 3 handlers. -The interposing is done to postpone signal delivery to threads which -entered the (libthr-internal) critical sections, where the calling +This interposing is done to postpone signal delivery to threads which +entered (libthr-internal) critical sections, where the calling of the signal handler is unsafe. -Example of such situation is owning the internal library lock. -When the signal is delivered while signal handler cannot be safely -called, call is postponed and performed after the critical section -is left. -This should be taken into account when interpreting the +An example of such a situation is owning the internal library lock. +When a signal is delivered while the signal handler cannot be safely +called, the call is postponed and performed until after the exit from +the critical section. +This should be taken into account when interpreting .Xr ktrace 1 logs. .Sh SEE ALSOhelp
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.GSO.1.10.1409211629300.21571>
