Date: Thu, 19 May 2016 17:40:00 +0000 (UTC) From: Konstantin Belousov <kib@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r300221 - in head/lib: libc/sys libthr Message-ID: <201605191740.u4JHe0kk044054@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: kib Date: Thu May 19 17:40:00 2016 New Revision: 300221 URL: https://svnweb.freebsd.org/changeset/base/300221 Log: Document _umtx_op(2) interface for the implementation of robust mutexes. In libthr(3), list added knobs. Reviewed by: emaste Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D6427 Modified: head/lib/libc/sys/_umtx_op.2 head/lib/libthr/libthr.3 Modified: head/lib/libc/sys/_umtx_op.2 ============================================================================== --- head/lib/libc/sys/_umtx_op.2 Thu May 19 17:21:24 2016 (r300220) +++ head/lib/libc/sys/_umtx_op.2 Thu May 19 17:40:00 2016 (r300221) @@ -28,7 +28,7 @@ .\" .\" $FreeBSD$ .\" -.Dd May 5, 2016 +.Dd May 17, 2016 .Dt _UMTX_OP 2 .Os .Sh NAME @@ -85,6 +85,7 @@ struct umutex { volatile lwpid_t m_owner; uint32_t m_flags; uint32_t m_ceilings[2]; + uintptr_t m_rb_lnk; }; .Ed .Pp @@ -95,18 +96,24 @@ It contains either the thread identifier locked state, or zero when the lock is unowned. The highest bit set indicates that there is contention on the lock. The constants are defined for special values: -.Bl -tag -width "Dv UMUTEX_CONTESTED" +.Bl -tag -width "Dv UMUTEX_RB_OWNERDEAD" .It Dv UMUTEX_UNOWNED Zero, the value stored in the unowned lock. .It Dv UMUTEX_CONTESTED The contenion indicator. +.It Dv UMUTEX_RB_OWNERDEAD +A thread owning the robust mutex terminated. +The mutex is in unlocked state. +.It Dv UMUTEX_RB_NOTRECOV +The robust mutex is in a non-recoverable state. +It cannot be locked until reinitialized. .El .Pp The .Dv m_flags field may contain the following umutex-specific flags, in addition to the common flags: -.Bl -tag -width "Dv UMUTEX_PRIO_INHERIT" +.Bl -tag -width "Dv UMUTEX_NONCONSISTENT" .It Dv UMUTEX_PRIO_INHERIT Mutex implements .Em Priority Inheritance @@ -115,6 +122,13 @@ protocol. Mutex implements .Em Priority Protection protocol. +.It Dv UMUTEX_ROBUST +Mutex is robust, as described in the +.Sx ROBUST UMUTEXES +section below. +.It Dv UMUTEX_NONCONSISTENT +Robust mutex is in a transient non-consistent state. +Not used by kernel. .El .Pp In the manual page, mutexes not having @@ -417,6 +431,75 @@ primitives, even when the physical addre When waking up a limited number of threads from a given sleep queue, the highest priority threads that have been blocked for the longest on the queue are selected. +.Ss ROBUST UMUTEXES +The +.Em robust umutexes +are provided as a substrate for a userspace library to implement +POSIX robust mutexes. +A robust umutex must have the +.Dv UMUTEX_ROBUST +flag set. +.Pp +On thread termination, the kernel walks two lists of mutexes. +The two lists head addresses must be provided by a prior call to +.Dv UMTX_OP_ROBUST_LISTS +request. +The lists are singly-linked. +The link to next element is provided by the +.Dv m_rb_lnk +member of the +.Vt struct umutex . +.Pp +Robust list processing is aborted if the kernel finds a mutex +with any of the following conditions: +.Bl -dash -offset indent -compact +.It +the +.Dv UMUTEX_ROBUST +flag is not set +.It +not owned by the current thread, except when the mutex is pointed to +by the +.Dv robust_inactive +member of the +.Vt struct umtx_robust_lists_params , +registered for the current thread +.It +the combination of mutex flags is invalid +.It +read of the umutex memory faults +.It +the list length limit described in +.Xr libthr 3 + is reached. +.El +.Pp +Every mutex in both lists is unlocked as if the +.Dv UMTX_OP_MUTEX_UNLOCK +request is performed on it, but instead of the +.Dv UMUTEX_UNOWNED +value, the +.Dv m_owner +field is written with the +.Dv UMUTEX_RB_OWNERDEAD +value. +When a mutex in the +.Dv UMUTEX_RB_OWNERDEAD +state is locked by kernel due to the +.Dv UMTX_OP_MUTEX_TRYLOCK +and +.Dv UMTX_OP_MUTEX_LOCK +requests, the lock is granted and +.Er EOWNERDEAD +error is returned. +.Pp +Also, the kernel handles the +.Dv UMUTEX_RB_NOTRECOV +value of +.Dv the m_owner +field specially, always returning the +.Er ENOTRECOVERABLE +error for lock attempts, without granting the lock. .Ss OPERATIONS The following operations, requested by the .Fa op @@ -582,12 +665,12 @@ The arguments to the request are: Pointer to the umutex. .It Fa val New ceiling value. -.It Fa uaddr1 +.It Fa uaddr Address of a variable of type .Vt uint32_t . If not NULL, after the successful update the previous ceiling value is written to the location pointed to by -.Fa uaddr1 . +.Fa uaddr . .El .Pp The request locks the umutex pointed to by the @@ -614,7 +697,7 @@ Pointer to the .Vt struct ucond . .It Fa val Request flags, see below. -.It Fa uaddr1 +.It Fa uaddr Pointer to the umutex. .It Fa uaddr2 Optional pointer to a @@ -624,7 +707,7 @@ for timeout specification. .Pp The request must be issued by the thread owning the mutex pointed to by the -.Fa uaddr1 +.Fa uaddr argument. The .Dv c_hash_waiters @@ -633,7 +716,7 @@ member of the pointed to by the .Fa obj argument, is set to an arbitrary non-zero value, after which the -.Fa uaddr1 +.Fa uaddr mutex is unlocked (following the appropriate protocol), and the current thread is put to sleep on the sleep queue keyed by the @@ -651,7 +734,7 @@ the same sleep queue, the .Dv c_hash_waiters member is cleared. After wakeup, the -.Fa uaddr1 +.Fa uaddr umutex is not relocked. .Pp The following flags are defined: @@ -1084,6 +1167,58 @@ The argument specifies the virtual address, which backing physical memory byte identity is used as a key for the anonymous shared object creation or lookup. +.It Dv UMTX_OP_ROBUST_LISTS +Register the list heads for the current thread's robust mutex lists. +The arguments to the request are: +.Bl -tag -width "It Fa obj" +.It Fa val +Size of the structure passed in the +.Fa uaddr +argument. +.It Fa uaddr +Pointer to the structure of type +.Vt struct umtx_robust_lists_params . +.El +.Pp +The structure is defined as +.Bd -literal +struct umtx_robust_lists_params { + uintptr_t robust_list_offset; + uintptr_t robust_priv_list_offset; + uintptr_t robust_inact_offset; +}; +.Ed +.Pp +The +.Dv robust_list_offset +member contains address of the first element in the list of locked +robust shared mutexes. +The +.Dv robust_priv_list_offset +member contains address of the first element in the list of locked +robust private mutexes. +The private and shared robust locked lists are split to allow fast +termination of the shared list on fork, in the child. +.Pp +The +.Dv robust_inact_offset +contains a pointer to the mutex which might be locked in nearby future, +or might have been just unlocked. +It is typically set by the lock or unlock mutex implementation code +around the whole operation, since lists can be only changed race-free +when the thread owns the mutex. +The kernel inspects the +.Dv robust_inact_offset +in addition to walking the shared and private lists. +Also, the mutex pointed to by +.Dv robust_inact_offset +is handled more loosly at the thread termination time, +than other mutexes on the list. +That mutex is allowed to be not owned by the current thread, +in which case list processing is continued. +See +.Sx ROBUST UMUTEXES +subsection for details. .El .Sh RETURN VALUES If successful, @@ -1106,7 +1241,7 @@ variable is set to indicate the error. The .Fn _umtx_op operations will return the following errors: -.Bl -tag -width Er +.Bl -tag -width "Bq Er ENOTRECOVERABLE" .It Bq Er EFAULT One of the arguments point to invalid memory. .It Bq Er EINVAL @@ -1145,7 +1280,7 @@ The argument specifies invalid operation. .It Bq Er EINVAL The -.Fa uaddr1 +.Fa uaddr argument for the .Dv UMTX_OP_SHM request specifies invalid operation. @@ -1162,6 +1297,21 @@ array during lock or unlock operations, .Dv RTP_PRIO_MAX . .It Bq Er EPERM Unlock attempted on an object not owned by the current thread. +.It Bq Er EOWNERDEAD +The lock was requested on an umutex where the +.Dv m_owner +field was set to the +.Dv UMUTEX_RB_OWNERDEAD +value, indicating terminated robust mutex. +The lock was granted to the caller, so this error in fact +indicates success with additional conditions. +.It Bq Er ENOTRECOVERABLE +The lock was requested on an umutex which +.Dv m_owner +field is equal to the +.Dv UMUTEX_RB_NOTRECOV +value, indicating abandoned robust mutex after termination. +The lock was not granted to the caller. .It Bq Er ENOTTY The shared memory object, associated with the address passed to the .Dv UMTX_SHM_ALIVE @@ -1197,7 +1347,7 @@ for read. A try mutex lock operation was not able to obtain the lock. .It Bq Er ETIMEDOUT The request specified a timeout in the -.Fa uaddr1 +.Fa uaddr and .Fa uaddr2 arguments, and timed out before obtaining the lock or being woken up. @@ -1211,6 +1361,27 @@ Mutex lock requests without timeout spec The error is typically not returned to userspace code, restart is handled by usual adjustment of the instruction counter. .El +.Sh BUGS +A window between a unlocking robust mutex and resetting the pointer in the +.Dv robust_inact_offset +member of the registered +.Vt struct umtx_robust_lists_params +allows another thread to destroy the mutex, thus making the kernel inspect +freed or reused memory. +The +.Li libthr +implementation is only vulnerable to this race when operating on +a shared mutex. +A possible fix for the current implementation is to strengthen the checks +for shared mutexes before terminating them, in particular, verifying +that the mutex memory is mapped from the POSIX shared object, allocated +by the +.Dv UMTX_OP_SHM +request. +This is not done because it is believed that the race is adequately +covered by other consistency checks, while adding the check would +prevent alternative implementations of +.Li libpthread . .Sh SEE ALSO .Xr clock_gettime 2 , .Xr mmap 2 , Modified: head/lib/libthr/libthr.3 ============================================================================== --- head/lib/libthr/libthr.3 Thu May 19 17:21:24 2016 (r300220) +++ head/lib/libthr/libthr.3 Thu May 19 17:40:00 2016 (r300221) @@ -29,7 +29,7 @@ .\" .\" $FreeBSD$ .\" -.Dd February 12, 2015 +.Dd May 17, 2016 .Dt LIBTHR 3 .Os .Sh NAME @@ -167,7 +167,7 @@ for 32bit architectures. The following environment variables are recognized by .Nm and adjust the operation of the library at run-time: -.Bl -tag -width LIBPTHREAD_SPLITSTACK_MAIN +.Bl -tag -width "Ev LIBPTHREAD_SPLITSTACK_MAIN" .It Ev LIBPTHREAD_BIGSTACK_MAIN Disables the reduction of the initial thread stack enabled by .Ev LIBPTHREAD_SPLITSTACK_MAIN . @@ -198,7 +198,37 @@ The integer value of the variable specif threads are inserted at the head of the sleep queue, instead of its tail. Bigger values reduce the frequency of the FIFO discipline. The value must be between 0 and 255. +.Pp +.El +The following +.Dv sysctl +MIBs affect the operation of the library: +.Bl -tag -width "Dv debug.umtx.robust_faults_verbose" +.It Dv kern.ipc.umtx_vnode_persistent +By default, a shared lock backed by a mapped file in memory is +automatically destroyed on the last unmap of the corresponding file's page, +which is allowed by POSIX. +Setting the sysctl to 1 makes such a shared lock object persist until +the vnode is recycled by the Virtual File System. +Note that in case file is not opened and not mapped, the kernel might +recycle it at any moment, making this sysctl less useful than it sounds. +.It Dv kern.ipc.umtx_max_robust +The maximal number of robust mutexes allowed for one thread. +The kernel will not unlock more mutexes than specified, see +.Xr _umtx_op +for more details. +The default value is large enough for most useful applications. +.It Dv debug.umtx.robust_faults_verbose +A non zero value makes kernel emit some diagnostic when the robust +mutexes unlock was prematurely aborted after detecting some inconsistency, +as a measure to prevent memory corruption. .El +.Pp +The +.Dv RLIMIT_UMTXP +limit (see +.Xr getrlimit 2 ) +defines how many shared locks a given user may create simultaneously. .Sh INTERACTION WITH RUN-TIME LINKER On load, .Nm @@ -236,6 +266,12 @@ logs. .Xr ld-elf.so.1 1 , .Xr getrlimit 2 , .Xr errno 2 , +.Xr thr_exit 2 , +.Xr thr_kill 2 , +.Xr thr_kill2 2 , +.Xr thr_new 2 , +.Xr thr_self 2 , +.Xr thr_set_name 2 , .Xr _umtx_op 2 , .Xr dlclose 3 , .Xr dlopen 3 ,
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201605191740.u4JHe0kk044054>