Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 May 2016 17:40:00 +0000 (UTC)
From:      Konstantin Belousov <kib@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r300221 - in head/lib: libc/sys libthr
Message-ID:  <201605191740.u4JHe0kk044054@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: kib
Date: Thu May 19 17:40:00 2016
New Revision: 300221
URL: https://svnweb.freebsd.org/changeset/base/300221

Log:
  Document _umtx_op(2) interface for the implementation of robust mutexes.
  In libthr(3), list added knobs.
  
  Reviewed by:	emaste
  Sponsored by:	The FreeBSD Foundation
  Differential revision:	https://reviews.freebsd.org/D6427

Modified:
  head/lib/libc/sys/_umtx_op.2
  head/lib/libthr/libthr.3

Modified: head/lib/libc/sys/_umtx_op.2
==============================================================================
--- head/lib/libc/sys/_umtx_op.2	Thu May 19 17:21:24 2016	(r300220)
+++ head/lib/libc/sys/_umtx_op.2	Thu May 19 17:40:00 2016	(r300221)
@@ -28,7 +28,7 @@
 .\"
 .\" $FreeBSD$
 .\"
-.Dd May 5, 2016
+.Dd May 17, 2016
 .Dt _UMTX_OP 2
 .Os
 .Sh NAME
@@ -85,6 +85,7 @@ struct umutex {
 	volatile lwpid_t m_owner;
 	uint32_t         m_flags;
 	uint32_t         m_ceilings[2];
+	uintptr_t        m_rb_lnk;
 };
 .Ed
 .Pp
@@ -95,18 +96,24 @@ It contains either the thread identifier
 locked state, or zero when the lock is unowned.
 The highest bit set indicates that there is contention on the lock.
 The constants are defined for special values:
-.Bl -tag -width "Dv UMUTEX_CONTESTED"
+.Bl -tag -width "Dv UMUTEX_RB_OWNERDEAD"
 .It Dv UMUTEX_UNOWNED
 Zero, the value stored in the unowned lock.
 .It Dv UMUTEX_CONTESTED
 The contenion indicator.
+.It Dv UMUTEX_RB_OWNERDEAD
+A thread owning the robust mutex terminated.
+The mutex is in unlocked state.
+.It Dv UMUTEX_RB_NOTRECOV
+The robust mutex is in a non-recoverable state.
+It cannot be locked until reinitialized.
 .El
 .Pp
 The
 .Dv m_flags
 field may contain the following umutex-specific flags, in addition to
 the common flags:
-.Bl -tag -width "Dv UMUTEX_PRIO_INHERIT"
+.Bl -tag -width "Dv UMUTEX_NONCONSISTENT"
 .It Dv UMUTEX_PRIO_INHERIT
 Mutex implements
 .Em Priority Inheritance
@@ -115,6 +122,13 @@ protocol.
 Mutex implements
 .Em Priority Protection
 protocol.
+.It Dv UMUTEX_ROBUST
+Mutex is robust, as described in the
+.Sx ROBUST UMUTEXES
+section below.
+.It Dv UMUTEX_NONCONSISTENT
+Robust mutex is in a transient non-consistent state.
+Not used by kernel.
 .El
 .Pp
 In the manual page, mutexes not having
@@ -417,6 +431,75 @@ primitives, even when the physical addre
 When waking up a limited number of threads from a given sleep queue,
 the highest priority threads that have been blocked for the longest on
 the queue are selected.
+.Ss ROBUST UMUTEXES
+The
+.Em robust umutexes
+are provided as a substrate for a userspace library to implement
+POSIX robust mutexes.
+A robust umutex must have the
+.Dv UMUTEX_ROBUST
+flag set.
+.Pp
+On thread termination, the kernel walks two lists of mutexes.
+The two lists head addresses must be provided by a prior call to
+.Dv UMTX_OP_ROBUST_LISTS
+request.
+The lists are singly-linked.
+The link to next element is provided by the
+.Dv m_rb_lnk
+member of the
+.Vt struct umutex .
+.Pp
+Robust list processing is aborted if the kernel finds a mutex
+with any of the following conditions:
+.Bl -dash -offset indent -compact
+.It
+the
+.Dv UMUTEX_ROBUST
+flag is not set
+.It
+not owned by the current thread, except when the mutex is pointed to
+by the
+.Dv robust_inactive
+member of the
+.Vt struct umtx_robust_lists_params ,
+registered for the current thread
+.It
+the combination of mutex flags is invalid
+.It
+read of the umutex memory faults
+.It
+the list length limit described in
+.Xr libthr 3
+ is reached.
+.El
+.Pp
+Every mutex in both lists is unlocked as if the
+.Dv UMTX_OP_MUTEX_UNLOCK
+request is performed on it, but instead of the
+.Dv UMUTEX_UNOWNED
+value, the
+.Dv m_owner
+field is written with the
+.Dv UMUTEX_RB_OWNERDEAD
+value.
+When a mutex in the
+.Dv UMUTEX_RB_OWNERDEAD
+state is locked by kernel due to the
+.Dv UMTX_OP_MUTEX_TRYLOCK
+and
+.Dv UMTX_OP_MUTEX_LOCK
+requests, the lock is granted and
+.Er EOWNERDEAD
+error is returned.
+.Pp
+Also, the kernel handles the
+.Dv UMUTEX_RB_NOTRECOV
+value of
+.Dv the m_owner
+field specially, always returning the
+.Er ENOTRECOVERABLE
+error for lock attempts, without granting the lock.
 .Ss OPERATIONS
 The following operations, requested by the
 .Fa op
@@ -582,12 +665,12 @@ The arguments to the request are:
 Pointer to the umutex.
 .It Fa val
 New ceiling value.
-.It Fa uaddr1
+.It Fa uaddr
 Address of a variable of type
 .Vt uint32_t .
 If not NULL, after the successful update the previous ceiling value is
 written to the location pointed to by
-.Fa uaddr1 .
+.Fa uaddr .
 .El
 .Pp
 The request locks the umutex pointed to by the
@@ -614,7 +697,7 @@ Pointer to the
 .Vt struct ucond .
 .It Fa val
 Request flags, see below.
-.It Fa uaddr1
+.It Fa uaddr
 Pointer to the umutex.
 .It Fa uaddr2
 Optional pointer to a
@@ -624,7 +707,7 @@ for timeout specification.
 .Pp
 The request must be issued by the thread owning the mutex pointed to
 by the
-.Fa uaddr1
+.Fa uaddr
 argument.
 The
 .Dv c_hash_waiters
@@ -633,7 +716,7 @@ member of the
 pointed to by the
 .Fa obj
 argument, is set to an arbitrary non-zero value, after which the
-.Fa uaddr1
+.Fa uaddr
 mutex is unlocked (following the appropriate protocol), and
 the current thread is put to sleep on the sleep queue keyed by
 the
@@ -651,7 +734,7 @@ the same sleep queue, the
 .Dv c_hash_waiters
 member is cleared.
 After wakeup, the
-.Fa uaddr1
+.Fa uaddr
 umutex is not relocked.
 .Pp
 The following flags are defined:
@@ -1084,6 +1167,58 @@ The
 argument specifies the virtual address, which backing physical memory
 byte identity is used as a key for the anonymous shared object
 creation or lookup.
+.It Dv UMTX_OP_ROBUST_LISTS
+Register the list heads for the current thread's robust mutex lists.
+The arguments to the request are:
+.Bl -tag -width "It Fa obj"
+.It Fa val
+Size of the structure passed in the
+.Fa uaddr
+argument.
+.It Fa uaddr
+Pointer to the structure of type
+.Vt struct umtx_robust_lists_params .
+.El
+.Pp
+The structure is defined as
+.Bd -literal
+struct umtx_robust_lists_params {
+	uintptr_t	robust_list_offset;
+	uintptr_t	robust_priv_list_offset;
+	uintptr_t	robust_inact_offset;
+};
+.Ed
+.Pp
+The
+.Dv robust_list_offset
+member contains address of the first element in the list of locked
+robust shared mutexes.
+The
+.Dv robust_priv_list_offset
+member contains address of the first element in the list of locked
+robust private mutexes.
+The private and shared robust locked lists are split to allow fast
+termination of the shared list on fork, in the child.
+.Pp
+The
+.Dv robust_inact_offset
+contains a pointer to the mutex which might be locked in nearby future,
+or might have been just unlocked.
+It is typically set by the lock or unlock mutex implementation code
+around the whole operation, since lists can be only changed race-free
+when the thread owns the mutex.
+The kernel inspects the
+.Dv robust_inact_offset
+in addition to walking the shared and private lists.
+Also, the mutex pointed to by
+.Dv robust_inact_offset
+is handled more loosly at the thread termination time,
+than other mutexes on the list.
+That mutex is allowed to be not owned by the current thread,
+in which case list processing is continued.
+See
+.Sx ROBUST UMUTEXES
+subsection for details.
 .El
 .Sh RETURN VALUES
 If successful,
@@ -1106,7 +1241,7 @@ variable is set to indicate the error.
 The
 .Fn _umtx_op
 operations will return the following errors:
-.Bl -tag -width Er
+.Bl -tag -width "Bq Er ENOTRECOVERABLE"
 .It Bq Er EFAULT
 One of the arguments point to invalid memory.
 .It Bq Er EINVAL
@@ -1145,7 +1280,7 @@ The
 argument specifies invalid operation.
 .It Bq Er EINVAL
 The
-.Fa uaddr1
+.Fa uaddr
 argument for the
 .Dv UMTX_OP_SHM
 request specifies invalid operation.
@@ -1162,6 +1297,21 @@ array during lock or unlock operations, 
 .Dv RTP_PRIO_MAX .
 .It Bq Er EPERM
 Unlock attempted on an object not owned by the current thread.
+.It Bq Er EOWNERDEAD
+The lock was requested on an umutex where the
+.Dv m_owner
+field was set to the
+.Dv UMUTEX_RB_OWNERDEAD
+value, indicating terminated robust mutex.
+The lock was granted to the caller, so this error in fact
+indicates success with additional conditions.
+.It Bq Er ENOTRECOVERABLE
+The lock was requested on an umutex which
+.Dv m_owner
+field is equal to the
+.Dv UMUTEX_RB_NOTRECOV
+value, indicating abandoned robust mutex after termination.
+The lock was not granted to the caller.
 .It Bq Er ENOTTY
 The shared memory object, associated with the address passed to the
 .Dv UMTX_SHM_ALIVE
@@ -1197,7 +1347,7 @@ for read.
 A try mutex lock operation was not able to obtain the lock.
 .It Bq Er ETIMEDOUT
 The request specified a timeout in the
-.Fa uaddr1
+.Fa uaddr
 and
 .Fa uaddr2
 arguments, and timed out before obtaining the lock or being woken up.
@@ -1211,6 +1361,27 @@ Mutex lock requests without timeout spec
 The error is typically not returned to userspace code, restart
 is handled by usual adjustment of the instruction counter.
 .El
+.Sh BUGS
+A window between a unlocking robust mutex and resetting the pointer in the
+.Dv robust_inact_offset
+member of the registered
+.Vt struct umtx_robust_lists_params
+allows another thread to destroy the mutex, thus making the kernel inspect
+freed or reused memory.
+The
+.Li libthr
+implementation is only vulnerable to this race when operating on
+a shared mutex.
+A possible fix for the current implementation is to strengthen the checks
+for shared mutexes before terminating them, in particular, verifying
+that the mutex memory is mapped from the POSIX shared object, allocated
+by the
+.Dv UMTX_OP_SHM
+request.
+This is not done because it is believed that the race is adequately
+covered by other consistency checks, while adding the check would
+prevent alternative implementations of
+.Li libpthread .
 .Sh SEE ALSO
 .Xr clock_gettime 2 ,
 .Xr mmap 2 ,

Modified: head/lib/libthr/libthr.3
==============================================================================
--- head/lib/libthr/libthr.3	Thu May 19 17:21:24 2016	(r300220)
+++ head/lib/libthr/libthr.3	Thu May 19 17:40:00 2016	(r300221)
@@ -29,7 +29,7 @@
 .\"
 .\" $FreeBSD$
 .\"
-.Dd February 12, 2015
+.Dd May 17, 2016
 .Dt LIBTHR 3
 .Os
 .Sh NAME
@@ -167,7 +167,7 @@ for 32bit architectures.
 The following environment variables are recognized by
 .Nm
 and adjust the operation of the library at run-time:
-.Bl -tag -width LIBPTHREAD_SPLITSTACK_MAIN
+.Bl -tag -width "Ev LIBPTHREAD_SPLITSTACK_MAIN"
 .It Ev LIBPTHREAD_BIGSTACK_MAIN
 Disables the reduction of the initial thread stack enabled by
 .Ev LIBPTHREAD_SPLITSTACK_MAIN .
@@ -198,7 +198,37 @@ The integer value of the variable specif
 threads are inserted at the head of the sleep queue, instead of its tail.
 Bigger values reduce the frequency of the FIFO discipline.
 The value must be between 0 and 255.
+.Pp
+.El
+The following
+.Dv sysctl
+MIBs affect the operation of the library:
+.Bl -tag -width "Dv debug.umtx.robust_faults_verbose"
+.It Dv kern.ipc.umtx_vnode_persistent
+By default, a shared lock backed by a mapped file in memory is
+automatically destroyed on the last unmap of the corresponding file's page,
+which is allowed by POSIX.
+Setting the sysctl to 1 makes such a shared lock object persist until
+the vnode is recycled by the Virtual File System.
+Note that in case file is not opened and not mapped, the kernel might
+recycle it at any moment, making this sysctl less useful than it sounds.
+.It Dv kern.ipc.umtx_max_robust
+The maximal number of robust mutexes allowed for one thread.
+The kernel will not unlock more mutexes than specified, see
+.Xr _umtx_op
+for more details.
+The default value is large enough for most useful applications.
+.It Dv debug.umtx.robust_faults_verbose
+A non zero value makes kernel emit some diagnostic when the robust
+mutexes unlock was prematurely aborted after detecting some inconsistency,
+as a measure to prevent memory corruption.
 .El
+.Pp
+The
+.Dv RLIMIT_UMTXP
+limit (see
+.Xr getrlimit 2 )
+defines how many shared locks a given user may create simultaneously.
 .Sh INTERACTION WITH RUN-TIME LINKER
 On load,
 .Nm
@@ -236,6 +266,12 @@ logs.
 .Xr ld-elf.so.1 1 ,
 .Xr getrlimit 2 ,
 .Xr errno 2 ,
+.Xr thr_exit 2 ,
+.Xr thr_kill 2 ,
+.Xr thr_kill2 2 ,
+.Xr thr_new 2 ,
+.Xr thr_self 2 ,
+.Xr thr_set_name 2 ,
 .Xr _umtx_op 2 ,
 .Xr dlclose 3 ,
 .Xr dlopen 3 ,



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201605191740.u4JHe0kk044054>