Date: Thu, 5 May 2016 16:10:29 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: threads@freebsd.org Cc: arch@freebsd.org Subject: Robust mutexes implementation Message-ID: <20160505131029.GE2422@kib.kiev.ua>
next in thread | raw e-mail | index | archive | help
I implemented robust mutexes for our libthr. A robust mutex is guaranteed to be cleared by the system upon either thread or process owner termination while the mutex is held. The next mutex locker is then notified about inconsistent mutex state and can execute (or abandon) corrective actions. The patch mostly consists of small changes here and there, adding neccessary checks for the inconsistent and abandoned conditions into existing paths. Additionally, the thread exit handler was extended to iterate over the userspace-maintained list of owned robust mutexes, unlocking and marking as terminated each of them. The list of owned robust mutexes cannot be maintained atomically synchronous with the mutex lock state (it is possible in kernel, but is too expensive). Instead, for the duration of lock or unlock operation, the current mutex is remembered in a special slot that is also checked by the kernel at thread termination. Kernel must be aware about the per-thread location of the heads of robust mutex lists and the current active mutex slot. Initially I tried to extend TCBs with this data, so only a single syscall at the threading library initialization would be needed: for any thread the location of TCB is known by kernel, and the syscall would pass offsets. Unfortunately, on some architectures the size of TCB is part of the fixed ABI and cannot be changed. Instead, when a thread touches a robust mutex for the first time, a new umtx op syscall is issued which informs about location of lists heads. The umtx sleep queues for PP and PI mutexes are split between non-robust and robust. I do not understand the reasoning behind this POSIX requirement. Patch passes all glibc tests for robust mutexes I found in the nptl/ directory. See https://github.com/kostikbel/glibc-robust-tests . Patch is available at https://kib.kiev.ua/kib/pshared/robust.1.patch (beware of self-signed root certificate in the chain). Work was sponsored by The FreeBSD Foundation. Unrelated things in the patch: 1. Style. Since I had to re-read whole sys/kern/kern_umtx.c, lib/libthr/thread/thr_umtx.h and lib/libthr/thread/thr_umtx.c, I started fixing the numerous style violations in these files, which actually made my eyes bleed. 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared pi mutexes. 3. Removal of the struct pthread_mutex m_owner field. I cannot see why it is useful. The only optimization it provides is the possibility to avoid clearing UMUTEX_CONTESTED bit when reading m_lock.m_owner. The disadvantages of having this duplicated field is that kernel does not know about pthread_mutex, so cannot fix the dup value. Overall it is less work to clear UMUTEX_CONTESTED when checking owner, then to try and handle inconsistencies. I added the PMUTEX_OWNER_ID() macro to simplify code. 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls the lifetime of the shared mutex associated with a vnode' page. Apparently, there is real code around which expects the following to work: - mmap a file, create a shared mutex in the mapping; - the process exits; - another process starts, mmaps the same file and expects that the previously initialized mutex is still usable. The knob changes the lifetime of such shared off-page from the 'destroy on last unmap' to either 'until vnode is reclaimed' or until 'pthread_mutex_destroy' called, whatever comes first.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160505131029.GE2422>