From owner-freebsd-threads@freebsd.org Mon May 2 04:11:10 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3DD9CB296DE for ; Mon, 2 May 2016 04:11:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2EC27180A for ; Mon, 2 May 2016 04:11:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u424BAIv057347 for ; Mon, 2 May 2016 04:11:10 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209194] Seg fault with pthread_cond_signal() and pthread_cond_broadcast() Date: Mon, 02 May 2016 04:11:10 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 10.2-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: michael.cress@cress.us X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 May 2016 04:11:10 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209194 Bug ID: 209194 Summary: Seg fault with pthread_cond_signal() and pthread_cond_broadcast() Product: Base System Version: 10.2-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: threads Assignee: freebsd-threads@FreeBSD.org Reporter: michael.cress@cress.us I have code that is Seg Faulting with the following (partial) stack trace: #0 0x281fa7cf in pthread_getspecific () from /lib/libthr.so.3 #1 0x28201597 in pthread_cond_signal () from /lib/libthr.so.3 One thread exclusively is pthread_cond_timedwait()'ing on an initialized pthread_cond_t variable. Signalling occurs from another thread using pthread_cond_signal()/_broadcast(). Does anyone have any ideas on why this = seg fault is occuring/why the pthread_getspecific() internal call is failing? --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Mon May 2 10:13:28 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 34B39B1D3F7 for ; Mon, 2 May 2016 10:13:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2503A1303 for ; Mon, 2 May 2016 10:13:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u42ADSvH005056 for ; Mon, 2 May 2016 10:13:28 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209194] Seg fault with pthread_cond_signal() and pthread_cond_broadcast() Date: Mon, 02 May 2016 10:13:28 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 10.2-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 May 2016 10:13:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209194 Konstantin Belousov changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kib@FreeBSD.org --- Comment #1 from Konstantin Belousov --- (In reply to Michael Cress from comment #0) Best idea is that your code has a bug. Also, backtraces from libraries without debugging information compiled in a= re useless (or too hard to interpret usefully), since debugger matches nearest defined symbols to the addresses in backtrace, which typically mis-represen= ts the problem. But I am sure that libthr is a victim of unrelated bug in other code. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Mon May 2 16:01:20 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31580AEE32F for ; Mon, 2 May 2016 16:01:20 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1AF9818DE for ; Mon, 2 May 2016 16:01:20 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u42G1Jqx036341 for ; Mon, 2 May 2016 16:01:19 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209194] Seg fault with pthread_cond_signal() and pthread_cond_broadcast() Date: Mon, 02 May 2016 16:01:20 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 10.2-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: michael.cress@cress.us X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 May 2016 16:01:20 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209194 --- Comment #2 from Michael Cress --- Is it possible to get a version of this library with the debugging informat= ion compiled in? I agree that the most likely explanation is a bug in my code as the unit tests for that section of code execute correctly. However, examini= ng the data structures using GDB between succeeding and failing calls, I have verified that the cond var and the lock memory addresses are not being modified. Visibility and access to the lock and cond vars does not escape t= he .c file so I can't see where overwriting or corruption of the lock and cond vars could happen. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Mon May 2 16:19:41 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3F1D0AEEB3C for ; Mon, 2 May 2016 16:19:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2FD181FEA for ; Mon, 2 May 2016 16:19:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u42GJe8X092628 for ; Mon, 2 May 2016 16:19:41 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209194] Seg fault with pthread_cond_signal() and pthread_cond_broadcast() Date: Mon, 02 May 2016 16:19:41 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 10.2-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 May 2016 16:19:41 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209194 --- Comment #3 from Konstantin Belousov --- (In reply to Michael Cress from comment #2) Check out full src/ for your system. Then do (cd lib/libc && make all install DEBUG_FLAGS=3D-g) (cd lib/libthr && make all install DEBUG_FLAGS=3D-g) (cd libexec/rtld-elf && make all install DEBUG_FLAGS=3D-g) --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Tue May 3 13:14:09 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C7FE4B2B9EB for ; Tue, 3 May 2016 13:14:09 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B18521E30 for ; Tue, 3 May 2016 13:14:09 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u43DE9HO013669 for ; Tue, 3 May 2016 13:14:09 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Tue, 03 May 2016 13:14:09 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: le277@cam.ac.uk X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status keywords bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2016 13:14:09 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 Bug ID: 209233 Summary: [patch] pthread_suspend_all_np races with check_suspend Product: Base System Version: 11.0-CURRENT Hardware: Any OS: Any Status: New Keywords: patch Severity: Affects Some People Priority: --- Component: threads Assignee: freebsd-threads@FreeBSD.org Reporter: le277@cam.ac.uk Keywords: patch Created attachment 169926 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D169926&action= =3Dedit Patch to fix race Currently the suspend all implementation in libthr only sets the NEED_SUSPE= ND flag if a thread does not have the SUSPENDED flag set. However, that thread= may be in the process of resuming, and so its NEED_SUSPEND flag should always be set if we wish a thread to stay suspended, even if already has the SUSPENDED flag set. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 10:52:04 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 815F8B2BEDB for ; Wed, 4 May 2016 10:52:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7231616A7 for ; Wed, 4 May 2016 10:52:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44Aq49v091617 for ; Wed, 4 May 2016 10:52:04 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 10:52:04 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 10:52:04 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 Konstantin Belousov changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kib@FreeBSD.org --- Comment #1 from Konstantin Belousov --- (In reply to Lawrence Esswood from comment #0) So the issue is that we might leave pthread_resume{_all}_np with the THR_FLAG_SUSPENDED still set. IMO this is the real bug, and it cannot be worked around by trying to set NEED_SUSPEND in the next suspend cycle.=20 Instead, resume functions should wait until the resumed thread is scheduled= and has a chance to run enough to clear the flag. One more wait/wake point around thread->cycle would be needed to ensure of proper control pass between resume->resumed->resume threads. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 12:12:04 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5FCD8B2A954 for ; Wed, 4 May 2016 12:12:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 50B261A20 for ; Wed, 4 May 2016 12:12:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44CC3bP069572 for ; Wed, 4 May 2016 12:12:04 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 12:12:04 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: le277@cam.ac.uk X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 12:12:04 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 --- Comment #2 from Lawrence Esswood --- (In reply to Konstantin Belousov from comment #1) This would end up doing the same thing in the suspend logic as always setti= ng the NEED_SUSPEND flag, as the !(thread->flags & THR_FLAGS_SUSPENDED) check would always be true if THR_FLAGS_NEED_SUSPEND was clear and if we insisted that !THR_FLAGS_NEED_SUSPEND =3D> !THR_FLAGS_SUSPENDED after a resume. The = only (observable) difference would be extra blocking in resume which we probabab= ly want to avoid. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 15:19:07 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 53AC2B2D771 for ; Wed, 4 May 2016 15:19:07 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 365751AC8 for ; Wed, 4 May 2016 15:19:07 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44FJ6K5024009 for ; Wed, 4 May 2016 15:19:07 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 15:19:06 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 15:19:07 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 --- Comment #3 from Konstantin Belousov --- (In reply to Lawrence Esswood from comment #2) Yes, my intent is to guarantee that the THR_FLAGS_SUSPENDED is never left dandling, since we are not supposed to distinguish between previous and cur= rent suspension in suspend_common(). Then, why not simply clear THR_FLAGS_SUSPENDED in resume_common() ? We apparently do not care how much thread moved ahead returning from check_suspend(), and the function unblocks SIGCANCEL only after it is ready= to next suspend. Could you provide a test case for the problem ? --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 15:19:44 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7710FB2D7CB for ; Wed, 4 May 2016 15:19:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6799C1C14 for ; Wed, 4 May 2016 15:19:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44FJiV1024870 for ; Wed, 4 May 2016 15:19:44 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 15:19:44 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 15:19:44 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 --- Comment #4 from Konstantin Belousov --- Created attachment 169969 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D169969&action= =3Dedit Clear THR_FLAGS_SUSPENDED in resume_common() --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 16:56:31 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CE7FB2C1A7 for ; Wed, 4 May 2016 16:56:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F1B301A76 for ; Wed, 4 May 2016 16:56:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44GuUeu068314 for ; Wed, 4 May 2016 16:56:30 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 16:56:31 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: le277@cam.ac.uk X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 16:56:31 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 --- Comment #5 from Lawrence Esswood --- (In reply to Konstantin Belousov from comment #3) I had a very wonky test that very unreliably failed, now I know why it happ= ens I can probably create a much better one to attach. Your suggestion I think would work. You would also have to change the break condition on the check_suspend loop otherwise if the thread is woken for any reason it will break too early. One question however, what happens if the thread in check_suspend has its NEED_SUSPEND flag set and is signalled somewhere between the loop exit and = the _thr_signal_unblock? I think it won't do another check_suspend / check_defe= rred until it next hits a _thr_ast which might be never. We could maybe extend t= he thread lock to include the _thr_signal_unblock, or have the end of check_suspend make a recursive call. I will try knock up a test case. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 17:16:04 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5AA92B2C6D9 for ; Wed, 4 May 2016 17:16:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3DF0614AD for ; Wed, 4 May 2016 17:16:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44HG3jw044166 for ; Wed, 4 May 2016 17:16:04 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 17:16:04 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 17:16:04 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 --- Comment #6 from Konstantin Belousov --- (In reply to Lawrence Esswood from comment #5) Yes, it is a bug that I left the old test in the loop condition. I left on= ly the check for THR_FLAGS_NEED_SUSPEND there, since it is cleared only on res= ume. WRT SIGCANCEL generated after loop exit and before thr_signal_unblock. The blocked signal list includes SIGCANCEL, thus the SIGCANCEL signal should be only delivered after unblock. Then typically we would re-enter _thr_ast() w= ith the different set of flags. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 17:16:40 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C47DCB2C717 for ; Wed, 4 May 2016 17:16:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B4F7A14EC for ; Wed, 4 May 2016 17:16:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44HGeHp044920 for ; Wed, 4 May 2016 17:16:40 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 17:16:40 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.isobsolete attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 17:16:40 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 Konstantin Belousov changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #169969|0 |1 is obsolete| | --- Comment #7 from Konstantin Belousov --- Created attachment 169971 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D169971&action= =3Dedit Clear THR_FLAGS_SUSPENDED in resume_common() v2 Fix loop condition. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 17:57:51 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 19B8EB2D0A6 for ; Wed, 4 May 2016 17:57:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 09C251F54 for ; Wed, 4 May 2016 17:57:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44Hvo02034519 for ; Wed, 4 May 2016 17:57:50 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 17:57:50 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: le277@cam.ac.uk X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 17:57:51 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 --- Comment #8 from Lawrence Esswood --- Created attachment 169973 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D169973&action= =3Dedit test case (In reply to Konstantin Belousov from comment #3) This fails pretty reliably / is the best I can do. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Wed May 4 18:26:48 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 96B12B2D7EE for ; Wed, 4 May 2016 18:26:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 874121EBE for ; Wed, 4 May 2016 18:26:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u44IQmiY085580 for ; Wed, 4 May 2016 18:26:48 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Wed, 04 May 2016 18:26:48 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2016 18:26:48 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 --- Comment #9 from Konstantin Belousov --- (In reply to Lawrence Esswood from comment #8) Test is good. It also fails reliably for me, and works on the patched libthr. If you have no more comments, I will commit the fix. Some time later I mig= ht add the test to our test suite. Would be nice if you add copyright and lic= ense into the test source preamble, share/examples/etc/bsd-style-copyright provi= des text of the preferred license. Thanks. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Thu May 5 10:20:55 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C92D7B2D5A1 for ; Thu, 5 May 2016 10:20:55 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B9F2310B5 for ; Thu, 5 May 2016 10:20:55 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u45AKtQF015974 for ; Thu, 5 May 2016 10:20:55 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 209233] [patch] pthread_suspend_all_np races with check_suspend Date: Thu, 05 May 2016 10:20:55 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-threads@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2016 10:20:55 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209233 --- Comment #10 from commit-hook@freebsd.org --- A commit references this bug: Author: kib Date: Thu May 5 10:20:23 UTC 2016 New revision: 299114 URL: https://svnweb.freebsd.org/changeset/base/299114 Log: Do not leak THR_FLAGS_SUSPENDED from the previous suspend/resume cycle. The flag currently is cleared by the resumed thread. If next suspend request comes before the thread was able to clean the flag, in which case suspender skip the thread. Instead, clear the THR_FLAGS_SUSPEND flag in resume_common(), we do not care how much code was executed in the resumed thread when the pthread_resume_*np(s) functions returned. PR: 209233 Reported by: Lawrence Esswood MFC after: 1 week Changes: head/lib/libthr/thread/thr_resume_np.c head/lib/libthr/thread/thr_sig.c --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-threads@freebsd.org Thu May 5 13:10:39 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B90D6B2CADD for ; Thu, 5 May 2016 13:10:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id A535A1934 for ; Thu, 5 May 2016 13:10:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 9D0A2B2CADB; Thu, 5 May 2016 13:10:39 +0000 (UTC) Delivered-To: threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C7C2B2CAD9; Thu, 5 May 2016 13:10:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 45F9618FD; Thu, 5 May 2016 13:10:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u45DATQK087691 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 5 May 2016 16:10:30 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u45DATQK087691 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u45DATPC087690; Thu, 5 May 2016 16:10:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 5 May 2016 16:10:29 +0300 From: Konstantin Belousov To: threads@freebsd.org Cc: arch@freebsd.org Subject: Robust mutexes implementation Message-ID: <20160505131029.GE2422@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.6.0 (2016-04-01) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2016 13:10:39 -0000 I implemented robust mutexes for our libthr. A robust mutex is guaranteed to be cleared by the system upon either thread or process owner termination while the mutex is held. The next mutex locker is then notified about inconsistent mutex state and can execute (or abandon) corrective actions. The patch mostly consists of small changes here and there, adding neccessary checks for the inconsistent and abandoned conditions into existing paths. Additionally, the thread exit handler was extended to iterate over the userspace-maintained list of owned robust mutexes, unlocking and marking as terminated each of them. The list of owned robust mutexes cannot be maintained atomically synchronous with the mutex lock state (it is possible in kernel, but is too expensive). Instead, for the duration of lock or unlock operation, the current mutex is remembered in a special slot that is also checked by the kernel at thread termination. Kernel must be aware about the per-thread location of the heads of robust mutex lists and the current active mutex slot. Initially I tried to extend TCBs with this data, so only a single syscall at the threading library initialization would be needed: for any thread the location of TCB is known by kernel, and the syscall would pass offsets. Unfortunately, on some architectures the size of TCB is part of the fixed ABI and cannot be changed. Instead, when a thread touches a robust mutex for the first time, a new umtx op syscall is issued which informs about location of lists heads. The umtx sleep queues for PP and PI mutexes are split between non-robust and robust. I do not understand the reasoning behind this POSIX requirement. Patch passes all glibc tests for robust mutexes I found in the nptl/ directory. See https://github.com/kostikbel/glibc-robust-tests . Patch is available at https://kib.kiev.ua/kib/pshared/robust.1.patch (beware of self-signed root certificate in the chain). Work was sponsored by The FreeBSD Foundation. Unrelated things in the patch: 1. Style. Since I had to re-read whole sys/kern/kern_umtx.c, lib/libthr/thread/thr_umtx.h and lib/libthr/thread/thr_umtx.c, I started fixing the numerous style violations in these files, which actually made my eyes bleed. 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared pi mutexes. 3. Removal of the struct pthread_mutex m_owner field. I cannot see why it is useful. The only optimization it provides is the possibility to avoid clearing UMUTEX_CONTESTED bit when reading m_lock.m_owner. The disadvantages of having this duplicated field is that kernel does not know about pthread_mutex, so cannot fix the dup value. Overall it is less work to clear UMUTEX_CONTESTED when checking owner, then to try and handle inconsistencies. I added the PMUTEX_OWNER_ID() macro to simplify code. 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls the lifetime of the shared mutex associated with a vnode' page. Apparently, there is real code around which expects the following to work: - mmap a file, create a shared mutex in the mapping; - the process exits; - another process starts, mmaps the same file and expects that the previously initialized mutex is still usable. The knob changes the lifetime of such shared off-page from the 'destroy on last unmap' to either 'until vnode is reclaimed' or until 'pthread_mutex_destroy' called, whatever comes first. From owner-freebsd-threads@freebsd.org Thu May 5 17:31:40 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1221BB2E651 for ; Thu, 5 May 2016 17:31:40 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 0163418AA for ; Thu, 5 May 2016 17:31:40 +0000 (UTC) (envelope-from martin@lispworks.com) Received: by mailman.ysv.freebsd.org (Postfix) id F0A3AB2E64F; Thu, 5 May 2016 17:31:39 +0000 (UTC) Delivered-To: threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F00B5B2E64C; Thu, 5 May 2016 17:31:39 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id 9D40218A8; Thu, 5 May 2016 17:31:39 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.9/8.14.9) with ESMTP id u45HKZYM011908; Thu, 5 May 2016 18:20:35 +0100 (BST) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id u45HKZlO021098; Thu, 5 May 2016 18:20:35 +0100 Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id u45HKZ76021094; Thu, 5 May 2016 18:20:35 +0100 Date: Thu, 5 May 2016 18:20:35 +0100 Message-Id: <201605051720.u45HKZ76021094@higson.cam.lispworks.com> From: Martin Simmons To: Konstantin Belousov CC: threads@freebsd.org, arch@freebsd.org In-reply-to: <20160505131029.GE2422@kib.kiev.ua> (message from Konstantin Belousov on Thu, 5 May 2016 16:10:29 +0300) Subject: Re: Robust mutexes implementation References: <20160505131029.GE2422@kib.kiev.ua> X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2016 17:31:40 -0000 There is a potential bug in enqueue_mutex when it tests m1 == NULL and m1 != NULL. These tests only work because m_lock is the first slot in struct pthread_mutex and hence 0 in curthread->robust_list is converted to NULL (rather than a negative value). Also, is it safe to assume memory ordering between the assignments of m->m_lock.m_rb_lnk and curthread->robust_list? Maybe it is OK because the kernel will never read curthread->robust_list until after the CPU executing enqueue_mutex has passed a memory barrier? __Martin From owner-freebsd-threads@freebsd.org Thu May 5 18:58:17 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2AC83B2EA59 for ; Thu, 5 May 2016 18:58:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 1566E1FAF for ; Thu, 5 May 2016 18:58:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 0D553B2EA57; Thu, 5 May 2016 18:58:17 +0000 (UTC) Delivered-To: threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CB3CB2EA55; Thu, 5 May 2016 18:58:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 82B551FAD; Thu, 5 May 2016 18:58:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u45IwAHm069983 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 5 May 2016 21:58:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u45IwAHm069983 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u45IwAcl069982; Thu, 5 May 2016 21:58:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 5 May 2016 21:58:10 +0300 From: Konstantin Belousov To: Martin Simmons Cc: threads@freebsd.org, arch@freebsd.org Subject: Re: Robust mutexes implementation Message-ID: <20160505185810.GF2422@kib.kiev.ua> References: <20160505131029.GE2422@kib.kiev.ua> <201605051720.u45HKZ76021094@higson.cam.lispworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201605051720.u45HKZ76021094@higson.cam.lispworks.com> User-Agent: Mutt/1.6.0 (2016-04-01) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2016 18:58:17 -0000 On Thu, May 05, 2016 at 06:20:35PM +0100, Martin Simmons wrote: > There is a potential bug in enqueue_mutex when it tests m1 == NULL and m1 != > NULL. These tests only work because m_lock is the first slot in struct > pthread_mutex and hence 0 in curthread->robust_list is converted to NULL > (rather than a negative value). Yes. In fact I wrote the __containerof stuff only later, initial version of the patch did relied on the fact that m_lock is at the beginning of pthread_mutex. I rewrote the code in enqueue, and there is one more similar check in dequeue_mutex(). Updated patch is at https://kib.kiev.ua/kib/pshared/robust.2.patch . It also fixed the lack of userland split for non-robust pp/robust pp queues. > > Also, is it safe to assume memory ordering between the assignments of > m->m_lock.m_rb_lnk and curthread->robust_list? Maybe it is OK because the > kernel will never read curthread->robust_list until after the CPU executing > enqueue_mutex has passed a memory barrier? Inter-CPU ordering (I suppose you mean this, because you mention barriers) only matter when we consider multi-threaded interaction. In case of dequeue_mutex, paired to corresponding enqueue_mutex(), the calls occur in the same thread, and the thread is always self-consistent. WRT possible reordering, we in fact only care that enqueue writes do not become visible before lock is obtained, and dequeue must finish for external observers before lock is released. This is ensured by the umutex lock semantic, which has neccessary acquire barrier on lock and release barrier on unlock. From owner-freebsd-threads@freebsd.org Fri May 6 13:00:54 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 02886B2F78E for ; Fri, 6 May 2016 13:00:54 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id E4AB11F1C for ; Fri, 6 May 2016 13:00:53 +0000 (UTC) (envelope-from martin@lispworks.com) Received: by mailman.ysv.freebsd.org (Postfix) id E0611B2F78B; Fri, 6 May 2016 13:00:53 +0000 (UTC) Delivered-To: threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DF885B2F789; Fri, 6 May 2016 13:00:53 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id 8AE7A1F18; Fri, 6 May 2016 13:00:52 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.9/8.14.9) with ESMTP id u46D0mQI006640; Fri, 6 May 2016 14:00:48 +0100 (BST) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id u46D0mux008057; Fri, 6 May 2016 14:00:48 +0100 Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id u46D0mhN008053; Fri, 6 May 2016 14:00:48 +0100 Date: Fri, 6 May 2016 14:00:48 +0100 Message-Id: <201605061300.u46D0mhN008053@higson.cam.lispworks.com> From: Martin Simmons To: Konstantin Belousov CC: threads@freebsd.org, arch@freebsd.org In-reply-to: <20160505185810.GF2422@kib.kiev.ua> (message from Konstantin Belousov on Thu, 5 May 2016 21:58:10 +0300) Subject: Re: Robust mutexes implementation References: <20160505131029.GE2422@kib.kiev.ua> <201605051720.u45HKZ76021094@higson.cam.lispworks.com> <20160505185810.GF2422@kib.kiev.ua> X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 13:00:54 -0000 >>>>> On Thu, 5 May 2016 21:58:10 +0300, Konstantin Belousov said: > > Yes. In fact I wrote the __containerof stuff only later, initial > version of the patch did relied on the fact that m_lock is at the > beginning of pthread_mutex. > > I rewrote the code in enqueue, and there is one more similar check in > dequeue_mutex(). > > Updated patch is at https://kib.kiev.ua/kib/pshared/robust.2.patch . OK. > On Thu, May 05, 2016 at 06:20:35PM +0100, Martin Simmons wrote: > > Also, is it safe to assume memory ordering between the assignments of > > m->m_lock.m_rb_lnk and curthread->robust_list? Maybe it is OK because the > > kernel will never read curthread->robust_list until after the CPU executing > > enqueue_mutex has passed a memory barrier? > > Inter-CPU ordering (I suppose you mean this, because you mention > barriers) only matter when we consider multi-threaded interaction. > In case of dequeue_mutex, paired to corresponding enqueue_mutex(), > the calls occur in the same thread, and the thread is always > self-consistent. Agreed. > WRT possible reordering, we in fact only care that enqueue writes do > not become visible before lock is obtained, and dequeue must finish > for external observers before lock is released. This is ensured by > the umutex lock semantic, which has neccessary acquire barrier on > lock and release barrier on unlock. I meant the case where CPU 1 claims the lock, executing this from enqueue_mutex: m->m_lock.m_rb_lnk = 0; /* A */ *rl = (uintptr_t)&m->m_lock; /* B */ and then the thread dies and CPU 2 cleans up the dead thread executing this from umtx_handle_rb: error = copyin((void *)rbp, &m, sizeof(m)); /* C */ *rb_list = m.m_rb_lnk; /* D */ where rbp is the value stored into *rl at B by CPU 1. What ensures that CPU 1's store at A is seen by CPU 2 when it reads the value at D? I'm hoping this is covered by something that I don't understand, but I was worried by the comment "consistent even in the case of thread termination at arbitrary moment" above enqueue_mutex. __Martin From owner-freebsd-threads@freebsd.org Fri May 6 13:20:54 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A52F7B2FCAD for ; Fri, 6 May 2016 13:20:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 8EFB21A89 for ; Fri, 6 May 2016 13:20:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 8E5E5B2FCAB; Fri, 6 May 2016 13:20:54 +0000 (UTC) Delivered-To: threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8DDECB2FCAA; Fri, 6 May 2016 13:20:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3825B1A88; Fri, 6 May 2016 13:20:54 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u46DKi7P025450 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 6 May 2016 16:20:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u46DKi7P025450 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u46DKicH025449; Fri, 6 May 2016 16:20:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 6 May 2016 16:20:44 +0300 From: Konstantin Belousov To: Martin Simmons Cc: threads@freebsd.org, arch@freebsd.org Subject: Re: Robust mutexes implementation Message-ID: <20160506132044.GA89104@kib.kiev.ua> References: <20160505131029.GE2422@kib.kiev.ua> <201605051720.u45HKZ76021094@higson.cam.lispworks.com> <20160505185810.GF2422@kib.kiev.ua> <201605061300.u46D0mhN008053@higson.cam.lispworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201605061300.u46D0mhN008053@higson.cam.lispworks.com> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 13:20:54 -0000 On Fri, May 06, 2016 at 02:00:48PM +0100, Martin Simmons wrote: > >>>>> On Thu, 5 May 2016 21:58:10 +0300, Konstantin Belousov said: > > WRT possible reordering, we in fact only care that enqueue writes do > > not become visible before lock is obtained, and dequeue must finish > > for external observers before lock is released. This is ensured by > > the umutex lock semantic, which has neccessary acquire barrier on > > lock and release barrier on unlock. > > I meant the case where CPU 1 claims the lock, executing this from > enqueue_mutex: > > m->m_lock.m_rb_lnk = 0; /* A */ > *rl = (uintptr_t)&m->m_lock; /* B */ > > and then the thread dies and CPU 2 cleans up the dead thread executing this > from umtx_handle_rb: > > error = copyin((void *)rbp, &m, sizeof(m)); /* C */ > *rb_list = m.m_rb_lnk; /* D */ > > where rbp is the value stored into *rl at B by CPU 1. > > What ensures that CPU 1's store at A is seen by CPU 2 when it reads the value > at D? I'm hoping this is covered by something that I don't understand, but I > was worried by the comment "consistent even in the case of thread termination > at arbitrary moment" above enqueue_mutex. > The cleanup is performed by the same thread which did the lock, in the kernel mode. Thread can only die by explicit kernel action, and this action is executed by the dying thread. In fact, there is one (or two) cases when the cleanup is performed by thread other than the lock owner. Namely, when we execve(2) (or _exit(2) the whole process, but the mechanics is same), we first put the process into so-called single-threading state, where all threads other than the executing the syscall, are put into sleep at the safe moment. Then, the old address space if destroyed and new image activated. Only after that, other threads are killed. It is done to allow the error return, where we need to keep threads around on failed execve. And, right before the old address space is destroyed, robust mutexes for all threads are terminated, since we need access to usermode VA to operate on locks. But there, the single-threading mechanics contains neccessary barriers to ensure visibility in right order. In essence, there are so many locks/unlocks with barrier semantic on single-threading that this is not a problem. From owner-freebsd-threads@freebsd.org Fri May 6 14:43:51 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9F8FFB31092 for ; Fri, 6 May 2016 14:43:51 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 8D5751ADF for ; Fri, 6 May 2016 14:43:51 +0000 (UTC) (envelope-from martin@lispworks.com) Received: by mailman.ysv.freebsd.org (Postfix) id 889B4B31090; Fri, 6 May 2016 14:43:51 +0000 (UTC) Delivered-To: threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 87FF0B3108E; Fri, 6 May 2016 14:43:51 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id 335211ADD; Fri, 6 May 2016 14:43:50 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.9/8.14.9) with ESMTP id u46EhkkQ010464; Fri, 6 May 2016 15:43:46 +0100 (BST) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id u46EhkFT009474; Fri, 6 May 2016 15:43:46 +0100 Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id u46EhkfY009470; Fri, 6 May 2016 15:43:46 +0100 Date: Fri, 6 May 2016 15:43:46 +0100 Message-Id: <201605061443.u46EhkfY009470@higson.cam.lispworks.com> From: Martin Simmons To: Konstantin Belousov CC: threads@freebsd.org, arch@freebsd.org In-reply-to: <20160506132044.GA89104@kib.kiev.ua> (message from Konstantin Belousov on Fri, 6 May 2016 16:20:44 +0300) Subject: Re: Robust mutexes implementation References: <20160505131029.GE2422@kib.kiev.ua> <201605051720.u45HKZ76021094@higson.cam.lispworks.com> <20160505185810.GF2422@kib.kiev.ua> <201605061300.u46D0mhN008053@higson.cam.lispworks.com> <20160506132044.GA89104@kib.kiev.ua> X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 14:43:51 -0000 >>>>> On Fri, 6 May 2016 16:20:44 +0300, Konstantin Belousov said: > > The cleanup is performed by the same thread which did the lock, in the > kernel mode. Thread can only die by explicit kernel action, and this > action is executed by the dying thread. Thanks, that's the part I didn't know about. __Martin From owner-freebsd-threads@freebsd.org Fri May 6 23:30:14 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E68C9B31AF5 for ; Fri, 6 May 2016 23:30:14 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id D5729108D for ; Fri, 6 May 2016 23:30:14 +0000 (UTC) (envelope-from jilles@stack.nl) Received: by mailman.ysv.freebsd.org (Postfix) id CE0D1B31AF3; Fri, 6 May 2016 23:30:14 +0000 (UTC) Delivered-To: threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CAE71B31AF1; Fri, 6 May 2016 23:30:14 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6D574108B; Fri, 6 May 2016 23:30:14 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from toad2.stack.nl (toad2.stack.nl [IPv6:2001:610:1108:5010::161]) by mx1.stack.nl (Postfix) with ESMTP id 1734BB80ED; Sat, 7 May 2016 01:30:11 +0200 (CEST) Received: by toad2.stack.nl (Postfix, from userid 1677) id D8576892AE; Sat, 7 May 2016 01:30:11 +0200 (CEST) Date: Sat, 7 May 2016 01:30:11 +0200 From: Jilles Tjoelker To: Konstantin Belousov Cc: threads@freebsd.org, arch@freebsd.org Subject: Re: Robust mutexes implementation Message-ID: <20160506233011.GA99994@stack.nl> References: <20160505131029.GE2422@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160505131029.GE2422@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 23:30:15 -0000 On Thu, May 05, 2016 at 04:10:29PM +0300, Konstantin Belousov wrote: > I implemented robust mutexes for our libthr. A robust mutex is > guaranteed to be cleared by the system upon either thread or process > owner termination while the mutex is held. The next mutex locker is > then notified about inconsistent mutex state and can execute (or > abandon) corrective actions. > The patch mostly consists of small changes here and there, adding > neccessary checks for the inconsistent and abandoned conditions into > existing paths. Additionally, the thread exit handler was extended > to iterate over the userspace-maintained list of owned robust mutexes, > unlocking and marking as terminated each of them. > The list of owned robust mutexes cannot be maintained atomically > synchronous with the mutex lock state (it is possible in kernel, but > is too expensive). Instead, for the duration of lock or unlock > operation, the current mutex is remembered in a special slot that is > also checked by the kernel at thread termination. > Kernel must be aware about the per-thread location of the heads of > robust mutex lists and the current active mutex slot. Initially I tried > to extend TCBs with this data, so only a single syscall at the > threading library initialization would be needed: for any thread the > location of TCB is known by kernel, and the syscall would pass > offsets. Unfortunately, on some architectures the size of TCB is part > of the fixed ABI and cannot be changed. Instead, when a thread touches > a robust mutex for the first time, a new umtx op syscall is issued which > informs about location of lists heads. > The umtx sleep queues for PP and PI mutexes are split between > non-robust and robust. I do not understand the reasoning behind this > POSIX requirement. > Patch passes all glibc tests for robust mutexes I found in the nptl/ > directory. See https://github.com/kostikbel/glibc-robust-tests . > Patch is available at https://kib.kiev.ua/kib/pshared/robust.1.patch > (beware of self-signed root certificate in the chain). Work was > sponsored by The FreeBSD Foundation. > Unrelated things in the patch: > 1. Style. Since I had to re-read whole sys/kern/kern_umtx.c, > lib/libthr/thread/thr_umtx.h and lib/libthr/thread/thr_umtx.c, I > started fixing the numerous style violations in these files, which > actually made my eyes bleed. > 2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared > pi mutexes. > 3. Removal of the struct pthread_mutex m_owner field. I cannot see > why it is useful. The only optimization it provides is the > possibility to avoid clearing UMUTEX_CONTESTED bit when reading > m_lock.m_owner. The disadvantages of having this duplicated field > is that kernel does not know about pthread_mutex, so cannot fix the > dup value. Overall it is less work to clear UMUTEX_CONTESTED when > checking owner, then to try and handle inconsistencies. > I added the PMUTEX_OWNER_ID() macro to simplify code. > 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls > the lifetime of the shared mutex associated with a vnode' page. > Apparently, there is real code around which expects the following > to work: > - mmap a file, create a shared mutex in the mapping; > - the process exits; > - another process starts, mmaps the same file and expects that the > previously initialized mutex is still usable. > The knob changes the lifetime of such shared off-page from the > 'destroy on last unmap' to either 'until vnode is reclaimed' or > until 'pthread_mutex_destroy' called, whatever comes first. The 'until vnode is reclaimed' bit sounds like a recipe for hard to reproduce bugs. I do think it is related to the robust mutex patch, though. Without robust mutexes and assuming threads do not unmap the memory while having a mutex locked or while waiting on a condition variable, it is sufficient to create the off-page mutex/condvar automatically in its initial state when pthread_mutex_lock() or pthread_cond_*wait() are called and no off-page object exists. With robust mutexes, we need to store somewhere whether the next thread should receive [EOWNERDEAD] or not, and this should persist even if no processes have the memory mapped. This might be done by replacing THR_PSHARED_PTR with a different value in the pthread_mutex_t. I'm not sure I like that memory write being done from the kernel though. The below is a review of https://kib.kiev.ua/kib/pshared/robust.2.patch > diff --git a/lib/libc/gen/Symbol.map b/lib/libc/gen/Symbol.map It is not necessary to export _pthread_mutex_consistent, _pthread_mutexattr_getrobust and _pthread_mutexattr_setrobust (under FBSDprivate_1.0 symbol version). They are not used by name outside the DSO, only via the jump table. The same thing is true of many other FBSDprivate_1.0 symbols but there is a difference between adding new unnecessary exports and removing existing unnecessary exports. > + if ((error2 == 0 || error2 == EOWNERDEAD) && cancel) > _thr_testcancel(curthread); I don't think [EOWNERDEAD] should be swept under the carpet like this. The cancellation cleanup handler will use the protected state and unlock the mutex without making the state consistent and the state will be unrecoverable. > +void > +_mutex_leave_robust(struct pthread *curthread, struct pthread_mutex *m) > +{ > + > + if (!is_robust_mutex(m)) > + return; This accesses the mutex after writing a value to the lock word which allows other threads to lock it. A use after free may result, since it is valid for another thread to lock, unlock and destroy the mutex (assuming the mutex is not used otherwise later). Memory ordering may permit the load of m->m_lock.m_flags to be moved before the actual unlock, so this issue may not actually appear. Given that the overhead of a system call on every robust mutex unlock is not desired, the kernel's unlock of a terminated thread's mutexes will unavoidably have this use after free. However, for non-robust mutexes the previous guarantees should be kept. > + int defered, error; Typo, should be "deferred". > +.It Bq Er EOWNERDEAD > +The argument > +.Fa mutex > +points to the robust mutex and the previous owning thread terminated > +while holding the mutex lock. > +The lock was granted to the caller and it is up to the new owner > +to make the state consistent. "points to a robust mutex". Perhaps add a See .Xr pthread_mutex_consistent 3 here. > diff --git a/share/man/man3/pthread_mutex_consistent.3 b/share/man/man3/pthread_mutex_consistent.3 This man page should mention that pthread_mutex_consistent() can be called when a mutex lock or condition variable wait failed with [EOWNERDEAD]. > @@ -37,7 +37,11 @@ struct umutex { > + __uintptr_t m_rb_lnk; /* Robust linkage */ Although Linux also stores the robust list nodes in the mutexes like this, I think it increases the chance of strange memory corruption. Making the robust list an array in the thread's memory would be more reliable. If the maximum number of owned robust mutexes can be small, this can have a fixed size; otherwise, it needs to grow as needed (which does add an allocation that may fail to the pthread_mutex_lock path, bad). > + * The umutex.m_lock values and bits. The m_owner is the word which > + * serves as the lock. It's high bit is the contention indicator, > + * rest of bits records the owner TID. TIDs values start with PID_MAX > + * + 2 and ends by INT32_MAX, the low range [1..PID_MAX] is guaranteed > + * to be useable as the special markers. Typo, "It's" should be "Its" and "ends" should be "end". Bruce Evans would probably complain about comma splice (two times). > +#define UMTX_OP_ROBUST 26 The name is rather vague. Perhaps this can be something like UMTX_OP_INIT_ROBUST. -- Jilles Tjoelker From owner-freebsd-threads@freebsd.org Sat May 7 17:00:03 2016 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DBA9AB31C99 for ; Sat, 7 May 2016 17:00:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id C5C271F4E for ; Sat, 7 May 2016 17:00:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id BE24EB31C98; Sat, 7 May 2016 17:00:03 +0000 (UTC) Delivered-To: threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BB1B0B31C94; Sat, 7 May 2016 17:00:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3A96D1F4D; Sat, 7 May 2016 17:00:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u47GxuRJ024297 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 7 May 2016 19:59:56 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u47GxuRJ024297 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u47GxujY024296; Sat, 7 May 2016 19:59:56 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 7 May 2016 19:59:56 +0300 From: Konstantin Belousov To: Jilles Tjoelker Cc: threads@freebsd.org, arch@freebsd.org Subject: Re: Robust mutexes implementation Message-ID: <20160507165956.GC89104@kib.kiev.ua> References: <20160505131029.GE2422@kib.kiev.ua> <20160506233011.GA99994@stack.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160506233011.GA99994@stack.nl> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 May 2016 17:00:04 -0000 On Sat, May 07, 2016 at 01:30:11AM +0200, Jilles Tjoelker wrote: > > 4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls > > the lifetime of the shared mutex associated with a vnode' page. > > Apparently, there is real code around which expects the following > > to work: > > - mmap a file, create a shared mutex in the mapping; > > - the process exits; > > - another process starts, mmaps the same file and expects that the > > previously initialized mutex is still usable. > > > The knob changes the lifetime of such shared off-page from the > > 'destroy on last unmap' to either 'until vnode is reclaimed' or > > until 'pthread_mutex_destroy' called, whatever comes first. > > The 'until vnode is reclaimed' bit sounds like a recipe for hard to > reproduce bugs. > > I do think it is related to the robust mutex patch, though. > > Without robust mutexes and assuming threads do not unmap the memory > while having a mutex locked or while waiting on a condition variable, it > is sufficient to create the off-page mutex/condvar automatically in its > initial state when pthread_mutex_lock() or pthread_cond_*wait() are > called and no off-page object exists. > > With robust mutexes, we need to store somewhere whether the next thread > should receive [EOWNERDEAD] or not, and this should persist even if no > processes have the memory mapped. This might be done by replacing > THR_PSHARED_PTR with a different value in the pthread_mutex_t. I'm not > sure I like that memory write being done from the kernel though. In principle, I agree with this. Note that if we go with something like THR_OWNERDEAD_PTR, the kernel write to set the value would be not much different from the kernel write to unlock robust mutex with inlined lock structures. Still, I would prefer to to implement this now. For the local purposes, the knob was enough, and default value will be 'disabled'. > > The below is a review of https://kib.kiev.ua/kib/pshared/robust.2.patch > > > diff --git a/lib/libc/gen/Symbol.map b/lib/libc/gen/Symbol.map > > It is not necessary to export _pthread_mutex_consistent, > _pthread_mutexattr_getrobust and _pthread_mutexattr_setrobust (under > FBSDprivate_1.0 symbol version). They are not used by name outside the > DSO, only via the jump table. Removed from the export. > > The same thing is true of many other FBSDprivate_1.0 symbols but there > is a difference between adding new unnecessary exports and removing > existing unnecessary exports. > > > + if ((error2 == 0 || error2 == EOWNERDEAD) && cancel) > > _thr_testcancel(curthread); > > I don't think [EOWNERDEAD] should be swept under the carpet like this. > The cancellation cleanup handler will use the protected state and unlock > the mutex without making the state consistent and the state will be > unrecoverable. So your argument there is to return EOWNERDEAD and not cancelling, am I right ? I reused part of your text as the comment. > > > +void > > +_mutex_leave_robust(struct pthread *curthread, struct pthread_mutex *m) > > +{ > > + > > + if (!is_robust_mutex(m)) > > + return; > > This accesses the mutex after writing a value to the lock > word which allows other threads to lock it. A use after free may result, > since it is valid for another thread to lock, unlock and destroy the > mutex (assuming the mutex is not used otherwise later). > > Memory ordering may permit the load of m->m_lock.m_flags to be moved > before the actual unlock, so this issue may not actually appear. > > Given that the overhead of a system call on every robust mutex unlock is > not desired, the kernel's unlock of a terminated thread's mutexes will > unavoidably have this use after free. However, for non-robust mutexes > the previous guarantees should be kept. I agree that this is a bug, and agree that the kernel accesses to the curthread->inact_mtx are potentially unsafe. I also did not wanted to issue a syscall for unlock of a robust mutex, as you noted. I fixed the bug with the is_robust_mutex() test in _mutex_leave_robust() by caching the robust status. I was indeed worried by the kernel access issue you mentioned, but kernel is immune to 'bad' memory accesses. What bothered me is the ill ABA situation, where the lock memory is freed and repurposed, and then the lock word is written with the one of two specific values which give the same state as for locked mutex. This would cause kernel to 'unlock' it (but not to follow the invalid m_rb_link). But for this to happen, we must have a situation where a thread is being terminated before mutex_unlock_common() reached the _mutex_leave_robust() call. This is async thread termination, which then must be either process termination (including execve()), or a call to thr_exit() from a signal handler in our thread (including async cancellation). I am sure that the thr_exit() situation is non-conforming, so the only concern is the process exit, and then, shared robust mutex, because for private mutex, only the exiting process state is affected. I can verify in umtx_handle_rb(), for instance, that for USYNC_PROCESS_SHARED object, the underlying memory is backed by the umtx shm page. This would close the race. But this would interfere with libthr2, if that ever happen. > > > + int defered, error; > > Typo, should be "deferred". I also changed PMUTEX_FLAG_DEFERED. > > > +.It Bq Er EOWNERDEAD > > +The argument > > +.Fa mutex > > +points to the robust mutex and the previous owning thread terminated > > +while holding the mutex lock. > > +The lock was granted to the caller and it is up to the new owner > > +to make the state consistent. > > "points to a robust mutex". > > Perhaps add a See .Xr pthread_mutex_consistent 3 here. Both done. > > > diff --git a/share/man/man3/pthread_mutex_consistent.3 b/share/man/man3/pthread_mutex_consistent.3 > > This man page should mention that pthread_mutex_consistent() can be > called when a mutex lock or condition variable wait failed with > [EOWNERDEAD]. I took introductory text from the POSIX page for the function. > > > @@ -37,7 +37,11 @@ struct umutex { > > + __uintptr_t m_rb_lnk; /* Robust linkage */ > > Although Linux also stores the robust list nodes in the mutexes like > this, I think it increases the chance of strange memory corruption. > Making the robust list an array in the thread's memory would be more > reliable. If the maximum number of owned robust mutexes can be small, > this can have a fixed size; otherwise, it needs to grow as needed (which > does add an allocation that may fail to the pthread_mutex_lock path, > bad). I gave this proposal some thought. I very much dislike an idea of calling memory allocator on the lock, and esp. the trylock, path. The later could need to obtain allocator locks which (sometimes partially) defeat the trylock purpose. I can use mmap(2) directly there, similarly how pthread_setspecific() was changed recently, which would avoid the issue of calling userspace allocator. Still, the problem of an addiitonal syscall, resulting ENOMEM and also the time to copy the current robust owned list to grown location are there (I do not see that using chunked allocations is reasonable, since it would be the same list as current m_rb_lnk, but at different level. I prefer to keep the robust linked list for these reasons. In fact, the deciding argument would be actual application usage of the robustness. I thought, when writing the patch, when and how would I use the feature, and did not see compelling arguments to even try to use it. My stumbling block is the user data consistency recovery: for instance, I tried to write a loop which would increment shared variable N times, and I was not able to end up with any simple recovery mechanism from the aborted iteration, except writing iteration in assembly and have a parallel tick variable which enumerate each iteration action. > > > + * The umutex.m_lock values and bits. The m_owner is the word which > > + * serves as the lock. It's high bit is the contention indicator, > > + * rest of bits records the owner TID. TIDs values start with PID_MAX > > + * + 2 and ends by INT32_MAX, the low range [1..PID_MAX] is guaranteed > > + * to be useable as the special markers. > > Typo, "It's" should be "Its" and "ends" should be "end". > > Bruce Evans would probably complain about comma splice (two times). I tried to reword this. > > > +#define UMTX_OP_ROBUST 26 > > The name is rather vague. Perhaps this can be something like > UMTX_OP_INIT_ROBUST. I renamed this to UMTX_OP_ROBUST_LISTS, together with the parameters structure. Updated patch is at https://kib.kiev.ua/kib/pshared/robust.3.patch I did not added the check for umtx shm into the umtx_handle_rb() yet, waiting for your opinion.