From owner-freebsd-threads@freebsd.org Thu Nov 26 12:36:33 2015 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 82456A39FF0 for ; Thu, 26 Nov 2015 12:36:33 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6EA4713B1 for ; Thu, 26 Nov 2015 12:36:33 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAQCaXPC024298 for ; Thu, 26 Nov 2015 12:36:33 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 204178] stress2 on arm64 thr1 hangs after printing pthread_join error Date: Thu, 26 Nov 2015 12:36:33 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: Andrew@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: Andrew@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Nov 2015 12:36:33 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204178 --- Comment #1 from Andrew Turner --- I'm not sure if the message is related to the hang. I've seen each independent of each other. It seems the process is stuck in the kernel waiting on a mutex: # pprocstat -t 19405 PID TID COMM TDNAME CPU PRI STATE WCHAN 19405 100607 thr1 - -1 120 sleep umtxn 19405 101334 thr1 - -1 152 sleep umtxn # procstat -k 19405 PID TID COMM TDNAME KSTACK 19405 100607 thr1 - mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_lock_umutex __umtx_op_wait_umutex do_el0_sync 19405 101334 thr1 - mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_lock_umutex __umtx_op_wait_umutex do_el0_sync -- You are receiving this mail because: You are on the CC list for the bug. From owner-freebsd-threads@freebsd.org Sat Nov 28 07:24:51 2015 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 84F32A3BF81 for ; Sat, 28 Nov 2015 07:24:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 712C617FD for ; Sat, 28 Nov 2015 07:24:51 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAS7OpDe093210 for ; Sat, 28 Nov 2015 07:24:51 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 204178] stress2 on arm64 thr1 hangs after printing pthread_join error Date: Sat, 28 Nov 2015 07:24:51 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: pho@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: Andrew@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Nov 2015 07:24:51 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204178 Peter Holm changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pho@FreeBSD.org --- Comment #2 from Peter Holm --- Could you try to run this test, in order to narrow the test scenario a bit. I have tried this on amd64/i386 without finding any issues. Place this in stress2/misc as thr1.sh and run it: #!/bin/sh . ../default.cfg export runRUNTIME=1h export thr1LOAD=100 export TESTPROGS=" testcases/swap/swap testcases/thr1/thr1 " (cd ..; ./testcases/run/run $TESTPROGS) Thank you. -- You are receiving this mail because: You are on the CC list for the bug. From owner-freebsd-threads@freebsd.org Sat Nov 28 10:31:07 2015 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D0321A3A448 for ; Sat, 28 Nov 2015 10:31:07 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BD39514FD for ; Sat, 28 Nov 2015 10:31:07 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tASAV7va008659 for ; Sat, 28 Nov 2015 10:31:07 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 204178] stress2 on arm64 thr1 hangs after printing pthread_join error Date: Sat, 28 Nov 2015 10:31:07 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: Andrew@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: Andrew@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Nov 2015 10:31:07 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204178 --- Comment #3 from Andrew Turner --- With this script I can reproduce the issue. It can take a few hours to show up so I increased the runtime to 24 hours. -- You are receiving this mail because: You are on the CC list for the bug. From owner-freebsd-threads@freebsd.org Sat Nov 28 11:18:49 2015 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7509DA3AD5A for ; Sat, 28 Nov 2015 11:18:49 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6252A1528 for ; Sat, 28 Nov 2015 11:18:49 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tASBInOQ032439 for ; Sat, 28 Nov 2015 11:18:49 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 204178] stress2 on arm64 thr1 hangs after printing pthread_join error Date: Sat, 28 Nov 2015 11:18:49 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: pho@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: Andrew@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Nov 2015 11:18:49 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204178 --- Comment #4 from Peter Holm --- Great! So the scenario is creating many threads, which returns almost immediately. This during VM pressure. -- You are receiving this mail because: You are on the CC list for the bug. From owner-freebsd-threads@freebsd.org Sat Nov 28 11:41:17 2015 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 55DBFA3A138 for ; Sat, 28 Nov 2015 11:41:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 29A031CC6 for ; Sat, 28 Nov 2015 11:41:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tASBfHi6079925 for ; Sat, 28 Nov 2015 11:41:17 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 204178] stress2 on arm64 thr1 hangs after printing pthread_join error Date: Sat, 28 Nov 2015 11:41:17 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: Andrew@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: Andrew@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Nov 2015 11:41:17 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204178 --- Comment #5 from Andrew Turner --- I have some code that inspects the state when the issue show up. Below is a dump of the registers of the only thread in the process. x0 = 000000004048fd50 x1 = 0000000000000011 x2 = 0000000000000000 x3 = 0000000000000000 x4 = 0000000000000000 x5 = 0000000000000001 x6 = 0000000000000001 x7 = 000000000000007f x8 = 00000000000001c6 x9 = 0000000080000000 x10 = 00000000000187dd x11 = 00000000000187dd x12 = 0000000000000001 x13 = 000000004048fcd8 x14 = 00000000000187dd x15 = 0000000000000000 x16 = 0000000040485df8 x17 = 00000000404fe8dc x18 = 0000000040801530 x19 = 000000004048fd50 x20 = 00000000000187dd x21 = 0000000040490000 x22 = 000000004048fd50 x23 = 0000000000412000 x24 = 0000000000000000 x25 = 00000000004014f0 x26 = 0000000000000000 x27 = 0000000000000000 x28 = 0000000000000000 x29 = 0000007fffffee50 lr = 0000000040466eb0 sp = 0000007fffffee40 elr = 00000000404fe8e0 spsr = 90000000 I looked at the data passed to the kernel in x0 and found the owner of the lock to be the current thread. I also looked at a stack trace and found we entered the lock by the following calls: _pthread_create -> _thr_alloc -> __thr_umutex_lock -> _umtx_op The lock in _thr_alloc is, as far as I can tell, the only place within this function we acquire this lock, and is protecting _tcb_ctor. -- You are receiving this mail because: You are on the CC list for the bug. From owner-freebsd-threads@freebsd.org Sat Nov 28 13:14:14 2015 Return-Path: Delivered-To: freebsd-threads@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C3058A3B86F for ; Sat, 28 Nov 2015 13:14:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AF68F13AA for ; Sat, 28 Nov 2015 13:14:14 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tASDEEgs059460 for ; Sat, 28 Nov 2015 13:14:14 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-threads@FreeBSD.org Subject: [Bug 204178] stress2 on arm64 thr1 hangs after printing pthread_join error Date: Sat, 28 Nov 2015 13:14:14 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: threads X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: Andrew@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Nov 2015 13:14:14 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204178 Konstantin Belousov changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kib@FreeBSD.org --- Comment #6 from Konstantin Belousov --- (In reply to Andrew Turner from comment #5) Could you instrument the tcb_lock to add the atomic counters for acquires and releases ? Then we would see the generation counts for acq/rel on tcb_lock, in particular, whether something was missed at unlock, or e.g. a thread was terminated without unlock (weird). -- You are receiving this mail because: You are on the CC list for the bug.