From owner-svn-src-all@FreeBSD.ORG Tue Nov 4 22:51:58 2008 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8A57F106567F; Tue, 4 Nov 2008 22:51:58 +0000 (UTC) (envelope-from peter@wemm.org) Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.229]) by mx1.freebsd.org (Postfix) with ESMTP id 4F2068FC13; Tue, 4 Nov 2008 22:51:58 +0000 (UTC) (envelope-from peter@wemm.org) Received: by rv-out-0506.google.com with SMTP id b25so3271929rvf.43 for ; Tue, 04 Nov 2008 14:51:57 -0800 (PST) Received: by 10.142.251.15 with SMTP id y15mr74336wfh.39.1225839117902; Tue, 04 Nov 2008 14:51:57 -0800 (PST) Received: by 10.142.255.21 with HTTP; Tue, 4 Nov 2008 14:51:57 -0800 (PST) Message-ID: Date: Tue, 4 Nov 2008 14:51:57 -0800 From: "Peter Wemm" To: "John Baldwin" In-Reply-To: <200811041707.26052.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200810240103.m9O13V7f071075@svn.freebsd.org> <200811041707.26052.jhb@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, David Xu Subject: Re: svn commit: r184216 - head/sys/kern X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Nov 2008 22:51:58 -0000 On Tue, Nov 4, 2008 at 2:07 PM, John Baldwin wrote: > On Thursday 23 October 2008 09:03:31 pm David Xu wrote: >> Author: davidxu >> Date: Fri Oct 24 01:03:31 2008 >> New Revision: 184216 >> URL: http://svn.freebsd.org/changeset/base/184216 >> >> Log: >> partly revert revision 184199, because TDF_NEEDSIGCHK is persitent >> when thread is in kernel mode, it can cause dead loop, now unlock >> process lock after acquired sleep queue lock and thread lock to >> avoid the problem. This means TDF_NEEDSIGCHK and TDF_NEEDSUSPCHK must >> be set with process lock and thread lock being hold at same time. > > You can't unlock the proc lock while holding the thread_lock(). This will > lead to deadlock due to the way that thread_lock() works. This is different > from the rules in 6.x where you could drop a mutex while holding sched_lock. > You will need to revert this. > > -- > John Baldwin I had to back out rev 184216 and 184199 in total in order to stop my machine from dying. Compile this dumb program: http://people.freebsd.org/~peter/pth.c $ cc -pthread -o pth pth.c run in a shell while loop so that the entire thing is execed and exits repeatedly. $ while true; do date; ./pth; done On my 2-core athlon64 box at home, and the 8-core ref8-i386 in the freebsd.org cluster, this causes a lockup in mere seconds. Backing out these two changes solves it. my machine: spin lock 0xffffff00a4037000 (turnstile lock) held by 0xffffff01045746e0 (tid 100355) too long panic: spin lock held too long ref8-i386: spin lock 0xc06436c0 (sched lock 5) held by 0xd374f690 (tid 100249) too long panic: spin lock held too long -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell