From owner-freebsd-current@FreeBSD.ORG Wed Apr 15 02:32:34 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AECCD1065679 for ; Wed, 15 Apr 2009 02:32:34 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 0AB138FC12 for ; Wed, 15 Apr 2009 02:32:33 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from harkness.in.wanderview.com (harkness.in.wanderview.com [10.76.10.150]) (authenticated bits=0) by mail.wanderview.com (8.14.3/8.14.3) with ESMTP id n3F2WPw8002462 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 15 Apr 2009 02:32:26 GMT (envelope-from ben@wanderview.com) Message-Id: From: Ben Kelly To: Artem Belevich In-Reply-To: <08D7DC2A-68BE-47B6-8D5D-5DE6B48F87E5@wanderview.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Date: Tue, 14 Apr 2009 22:32:25 -0400 References: <49C2CFF6.8070608@egr.msu.edu> <08D7DC2A-68BE-47B6-8D5D-5DE6B48F87E5@wanderview.com> X-Mailer: Apple Mail (2.930.3) X-Spam-Score: -1.44 () ALL_TRUSTED X-Scanned-By: MIMEDefang 2.64 on 10.76.20.1 Cc: freebsd-current@freebsd.org Subject: Re: [patch] zfs livelock and thread priorities X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Apr 2009 02:32:35 -0000 On Apr 14, 2009, at 11:50 AM, Ben Kelly wrote: > On Apr 13, 2009, at 7:36 PM, Artem Belevich wrote: >> Tried your patch that used PRIBIO+{1,2} for priorities with -current >> r191008 and the kernel died with "spinlock held too long" panic. >> Actually, there apparently were two instances of panic on different >> cores.. >> >> Here's output of "alltrace" and "ps" after the crash: >> http://pastebin.com/f140f4596 >> >> I've reverted the change and kernel booted just fine. >> >> The box is quad-core with two ZFS pools -- one single-disk and >> another >> one is a two-disk mirror. Freebsd is installed on UFS partitions, ZFS >> is used for user stuff only. > > Thanks for the report! > > I don't have a lot of time to look at this today, but it appears > that there is a race condition on SMP machines when setting the > priority immediately after the kproc is spawned. As a quick hack I > tried adding a pause between the kproc_create() and the > sched_prio(). Can you try this patch? > > http://www.wanderview.com/svn/public/misc/zfs_livelock/zfs_thread_priority.diff > > I'll try to take a closer look at this later in the week. Sorry for replying to my own e-mail, but I've updated the patch again with a less hackish approach. (At the same URL above.) I added a new kproc_create_priority() function to set the priority of the new thread before its first scheduled. This should avoid any SMP races with setting the priority from an external thread. If you would be willing to try the test again with this new patch I would appreciate it. Thanks! - Ben