From owner-freebsd-current@FreeBSD.ORG Fri Apr 17 01:30:24 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7FA49106566B for ; Fri, 17 Apr 2009 01:30:24 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 1B4F28FC15 for ; Fri, 17 Apr 2009 01:30:23 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from harkness.in.wanderview.com (harkness.in.wanderview.com [10.76.10.150]) (authenticated bits=0) by mail.wanderview.com (8.14.3/8.14.3) with ESMTP id n3H1ULo9033189 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 17 Apr 2009 01:30:21 GMT (envelope-from ben@wanderview.com) Message-Id: <7DEF7288-1304-4A64-8A23-BB03349653EA@wanderview.com> From: Ben Kelly To: Artem Belevich In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Date: Thu, 16 Apr 2009 21:30:21 -0400 References: <49C2CFF6.8070608@egr.msu.edu> <08D7DC2A-68BE-47B6-8D5D-5DE6B48F87E5@wanderview.com> X-Mailer: Apple Mail (2.930.3) X-Spam-Score: -1.44 () ALL_TRUSTED X-Scanned-By: MIMEDefang 2.64 on 10.76.20.1 Cc: freebsd-current@freebsd.org Subject: Re: [patch] zfs livelock and thread priorities X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 01:30:24 -0000 On Apr 15, 2009, at 12:35 AM, Artem Belevich wrote: > I'll give it a try in a few days. I'll let you know how it went. Just FYI, I was able to reproduce some of the failures with the original patch using an SMP vmware image. The new patch seems to fix these problems and I was able to successfully mount a zfs pool. > BTW, now that you're tinkering with ZFS threads and priorities, whould > you by any chance have any idea why zfs scrub is so painfully slow on > -current? > When I start scrub on my -stable box, it pretty much runs full speed > -- I can see disks under load all the time. > However on -current scrub seems to run in small bursts. Disks get busy > for a second or so and then things get quiet for about five seconds or > so and this pattern repeats over and over. I don't know. I haven't had to scrub my devices very often. I ran a couple here locally and did not see the behavior you describe. There is a significant delay between typing zpool scrub and when it actually begins disk I/O, but after that it completes without pause. If I get a chance I'll try to look at what the scrub code is doing. Thanks again. - Ben > --Artem > > > > On Tue, Apr 14, 2009 at 7:32 PM, Ben Kelly wrote: >> On Apr 14, 2009, at 11:50 AM, Ben Kelly wrote: >>> >>> On Apr 13, 2009, at 7:36 PM, Artem Belevich wrote: >>>> >>>> Tried your patch that used PRIBIO+{1,2} for priorities with - >>>> current >>>> r191008 and the kernel died with "spinlock held too long" panic. >>>> Actually, there apparently were two instances of panic on different >>>> cores.. >>>> >>>> Here's output of "alltrace" and "ps" after the crash: >>>> http://pastebin.com/f140f4596 >>>> >>>> I've reverted the change and kernel booted just fine. >>>> >>>> The box is quad-core with two ZFS pools -- one single-disk and >>>> another >>>> one is a two-disk mirror. Freebsd is installed on UFS partitions, >>>> ZFS >>>> is used for user stuff only. >>> >>> Thanks for the report! >>> >>> I don't have a lot of time to look at this today, but it appears >>> that >>> there is a race condition on SMP machines when setting the priority >>> immediately after the kproc is spawned. As a quick hack I tried >>> adding a >>> pause between the kproc_create() and the sched_prio(). Can you >>> try this >>> patch? >>> >>> >>> http://www.wanderview.com/svn/public/misc/zfs_livelock/zfs_thread_priority.diff >>> >>> I'll try to take a closer look at this later in the week. >> >> Sorry for replying to my own e-mail, but I've updated the patch >> again with a >> less hackish approach. (At the same URL above.) I added a new >> kproc_create_priority() function to set the priority of the new >> thread >> before its first scheduled. This should avoid any SMP races with >> setting >> the priority from an external thread. >> >> If you would be willing to try the test again with this new patch I >> would >> appreciate it. >> >> Thanks! >> >> - Ben >>