FreeBSD Mail Archives

Date:      Thu, 16 Apr 2009 21:30:21 -0400
From:      Ben Kelly <ben@wanderview.com>
To:        Artem Belevich <fbsdlist@src.cx>
Cc:        freebsd-current@freebsd.org
Subject:   Re: [patch] zfs livelock and thread priorities
Message-ID:  <7DEF7288-1304-4A64-8A23-BB03349653EA@wanderview.com>
In-Reply-To: <ed91d4a80904142135n429dea52o672abf51116fa707@mail.gmail.com>
References:  <DC9F2088-A0AF-467D-8574-F24A045ABD81@wanderview.com> <49C2CFF6.8070608@egr.msu.edu> <BDABA909-C2AE-4A55-869B-CA01BE778A82@wanderview.com> <ed91d4a80904131636u18c90474w7cdaa57bc7000e02@mail.gmail.com> <08D7DC2A-68BE-47B6-8D5D-5DE6B48F87E5@wanderview.com> <AC3C4C3F-40C6-4AF9-BAF3-2C4D1E444839@wanderview.com> <ed91d4a80904142135n429dea52o672abf51116fa707@mail.gmail.com>

index | next in thread | previous in thread | raw e-mail


On Apr 15, 2009, at 12:35 AM, Artem Belevich wrote:
> I'll give it a try in a few days. I'll let you know how it went.

Just FYI, I was able to reproduce some of the failures with the  
original patch using an SMP vmware image.  The new patch seems to fix  
these problems and I was able to successfully mount a zfs pool.

> BTW, now that you're tinkering with ZFS threads and priorities, whould
> you by any chance have any idea why zfs scrub is so painfully slow on
> -current?
> When I start scrub on my -stable box, it pretty much runs full speed
> -- I can see disks under load all the time.
> However on -current scrub seems to run in small bursts. Disks get busy
> for a second or so and then things get quiet for about five seconds or
> so and this pattern repeats over and over.

I don't know.  I haven't had to scrub my devices very often.  I ran a  
couple here locally and did not see the behavior you describe.  There  
is a significant delay between typing zpool scrub and when it actually  
begins disk I/O, but after that it completes without pause.  If I get  
a chance I'll try to look at what the scrub code is doing.

Thanks again.

- Ben

> --Artem
>
>
>
> On Tue, Apr 14, 2009 at 7:32 PM, Ben Kelly <ben@wanderview.com> wrote:
>> On Apr 14, 2009, at 11:50 AM, Ben Kelly wrote:
>>>
>>> On Apr 13, 2009, at 7:36 PM, Artem Belevich wrote:
>>>>
>>>> Tried your patch that used PRIBIO+{1,2} for priorities with - 
>>>> current
>>>> r191008 and the kernel died with "spinlock held too long" panic.
>>>> Actually, there apparently were two instances of panic on different
>>>> cores..
>>>>
>>>> Here's output of "alltrace" and "ps" after the crash:
>>>> http://pastebin.com/f140f4596
>>>>
>>>> I've reverted the change and kernel booted just fine.
>>>>
>>>> The box is quad-core with two ZFS pools -- one single-disk and  
>>>> another
>>>> one is a two-disk mirror. Freebsd is installed on UFS partitions,  
>>>> ZFS
>>>> is used for user stuff only.
>>>
>>> Thanks for the report!
>>>
>>> I don't have a lot of time to look at this today, but it appears  
>>> that
>>> there is a race condition on SMP machines when setting the priority
>>> immediately after the kproc is spawned.  As a quick hack I tried  
>>> adding a
>>> pause between the kproc_create() and the sched_prio().  Can you  
>>> try this
>>> patch?
>>>
>>>
>>>  http://www.wanderview.com/svn/public/misc/zfs_livelock/zfs_thread_priority.diff
>>>
>>> I'll try to take a closer look at this later in the week.
>>
>> Sorry for replying to my own e-mail, but I've updated the patch  
>> again with a
>> less hackish approach.  (At the same URL above.)  I added a new
>> kproc_create_priority() function to set the priority of the new  
>> thread
>> before its first scheduled.  This should avoid any SMP races with  
>> setting
>> the priority from an external thread.
>>
>> If you would be willing to try the test again with this new patch I  
>> would
>> appreciate it.
>>
>> Thanks!
>>
>> - Ben
>>

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7DEF7288-1304-4A64-8A23-BB03349653EA>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation