Date: Thu, 16 Apr 2009 21:30:21 -0400 From: Ben Kelly <ben@wanderview.com> To: Artem Belevich <fbsdlist@src.cx> Cc: freebsd-current@freebsd.org Subject: Re: [patch] zfs livelock and thread priorities Message-ID: <7DEF7288-1304-4A64-8A23-BB03349653EA@wanderview.com> In-Reply-To: <ed91d4a80904142135n429dea52o672abf51116fa707@mail.gmail.com> References: <DC9F2088-A0AF-467D-8574-F24A045ABD81@wanderview.com> <49C2CFF6.8070608@egr.msu.edu> <BDABA909-C2AE-4A55-869B-CA01BE778A82@wanderview.com> <ed91d4a80904131636u18c90474w7cdaa57bc7000e02@mail.gmail.com> <08D7DC2A-68BE-47B6-8D5D-5DE6B48F87E5@wanderview.com> <AC3C4C3F-40C6-4AF9-BAF3-2C4D1E444839@wanderview.com> <ed91d4a80904142135n429dea52o672abf51116fa707@mail.gmail.com>
index | next in thread | previous in thread | raw e-mail
On Apr 15, 2009, at 12:35 AM, Artem Belevich wrote:
> I'll give it a try in a few days. I'll let you know how it went.
Just FYI, I was able to reproduce some of the failures with the
original patch using an SMP vmware image. The new patch seems to fix
these problems and I was able to successfully mount a zfs pool.
> BTW, now that you're tinkering with ZFS threads and priorities, whould
> you by any chance have any idea why zfs scrub is so painfully slow on
> -current?
> When I start scrub on my -stable box, it pretty much runs full speed
> -- I can see disks under load all the time.
> However on -current scrub seems to run in small bursts. Disks get busy
> for a second or so and then things get quiet for about five seconds or
> so and this pattern repeats over and over.
I don't know. I haven't had to scrub my devices very often. I ran a
couple here locally and did not see the behavior you describe. There
is a significant delay between typing zpool scrub and when it actually
begins disk I/O, but after that it completes without pause. If I get
a chance I'll try to look at what the scrub code is doing.
Thanks again.
- Ben
> --Artem
>
>
>
> On Tue, Apr 14, 2009 at 7:32 PM, Ben Kelly <ben@wanderview.com> wrote:
>> On Apr 14, 2009, at 11:50 AM, Ben Kelly wrote:
>>>
>>> On Apr 13, 2009, at 7:36 PM, Artem Belevich wrote:
>>>>
>>>> Tried your patch that used PRIBIO+{1,2} for priorities with -
>>>> current
>>>> r191008 and the kernel died with "spinlock held too long" panic.
>>>> Actually, there apparently were two instances of panic on different
>>>> cores..
>>>>
>>>> Here's output of "alltrace" and "ps" after the crash:
>>>> http://pastebin.com/f140f4596
>>>>
>>>> I've reverted the change and kernel booted just fine.
>>>>
>>>> The box is quad-core with two ZFS pools -- one single-disk and
>>>> another
>>>> one is a two-disk mirror. Freebsd is installed on UFS partitions,
>>>> ZFS
>>>> is used for user stuff only.
>>>
>>> Thanks for the report!
>>>
>>> I don't have a lot of time to look at this today, but it appears
>>> that
>>> there is a race condition on SMP machines when setting the priority
>>> immediately after the kproc is spawned. As a quick hack I tried
>>> adding a
>>> pause between the kproc_create() and the sched_prio(). Can you
>>> try this
>>> patch?
>>>
>>>
>>> http://www.wanderview.com/svn/public/misc/zfs_livelock/zfs_thread_priority.diff
>>>
>>> I'll try to take a closer look at this later in the week.
>>
>> Sorry for replying to my own e-mail, but I've updated the patch
>> again with a
>> less hackish approach. (At the same URL above.) I added a new
>> kproc_create_priority() function to set the priority of the new
>> thread
>> before its first scheduled. This should avoid any SMP races with
>> setting
>> the priority from an external thread.
>>
>> If you would be willing to try the test again with this new patch I
>> would
>> appreciate it.
>>
>> Thanks!
>>
>> - Ben
>>
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7DEF7288-1304-4A64-8A23-BB03349653EA>
