Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Jul 2013 06:57:05 -0400
From:      Travis Mikalson <bofh@terranova.net>
To:        d@delphij.net
Cc:        freebsd-fs@freebsd.org, kib@freebsd.org
Subject:   Re: Report: ZFS deadlock in 9-STABLE
Message-ID:  <51F64A81.5010404@terranova.net>
In-Reply-To: <51D47A5F.3030501@delphij.net>
References:  <51D45401.5050801@terranova.net> <51D47A5F.3030501@delphij.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Xin Li wrote:
> Hi,
> 
> Sorry for the top posting but I am quite convinced that this is a
> known issue that we have seen with our customer.  Please try applying
> this patch [1] and please report back if that fixes your problem.

It has been 21 days since I booted a kernel with your patch applied and
so far so good. This is the longest this system has gone without
livelocking before.

I'll report back one last time if this system makes it another few weeks
without a livelock, since that will be an extremely strong indication
that I was having the problem that you seem to have resolved with this
patch.

> Note that if you would like to provide more help, we would appreciate
> that you test Konstantin's patch as well, at:
> 
> http://lists.freebsd.org/pipermail/freebsd-hackers/2013-May/042876.html
> 
> [1] See attachment; the commit is
> https://github.com/trueos/trueos/commit/f678ae7c7f72fba577b00e3d0c237c4f297575c6
> 
> Cheers,
> 
> On 07/03/13 09:40, Travis Mikalson wrote:
>> Hello,
> 
>> To cut to the chase, I have a procstat -kk -a captured during a
>> livelock for you here: 
>> http://tog.net/freebsd/zfsdeadlock-storage1-20130703
> 
>> The other relevant configurations I could think of to show you are 
>> available within that http://tog.net/freebsd/ directory.
> 
>> If you want any additional information that I haven't given here
>> please let me know!
> 
>> This is a FreeBSD 9-STABLE AMD64 system currently at: r250777: Sat
>> May 18 17:41:39 EDT 2013
> 
>> I didn't see too many relevant ZFS-related fixes after that date so
>> am waiting for another round of interesting commits to update
>> again.
> 
>> Unfortunately, this system has been livelocking on average about
>> once every 7-14 days. Its lot in life is a ZFS storage server
>> serving NFS and istgt traffic.
> 
>> It has 32GB of RAM and is an 8-core 2.6GHz Opteron 6212. The zpool
>> looks like this, it has eight 1TB SAS drives and two SSDs being
>> used for log and cache.
> 
>> pool: storage1 state: ONLINE status: The pool is formatted using a
>> legacy on-disk format.  The pool can still be used, but some
>> features are unavailable. action: Upgrade the pool using 'zpool
>> upgrade'.  Once this is done, the pool will no longer be accessible
>> on software that does not support feature flags. scan: scrub
>> repaired 0 in 6h4m with 0 errors on Sun Jan  6 06:39:38 2013 
>> config:
> 
>> NAME        STATE     READ WRITE CKSUM storage1    ONLINE       0
>> 0     0 raidz1-0  ONLINE       0     0     0 da0     ONLINE       0
>> 0     0 da2     ONLINE       0     0     0 da4     ONLINE       0
>> 0     0 da6     ONLINE       0     0     0 raidz1-1  ONLINE       0
>> 0     0 da1     ONLINE       0     0     0 da3     ONLINE       0
>> 0     0 da5     ONLINE       0     0     0 da7     ONLINE       0
>> 0     0 logs mirror-2  ONLINE       0     0     0 da8p2   ONLINE
>> 0     0     0 da9p2   ONLINE       0     0     0 cache da8p3
>> ONLINE       0     0     0 da9p3     ONLINE       0     0     0
> 
>> errors: No known data errors



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51F64A81.5010404>