Date: Fri, 7 Dec 2012 10:21:01 -0800 From: Garrett Cooper <yanegomi@gmail.com> To: Fabian Keil <freebsd-listen@fabiankeil.de> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: ZFS hang Message-ID: <79E03100-A019-4579-8762-50FBB82DD356@gmail.com> In-Reply-To: <20121207172240.037306e1@fabiankeil.de> References: <50C1CB34.3000308@icritical.com> <50C1DDE8.9030503@icritical.com> <20121207172240.037306e1@fabiankeil.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Dec 7, 2012, at 8:22 AM, Fabian Keil <freebsd-listen@fabiankeil.de> wrote= : > Matt Burke <mattblists@icritical.com> wrote: >=20 >> Obviously, the cause of my problems would seem to be a hosed disk. Howeve= r >> the kernel msgbuf shows no complaints from the drive before reboot. >>=20 >> da8 is a 60GB OCZ Agility 3 SSD (purchased prior to realising just how >> unreliable they are). According to the SMART data, it's had just 146GB of= >> reads and 278GB writes over 3 power cycles with only 3 months power on >> time, similar to the others that have failed (~60% failure rate for ours)= >>=20 >> I can understand the drive failing, I just can't understand how it hung t= he >> system. I have had a similar thing happen on one of these machines before= >> (with GENERIC and no dumpdev, so no debugging) with one of these disks on= >> an Areca HBA. >=20 > In CURRENT, parts of the cam layer can silently hang under certain > circumstances and this can negatively affect various other subsystems > including ZFS: > http://lists.freebsd.org/pipermail/freebsd-current/2012-October/037413.htm= l >=20 > I suppose this regression is old enough to have trickled down > to the stable branches by now. >=20 > I'm not saying that this is definitively the problem you are > seeing, but I think it would explain the symptoms. >=20 >> Could there be a problem with ATA devices on SCSI controllers which is >> causing failures to be silently dropped? Is ZFS lacking a timeout on IO c= alls? >=20 > I believe ZFS is designed with the expectation that timeouts are > handled by the layers below it, so technically it doesn't "lack" > the timeouts for IO calls ... I've noticed hangs on reboot as well recently (in the last 2-3 months) with m= y ata single disk pools and my mfi pool. All storage disks seem healthy... T= he pools were running v28 with the zfs features upgrade. Thanks, -Garrett=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?79E03100-A019-4579-8762-50FBB82DD356>