Date: Fri, 10 Oct 2014 04:02:29 +0000 From: Steve Wills <swills@freebsd.org> To: Steven Hartland <killing@multiplay.co.uk> Cc: fs@freebsd.org, current@freebsd.org, Andriy Gapon <avg@freebsd.org> Subject: Re: zfs hang Message-ID: <20141010040228.GI79158@mouf.net> In-Reply-To: <F93FC06BE5854556BF1F4318690C728C@multiplay.co.uk> References: <20141008004045.GA24762__48659.9047123038$1412728878$gmane$org@mouf.net> <5434D1CE.8010801@FreeBSD.org> <20141010012724.GD79158@mouf.net> <F93FC06BE5854556BF1F4318690C728C@multiplay.co.uk>
index | next in thread | previous in thread | raw e-mail
On Fri, Oct 10, 2014 at 02:35:14AM +0100, Steven Hartland wrote: > > ----- Original Message ----- > From: "Steve Wills" <swills@freebsd.org> > To: "Andriy Gapon" <avg@freebsd.org> > Cc: <current@freebsd.org>; <fs@freebsd.org> > Sent: Friday, October 10, 2014 2:27 AM > Subject: Re: zfs hang > > > > On Wed, Oct 08, 2014 at 08:55:26AM +0300, Andriy Gapon wrote: > >> On 08/10/2014 03:40, Steve Wills wrote: > >> > Hi, > >> > > >> > Not sure which thread this belongs to, but I have a zfs hang on one of my boxes > >> > running r272152. Running procstat -kka looks like: > >> > > >> > http://pastebin.com/szZZP8Tf > >> > > >> > My zpool commands seem to be hung in spa_errlog_lock while others are hung in > >> > zfs_lookup. Suggestions? > >> > >> There are several threads in zio_wait. If this is their permanent state then > >> there is some problem with I/O somewhere below ZFS. > > > > Thanks for the feedback. It seems one of my disks is dying, I rebooted and it > > came up OK, but today I got: > > > > panic: I/O to pool 'rpool' appears to be hung on vdev guid ..... at '/dev/ada0p3' > > > > I have screenshots and backtrace if anyone is interested. Dying drives > > shouldn't cause panic, right? > > Its the deadman timer kicking in so yes, thats expected. > > The following sysctls control this behaviour if you want to try and recover: > vfs.zfs.deadman_synctime_ms: 1000000 > vfs.zfs.deadman_checktime_ms: 5000 > vfs.zfs.deadman_enabled: 1 Ah, ok. This pool has two disks, mirrored. I think one of them is dying, the BIOS gives a SMART error on startup, but it still uses the disk fine. From what I read of the zfs deadman design, it's for when the controller is acting up. So I'm confused. Maybe this means both disks are dying? Stevehome | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141010040228.GI79158>
