Date: Tue, 10 Jul 2012 11:32:39 +0300 From: George Kontostanos <gkontos.mail@gmail.com> To: Dennis Glatting <freebsd@pki2.com> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS hanging Message-ID: <CA%2BdUSypXZ5qq9vj%2BkaVicfkUHEQEeaCt46F%2Bg%2B7wDyATbw9UbA@mail.gmail.com> In-Reply-To: <1341864787.32803.43.camel@btw.pki2.com> References: <1341864787.32803.43.camel@btw.pki2.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 9, 2012 at 11:13 PM, Dennis Glatting <freebsd@pki2.com> wrote: > I have a ZFS array of disks where the system simply stops as if forever > blocked by some IO mutex. This happens often and the following is the > output of top: > > last pid: 6075; load averages: 0.00, 0.00, 0.00 up 0+16:54:41 > 13:04:10 > 135 processes: 1 running, 134 sleeping > CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > Mem: 47M Active, 24M Inact, 18G Wired, 120M Buf, 44G Free > Swap: 32G Total, 32G Free > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > 2410 root 1 33 0 11992K 2820K zio->i 7 331:25 0.00% > bzip2 > 2621 root 1 52 4 28640K 5544K tx->tx 24 245:33 0.00% > john > 2624 root 1 48 4 28640K 5544K tx->tx 4 239:08 0.00% > john > 2623 root 1 49 4 28640K 5544K tx->tx 7 238:44 0.00% > john > 2640 root 1 42 4 28640K 5420K tx->tx 23 206:51 0.00% > john > 2638 root 1 42 4 28640K 5420K tx->tx 28 206:34 0.00% > john > 2639 root 1 42 4 28640K 5420K tx->tx 9 206:30 0.00% > john > 2637 root 1 42 4 28640K 5420K tx->tx 18 206:24 0.00% > john > > > This system is presently resilvering a disk but these stops have > happened before. > > > iirc# zpool status disk-1 > pool: disk-1 > state: DEGRADED > status: One or more devices is currently being resilvered. The pool > will > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Sun Jul 8 13:07:46 2012 > 104G scanned out of 12.4T at 1.73M/s, (scan is slow, no > estimated time) > 10.3G resilvered, 0.82% done > config: > > NAME STATE READ WRITE CKSUM > disk-1 DEGRADED 0 0 0 > raidz2-0 DEGRADED 0 0 0 > da1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da10 ONLINE 0 0 0 > da9 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > replacing-7 DEGRADED 0 0 0 > 17938531774236227186 UNAVAIL 0 0 0 was /dev/da8 > da3 ONLINE 0 0 0 (resilvering) > da8 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > logs > ada2p1 ONLINE 0 0 0 > cache > ada1 ONLINE 0 0 0 > > errors: No known data errors > > > This system has dissimilar disks, which I understand should not be a > problem but the stopping also happened before I started the slow disk > upgrade process. > > The disks are served by: > > * A LSI 9211 flashed to IT, and > * A LSI 2008 controller on the motherboard also flashed to IT. > > The 2008 BIOS and firmware is the most recent from LSI. The motherboard > is a Supermicro H8DG6-F. > > > My question is what should I be looking at and how should I look at it? > There is nothing in the logs or the console, rather the system is > forever paused and entering commands results in no response (it's as if > everything is deadlocked). > > > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" Can you post your 'dmesg | grep mps', the FreeBSD version you run? Also, is there any chance that those disks are 4K? -- George Kontostanos Aicom telecoms ltd http://www.aisecure.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BdUSypXZ5qq9vj%2BkaVicfkUHEQEeaCt46F%2Bg%2B7wDyATbw9UbA>