Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Jul 2012 11:32:39 +0300
From:      George Kontostanos <gkontos.mail@gmail.com>
To:        Dennis Glatting <freebsd@pki2.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS hanging
Message-ID:  <CA%2BdUSypXZ5qq9vj%2BkaVicfkUHEQEeaCt46F%2Bg%2B7wDyATbw9UbA@mail.gmail.com>
In-Reply-To: <1341864787.32803.43.camel@btw.pki2.com>
References:  <1341864787.32803.43.camel@btw.pki2.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 9, 2012 at 11:13 PM, Dennis Glatting <freebsd@pki2.com> wrote:
> I have a ZFS array of disks where the system simply stops as if forever
> blocked by some IO mutex. This happens often and the following is the
> output of top:
>
> last pid:  6075;  load averages:  0.00,  0.00,  0.00    up 0+16:54:41
> 13:04:10
> 135 processes: 1 running, 134 sleeping
> CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> Mem: 47M Active, 24M Inact, 18G Wired, 120M Buf, 44G Free
> Swap: 32G Total, 32G Free
>
>   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
> COMMAND
>  2410 root          1  33    0 11992K  2820K zio->i  7 331:25  0.00%
> bzip2
>  2621 root          1  52    4 28640K  5544K tx->tx 24 245:33  0.00%
> john
>  2624 root          1  48    4 28640K  5544K tx->tx  4 239:08  0.00%
> john
>  2623 root          1  49    4 28640K  5544K tx->tx  7 238:44  0.00%
> john
>  2640 root          1  42    4 28640K  5420K tx->tx 23 206:51  0.00%
> john
>  2638 root          1  42    4 28640K  5420K tx->tx 28 206:34  0.00%
> john
>  2639 root          1  42    4 28640K  5420K tx->tx  9 206:30  0.00%
> john
>  2637 root          1  42    4 28640K  5420K tx->tx 18 206:24  0.00%
> john
>
>
> This system is presently resilvering a disk but these stops have
> happened before.
>
>
> iirc#  zpool status disk-1
>   pool: disk-1
>  state: DEGRADED
> status: One or more devices is currently being resilvered.  The pool
> will
>         continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>   scan: resilver in progress since Sun Jul  8 13:07:46 2012
>         104G scanned out of 12.4T at 1.73M/s, (scan is slow, no
> estimated time)
>         10.3G resilvered, 0.82% done
> config:
>
>         NAME                        STATE     READ WRITE CKSUM
>         disk-1                      DEGRADED     0     0     0
>           raidz2-0                  DEGRADED     0     0     0
>             da1                     ONLINE       0     0     0
>             da2                     ONLINE       0     0     0
>             da10                    ONLINE       0     0     0
>             da9                     ONLINE       0     0     0
>             da5                     ONLINE       0     0     0
>             da6                     ONLINE       0     0     0
>             da7                     ONLINE       0     0     0
>             replacing-7             DEGRADED     0     0     0
>               17938531774236227186  UNAVAIL      0     0     0  was /dev/da8
>               da3                   ONLINE       0     0     0  (resilvering)
>             da8                     ONLINE       0     0     0
>             da4                     ONLINE       0     0     0
>         logs
>           ada2p1                    ONLINE       0     0     0
>         cache
>           ada1                      ONLINE       0     0     0
>
> errors: No known data errors
>
>
> This system has dissimilar disks, which I understand should not be a
> problem but the stopping also happened before I started the slow disk
> upgrade process.
>
> The disks are served by:
>
> * A LSI 9211 flashed to IT, and
> * A LSI 2008 controller on the motherboard also flashed to IT.
>
> The 2008 BIOS and firmware is the most recent from LSI. The motherboard
> is a Supermicro H8DG6-F.
>
>
> My question is what should I be looking at and how should I look at it?
> There is nothing in the logs or the console, rather the system is
> forever paused and entering commands results in no response (it's as if
> everything is deadlocked).
>
>
>
>
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

Can you post your 'dmesg | grep mps', the FreeBSD version you run?
Also, is there any chance that those disks are 4K?

-- 
George Kontostanos
Aicom telecoms ltd
http://www.aisecure.net



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BdUSypXZ5qq9vj%2BkaVicfkUHEQEeaCt46F%2Bg%2B7wDyATbw9UbA>