From owner-freebsd-fs@FreeBSD.ORG Tue Jul 10 08:32:40 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BC64D1065675 for ; Tue, 10 Jul 2012 08:32:40 +0000 (UTC) (envelope-from gkontos.mail@gmail.com) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 7D3188FC0C for ; Tue, 10 Jul 2012 08:32:40 +0000 (UTC) Received: by obbun3 with SMTP id un3so2222160obb.13 for ; Tue, 10 Jul 2012 01:32:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=jjZdX3VUk5KYTFL6MRpVC4efZ0rrOqu1jN9Y7XYoQq4=; b=Y3eXONP0Q1dPiaYrPhjxd/RzIzEwtLEM1vT+3gjl+gAvK++TkF8iq5AG/ebOD35fe6 1vBzCetNwdnCuuKg5QPTrjmVVjidrvSsCsUGULtCutPXWhvmZ0EMwdXGMm6x1V5sQVbF Eh7TnTkP28DquoViyoX+loYUpkIdLaDSAldNZefoDmlaGPeecUh8Ha9KG55VFRx0kgT0 nFeEzHyCVpwuooFakurahc870Onw707LtndaD5LybpFfmCF85dT17eFZzPoDWqlrOsL9 yTdQPfPgS+ZORvuHgNzDoAjoM7ZOUYHrzQgvB1a6irRJtIvzqkRFqNuLmYIf1ZFdGqJA VIaQ== MIME-Version: 1.0 Received: by 10.60.30.132 with SMTP id s4mr359525oeh.6.1341909159879; Tue, 10 Jul 2012 01:32:39 -0700 (PDT) Received: by 10.182.209.33 with HTTP; Tue, 10 Jul 2012 01:32:39 -0700 (PDT) In-Reply-To: <1341864787.32803.43.camel@btw.pki2.com> References: <1341864787.32803.43.camel@btw.pki2.com> Date: Tue, 10 Jul 2012 11:32:39 +0300 Message-ID: From: George Kontostanos To: Dennis Glatting Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS hanging X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Jul 2012 08:32:40 -0000 On Mon, Jul 9, 2012 at 11:13 PM, Dennis Glatting wrote: > I have a ZFS array of disks where the system simply stops as if forever > blocked by some IO mutex. This happens often and the following is the > output of top: > > last pid: 6075; load averages: 0.00, 0.00, 0.00 up 0+16:54:41 > 13:04:10 > 135 processes: 1 running, 134 sleeping > CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > Mem: 47M Active, 24M Inact, 18G Wired, 120M Buf, 44G Free > Swap: 32G Total, 32G Free > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > 2410 root 1 33 0 11992K 2820K zio->i 7 331:25 0.00% > bzip2 > 2621 root 1 52 4 28640K 5544K tx->tx 24 245:33 0.00% > john > 2624 root 1 48 4 28640K 5544K tx->tx 4 239:08 0.00% > john > 2623 root 1 49 4 28640K 5544K tx->tx 7 238:44 0.00% > john > 2640 root 1 42 4 28640K 5420K tx->tx 23 206:51 0.00% > john > 2638 root 1 42 4 28640K 5420K tx->tx 28 206:34 0.00% > john > 2639 root 1 42 4 28640K 5420K tx->tx 9 206:30 0.00% > john > 2637 root 1 42 4 28640K 5420K tx->tx 18 206:24 0.00% > john > > > This system is presently resilvering a disk but these stops have > happened before. > > > iirc# zpool status disk-1 > pool: disk-1 > state: DEGRADED > status: One or more devices is currently being resilvered. The pool > will > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Sun Jul 8 13:07:46 2012 > 104G scanned out of 12.4T at 1.73M/s, (scan is slow, no > estimated time) > 10.3G resilvered, 0.82% done > config: > > NAME STATE READ WRITE CKSUM > disk-1 DEGRADED 0 0 0 > raidz2-0 DEGRADED 0 0 0 > da1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da10 ONLINE 0 0 0 > da9 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > replacing-7 DEGRADED 0 0 0 > 17938531774236227186 UNAVAIL 0 0 0 was /dev/da8 > da3 ONLINE 0 0 0 (resilvering) > da8 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > logs > ada2p1 ONLINE 0 0 0 > cache > ada1 ONLINE 0 0 0 > > errors: No known data errors > > > This system has dissimilar disks, which I understand should not be a > problem but the stopping also happened before I started the slow disk > upgrade process. > > The disks are served by: > > * A LSI 9211 flashed to IT, and > * A LSI 2008 controller on the motherboard also flashed to IT. > > The 2008 BIOS and firmware is the most recent from LSI. The motherboard > is a Supermicro H8DG6-F. > > > My question is what should I be looking at and how should I look at it? > There is nothing in the logs or the console, rather the system is > forever paused and entering commands results in no response (it's as if > everything is deadlocked). > > > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" Can you post your 'dmesg | grep mps', the FreeBSD version you run? Also, is there any chance that those disks are 4K? -- George Kontostanos Aicom telecoms ltd http://www.aisecure.net