From owner-freebsd-fs@FreeBSD.ORG Sun Oct 23 14:22:06 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6206B1065675 for ; Sun, 23 Oct 2011 14:22:06 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id D231A8FC15 for ; Sun, 23 Oct 2011 14:22:05 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id D9BFE575; Sun, 23 Oct 2011 16:03:08 +0200 (CEST) Date: Sun, 23 Oct 2011 16:02:22 +0200 From: Pawel Jakub Dawidek To: Harold Paulson Message-ID: <20111023140222.GG1697@garage.freebsd.pl> References: <4D8047A6-930E-4DE8-BA55-051890585BFE@internal.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oXNgvKVxGWJ0RPMJ" Content-Disposition: inline In-Reply-To: <4D8047A6-930E-4DE8-BA55-051890585BFE@internal.org> X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Damaged directory on ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Oct 2011 14:22:06 -0000 --oXNgvKVxGWJ0RPMJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 17, 2011 at 05:17:31PM -0700, Harold Paulson wrote: > Hello,=20 >=20 > I've had a server that boots from ZFS panicking for a couple days. I hav= e worked around the problem for now, but I hope someone can give me some in= sight into what's going on, and how I can solve it properly. =20 >=20 > The server is running 8.2-STABLE (zfs v28) with 8G of ram and 4 SATA disk= s in a raid10 type arrangement: >=20 > # uname -a =20 > FreeBSD jane.sierraweb.com 8.2-STABLE-201105 FreeBSD 8.2-STABLE-201105 #0= : Tue May 17 05:18:48 UTC 2011 root@mason.cse.buffalo.edu:/usr/obj/usr/= src/sys/GENERIC amd64 >=20 > And zpool status:=20 >=20 > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > mirror ONLINE 0 0 0 > gpt/disk0 ONLINE 0 0 0 > gpt/disk1 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > gpt/disk2 ONLINE 0 0 0 > gpt/disk3 ONLINE 0 0 0 >=20 > It started panicking under load a couple days ago. We replaced RAM and m= otherboard, but problems persisted. I don't know if a hardware issue origi= nally caused the problem or what. When it panics, I get the usual panic me= ssage, but I don't get a core file, and it never reboots itself. =20 >=20 > http://pastebin.com/F1J2AjSF >=20 > While I was trying to figure out the source of the problem, I notice stuc= k various stuck processes that peg a CPU and can't be killed, such as: >=20 > PID JID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU CO= MMAND > 48735 0 root 1 46 0 11972K 924K CPU3 3 415:14 100.00% f= ind >=20 > They are not marked zombie, but I can't kill them, and restarting the jai= l they are in won't even get rid of them. truss just hangs with no output = on them. On different occasions, I noticed pop3d processes for the same us= er getting stuck in this way. On a hunch I ran a "find" through the files = in the user's Maildir and got a panic. I disabled this account and now the= server is stable again. At least until locate.updatedb walks through that= directory, I suppose. Evidentially, there is some kind of hole in the fi= le system below that directory tree causing the panic. =20 >=20 > I can move that directory out of the way, and carry on, but is there anyt= hing I can do to really *repair* the problem? Could you run these commands: objdump -D /boot/kernel/zfs.ko.symbols | egrep '^[0-9a-f]{8,16} ' | awk '{printf("0x%s\n", $1)}' | xargs -J ADDR printf "%u + %= u\n" ADDR 0x111 | bc | xargs printf "0x%x\n" | xargs addr2line -e /boot/ker= nel/zfs.ko.symbols They should convert fzap_cursor_retrieve+0x111 info file:line. Send it here once you obtain it. Thanks. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --oXNgvKVxGWJ0RPMJ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk6kHm4ACgkQForvXbEpPzR9HACfZjaw7qUv8KyZfPkEH7xVLuet I8cAnjray9S2+gUN5SFKdTD4IngISlaH =PF1p -----END PGP SIGNATURE----- --oXNgvKVxGWJ0RPMJ--