From owner-freebsd-fs@FreeBSD.ORG Tue Oct 18 00:54:50 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7CBC01065674 for ; Tue, 18 Oct 2011 00:54:50 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta07.emeryville.ca.mail.comcast.net (qmta07.emeryville.ca.mail.comcast.net [76.96.30.64]) by mx1.freebsd.org (Postfix) with ESMTP id 63A168FC12 for ; Tue, 18 Oct 2011 00:54:50 +0000 (UTC) Received: from omta14.emeryville.ca.mail.comcast.net ([76.96.30.60]) by qmta07.emeryville.ca.mail.comcast.net with comcast id m0ty1h0041HpZEsA70ujiE; Tue, 18 Oct 2011 00:54:43 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta14.emeryville.ca.mail.comcast.net with comcast id m0ti1h00k1t3BNj8a0tijs; Tue, 18 Oct 2011 00:53:43 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 545D1102C1C; Mon, 17 Oct 2011 17:54:48 -0700 (PDT) Date: Mon, 17 Oct 2011 17:54:48 -0700 From: Jeremy Chadwick To: Harold Paulson Message-ID: <20111018005448.GA2855@icarus.home.lan> References: <4D8047A6-930E-4DE8-BA55-051890585BFE@internal.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4D8047A6-930E-4DE8-BA55-051890585BFE@internal.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Damaged directory on ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Oct 2011 00:54:50 -0000 On Mon, Oct 17, 2011 at 05:17:31PM -0700, Harold Paulson wrote: > I've had a server that boots from ZFS panicking for a couple days. I have worked around the problem for now, but I hope someone can give me some insight into what's going on, and how I can solve it properly. > > The server is running 8.2-STABLE (zfs v28) with 8G of ram and 4 SATA disks in a raid10 type arrangement: > > # uname -a > FreeBSD jane.sierraweb.com 8.2-STABLE-201105 FreeBSD 8.2-STABLE-201105 #0: Tue May 17 05:18:48 UTC 2011 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 First thing to do is to consider upgrading to a newer RELENG_8 date. There have been *many* ZFS fixes since May. > And zpool status: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > mirror ONLINE 0 0 0 > gpt/disk0 ONLINE 0 0 0 > gpt/disk1 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > gpt/disk2 ONLINE 0 0 0 > gpt/disk3 ONLINE 0 0 0 > > It started panicking under load a couple days ago. We replaced RAM and motherboard, but problems persisted. I don't know if a hardware issue originally caused the problem or what. When it panics, I get the usual panic message, but I don't get a core file, and it never reboots itself. > > http://pastebin.com/F1J2AjSF ZFS developers will need to comment on the state of the backtrace. You may be requested to examine the core using kgdb and be given some commands to run on it. > While I was trying to figure out the source of the problem, I notice stuck various stuck processes that peg a CPU and can't be killed, such as: > > PID JID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 48735 0 root 1 46 0 11972K 924K CPU3 3 415:14 100.00% find Had you done procstat -k -k 48735 (the "double -k" is not a typo), you probably would have seen that the process was "stuck" in a ZFS-related thread. These are processes which the kernel is hanging on to and will not let go of, so even kill -9 won't kill these. It would have also be worthwhile to get the "process tree" of what spawned the PID. (Solaris has ptree; I think we have something similar under FreeBSD but I forget what) The reason that matters is that it's probably a periodic job that runs (there are many which use find), traversing your ZFS filesystems, and tickling a bug/issue somewhere. You even hint at this in your next paragraph, re: locate.updatedb. > They are not marked zombie, but I can't kill them, and restarting the jail they are in won't even get rid of them. truss just hangs with no output on them. On different occasions, I noticed pop3d processes for the same user getting stuck in this way. On a hunch I ran a "find" through the files in the user's Maildir and got a panic. I disabled this account and now the server is stable again. At least until locate.updatedb walks through that directory, I suppose. Evidentially, there is some kind of hole in the file system below that directory tree causing the panic. The fact that jails are involved complicates things even more. truss and ktrace won't show anything going on because of what I said above: the kernel bits associated with the process are hung or spinning, not the actual syscall/userland bits. Furthermore, truss on FreeBSD is basically worthless; use ktrace. > I can move that directory out of the way, and carry on, but is there anything I can do to really *repair* the problem? I would recommend starting with "zpool scrub" on the pool which is associated with the Maildir/ directory of the account you disable. I will not be surprised if it comes back 100% clean. Given what the backtrace looks like, I would say the Maildir/ has a ton of files in it. Is that the case? Does "echo *" say something about argument list too long? You should also be aware that Maildir on ZFS performs horribly. I've experienced this, and there are old discussions about it as well. Here are some of my findings. http://koitsu.wordpress.com/2009/06/01/freebsd-and-zfs-horrible-raidz1-read-speed/ http://koitsu.wordpress.com/2009/06/01/freebsd-and-zfs-horrible-raidz1-speed-part-2/ http://koitsu.wordpress.com/2009/10/29/unix-mail-format-annoyances/ The state of mail spools on UNIX is a complete disgrace, and everyone involved in it should feel ashamed. MIX is probably the best solution to this problem, but it's not being adopted by all the major players, which is very sad. I realise that doesn't solve your problem, but my strong recommendation is to use classic UNIX mail spools (one file for many messages) when the filesystem is ZFS-based. However, someone familiar with the ZFS internals, as I said, should investigate the crash you're experiencing regardless. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |