From owner-freebsd-fs@FreeBSD.ORG  Sun Oct 23 06:54:24 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 326D11065670
	for <freebsd-fs@freebsd.org>; Sun, 23 Oct 2011 06:54:24 +0000 (UTC)
	(envelope-from haroldp@internal.org)
Received: from pluto.internal.org (mail.internal.org [64.191.53.117])
	by mx1.freebsd.org (Postfix) with ESMTP id E4F0D8FC0A
	for <freebsd-fs@freebsd.org>; Sun, 23 Oct 2011 06:54:23 +0000 (UTC)
Received: from [10.0.0.79] (99-46-24-87.lightspeed.renonv.sbcglobal.net
	[99.46.24.87])
	by pluto.internal.org (Postfix) with ESMTPA id 99C89ECC0B
	for <freebsd-fs@freebsd.org>; Sat, 22 Oct 2011 23:54:21 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Apple Message framework v1084)
From: Harold Paulson <haroldp@internal.org>
In-Reply-To: <20111018005448.GA2855@icarus.home.lan>
Date: Sat, 22 Oct 2011 23:54:20 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <4E2EF065-5C7D-4C5A-B1ED-89FC4BBBEEA1@internal.org>
References: <4D8047A6-930E-4DE8-BA55-051890585BFE@internal.org>
	<20111018005448.GA2855@icarus.home.lan>
To: freebsd-fs@freebsd.org
X-Mailer: Apple Mail (2.1084)
Subject: Re: Damaged directory on ZFS
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 23 Oct 2011 06:54:24 -0000

Jeremy,=20

If I've taken a while to respond it was because there was a ton of great =
information in your post and I've spent a lot of time testing stuff out. =
=20


On Oct 17, 2011, at 5:54 PM, Jeremy Chadwick wrote:

> On Mon, Oct 17, 2011 at 05:17:31PM -0700, Harold Paulson wrote:
>> I've had a server that boots from ZFS panicking for a couple days.  I =
have worked around the problem for now, but I hope someone can give me =
some insight into what's going on, and how I can solve it properly. =20
>>=20
>> The server is running 8.2-STABLE (zfs v28) with 8G of ram and 4 SATA =
disks in a raid10 type arrangement:
>>=20
>> # uname -a             =20
>> FreeBSD jane.sierraweb.com 8.2-STABLE-201105 FreeBSD =
8.2-STABLE-201105 #0: Tue May 17 05:18:48 UTC 2011     =
root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
>=20
> First thing to do is to consider upgrading to a newer RELENG_8 date.
> There have been *many* ZFS fixes since May.

I've done this, ran a scrub that completed without error, and still, =
listing that directory panics the machine. =20


>> It started panicking under load a couple days ago.  We replaced RAM =
and motherboard, but problems persisted.  I don't know if a hardware =
issue originally caused the problem or what.  When it panics, I get the =
usual panic message, but I don't get a core file, and it never reboots =
itself. =20
>>=20
>> http://pastebin.com/F1J2AjSF
>=20
> ZFS developers will need to comment on the state of the backtrace.  =
You
> may be requested to examine the core using kgdb and be given some
> commands to run on it.

Yeah, I made a real effort to get a core, but I just don't think it's =
going to happen.  It's an all-zfs system for starters.  I actually =
pulled a drive out of the array and reformatted it to try to get a core, =
but it freezes on panic and never reboots after 15 seconds or any of =
that. =20


>> While I was trying to figure out the source of the problem, I notice =
stuck various stuck processes that peg a CPU and can't be killed, such =
as:
>>=20
>>  PID JID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU =
COMMAND
>> 48735   0 root        1  46    0 11972K   924K CPU3    3 415:14 =
100.00% find
>=20
> Had you done procstat -k -k 48735 (the "double -k" is not a typo), you
> probably would have seen that the process was "stuck" in a ZFS-related
> thread.  These are processes which the kernel is hanging on to and =
will
> not let go of, so even kill -9 won't kill these.
>=20
> It would have also be worthwhile to get the "process tree" of what
> spawned the PID.  (Solaris has ptree; I think we have something =
similar
> under FreeBSD but I forget what)  The reason that matters is that it's
> probably a periodic job that runs (there are many which use find),
> traversing your ZFS filesystems, and tickling a bug/issue somewhere.
> You even hint at this in your next paragraph, re: locate.updatedb.

The processes are just ones that touch that poison directory (or some =
file within it), "pop3d" or "find" from nightly periodic runs.  pstree =
is in ports and an old favorite of mine, and reports what I'd expect =
from those. =20

procstat isn't any more interesting.  Here was the one I managed to get:

# procstat -k -k 44571
  PID    TID COMM             TDNAME           KSTACK                    =
  =20
44571 101006 find             -                <running>                 =
  =20


>> I can move that directory out of the way, and carry on, but is there =
anything I can do to really *repair* the problem?
>=20
> I would recommend starting with "zpool scrub" on the pool which is
> associated with the Maildir/ directory of the account you disable.  I
> will not be surprised if it comes back 100% clean.

Yep, scrubs complete without error.


> Given what the backtrace looks like, I would say the Maildir/ has a =
ton
> of files in it.  Is that the case?  Does "echo *" say something about
> argument list too long?

Nah, it's only like 12M of email (restored from a snap).  Listing the =
dir is an insta-panic. =20


> However, someone familiar with the ZFS internals, as I said, should
> investigate the crash you're experiencing regardless.

I'd still like to find a fix.  I moved the dir to /var/blackhole and =
excepted it from locate.updatedb and other periodic scans, so the system =
isn't panicking, but it's a crummy situation. =20

	- H