Date: Mon, 17 Oct 2011 17:17:31 -0700 From: Harold Paulson <haroldp@internal.org> To: freebsd-fs@freebsd.org Subject: Damaged directory on ZFS Message-ID: <4D8047A6-930E-4DE8-BA55-051890585BFE@internal.org>
next in thread | raw e-mail | index | archive | help
Hello,=20 I've had a server that boots from ZFS panicking for a couple days. I = have worked around the problem for now, but I hope someone can give me = some insight into what's going on, and how I can solve it properly. =20 The server is running 8.2-STABLE (zfs v28) with 8G of ram and 4 SATA = disks in a raid10 type arrangement: # uname -a =20 FreeBSD jane.sierraweb.com 8.2-STABLE-201105 FreeBSD 8.2-STABLE-201105 = #0: Tue May 17 05:18:48 UTC 2011 = root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 And zpool status:=20 NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 gpt/disk0 ONLINE 0 0 0 gpt/disk1 ONLINE 0 0 0 mirror ONLINE 0 0 0 gpt/disk2 ONLINE 0 0 0 gpt/disk3 ONLINE 0 0 0 It started panicking under load a couple days ago. We replaced RAM and = motherboard, but problems persisted. I don't know if a hardware issue = originally caused the problem or what. When it panics, I get the usual = panic message, but I don't get a core file, and it never reboots itself. = =20 http://pastebin.com/F1J2AjSF While I was trying to figure out the source of the problem, I notice = stuck various stuck processes that peg a CPU and can't be killed, such = as: PID JID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU = COMMAND 48735 0 root 1 46 0 11972K 924K CPU3 3 415:14 100.00% = find They are not marked zombie, but I can't kill them, and restarting the = jail they are in won't even get rid of them. truss just hangs with no = output on them. On different occasions, I noticed pop3d processes for = the same user getting stuck in this way. On a hunch I ran a "find" = through the files in the user's Maildir and got a panic. I disabled = this account and now the server is stable again. At least until = locate.updatedb walks through that directory, I suppose. Evidentially, = there is some kind of hole in the file system below that directory tree = causing the panic. =20 I can move that directory out of the way, and carry on, but is there = anything I can do to really *repair* the problem? Thanks. - H
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D8047A6-930E-4DE8-BA55-051890585BFE>