From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 10 02:50:41 2006 Return-Path: X-Original-To: hackers@freebsd.org Delivered-To: freebsd-hackers@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1494716A41F for ; Tue, 10 Jan 2006 02:50:41 +0000 (GMT) (envelope-from spork@fasttrackmonkey.com) Received: from angryfist.fasttrackmonkey.com (angryfist.fasttrackmonkey.com [216.220.107.230]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7B82C43D45 for ; Tue, 10 Jan 2006 02:50:38 +0000 (GMT) (envelope-from spork@fasttrackmonkey.com) Received: (qmail 3726 invoked by uid 2003); 10 Jan 2006 02:43:22 -0000 Received: from spork@fasttrackmonkey.com by angryfist.fasttrackmonkey.com by uid 1001 with qmail-scanner-1.20 (clamscan: 0.65. Clear:RC:1(216.220.116.154):. Processed in 0.048304 secs); 10 Jan 2006 02:43:22 -0000 Received: from unknown (HELO ?192.168.0.40?) (216.220.116.154) by 0 with (DHE-RSA-AES256-SHA encrypted) SMTP; 10 Jan 2006 02:43:21 -0000 Date: Mon, 9 Jan 2006 21:50:35 -0500 (EST) From: Charles Sprickman X-X-Sender: spork@gee5.local To: hackers@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: nullfs/quota/jail interaction X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Jan 2006 02:50:41 -0000 Hi all, I know nullfs is not to be relied upon, but I did hit an interesting bug the other day, and I was wondering if I should bother with a PR or not. In short, doing the following seems to dirty the partition and leave the machine in a state where a hard reset is required to recover. This is -stable from 1/4/06. -start a jail within a partition dedicated to jails, in my case it looks like: /jails /jails/jail1 /jails/jail2 etc... -use nullfs to link the host's ports tree into the jail: mount_nullfs /usr/ports /jails/jail1/usr/ports -enable quotas for the /jails partition -stop the jails and run quotacheck to make sure everything is consistent quotacheck -v /jails At that point, the quotacheck command seems to deadlock on something. The process is not interruptible (ie: CTRL-C, CTRL-Z do not do anything but echo) even with a "kill -9" from the host. Any subsequent command that attempts to read anything from that partition will also hang as above. If a shutdown is issued, that does not kill off any of the deadlocked processes and the machine must be manually reset (something of a pain if it's remote). In order to get a clean boot I had to remove all jail startup commands from rc.conf and make sure nothing else (like syslog) was trying to do anything in /jails. Even then, the background fsck eventually deadlocked as well. I had reboot once more with /jails commented out of fstab and then run the fsck manually to recover. Should I write this up and send it in case one day someone decides to fix nullfs? I'm also wondering about the seperate issue of having some deadlocked process not allowing the machine to reboot - I've seen similar reports of this behaviour with various other less-than-stable filesystems like vfs. It would be nice to have some way of telling the kernel "please stop waiting on this disk, it's not coming back, it's futile, please just reboot". Thanks, Charles