From owner-freebsd-jail@freebsd.org Sat Jul 9 14:23:08 2016 Return-Path: Delivered-To: freebsd-jail@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8EEEBB8367B for ; Sat, 9 Jul 2016 14:23:08 +0000 (UTC) (envelope-from jamie@gritton.org) Received: from gritton.org (gritton.org [162.220.209.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "www.gritton.org", Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 778AE18A5 for ; Sat, 9 Jul 2016 14:23:08 +0000 (UTC) (envelope-from jamie@gritton.org) Received: from gritton.org (gritton.org [162.220.209.3]) by gritton.org (8.15.2/8.15.2) with ESMTPS id u69EMPDx005353 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 9 Jul 2016 08:22:25 -0600 (MDT) (envelope-from jamie@gritton.org) Received: (from www@localhost) by gritton.org (8.15.2/8.15.2/Submit) id u69EMPPF005352; Sat, 9 Jul 2016 08:22:25 -0600 (MDT) (envelope-from jamie@gritton.org) X-Authentication-Warning: gritton.org: www set sender to jamie@gritton.org using -f To: freebsd-jail@freebsd.org Subject: Re: NFS + nullfs + jail = zombies? X-PHP-Originating-Script: 0:rcube.php MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Sat, 09 Jul 2016 08:22:25 -0600 From: James Gritton In-Reply-To: References: Message-ID: <596319b8dce811ea6e332c48d3542451@gritton.org> X-Sender: jamie@gritton.org User-Agent: Roundcube Webmail/1.2.0 X-BeenThere: freebsd-jail@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Discussion about FreeBSD jail\(8\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jul 2016 14:23:08 -0000 On 2016-07-08 12:28, Thomas Johnson wrote: > I am working on developing a clustered application utilizing jails and > running into problems that seem to be NFS-related. I'm hoping that > someone can point out my error. > > The jail images and my application data are served via NFS. The host > mounts NFS at boot, and then uses nullfs mounts to assemble the jail > tree when the jail is created (fstab files and jail.conf are below). > This seems to work fine, the jail starts and is usable. The problem > comes when I remove/restart the jail. Frequently (but not > consistently), the jail gets stuck in a dying state, causing the > unmount of the jail root (nullfs) to fail with a "device busy" error. > > # jail -f /var/local/jail.conf -r wds1-1a > Stopping cron. > Waiting for PIDS: 1361. > . > Terminated > wds1-1a: removed > umount: unmount of /var/jail/wds1-1a failed: Device busy > # jls -av > JID Hostname Path > Name State > CPUSetID > IP Address(es) > 1 wds1-1a /var/jail/wds1-1a > wds1-1a DYING > 2 > 2620:1:1:1:1a::1 > > Through trial-and-error I have determined that forcing an unmount of > the root works, but subsequent mounts to that mount point will fail to > unmount with the same error. Deleting and recreating the mountpoint > fixes the mounting issue, but the dying jail remains permanently. > > I have also found that if I copy the jail root to local storage and > update the jail's fstab to nullfs mount this, the problem seems to go > away. This leads me to believe that the issue is related to the NFS > source for the nullfs mount. statd and lockd are both running on the > host. > > My relevant configurations are below. I can provide any other > information desired. > > # Host fstab line for jail root. > # > 10.219.212.1:/vol/dev/wds/jail_base /jail/base nfs ro 0 0 > > > # Jail fstab file (mount.fstab) > # > /jail/base /var/jail/wds1-1a nullfs ro 0 0 > # writable (UFS-backed) /var > /var/jail-vars/wds1-1a /var/jail/wds1-1a/var nullfs rw 0 0 > > > # jail.conf file > # > * { > devfs_ruleset = "4"; > mount.devfs; > exec.start = "/bin/sh /etc/rc"; > exec.stop = "/bin/sh /etc/rc.shutdown"; > interface = "vmx1"; > allow.dying = 1; > exec.prestart = "/usr/local/bin/rsync -avC --delete > /jail/${image}/var/ /var/jail-vars/${host.hostname}/"; > } > > # JMANAGE wds1-1a > wds1-1a { > path = "/var/jail/wds1-1a"; > ip6.addr = "2620:1:1:1:1a::1"; > host.hostname = "wds1-1a"; > host.domainname = "dev"; > mount.fstab = "/var/local/fstab.wds1-1a"; > $image = "base"; > } What happens if you take jails out of the equation? I know this isn't entirely a non-jail issue, but I wonder if a jail is required for the mount point to be un-re-mountable. I've heard before of NFS-related problems where a jail remains dying forever, but this has been more of an annoyance than a real problem. It's not so much that I want to absolve jails, as I want to see where the main fight exists. It's tricky enough fixing an interface between two systems, but we've got three here. - Jamie