From owner-freebsd-hackers Wed Nov 6 10:35:42 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id KAA00804 for hackers-outgoing; Wed, 6 Nov 1996 10:35:42 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [204.216.27.3]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id KAA00775 for ; Wed, 6 Nov 1996 10:35:35 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by who.cdrom.com (8.7.5/8.6.11) with SMTP id JAA13718 for ; Wed, 6 Nov 1996 09:42:15 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id KAA08415; Wed, 6 Nov 1996 10:33:12 -0700 From: Terry Lambert Message-Id: <199611061733.KAA08415@phaeton.artisoft.com> Subject: Re: Davidg bug (was: mount panics & hangs) To: julian@whistle.com (Julian Elischer) Date: Wed, 6 Nov 1996 10:33:11 -0700 (MST) Cc: archie@whistle.com, freebsd-hackers@freebsd.org In-Reply-To: <327FE834.167EB0E7@whistle.com> from "Julian Elischer" at Nov 5, 96 05:21:56 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > dounmount(mp, flags, p) > register struct mount *mp; > int flags; > struct proc *p; > { > struct vnode *coveredvp; > int error; > > coveredvp = mp->mnt_vnodecovered; > if (vfs_busy(mp)) > return (EBUSY); ^^^^^^^^^^^^^^^^^^^^^^^ > mp->mnt_flag |= MNT_UNMOUNT; > error = vfs_lock(mp); > if (error) > return (error); <-------line "C" ^^^^^^^^^^^^^^^^^^^^^^^ > BTW there is another small bug, which is.. the return at line "C" > should also do a vfs_unbusy() > > suggestions? Add a "NOWAIT" flags value, obey it at the indicated locations, and don't pass it in this case (only on shutdown). In reality, there should be a mutex for the VFS structures, the list of mounted fs's being one of them, where "dounmount" is called, so you never have more than one process in the mount code. The problem is that the vfs_busy/vfs_lock pair create a race condition because there is not an imposed order of operation. That comes from mixing the vfsop and vop layers without regard to structural call layering (vfsop is hierarchically above vop). So you're right: it's a "thundering herd" problem, where the wrong process happens to win the race. I suspect that this requires the while loop to happen so that the priority becomes inverted when both processes are marked ready-to-run. Has this problem *ever* been repeated without that while loop to torture it into happening? Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.