Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Dec 2011 20:34:37 -0800
From:      Devin Teske <devin.teske@fisglobal.com>
To:        <freebsd-rc@freebsd.org>
Cc:        Ken Smith <kensmith@buffalo.edu>, Parker-Smith <daveps@vicor.com>, Dave, phk@freebsd.org, 'Julian Elischer' <julian@freebsd.org>
Subject:   mount(8) bug? rc.d/mountlate bug? bug in both?
Message-ID:  <039201ccb0ab$b3db9470$1b92bd50$@fisglobal.com>

next in thread | raw e-mail | index | archive | help
Hi -RC@, Julian, Poul, and Ken,

We need your help on FreeBSD-8.1!

Please read the following dossier on our issue with simply attempting to add a
single NFS mount to fstab(5) ***without*** the side-effect of rebooting into
single-user mode should that mount fail for ANY reason during boot.

FULL-DISCLOSURE: We've already tried marking the filesystem as "late" and/or
"bg" to no avail. We've traced the problem down to a possible bug in either
mount(8) or the `/etc/rc.d/mountlate' boot-script. Need confirmation that this
is a bug, OR a work-around to eliminate the numerous edge-cases where we can
reliably cause the system to boot into single-user mode.

ASIDE: We're longing for the days of FreeBSD-4 where NFS would simply fail and
boot would still continue. Meaning that eventually you could service the system
remotely -- logging in to fix the bad mounts (beit caused by typo, machine that
went missing on the net, or any other reason why the mount is no longer valid).

========= ISSUE DOSSIER BELOW (THANKS AS ALWAYS) =========

The desire:

You want to add an NFS mount to /etc/fstab so that it's mounted at boot.

However, you also want to make it so that in-NO-way can you end up dropping to
single-user mode because of this mount going bad (for any reason).



The problem:

Several scenarios can result in dropping to single-user mode... such as:

1. The hostname for the mount is not resolvable in DNS

2. The IP address for the mount is not routable

3. The machine providing the mount is not running NFS

4. The machine providing the mount returns "permission denied"

5. The machine providing the mount takes too long to respond

... among which any of the above can happen in any number of ways, such as a
typo in `/etc/resolv.conf' before a remote reboot, et cetera (upon-which, good
luck getting back into the box to fix resolv.conf(5) as the system would now be
in single-user mode).

You might argue that all of these scenarios SHOULD result in dropping to
single-user mode on reboot, however the topic of this e-mail is not to discuss
whether this should be the case but rather HOW  to make it NOT the case (if
possible -- without code-change) for our needs.



Corollary:

Having a workstation 3000+ miles away in India reboot into single-user mode
simply because of a momentary network hiccup (or any other situation that could
cause failure of the NFS mount) at boot is what we're trying to avoid. That is
to explain, avoiding the situation where a system that is physically afar from
becoming permanently unresponsive, requiring significant expenditure/effort to
rectify.



Discussion:

We're already aware-of (and have tried) the "bg" NFS-specific filesystem flag.

According to mount_nfs(8) manual, the "bg" option SHOULD be enough to make the
filesystem NOT be critical to booting, yet in-practice adding this flag does NOT
prevent the system from dropping to single-user mode (more below).



Possible Bug:

As the system is booting, /etc/rc.d/mountcritremote attempts to mount the
filesystem. It fails. This is OK (because mountcritremote does not return
FAILURE status -- he returns SUCCESS and boot proceeds as-expected).

Later, /etc/rc.d/mountlate runs and attempts to mount it again. It fails again
except this time mountlate calls "stop_boot" after the failure (dropping us to
single-user mode).

The "possible bug" comes into play in reading /etc/rc.d/mountlate and finding
out just how exactly it determines that it should have been mounting this
filesystem in the first place.

mountlate calls "/sbin/mount -d -a -l" to determine if there are any "late"
filesystems to mount.

The filesystem is NOT marked as "late", but "/sbin/mount -d -a -l" will still
report it because it's not yet mounted.

This is where we need to read mount(8) to learn that "-l" doesn't mean it will
report-on ONLY late-filesystems, but rather ALSO report-on late-filesystems.
>From mount(8):

	-l	When used in conjunction with -a option, also mount those
		file systems which are marked as ``late''.

So it becomes clear that the "bg" option is not effective in making an NFS
filesystem non-critical because /etc/rc.d/mountlate isn't excluding filesystems
that have the "bg" flag.

So, the "possible bug" is that:

mountlate should go through the filesystems returned by mount(8) and check the
options itself for "bg", skipping those filesystems with this option.
-- 
Devin

_____________
The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?039201ccb0ab$b3db9470$1b92bd50$>