Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 May 2016 13:41:23 +0200
From:      Edward Tomasz =?utf-8?Q?Napiera=C5=82a?= <trasz@FreeBSD.org>
To:        Graham Menhennitt <graham@menhennitt.com.au>
Cc:        freebsd-current@freebsd.org
Subject:   Re: boot fails "Can't stat /dev/da0a: No such file or directory"
Message-ID:  <20160505114123.GA1289@brick>
In-Reply-To: <4ec4e751-5ac7-d793-9356-5af4327b824d@menhennitt.com.au>
References:  <57247294.4050607@menhennitt.com.au> <57266A7E.1070500@menhennitt.com.au> <20160503084251.GB5892@brick> <57286DC6.3010403@menhennitt.com.au> <20160503095925.GC5892@brick> <4ec4e751-5ac7-d793-9356-5af4327b824d@menhennitt.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
On 0505T1847, Graham Menhennitt wrote:
> On 3/05/2016 07:59 PM, Edward Tomasz Napierała wrote:
> > On 0503T1922, Graham Menhennitt wrote:
> >> On 3/05/2016 06:42 PM, Edward Tomasz Napierała wrote:
> >>> On 0502T0643, Graham Menhennitt wrote:
> >>>> On 30/04/2016 06:53 PM, Graham Menhennitt wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> I have a USB disk that I use for backup. Up till now, it's mounted
> >>>>> without any problems at boot time. After updating to -current as of
> >>>>> yesterday, it doesn't mount and causes the boot to fail.
> >>>>>
> >>>>> My /etc/fstab looks like:
> >>>>>
> >>>>>     # Device    Mountpoint    FStype    Options    Dump    Pass#
> >>>>>
> >>>>>     /dev/ada0s1a    /        ufs    rw    1    1
> >>>>>     /dev/ada0s1b    none        swap    sw    0    0
> >>>>>     /dev/da0a    /backup        ufs    rw,late    1    1
> >>>>>
> >>>>>
> >>>>> I tried adding the "late" to fix the problem, but it doesn't help.
> >>>>>
> >>>>> The error message is:
> >>>>>
> >>>>>     /dev/ada0s1a: clean...
> >>>>>     Can't stat /dev/da0a: No such file or directory
> >>>>>     Unknown error; help!
> >>>>>     ERROR: ABORTING BOOT (sending SIGTERM to parent)!
> >>>>>
> >>>>>
> >>>>> (hand transcribed - maybe typos)
> >>>>>
> >>>>> Can anybody help, please.
> >>>>>
> >>>>> Thanks,
> >>>>>     Graham
> >>>> Sorry, I forgot to mention...
> >>>>
> >>>> I commented out that line from fstab which allows the boot to complete.
> >>>> I can then manually mount it without any problems. It looks like the
> >>>> device doesn't get created early enough.
> >>> Have you run mergemaster after upgrade?  In particular, do you have
> >>> the current version of /etc/rc.d/mountcritlocal?
> >>>
> >> Thanks for replying, Edward. Yes I've installed that file. The delay
> >> that Dave told me about has fixed the problem.
> > Still, it would be nice if this worked by default.  The updated
> > mountcritlocal script should wait for USB to release root tokens
> > if the mount initially fails,
> >
> Ok, I tried to do a bit of diagnosis here. I took out the delay from
> /boot/loader.conf and I added "set -x" to the top of
> /etc/rc.d/mountcritlocal (before the start of the mountcritlocal_start()
> function definition). I then rebooted. I didn't see any shell command
> output from the "set -x" before the error occurred. That means that the
> error is happening before /etc/rc.d/mountcritlocal is being read.
> 
> When I put the delay back in and boot, I see the shell commands after
> the filesystems are mounted (and, hence, after the error would have
> occurred if the delay wasn't there). So I don't think mountcritlocal is
> going to help me.
> 
> I'm not sure what else to try. if you have any suggestions, I can do
> some experimenting. Is there a simple way to capture the output from the
> rc.d scripts?

Huh, you've nailed it - it was a different script, /etc/rc.d/fsck; it
runs before mountcritlocal.  Could you try the following patch?  You can
apply it directly to /etc/rc.d:

Index: etc/rc.d/fsck
===================================================================
--- etc/rc.d/fsck	(revision 299115)
+++ etc/rc.d/fsck	(working copy)
@@ -14,6 +14,35 @@ desc="Run file system checks"
 start_cmd="fsck_start"
 stop_cmd=":"
 
+# Originally, root mount hold had to be released before mounting
+# the root filesystem.  This delayed the boot, so it was changed
+# to only wait if the root device isn't readily available.  This
+# can result in this script executing before all the devices - such
+# as graid(8) - are available.  Thus, should the mount fail,
+# we will wait for the root mount hold release and retry.
+root_hold_wait()
+{
+	waited=0
+	while true; do
+		holders="$(sysctl -n vfs.root_mount_hold)"
+		if [ -z "${holders}" ]; then
+			break;
+		fi
+		if [ ${waited} -eq 0 ]; then
+			echo -n "Waiting ${root_hold_delay}s" \
+			"for the root mount holders: ${holders}"
+		else
+			echo -n .
+		fi
+		if [ ${waited} -ge ${root_hold_delay} ]; then
+			echo
+			break
+		fi
+		sleep 1
+		waited=$(($waited + 1))
+	done
+}
+
 fsck_start()
 {
 	if [ "$autoboot" = no ]; then
@@ -31,7 +60,21 @@ fsck_start()
 			fsck -p
 		fi
 
-		case $? in
+		err=$?
+		if [ ${err} -eq 3 ]; then
+			echo "Warning! Some of the devices might not be" \
+			    "available; retrying"
+			root_hold_wait
+			check_startmsgs && echo "Restarting file system checks:"
+			if checkyesno background_fsck; then
+				fsck -F -p
+			else
+				fsck -p
+			fi
+			err=$?
+		fi
+
+		case ${err} in
 		0)
 			;;
 		2)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160505114123.GA1289>