From owner-svn-src-all@freebsd.org Fri Mar 9 22:09:45 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AB8FAF2A065; Fri, 9 Mar 2018 22:09:45 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-io0-x231.google.com (mail-io0-x231.google.com [IPv6:2607:f8b0:4001:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48794701F4; Fri, 9 Mar 2018 22:09:45 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-io0-x231.google.com with SMTP id o74so1550311iod.6; Fri, 09 Mar 2018 14:09:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=/vsfZ6Y9rbDNOeuIR0hu2q6Is7+ngj+5m1v9LUH0+2A=; b=kO20dSuFGZtHxPgL98oMGfMEyfh56burmCJSHMpfBmPpSXjd1kIlM4s9qL7ScLTBuj ks6Hj7LzU7FUynI4zLcOD+N6oR+SeHgtJloimJ3nAj7B1pwdOlsByyYt5QYBBE5gbCij oRkJ7xXD1YKCN5JzO0a9EabE3Jf1GzQVaJJkj8+Yqm9LK1+NdQ1WTLvcy+u+YdgA5j6H BAh78VQxd/qXW2LCfLCWvCA4V+NDh9BASxjOJcv3JEtuexFx5a+gxApxIQc3cpi9atWb BenMdQMgaCsw466SYh2t9qfGB2Dg8QJgg5nFSyS65S1uFZuxkx8LOcCbEh2UDQWgB1v5 HGJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=/vsfZ6Y9rbDNOeuIR0hu2q6Is7+ngj+5m1v9LUH0+2A=; b=BGqH0ygXmwAhbVJxG3Y/kvEzIi2lJWUXik2S80KdcmoErvbpAYSbggLeEDZ97LzPPX OU5sYJ/EkBhc4TXh+oKnQucEge3dcJrgLKB+NntSA3BerddEZNoA23G4JXH8Tey46Pzf ThFkmqZDs93u1prsHEMTSDPgWYX2ApnDap+UEkwMYUvc1PLT13NVfhWk4X1RvSwX4rhb +m/gfHCZYMY1xOZy/siJN2Yw/rEsLNWTkOAVzJ/xvCR8/2xv5I9+fCW99lBU1w1yeKuP K99tAl/oNnELo2G38FE4tLX8QuN5agcWLCgjVqGDumpD8ZqCi/NN/eC3/bNg9SRlGQAF IlXA== X-Gm-Message-State: AElRT7HPM4KymMzjWA57MJT1vPCIcJKWyrweW25MYUQYQ6zUcI4DJekX ZDhMX8UK/Go/sXa5PU9MbpUMyulo X-Google-Smtp-Source: AG47ELunmN+/+dUw3o9jlbciuAJY5UWdaZOVRcKdAiQsraXKyn/ShA0OmAthA6B+Wd/u3APe6ea+vg== X-Received: by 10.107.213.72 with SMTP id x8mr46547ioc.60.1520633384184; Fri, 09 Mar 2018 14:09:44 -0800 (PST) Received: from raichu (toroon0560w-lp130-01-174-88-76-226.dsl.bell.ca. [174.88.76.226]) by smtp.gmail.com with ESMTPSA id 33sm1505739ioj.71.2018.03.09.14.09.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Mar 2018 14:09:43 -0800 (PST) Sender: Mark Johnston Date: Fri, 9 Mar 2018 17:09:40 -0500 From: Mark Johnston To: David Bright Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r328013 - head/sbin/fsck_ffs Message-ID: <20180309220940.GG6174@raichu> References: <201801151925.w0FJPCKA019434@repo.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201801151925.w0FJPCKA019434@repo.freebsd.org> User-Agent: Mutt/1.9.3 (2018-01-21) X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Mar 2018 22:09:45 -0000 On Mon, Jan 15, 2018 at 07:25:11PM +0000, David Bright wrote: > Author: dab > Date: Mon Jan 15 19:25:11 2018 > New Revision: 328013 > URL: https://svnweb.freebsd.org/changeset/base/328013 > > Log: > Exit fsck_ffs with non-zero status when file system is not repaired. > > When the fsck_ffs program cannot fully repair a file system, it will > output the message PLEASE RERUN FSCK. However, it does not exit with a > non-zero status in this case (contradicting the man page claim that it > "exits with 0 on success, and >0 if an error occurs." The fsck > rc-script (when running "fsck -y") tests the status from fsck (which > passes along the exit status from fsck_ffs) and issues a "stop_boot" > if the status fails. However, this is not effective since fsck_ffs can > return zero even on (some) errors. Effectively, it is left to a later > step in the boot process when the file systems are mounted to detect > the still-unclean file system and stop the boot. > > This change modifies fsck_ffs so that when it cannot fully repair the > file system and issues the PLEASE RERUN FSCK message it also exits > with a non-zero status. > > While here, the fsck_ffs man page has also been updated to document > the failing exit status codes used by fsck_ffs. Previously, only exit > status 7 was documented. Some of these exit statuses are tested for in > the fsck rc-script, so they are clearly depended upon and deserve > documentation. etc/rc.d/fsck doesn't know how to interpret the new exit code and now just drops to a single-user shell when it is encountered. This is happening to me semi-regularly when my test systems crash, especially when I test kernel panic handling. :) Is there any reason etc/rc.d/fsck shouldn't automatically retry (up to some configurable number of retries) when the new error code is seen? The patch below seems to do the trick for me: diff --git a/etc/defaults/rc.conf b/etc/defaults/rc.conf index 584e842bba2c..63d2fcc0be8d 100644 --- a/etc/defaults/rc.conf +++ b/etc/defaults/rc.conf @@ -95,6 +95,7 @@ root_rw_mount="YES" # Set to NO to inhibit remounting root read-write. root_hold_delay="30" # Time to wait for root mount hold release. fsck_y_enable="NO" # Set to YES to do fsck -y if the initial preen fails. fsck_y_flags="-T ffs:-R -T ufs:-R" # Additional flags for fsck -y +fsck_retries="3" # Number of times to retry fsck before giving up. background_fsck="YES" # Attempt to run fsck in the background where possible. background_fsck_delay="60" # Time to wait (seconds) before starting the fsck. growfs_enable="NO" # Set to YES to attempt to grow the root filesystem on boot diff --git a/etc/rc.d/fsck b/etc/rc.d/fsck index bd3122a20110..708d92228e3d 100755 --- a/etc/rc.d/fsck +++ b/etc/rc.d/fsck @@ -14,8 +14,82 @@ desc="Run file system checks" start_cmd="fsck_start" stop_cmd=":" +_fsck_run() +{ + local err + + if checkyesno background_fsck; then + fsck -F -p + else + fsck -p + fi + + err=$? + if [ ${err} -eq 3 ]; then + echo "Warning! Some of the devices might not be" \ + "available; retrying" + root_hold_wait + check_startmsgs && echo "Restarting file system checks:" + if checkyesno background_fsck; then + fsck -F -p + else + fsck -p + fi + err=$? + fi + + case ${err} in + 0) + ;; + 2) + stop_boot + ;; + 4) + echo "Rebooting..." + reboot + echo "Reboot failed; help!" + stop_boot + ;; + 8) + if checkyesno fsck_y_enable; then + echo "File system preen failed, trying fsck -y ${fsck_y_flags}" + fsck -y ${fsck_y_flags} + case $? in + 0) + ;; + *) + echo "Automatic file system check failed; help!" + stop_boot + ;; + esac + else + echo "Automatic file system check failed; help!" + stop_boot + fi + ;; + 12) + echo "Boot interrupted." + stop_boot + ;; + 16) + echo "File system check retry requested." + ;; + 130) + stop_boot + ;; + *) + echo "Unknown error ${err}; help!" + stop_boot + ;; + esac + + return $err +} + fsck_start() { + local err tries + if [ "$autoboot" = no ]; then echo "Fast boot: skipping disk checks." elif [ ! -r /etc/fstab ]; then @@ -25,67 +99,13 @@ fsck_start() trap : 3 check_startmsgs && echo "Starting file system checks:" - if checkyesno background_fsck; then - fsck -F -p - else - fsck -p - fi - - err=$? - if [ ${err} -eq 3 ]; then - echo "Warning! Some of the devices might not be" \ - "available; retrying" - root_hold_wait - check_startmsgs && echo "Restarting file system checks:" - if checkyesno background_fsck; then - fsck -F -p - else - fsck -p - fi + tries=$fsck_retries + while [ $tries -gt 0 ]; do + _fsck_run err=$? - fi - - case ${err} in - 0) - ;; - 2) - stop_boot - ;; - 4) - echo "Rebooting..." - reboot - echo "Reboot failed; help!" - stop_boot - ;; - 8) - if checkyesno fsck_y_enable; then - echo "File system preen failed, trying fsck -y ${fsck_y_flags}" - fsck -y ${fsck_y_flags} - case $? in - 0) - ;; - *) - echo "Automatic file system check failed; help!" - stop_boot - ;; - esac - else - echo "Automatic file system check failed; help!" - stop_boot - fi - ;; - 12) - echo "Boot interrupted." - stop_boot - ;; - 130) - stop_boot - ;; - *) - echo "Unknown error ${err}; help!" - stop_boot - ;; - esac + [ $err -eq 16 ] || break + tries=$(($tries - 1)) + done fi } diff --git a/share/man/man5/rc.conf.5 b/share/man/man5/rc.conf.5 index c27a2134e6bc..c9a16ca9f65c 100644 --- a/share/man/man5/rc.conf.5 +++ b/share/man/man5/rc.conf.5 @@ -24,7 +24,7 @@ .\" .\" $FreeBSD$ .\" -.Dd February 15, 2018 +.Dd March 9, 2018 .Dt RC.CONF 5 .Os .Sh NAME @@ -2053,6 +2053,11 @@ will be run with the .Fl y flag if the initial preen of the file systems fails. +.It Va fsck_retries +.Pq Vt int +Maximum number of times to re-run +.Xr fsck 8 +if its exit status indicates that a re-run is required. .It Va background_fsck .Pq Vt bool If set to