From owner-freebsd-bugs@FreeBSD.ORG  Wed Nov 29 21:51:01 2006
Return-Path: <owner-freebsd-bugs@FreeBSD.ORG>
X-Original-To: freebsd-bugs@freebsd.org
Delivered-To: freebsd-bugs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 9B9ED16A50B;
	Wed, 29 Nov 2006 21:51:01 +0000 (UTC)
	(envelope-from mjacob@freebsd.org)
Received: from ns1.feral.com (ns1.feral.com [192.67.166.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E1A0E43F08;
	Wed, 29 Nov 2006 21:45:50 +0000 (GMT)
	(envelope-from mjacob@freebsd.org)
Received: from ns1.feral.com (localhost [127.0.0.1])
	by ns1.feral.com (8.13.8/8.13.8) with ESMTP id kATLjVY1026607;
	Wed, 29 Nov 2006 13:45:41 -0800 (PST)
	(envelope-from mjacob@freebsd.org)
Received: from localhost (mjacob@localhost)
	by ns1.feral.com (8.13.8/8.13.8/Submit) with ESMTP id kATLjVpc026604;
	Wed, 29 Nov 2006 13:45:31 -0800 (PST)
	(envelope-from mjacob@freebsd.org)
X-Authentication-Warning: ns1.feral.com: mjacob owned process doing -bs
Date: Wed, 29 Nov 2006 13:45:31 -0800 (PST)
From: mjacob@freebsd.org
X-X-Sender: mjacob@ns1.feral.com
To: Remko Lodder <remko@freebsd.org>
In-Reply-To: <200611292130.kATLUI7L073624@freefall.freebsd.org>
Message-ID: <20061129133807.A26564@ns1.feral.com>
References: <200611292130.kATLUI7L073624@freefall.freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-bugs@freebsd.org
Subject: Re: kern/106030: panic while rebooting with a dead disk
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: mjacob@freebsd.org
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Nov 2006 21:51:01 -0000


On Wed, 29 Nov 2006, Remko Lodder wrote:
> > I had a mounted ufs disk that went away. I rebooted so as to avoid a panic. Too bad. Geom paniced
> > on me anyway:
> >
> > Syncing disks, vnodes remaining...2 (da8:isp1:0:6:2): Invalidating pack
> > g_vfs_done():da8a[WRITE(offset=81920, length=4096)]error = 6
>
> Well, it wants to synchronise the data in the caches to the
> disk and cannot find it.. I think a panic is the best thing
> to do to prevent any weird things happening.  What else

A panic should be the last resort. If I/O is returned indicating the 
device has gone, a binval on all cached data and a forced close of the 
file table entry and notification of all user processes is the 
reasonable thing to do. Most real Unix'es that were hardened from the 
orginal v7 product learned to do this. FreeBSD hasn't.

> should be done when the disk it once had mounted goes away?
> you have different problems already when that happends..

As I've repeatedly said, mostly to deaf ears in FreeBSD, a device error 
should never be the cause for panic *unless* there is absolutely no way 
to notify user processes of the error *and* data corruption may have 
silently occurred. Inconvenience to an existing design is not really a 
good argument.

A read error to a device that has disappeared shouldn't cause a panic, 
even with a filesystem mounted. A write error to same shouldn't cause a 
panic - the error propagates back up the stack to the actual I/O 
invocation. If it was writebehind or dirty paging activity that can no 
longer be associated with any thread, then a panic is a policy decision 
that only the invoker of the I/O can make. Not the device driver. Not 
the volume manager (which is what GEOM is).