From owner-freebsd-hackers  Fri Oct 18 11:35:57 2002
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 2A7F037B404; Fri, 18 Oct 2002 11:35:55 -0700 (PDT)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 81D4643EAC; Fri, 18 Oct 2002 11:35:54 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.12.5/8.12.4) with ESMTP id g9IIZsPQ061971;
	Fri, 18 Oct 2002 11:35:54 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.5/8.12.4/Submit) id g9IIZsBX061970;
	Fri, 18 Oct 2002 11:35:54 -0700 (PDT)
	(envelope-from dillon)
Date: Fri, 18 Oct 2002 11:35:54 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200210181835.g9IIZsBX061970@apollo.backplane.com>
To: Maxim Sobolev <sobomax@FreeBSD.org>
Cc: hackers@FreeBSD.org
Subject: Re: Patch to allow a driver to report unrecoverable write errors to the buf layer
References: <3DB048B5.21097613@FreeBSD.org> <200210181807.g9II7cBY024485@apollo.backplane.com> <3DB0516F.9BE00F57@FreeBSD.org>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG


:> :
:> :There is a very easy way to trigger the problem: insert blank floppy
:> :...
:> 
:>     Your patch looks slightly incomplete to me, but the concept is reasonable.
:>     The BIO_NORETRY test that sets B_INVAL should probably be done in
:>     brelse(), not in bufwait().  It is the code in brelse() that actually
:>     does the re-dirtying of the buffer in case of a write-error.
:
:Ah, actually I've initially put it into brelse() but then reconsidered
:a decision and moved it down into bufwait(). I'll move it back. ;)

    Heh heh.  Well, it seems to me that since it is the BUF abstraction
    that has the error check / redirtying / retry code, then the BUF
    abstraction should probably be responsible for the no-retry case as
    well.  The BIO abstraction is really designed to hold an I/O operation,
    not really to hold meta operations.  You could still specify a BIO
    flag for it since it's a media hack of sorts, but the BUF code should
    be responsible for processing it.

    I dunno about a formal abstraction.  We need to differentiate between
    media which can and cannot remap blocks.  A 'perfect' solution
    would be far more complex.  File data blocks would have to be
    remapped at the filesystem level and meta-data would have to be 
    invalidated in-core (bitmap, inode blocks with write errors), and
    the filesystem would have to be marked dirty on unmount.  Then unmount
    could safely destroy the buffers representing the write-error'd meta
    data. 

    The VFS layer would definitely need to be involved.  We have the
    advantage in that the buffer cache is already logically mapped, but
    it would still be a fairly sophisticated piece of work.

:>     This re-dirtying is necessary in most cases to prevent filesystem
:>     corruption.  Otherwise the buffer may be thrown away and a re-read
:>     may return the original pre-modified data, causing massive filesystem
:>     corruption elsewhere (consider what that would mean for a bitmap block).
:> 
:>     I think it's perfectly reasonable to do away with the buffer in the
:>     case of a floppy error, though.
:
:Thanks!
:
:-Maxim

    Just a bit of history.  Originally the buffer cache did not retry error'd
    out writes.  I changed it several years ago because the mechanism
    was producing massive filesystem corruption in the face of disk write
    errors.  The floppy issue was a known issue at the time and I am quite
    happy that someone is tackling the problem now!

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message