From owner-freebsd-stable  Mon Feb 11 12: 6:47 2002
Delivered-To: freebsd-stable@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP
	id 63D1637B400; Mon, 11 Feb 2002 12:06:22 -0800 (PST)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.11.6/8.9.1) id g1BK6Dw56980;
	Mon, 11 Feb 2002 12:06:13 -0800 (PST)
	(envelope-from dillon)
Date: Mon, 11 Feb 2002 12:06:13 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200202112006.g1BK6Dw56980@apollo.backplane.com>
To: Bruce Evans <bde@zeta.org.au>
Cc: Matteo <drummino@yahoo.com>, <freebsd-stable@FreeBSD.ORG>,
	<freebsd-bugs@FreeBSD.ORG>, <aiuto@gufi.org>
Subject: Re: Crash System with MSDOS file-system driver!
References:  <20020211221141.R682-100000@gamplex.bde.org>
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-stable.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-stable>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-stable>
X-Loop: FreeBSD.ORG

    Well, I think you did find a bug in VFS/BIO.  Even UFS will unnecessarily
    break if a fully valid buffer is written and the write fails, due to
    b_resid being left non-zero (B_ERROR is cleared in that case and the
    write is re-queued).

    But we have a problem here.  You can't gratuitously clear b_resid for
    the B_CACHE case.  NFS depends on it being persistent for VLNK handling.
    In fact, NFS even assumes it is persistent for directory reading but
    since VMIO-backed buffers can be dropped and later reconstituted (zeroing
    b_resid), there is a hack in there to magically fix it for the VDIR
    case.

    This ability to drop and reconstitute the buffer implies that we can
    legally zero b_resid in the B_CACHE|B_VMIO case, but we cannot legally
    clear b_resid in the non-VMIO case.

    I also don't think it's a good idea to just munge breadn() and leave
    bread() alone.

    What I recommend is that the b_resid clearing code actually be put in
    getblk() itself but only for the B_CACHE|B_VMIO case, and with a big
    comment describing why it is there.  If you want to work up a patch
    for that, Bruce, I'll be happy to help test it.  Or I can just do it
    if you don't want to.  The code is almost exactly the same.

    I do not understand the last bit of your patch that converts a non-error'd
    short read into an EIO.  This case cannot happen on the first bp with
    either UFS or MSDOSFS because they explicitly conditionalize the breadn()
    call for the case where there is at least one full buffer's worth of data
    is left in the file or directory.  Consequently the first buffer will
    never have a short read.  The API, in general, does not explicitly
    disallow EOF occuring in the first buffer so unless you believe an API
    change is in order and clearly document it in breadn()'s comments, you 
    should not commit this patch.  (For example, NFS doesn't use breadn()
    but if it did it would likely assume that EOF could occur anywhere, even
    in the first buffer).

						-Matt

:On Fri, 8 Feb 2002, Matteo wrote:
:
:> Hi all,
:>
:> I've 4.5-RELEASE and 4.4-STABLE.
:>
:> Try to follow this steps:
:>
:> 1) Mount a msdos floppy write protected in /floppy
:> 2) cd /floppy
:> 3) rm somefile
:> At this point kernel show this message:
:>
:> fd0c: hard error writing fsbn 56 of 56-63 (ST0
:> 40<abnrml> ST1 2<write_prosend out a error messages...
:> etc etc
:>
:> Next Step:
:>
:> 4) ls
:>
:> And system halt! I try with a UFS floppy and it's all
:> ok.
:
:%%%
:The hang is caused by msdosfs_readdir() "handling" reads of 0 bytes by
:spinning forever.  msdosfs_read() seems to have the same bug.
:ufs_readdir() works because it calls VOP_READ() == ffs_read() which
:treats reads of 0 bytes as EOF instead of spinning forever.
:
:Reads of 0 bytes and other short reads shouldn't happen, but they
:happen because failed writes clobber b_resid and breadn() later returns
:with the clobbered value.  Filesystems use b_resid to determine the
:extent of i/o in a few places (all (?) cloned from ufs_readwrite.c).
:This is probably wrong and very incomplete -- most places, including
:critical block allocation code, just assume that breadn() returns an
:error if everything could not be read.
:
:The patch mainly just resets b_resid in breadn().  Determining the
:value after the i/o that filled the buffer doesn't seem to be easy,
:but it should be 0.
:
:Index: vfs_bio.c
:===================================================================
:RCS file: /home/ncvs/src/sys/kern/vfs_bio.c,v
:retrieving revision 1.296
:diff -u -r1.296 vfs_bio.c
:--- vfs_bio.c	31 Jan 2002 18:39:44 -0000	1.296
:+++ vfs_bio.c	11 Feb 2002 11:09:22 -0000
:@@ -614,6 +614,13 @@
: 		vfs_busy_pages(bp, 0);
: 		VOP_STRATEGY(vp, bp);
: 		++readwait;
:+	} else {
:+		/*
:+		 * Recover from failed writes clobbering b_resid.  b_resid is
:+		 * strictly only valid after i/o, but some filesystems expect
:+		 * it to give the part of `size' that was not readable here.
:+		 */
:+		bp->b_resid = 0;
: 	}
:
: 	for (i = 0; i < cnt; i++, rablkno++, rabsize++) {
:@@ -640,6 +647,16 @@
:
: 	if (readwait) {
: 		rv = bufwait(bp);
:+		/*
:+		 * Convert short reads to i/o errors, since short reads aren't
:+		 * useful for the buffer cache and they are mostly not handled
:+		 * in callers.
:+		 * XXX we don't always get here for read-ahead blocks.
:+		 * b_resid for read-ahead blocks may be clobbered by failed
:+		 * attempts to write the blocks.
:+		 */
:+		if (rv == 0 && bp->b_resid != 0)
:+			rv = EIO;
: 	}
: 	return (rv);
: }
:%%%
:
:Bruce
:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message