Date: Mon, 11 Feb 2002 12:06:13 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Bruce Evans <bde@zeta.org.au> Cc: Matteo <drummino@yahoo.com>, <freebsd-stable@FreeBSD.ORG>, <freebsd-bugs@FreeBSD.ORG>, <aiuto@gufi.org> Subject: Re: Crash System with MSDOS file-system driver! Message-ID: <200202112006.g1BK6Dw56980@apollo.backplane.com> References: <20020211221141.R682-100000@gamplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Well, I think you did find a bug in VFS/BIO. Even UFS will unnecessarily break if a fully valid buffer is written and the write fails, due to b_resid being left non-zero (B_ERROR is cleared in that case and the write is re-queued). But we have a problem here. You can't gratuitously clear b_resid for the B_CACHE case. NFS depends on it being persistent for VLNK handling. In fact, NFS even assumes it is persistent for directory reading but since VMIO-backed buffers can be dropped and later reconstituted (zeroing b_resid), there is a hack in there to magically fix it for the VDIR case. This ability to drop and reconstitute the buffer implies that we can legally zero b_resid in the B_CACHE|B_VMIO case, but we cannot legally clear b_resid in the non-VMIO case. I also don't think it's a good idea to just munge breadn() and leave bread() alone. What I recommend is that the b_resid clearing code actually be put in getblk() itself but only for the B_CACHE|B_VMIO case, and with a big comment describing why it is there. If you want to work up a patch for that, Bruce, I'll be happy to help test it. Or I can just do it if you don't want to. The code is almost exactly the same. I do not understand the last bit of your patch that converts a non-error'd short read into an EIO. This case cannot happen on the first bp with either UFS or MSDOSFS because they explicitly conditionalize the breadn() call for the case where there is at least one full buffer's worth of data is left in the file or directory. Consequently the first buffer will never have a short read. The API, in general, does not explicitly disallow EOF occuring in the first buffer so unless you believe an API change is in order and clearly document it in breadn()'s comments, you should not commit this patch. (For example, NFS doesn't use breadn() but if it did it would likely assume that EOF could occur anywhere, even in the first buffer). -Matt :On Fri, 8 Feb 2002, Matteo wrote: : :> Hi all, :> :> I've 4.5-RELEASE and 4.4-STABLE. :> :> Try to follow this steps: :> :> 1) Mount a msdos floppy write protected in /floppy :> 2) cd /floppy :> 3) rm somefile :> At this point kernel show this message: :> :> fd0c: hard error writing fsbn 56 of 56-63 (ST0 :> 40<abnrml> ST1 2<write_prosend out a error messages... :> etc etc :> :> Next Step: :> :> 4) ls :> :> And system halt! I try with a UFS floppy and it's all :> ok. : :%%% :The hang is caused by msdosfs_readdir() "handling" reads of 0 bytes by :spinning forever. msdosfs_read() seems to have the same bug. :ufs_readdir() works because it calls VOP_READ() == ffs_read() which :treats reads of 0 bytes as EOF instead of spinning forever. : :Reads of 0 bytes and other short reads shouldn't happen, but they :happen because failed writes clobber b_resid and breadn() later returns :with the clobbered value. Filesystems use b_resid to determine the :extent of i/o in a few places (all (?) cloned from ufs_readwrite.c). :This is probably wrong and very incomplete -- most places, including :critical block allocation code, just assume that breadn() returns an :error if everything could not be read. : :The patch mainly just resets b_resid in breadn(). Determining the :value after the i/o that filled the buffer doesn't seem to be easy, :but it should be 0. : :Index: vfs_bio.c :=================================================================== :RCS file: /home/ncvs/src/sys/kern/vfs_bio.c,v :retrieving revision 1.296 :diff -u -r1.296 vfs_bio.c :--- vfs_bio.c 31 Jan 2002 18:39:44 -0000 1.296 :+++ vfs_bio.c 11 Feb 2002 11:09:22 -0000 :@@ -614,6 +614,13 @@ : vfs_busy_pages(bp, 0); : VOP_STRATEGY(vp, bp); : ++readwait; :+ } else { :+ /* :+ * Recover from failed writes clobbering b_resid. b_resid is :+ * strictly only valid after i/o, but some filesystems expect :+ * it to give the part of `size' that was not readable here. :+ */ :+ bp->b_resid = 0; : } : : for (i = 0; i < cnt; i++, rablkno++, rabsize++) { :@@ -640,6 +647,16 @@ : : if (readwait) { : rv = bufwait(bp); :+ /* :+ * Convert short reads to i/o errors, since short reads aren't :+ * useful for the buffer cache and they are mostly not handled :+ * in callers. :+ * XXX we don't always get here for read-ahead blocks. :+ * b_resid for read-ahead blocks may be clobbered by failed :+ * attempts to write the blocks. :+ */ :+ if (rv == 0 && bp->b_resid != 0) :+ rv = EIO; : } : return (rv); : } :%%% : :Bruce : To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200202112006.g1BK6Dw56980>