Date: Sun, 17 Feb 2013 09:48:32 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: fs@freebsd.org Subject: Re: cleaning files beyond EOF Message-ID: <20130217074832.GA2598@kib.kiev.ua> In-Reply-To: <20130217172928.C1900@besplex.bde.org> References: <20130217113031.N9271@besplex.bde.org> <20130217055528.GB2522@kib.kiev.ua> <20130217172928.C1900@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--mP3DRpeJDSE+ciuQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Feb 17, 2013 at 06:01:50PM +1100, Bruce Evans wrote: > On Sun, 17 Feb 2013, Konstantin Belousov wrote: >=20 > > On Sun, Feb 17, 2013 at 11:33:58AM +1100, Bruce Evans wrote: > >> I have a (possibly damaged) ffs data block with nonzero data beyond > >> EOF. Is anything responsible for clearing this data when the file > >> is mmapped()? > >> > >> At least old versions of gcc mmap() the file and have a bug checking > >> for EOF. They read the garbage beyond the end and get confused. > > > > Does the 'damaged' status of the data block mean that it contain the > > garbage after EOF on disk ? >=20 > Yes, it's at most software damage. I used a broken version of > vfs_bio_clrbuf() for a long time and it probably left some unusual > blocks. This matters suprisingly rarely. I recently had to modify the vfs_bio_clrbuf(). For me, a bug in the function did matter a lot, because the function is used, in particular, to clear the indirect blocks. The bug caused quite random filesystem failures until I figured it out. My version of vfs_bio_clrbuf() is at the end of the message, it avoids accessing b_data. >=20 > I forgot to mention that this is with an old version of FreeBSD, > where I changed vfs_bio.c a lot but barely touched vm. >=20 > > UFS uses a small wrapper around vnode_generic_getpages() as the > > VOP_GETPAGES(), the wrapping code can be ignored for the current > > purpose. > > > > vnode_generic_getpages() iterates over the the pages after the bstrateg= y() > > and marks the part of the page after EOF valid and zeroes it, using > > vm_page_set_valid_range(). >=20 > The old version has a large non-wrapper in ffs, and vnode_generic_getpage= s() > uses vm_page_set_validclean(). Maybe the bug is just in the old > ffs_getpages(). It seems to do only DEV_BSIZE'ed zeroing stuff. It > begins with the same "We have to zero that data" code that forms most > of the wrapper in the current version. It normally only returns > vnode_pager_generic_getpages() after that if bsize < PAGE_SIZE. > However, my version has a variable which I had forgotten about to > control this, and the forgotten setting of this variable results in > always using vnode_pager_generic_getpages(), as in -current. I probably > copied some fixes in -current for this. So the bug can't be just in > ffs_getpages(). >=20 > The "damaged" block is at the end of vfs_default.c. The file size is > 25 * PAGE_SIZE + 16. It is in 7 16K blocks, 2 full 2K frags, and 1 frag > with 16 bytes valid in it. But the ffs_getpages() might be indeed the culprit. It calls vm_page_zero_invalid(), which only has DEV_BSIZE granularity. I think that ffs_getpages() also should zero the after eof part of the last page of the file to fix your damage, since device read cannot read less than DEV_BSIZE. diff --git a/sys/ufs/ffs/ffs_vnops.c b/sys/ufs/ffs/ffs_vnops.c index ef6194c..4240b78 100644 --- a/sys/ufs/ffs/ffs_vnops.c +++ b/sys/ufs/ffs/ffs_vnops.c @@ -844,9 +844,9 @@ static int ffs_getpages(ap) struct vop_getpages_args *ap; { - int i; vm_page_t mreq; - int pcount; + uint64_t size; + int i, pcount; =20 pcount =3D round_page(ap->a_count) / PAGE_SIZE; mreq =3D ap->a_m[ap->a_reqpage]; @@ -861,6 +861,9 @@ ffs_getpages(ap) if (mreq->valid) { if (mreq->valid !=3D VM_PAGE_BITS_ALL) vm_page_zero_invalid(mreq, TRUE); + size =3D VTOI(ap->a_vp)->i_size; + if (mreq->pindex =3D=3D OFF_TO_IDX(size)) + pmap_zero_page_area(mreq, size & PAGE_MASK, PAGE_SIZE); for (i =3D 0; i < pcount; i++) { if (i !=3D ap->a_reqpage) { vm_page_lock(ap->a_m[i]); On the other hand, it is not clear should we indeed protect against such case, or just declare the disk data broken. >=20 > I have another problem that is apparently with > vnode_pager_generic_getpages() and now affects -current from about a > year ago in an identical way with the old version: mmap() is very slow > in msdosfs. cmp uses mmap() too much, and reading files sequentially > using mmap() is 3.4 times slower than reading them using read() on my > DVD media/drive. The i/o seems to be correctly clustered for both. > with average transaction sizes over 50K but tps much lower for mmap(). > Similarly on a (faster) hard disk except the slowness is not as noticeable > (drive buffering might hide it completely). However, for ffs files on > the hard disk, mmap() is as fast as read(). diff --git a/sys/kern/vfs_bio.c b/sys/kern/vfs_bio.c index 6393399..83d3609 100644 --- a/sys/kern/vfs_bio.c +++ b/sys/kern/vfs_bio.c @@ -3704,8 +4070,7 @@ vfs_bio_set_valid(struct buf *bp, int base, int size) void vfs_bio_clrbuf(struct buf *bp)=20 { - int i, j, mask; - caddr_t sa, ea; + int i, j, mask, sa, ea, slide; =20 if ((bp->b_flags & (B_VMIO | B_MALLOC)) !=3D B_VMIO) { clrbuf(bp); @@ -3723,39 +4088,69 @@ vfs_bio_clrbuf(struct buf *bp) if ((bp->b_pages[0]->valid & mask) =3D=3D mask) goto unlock; if ((bp->b_pages[0]->valid & mask) =3D=3D 0) { - bzero(bp->b_data, bp->b_bufsize); + pmap_zero_page_area(bp->b_pages[0], 0, bp->b_bufsize); bp->b_pages[0]->valid |=3D mask; goto unlock; } } - ea =3D sa =3D bp->b_data; - for(i =3D 0; i < bp->b_npages; i++, sa =3D ea) { - ea =3D (caddr_t)trunc_page((vm_offset_t)sa + PAGE_SIZE); - ea =3D (caddr_t)(vm_offset_t)ulmin( - (u_long)(vm_offset_t)ea, - (u_long)(vm_offset_t)bp->b_data + bp->b_bufsize); + sa =3D bp->b_offset & PAGE_MASK; + slide =3D 0; + for (i =3D 0; i < bp->b_npages; i++) { + slide =3D imin(slide + PAGE_SIZE, bp->b_bufsize + sa); + ea =3D slide & PAGE_MASK; + if (ea =3D=3D 0) + ea =3D PAGE_SIZE; if (bp->b_pages[i] =3D=3D bogus_page) continue; - j =3D ((vm_offset_t)sa & PAGE_MASK) / DEV_BSIZE; + j =3D sa / DEV_BSIZE; mask =3D ((1 << ((ea - sa) / DEV_BSIZE)) - 1) << j; VM_OBJECT_LOCK_ASSERT(bp->b_pages[i]->object, MA_OWNED); if ((bp->b_pages[i]->valid & mask) =3D=3D mask) continue; if ((bp->b_pages[i]->valid & mask) =3D=3D 0) - bzero(sa, ea - sa); + pmap_zero_page_area(bp->b_pages[i], sa, ea - sa); else { for (; sa < ea; sa +=3D DEV_BSIZE, j++) { - if ((bp->b_pages[i]->valid & (1 << j)) =3D=3D 0) - bzero(sa, DEV_BSIZE); + if ((bp->b_pages[i]->valid & (1 << j)) =3D=3D 0) { + pmap_zero_page_area(bp->b_pages[i], + sa, DEV_BSIZE); + } } } bp->b_pages[i]->valid |=3D mask; + sa =3D 0; } unlock: VM_OBJECT_UNLOCK(bp->b_bufobj->bo_object); bp->b_resid =3D 0; } =20 +void +vfs_bio_bzero_buf(struct buf *bp, int base, int size) +{ + vm_page_t m; + int i, n; + + if ((bp->b_flags & B_UNMAPPED) =3D=3D 0) { + BUF_CHECK_MAPPED(bp); + bzero(bp->b_data + base, size); + } else { + BUF_CHECK_UNMAPPED(bp); + n =3D PAGE_SIZE - (base & PAGE_MASK); + VM_OBJECT_LOCK(bp->b_bufobj->bo_object); + for (i =3D base / PAGE_SIZE; size > 0 && i < bp->b_npages; ++i) { + m =3D bp->b_pages[i]; + if (n > size) + n =3D size; + pmap_zero_page_area(m, base & PAGE_MASK, n); + base +=3D n; + size -=3D n; + n =3D PAGE_SIZE; + } + VM_OBJECT_UNLOCK(bp->b_bufobj->bo_object); + } +} + /* * vm_hold_load_pages and vm_hold_free_pages get pages into * a buffers address space. The pages are anonymous and are --mP3DRpeJDSE+ciuQ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRIItPAAoJEJDCuSvBvK1BLxoP/RBTTxKS5x4P4Y8eh72d80PC Q7FmAKW62SjIFapsHMYVoIdsfi7QYBWW8RGLI9LSjHfz5ouwb4p7LwR5NjcFCjw3 2G/wv9cTgkHv7/+QVqWNXBi0FquLVPY9oPwe2reFz9MZpHJeN0j/5rYKzCWGLxkC EMog2CdI5UBleLRTHCeXvHiSss75W39GAvefijo1C0/soAESCfSBLiITLZu3ZRjE +8sLS4mJ7pGHJ2AQdbVKGb1dkt6XHoZ2T8Jc3C/GwyoYE4CnUXe/2mssSCjePEng ho5eElyqW+3bwtMJKSQAuS+rQDhUvTLQFHHLlkZLyaDHbIVB0mkC03WQGRkdLi34 0Y+bzfNjcncyM197xUaG828DrpwKQZdsfkG685VzWBFWtzqz5epNVjjLWPBt+ReO HBB5FNAU9FXa0u8/amkpnexYTbfgnRx0h8mtd3B7eviKeZ7o2XrMwUAWVihYvZQN gAl/np41x8WXuIZ0BO3itc4LFFlSYj84AloyP3Tw9LT1zuk8bCcl3Vz0Xp6WAce/ tKfskZIMcgUn02wO3XOkHviwfWsi1IJ+eZhgqg7KWefj80WwtWTM8WR83Q0yhcBO rtTfsgFYPJdhF3dAK7EZBcASSs/xYOL74/qcyUVQwUZ4Wg2AA2ew1VdvSPm10qUu Rsd8TZkYmljlzb1BKkAK =Zg5p -----END PGP SIGNATURE----- --mP3DRpeJDSE+ciuQ--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130217074832.GA2598>