Date: Thu, 13 Aug 1998 20:40:01 -0700 (PDT) From: Matthew Dillon <dillon@backplane.com> To: freebsd-bugs@FreeBSD.ORG Subject: Re: kern/7418 (file corruption on mmap-based-read during file write()) Message-ID: <199808140340.UAA15933@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/7418; it has been noted by GNATS.
From: Matthew Dillon <dillon@backplane.com>
To: Luoqi Chen <luoqi@chen.ml.org>
Cc: freebsd-gnats-submit@freebsd.org, luoqi@watermarkgroup.com
Subject: Re: kern/7418 (file corruption on mmap-based-read during file write())
Date: Thu, 13 Aug 1998 20:33:41 -0700 (PDT)
I'm trying to track down the second PR I sent in... kern/7418 in this
case. I am concentrating on what happens when a page fault from an mmap'd
file occurs in one process while a second process is blocked in
ufs/ufs_readwrite() on the same file.
The file corruption that I am seeing is approximately this:
[this represents one page of memory]
[data][data][data..00 00 00 00][data]
Where 00's replace what should have been valid data in the file.
data written into the locations where the corruption occurs is being
replaced by 00. i.e. I might see
'abcdef [00 00 00] (PAGE BOUNDRY) jklmnop'.
The most interesting item is that the 00 corruption always *ends* at
a page boundry in the file.
--
So here is my question. Refer to ufs/ufs_readwrite.c around line 337,
as shown below. What happens if VOP_BALLOC() blocks or uiomove()
blocks during the discrete write() and, while blocked, another process
has a read fault on precisely the same logical page via mmap()?
Is it possible for ufs_readwrite to obtain a bp, copy the write() data
to it, but for the bp to then somehow be thrown away?
I also don't understand why B_RELBUF is being set for the bp. Can't
this cause the bp to be completely thrown away (aka kern/vfs_bio.c line
710)?? Makes sense for a READ, but I don't understand why B_RELBUF
is being set for the bp in the WRITE.
for (error = 0; uio->uio_resid > 0;) {
lbn = lblkno(fs, uio->uio_offset);
blkoffset = blkoff(fs, uio->uio_offset);
xfersize = fs->fs_bsize - blkoffset;
if (uio->uio_resid < xfersize)
xfersize = uio->uio_resid;
if (uio->uio_offset + xfersize > ip->i_size)
vnode_pager_setsize(vp, uio->uio_offset + xfersize);
if (fs->fs_bsize > xfersize)
flags |= B_CLRBUF;
else
flags &= ~B_CLRBUF;
/* XXX is uio->uio_offset the right thing here? */
error = VOP_BALLOC(vp, uio->uio_offset, xfersize,
ap->a_cred, flags, &bp);
if (error != 0)
break;
if (uio->uio_offset + xfersize > ip->i_size) {
ip->i_size = uio->uio_offset + xfersize;
extended = 1;
}
size = BLKSIZE(fs, ip, lbn) - bp->b_resid;
if (size < xfersize)
xfersize = size;
error =
uiomove((char *)bp->b_data + blkoffset, (int)xfersize, uio);
if ((ioflag & IO_VMIO) &&
(LIST_FIRST(&bp->b_dep) == NULL))
bp->b_flags |= B_RELBUF;
...
}
Finally, I tried looking in other filesystem device code for comparable
source. The ext2fs code seems to simply copy the ufs code:
error =
uiomove((char *)bp->b_data + blkoffset, (int)xfersize, uio);
if ((ioflag & IO_VMIO) &&
(LIST_FIRST(&bp->b_dep) == NULL)) /* in ext2fs? */
bp->b_flags |= B_RELBUF;
The msdosfs does not set B_RELBUF in either its read or write, nor does
the nfs code. It's all very confusing.
-Matt
Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet
Communications
<dillon@backplane.com> (Please include original email in any response)
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199808140340.UAA15933>
