Date: Wed, 23 Sep 1998 23:54:05 -0700 (PDT) From: Matthew Dillon <dillon@backplane.com> To: committers@FreeBSD.ORG Subject: Having some serious file write / mmap inconsistancy problems Message-ID: <199809240654.XAA17499@apollo.backplane.com>
next in thread | raw e-mail | index | archive | help
I've been trying to track this down but not having much luck. I'm
worried that this problem is more serious then just the case that's
coming up in my news code.
This problem relates back to a PR I submitted a while back. What happens
is that data write()s to a file are getting lost when the file is also
being mmaped.
I believe I have ruled out ftruncate() or file extension as being the
cause, I think I found a case where a file was pre-created and a small
write into the middle of the file 'disappeared' after a while. I have
not been able to write a program to reproduce the bug dependably,
but it seems to occur quite often. The case only occurs when the file
is also mmap()'d at some point, possibly before or possibly after the
write().
I'm down to trying to figure out the ffs_write code in
ufs/ufs/ufs_readwrite.c. I've traced the code to some rather suspicious
code that I believe may create a problem when the file is simultaniously
mmap'd and would appreciate it if someone familiar with the VFS/VM system
could take a look at it.
Specifically, kern/vfs_bio.c around line 485, in the bdwrite() code.
This code calls vfs_setdirty(bp) and then calls vfs_clean_pages(bp) and
bqrelse()'s the bp.
I'm worried that this sequence is somehow allowing the page to be thrown
away when it is also mmap()'d read-only by another process, causing the
write to 'disappear' and for the page to revert to its previous contents.
In my prior PR, kern/7418, I describe this corruption. It is definintely
happening, and it's hapenning at least two or three dozen times a day on
my news spool... but just today I found a case where a *tiny* write into
the middle of a pre-created file has apparently been 'lost' by the system,
which greatly narrows down the cases that could cause the bug.
-Matt
Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet
Communications & God knows what else.
<dillon@backplane.com> (Please include original email in any response)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809240654.XAA17499>
