Date: Wed, 23 Sep 1998 23:54:05 -0700 (PDT) From: Matthew Dillon <dillon@backplane.com> To: committers@FreeBSD.ORG Subject: Having some serious file write / mmap inconsistancy problems Message-ID: <199809240654.XAA17499@apollo.backplane.com>
next in thread | raw e-mail | index | archive | help
I've been trying to track this down but not having much luck. I'm worried that this problem is more serious then just the case that's coming up in my news code. This problem relates back to a PR I submitted a while back. What happens is that data write()s to a file are getting lost when the file is also being mmaped. I believe I have ruled out ftruncate() or file extension as being the cause, I think I found a case where a file was pre-created and a small write into the middle of the file 'disappeared' after a while. I have not been able to write a program to reproduce the bug dependably, but it seems to occur quite often. The case only occurs when the file is also mmap()'d at some point, possibly before or possibly after the write(). I'm down to trying to figure out the ffs_write code in ufs/ufs/ufs_readwrite.c. I've traced the code to some rather suspicious code that I believe may create a problem when the file is simultaniously mmap'd and would appreciate it if someone familiar with the VFS/VM system could take a look at it. Specifically, kern/vfs_bio.c around line 485, in the bdwrite() code. This code calls vfs_setdirty(bp) and then calls vfs_clean_pages(bp) and bqrelse()'s the bp. I'm worried that this sequence is somehow allowing the page to be thrown away when it is also mmap()'d read-only by another process, causing the write to 'disappear' and for the page to revert to its previous contents. In my prior PR, kern/7418, I describe this corruption. It is definintely happening, and it's hapenning at least two or three dozen times a day on my news spool... but just today I found a case where a *tiny* write into the middle of a pre-created file has apparently been 'lost' by the system, which greatly narrows down the cases that could cause the bug. -Matt Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet Communications & God knows what else. <dillon@backplane.com> (Please include original email in any response)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809240654.XAA17499>