Date: Mon, 21 Sep 1998 05:51:22 -0700 From: Don Lewis <Don.Lewis@tsc.tdk.com> To: Luoqi Chen <luoqi@watermarkgroup.com>, current@FreeBSD.ORG Subject: Re: Yet another patch to try for softupdates panic Message-ID: <199809211251.FAA14043@salsa.gv.tsc.tdk.com> In-Reply-To: Luoqi Chen <luoqi@watermarkgroup.com> "Yet another patch to try for softupdates panic" (Sep 18, 3:41pm)
next in thread | previous in thread | raw e-mail | index | archive | help
On Sep 18, 3:41pm, Luoqi Chen wrote: } Subject: Yet another patch to try for softupdates panic } This patch could be the real cure for the `initiate_write_filepage' panic } people were seeing during make -j# world. I have posted another patch } about a week ago (in fact, I have committed it), but it turned out to be } no more than a no-op (thanks to Bruce for pointing it out, it was an } embarrassing silly mistake of mine). I certainly hope this patch will do } its work: this patch should fix a race condition between directory truncation } and file creation that could lead to the `initiate_write_filepage' panic. Yeah, it looks like it might fix the problem. I tracked down the brokenness that I have been seeing to what looks like concurrent directory access, even though directories are supposed to be locked while they are being fiddled with. It looks like the initiate_write_filepage panic is caused by two processes trying to store directory entries in the same slot. I've seen one process start a ufs_lookup() in a directory while another process was doing a ufs_direnter() on that directory. This shouldn't be possible because ufs_direnter() should only happen if the directory is locked, and ufs_lookup() shouldn't be called until it's caller can lock the directory. If ufs_direnter() decides to compact the directory, it calls UFS_TRUNCATE(), which ends up calling softdep_fsync() if softupdates are enabled. Softdep_fsync() will unlock the directory, which is evil, and your patch should prevent this. What bothers me is that the directory truncation doesn't happen until after ufs_direnter() has stored the new directory entry, so I don't see how the softdep_fsync() unlocking bug causes the symptoms. It looks to me like the directory is somehow getting unlocked before the new directory entry is installed. It seems like the first process which wants to create a directory entry finds a free directory slot and calls ufs_direnter() which somehow unlocks the directory and goes to sleep for a while. Meanwhile another process finds the same directory slot and fills it. The first process then wakes up and overwrites the directory slot used by the second process. BOOM! If the directory slot is written before the lock is released, then the lookup() in the second process shouldn't find that slot free and there shouldn't be a collision. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809211251.FAA14043>