From owner-freebsd-current Mon Sep 21 09:42:28 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id JAA10226 for freebsd-current-outgoing; Mon, 21 Sep 1998 09:42:28 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from lor.watermarkgroup.com (lor.watermarkgroup.com [207.202.73.33]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id JAA10151 for ; Mon, 21 Sep 1998 09:42:02 -0700 (PDT) (envelope-from luoqi@watermarkgroup.com) Received: (from luoqi@localhost) by lor.watermarkgroup.com (8.8.8/8.8.8) id MAA19384; Mon, 21 Sep 1998 12:24:13 -0400 (EDT) (envelope-from luoqi) Date: Mon, 21 Sep 1998 12:24:13 -0400 (EDT) From: Luoqi Chen Message-Id: <199809211624.MAA19384@lor.watermarkgroup.com> To: Don.Lewis@tsc.tdk.com, current@FreeBSD.ORG, luoqi@watermarkgroup.com Subject: Re: Yet another patch to try for softupdates panic Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Yeah, it looks like it might fix the problem. I tracked down the > brokenness that I have been seeing to what looks like concurrent > directory access, even though directories are supposed to be locked > while they are being fiddled with. It looks like the > initiate_write_filepage panic is caused by two processes trying to > store directory entries in the same slot. I've seen one process start Exactly. > a ufs_lookup() in a directory while another process was doing a > ufs_direnter() on that directory. This shouldn't be possible because > ufs_direnter() should only happen if the directory is locked, and > ufs_lookup() shouldn't be called until it's caller can lock the directory. > > If ufs_direnter() decides to compact the directory, it calls > UFS_TRUNCATE(), which ends up calling softdep_fsync() if softupdates > are enabled. Softdep_fsync() will unlock the directory, which is evil, > and your patch should prevent this. What bothers me is that the > directory truncation doesn't happen until after ufs_direnter() has > stored the new directory entry, so I don't see how the softdep_fsync() > unlocking bug causes the symptoms. It looks to me like the directory It doesn't lead to panic immediately. But it leaves the system in a inconsistent state: the real size of the directory is in fact larger then what the i_size field of inode says it is. The next time a file is created in this directory, ufs_lookup() will only search for empty slot up to i_size bytes in the directory, if it couldn't find anything, this file will be placed at the first slot beyond i_size and there's already a valid entry there! > is somehow getting unlocked before the new directory entry is > installed. It seems like the first process which wants to create a > directory entry finds a free directory slot and calls ufs_direnter() > which somehow unlocks the directory and goes to sleep for a while. > Meanwhile another process finds the same directory slot and fills it. > The first process then wakes up and overwrites the directory slot used > by the second process. BOOM! If the directory slot is written before > the lock is released, then the lookup() in the second process shouldn't > find that slot free and there shouldn't be a collision. > > -lq To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message