From owner-freebsd-current@FreeBSD.ORG Thu May 4 19:33:29 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3451A16A400; Thu, 4 May 2006 19:33:29 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id CAB8843D48; Thu, 4 May 2006 19:33:28 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.13.4.20060308/8.13.4) with ESMTP id k44JXS2w033060; Thu, 4 May 2006 12:33:28 -0700 (PDT) Received: (from dillon@localhost) by apollo.backplane.com (8.13.4.20060308/8.13.4/Submit) id k44JXS7N033059; Thu, 4 May 2006 12:33:28 -0700 (PDT) Date: Thu, 4 May 2006 12:33:28 -0700 (PDT) From: Matthew Dillon Message-Id: <200605041933.k44JXS7N033059@apollo.backplane.com> To: freebsd-gnats-submit@freebsd.org, freebsd-current@freebsd.org Cc: Subject: Re: kern/93942: panic: ufs_dirbad: bad dir X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 May 2006 19:33:29 -0000 I've found three additional issues which might be related to ufs_dirbad panics. Again, unfortunately, no smoking gun. First, if B_NOCACHE gets set on a B_DIRTY buffer, the buffer can be lost without the data being written under certain conditions due to brelse() mechanics. B_NOCACHE is typically set by softupdates related code but can be set by other things as well (in particular, if a buffer is resized, and certain write/read combinations). One might think that calling bwrite() after setting B_NOCACHE would be safe, but that is not necessarily true. If a buffer is redirtied (B_DIRTY set) during the write, something which softupdates does all the time, B_NOCACHE almost certainly has to be cleared. Of the three issues I found, this is the most likely cause. Second, vnode_pager_setsize() is being called too late in ufs/ufs/ufs_lookup.c (line 733 in FreeBSD-current). It is being called after the buffer has been instantiated. This could create problems with the VMIO backing store for the buffer created by the UFS_BALLOC call. Third, vnode_pager_setsize() is being called too late in ufs/ufs/ufs_vnops.c (line 1557 in FreeBSD-current). It is being called after the buffer has been instantiated by UFS_BALLOC() in ufs_mkdir(), which could create problems with the buffer's VMIO backing store. -- The M.O. of this corruption, after examining over a dozen kernel cores, makes me now believe that the corruption is occuring when the kernel attempts to append a full block to a directory. The bitmaps are all good... it is if as though the directory block never got written and the data we are seeing is data that existed in tha block before the directory allocated it. But, likewise, the issue has occured with different disk drivers so I think we can rule out a disk driver failure. The issue also seems to occur most often with large, 'busy' buffers (lots of directory operations going on). Since no similar corruption has ever been reported for heavily used files, this supports the idea that it is *not* the disk driver. I believe that the data is getting written to the filesystem buffer representing the new block, but the buffer or its backing store is somehow getting thrown away without being written, or getting thrown away and then reinstantiated without being read. The areas I indicate in the above list are areas where data can potentially get thrown away or lost prior to a write. -Matt Matthew Dillon (Patch against DragonFly, will not apply to FreeBSD directly, included for reference only): Index: kern/vfs_bio.c =================================================================== RCS file: /cvs/src/sys/kern/vfs_bio.c,v retrieving revision 1.53.2.1 diff -u -r1.53.2.1 vfs_bio.c --- kern/vfs_bio.c 18 Apr 2006 17:12:25 -0000 1.53.2.1 +++ kern/vfs_bio.c 24 Apr 2006 19:22:04 -0000 @@ -972,6 +972,13 @@ bdirty(struct buf *bp) { KASSERT(bp->b_qindex == BQUEUE_NONE, ("bdirty: buffer %p still on queue %d", bp, bp->b_qindex)); + if (bp->b_flags & B_NOCACHE) { + printf("bdirty: clearing B_NOCACHE on buf %p\n", bp); + bp->b_flags &= ~B_NOCACHE; + } + if (bp->b_flags & B_INVAL) { + printf("bdirty: warning, dirtying invalid buffer %p\n", bp); + } bp->b_flags &= ~(B_READ|B_RELBUF); if ((bp->b_flags & B_DELWRI) == 0) { @@ -1096,6 +1103,11 @@ crit_enter(); + if ((bp->b_flags & (B_NOCACHE|B_DIRTY)) == (B_NOCACHE|B_DIRTY)) { + printf("warning: buf %p marked dirty & B_NOCACHE, clearing B_NOCACHE\n", bp); + bp->b_flags &= ~B_NOCACHE; + } + if (bp->b_flags & B_LOCKED) bp->b_flags &= ~B_ERROR; Index: vfs/ufs/ufs_lookup.c =================================================================== RCS file: /cvs/src/sys/vfs/ufs/ufs_lookup.c,v retrieving revision 1.18 diff -u -r1.18 ufs_lookup.c --- vfs/ufs/ufs_lookup.c 14 Sep 2005 01:13:48 -0000 1.18 +++ vfs/ufs/ufs_lookup.c 24 Apr 2006 19:22:23 -0000 @@ -716,6 +716,7 @@ */ if (dp->i_offset & (DIRBLKSIZ - 1)) panic("ufs_direnter: newblk"); + vnode_pager_setsize(dvp, dp->i_offset + DIRBLKSIZ); flags = B_CLRBUF; if (!DOINGSOFTDEP(dvp) && !DOINGASYNC(dvp)) flags |= B_SYNC; @@ -727,7 +728,6 @@ } dp->i_size = dp->i_offset + DIRBLKSIZ; dp->i_flag |= IN_CHANGE | IN_UPDATE; - vnode_pager_setsize(dvp, (u_long)dp->i_size); dirp->d_reclen = DIRBLKSIZ; blkoff = dp->i_offset & (VFSTOUFS(dvp->v_mount)->um_mountp->mnt_stat.f_iosize - 1); Index: vfs/ufs/ufs_vnops.c =================================================================== RCS file: /cvs/src/sys/vfs/ufs/ufs_vnops.c,v retrieving revision 1.32 diff -u -r1.32 ufs_vnops.c --- vfs/ufs/ufs_vnops.c 17 Sep 2005 07:43:12 -0000 1.32 +++ vfs/ufs/ufs_vnops.c 24 Apr 2006 19:22:42 -0000 @@ -1420,12 +1420,12 @@ dirtemplate = *dtp; dirtemplate.dot_ino = ip->i_number; dirtemplate.dotdot_ino = dp->i_number; + vnode_pager_setsize(tvp, DIRBLKSIZ); if ((error = VOP_BALLOC(tvp, (off_t)0, DIRBLKSIZ, cnp->cn_cred, B_CLRBUF, &bp)) != 0) goto bad; ip->i_size = DIRBLKSIZ; ip->i_flag |= IN_CHANGE | IN_UPDATE; - vnode_pager_setsize(tvp, (u_long)ip->i_size); bcopy((caddr_t)&dirtemplate, (caddr_t)bp->b_data, sizeof dirtemplate); if (DOINGSOFTDEP(tvp)) { /*