From owner-freebsd-bugs@FreeBSD.ORG Tue Feb 28 15:40:12 2006 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6C84516A420 for ; Tue, 28 Feb 2006 15:40:12 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5545043D5D for ; Tue, 28 Feb 2006 15:40:10 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k1SFe7G8067982 for ; Tue, 28 Feb 2006 15:40:07 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id k1SFe7DW067981; Tue, 28 Feb 2006 15:40:07 GMT (envelope-from gnats) Resent-Date: Tue, 28 Feb 2006 15:40:07 GMT Resent-Message-Id: <200602281540.k1SFe7DW067981@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Yarema Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0203C16A420 for ; Tue, 28 Feb 2006 15:35:39 +0000 (GMT) (envelope-from root@CoolRat.org) Received: from CoolRat.org (c-69-242-5-144.hsd1.pa.comcast.net [69.242.5.144]) by mx1.FreeBSD.org (Postfix) with ESMTP id DBC1543D45 for ; Tue, 28 Feb 2006 15:35:37 +0000 (GMT) (envelope-from root@CoolRat.org) Received: from localhost (localhost [127.0.0.1]) (uid 0) by CoolRat.org with local; Tue, 28 Feb 2006 10:35:36 -0500 id 003CBC1F.44046DC8.000006A2 Message-Id: Date: Tue, 28 Feb 2006 10:35:36 -0500 From: Yarema To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Dennis Koegel , Doug White , Martin Machacek Subject: kern/93942: panic: ufs_dirbad: bad dir X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Yarema List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Feb 2006 15:40:12 -0000 >Number: 93942 >Category: kern >Synopsis: panic: ufs_dirbad: bad dir >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Feb 28 15:40:06 GMT 2006 >Closed-Date: >Last-Modified: >Originator: Yarema >Release: FreeBSD 6.1-PRERELEASE i386 >Organization: >Environment: System: FreeBSD 6.1-PRERELEASE #0: Mon Feb 27 04:52:11 EST 2006 i386 >Description: This is at least the third file system which got hosed for me by the ufs_dirbad bug on three different hard drives since 5.3 STABLE. I suspect this is related to the following PRs: http://www.FreeBSD.org/cgi/query-pr.cgi?pr=49079 http://www.FreeBSD.org/cgi/query-pr.cgi?pr=51001 In every case a process would lock up making the whole system unresponsive. A reboot, fsck -y in single user mode and another reboot would produce the following during the mount of the corrupt fs in rw mode: bad dir ino 2 at offset 16384: mangled entry panic: ufs_dirbad: bad dir cpuid = 0 Another reboot, fsck -y in single user mode and reboot produces the same results repeatedly. Previously I had recovered by mounting the corrupt fs in ro mode, backup, newfs, restore. Recently I noticed Matthew Dillon commit the following to the DragonFly src repository: http://leaf.DragonFlyBSD.org/mailarchive/commits/2006-02/msg00057.html dillon 2006/02/21 10:46:56 PST DragonFly src repository Modified files: sys/kern vfs_cluster.c Log: bioops.io_start() was being called in a situation where the buffer could be brelse()'d afterwords instead of I/O being initiated. When this occurs, the buffer may contain softupdates-modified data which is never reverted, resulting in serious filesystem corruption. When io_start is called on a buffer, I/O MUST be initiated and terminated with a biodone() or the buffer's data may not be properly reverted. Solve the problem by moving the io_start() call a little further on in the code, after the potential brelse(). There is a possibility that this bug is responsible for the 'dirbad' panics often reported in DragonFly and FreeBSD circles. Revision Changes Path 1.16 +7 -6 src/sys/kern/vfs_cluster.c http://www.DragonFlyBSD.org/cvsweb/src/sys/kern/vfs_cluster.c.diff?r1=1.15&r2=1.16&f=u Below is the equivalent patch to the FreeBSD RELENG_6 branch of src/sys/kern/vfs_cluster.c Hope this helps track down the problem. >How-To-Repeat: mount >Fix: --- src/sys/kern/vfs_cluster.c.orig Fri Oct 28 03:28:27 2005 +++ src/sys/kern/vfs_cluster.c Tue Feb 28 09:27:20 2006 @@ -881,11 +881,6 @@ bremfree(tbp); tbp->b_flags &= ~B_DONE; } /* end of code for non-first buffers only */ - /* check for latent dependencies to be handled */ - if ((LIST_FIRST(&tbp->b_dep)) != NULL) { - tbp->b_iocmd = BIO_WRITE; - buf_start(tbp); - } /* * If the IO is via the VM then we do some * special VM hackery (yuck). Since the buffer's @@ -933,6 +928,11 @@ BUF_KERNPROC(tbp); TAILQ_INSERT_TAIL(&bp->b_cluster.cluster_head, tbp, b_cluster.cluster_entry); + /* check for latent dependencies to be handled */ + if ((LIST_FIRST(&tbp->b_dep)) != NULL) { + tbp->b_iocmd = BIO_WRITE; + buf_start(tbp); + } } finishcluster: pmap_qenter(trunc_page((vm_offset_t) bp->b_data), >Release-Note: >Audit-Trail: >Unformatted: