From owner-freebsd-bugs@FreeBSD.ORG  Tue Feb 28 15:40:12 2006
Return-Path: <owner-freebsd-bugs@FreeBSD.ORG>
X-Original-To: freebsd-bugs@hub.freebsd.org
Delivered-To: freebsd-bugs@hub.freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6C84516A420
	for <freebsd-bugs@hub.freebsd.org>;
	Tue, 28 Feb 2006 15:40:12 +0000 (GMT)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5545043D5D
	for <freebsd-bugs@hub.freebsd.org>;
	Tue, 28 Feb 2006 15:40:10 +0000 (GMT)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k1SFe7G8067982
	for <freebsd-bugs@freefall.freebsd.org>; Tue, 28 Feb 2006 15:40:07 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.13.4/8.13.4/Submit) id k1SFe7DW067981;
	Tue, 28 Feb 2006 15:40:07 GMT (envelope-from gnats)
Resent-Date: Tue, 28 Feb 2006 15:40:07 GMT
Resent-Message-Id: <200602281540.k1SFe7DW067981@freefall.freebsd.org>
Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer)
Resent-To: freebsd-bugs@FreeBSD.org
Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Yarema <yds@CoolRat.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0203C16A420
	for <FreeBSD-gnats-submit@freebsd.org>;
	Tue, 28 Feb 2006 15:35:39 +0000 (GMT)
	(envelope-from root@CoolRat.org)
Received: from CoolRat.org (c-69-242-5-144.hsd1.pa.comcast.net [69.242.5.144])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DBC1543D45
	for <FreeBSD-gnats-submit@freebsd.org>;
	Tue, 28 Feb 2006 15:35:37 +0000 (GMT)
	(envelope-from root@CoolRat.org)
Received: from localhost (localhost [127.0.0.1]) (uid 0)
	by CoolRat.org with local; Tue, 28 Feb 2006 10:35:36 -0500
	id 003CBC1F.44046DC8.000006A2
Message-Id: <courier.44046DC8.000006A2@CoolRat.org>
Date: Tue, 28 Feb 2006 10:35:36 -0500
From: Yarema <yds@CoolRat.org>
To: FreeBSD-gnats-submit@FreeBSD.org
X-Send-Pr-Version: 3.113
Cc: Dennis Koegel <amf@hobbit.neveragain.de>, Doug White <dwhite@gumbysoft.com>,
	Martin Machacek <m@m3a.net>
Subject: kern/93942: panic: ufs_dirbad: bad dir
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Yarema <yds@CoolRat.org>
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Feb 2006 15:40:12 -0000


>Number:         93942
>Category:       kern
>Synopsis:       panic: ufs_dirbad: bad dir
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Feb 28 15:40:06 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Yarema <yds@CoolRat.org>
>Release:        FreeBSD 6.1-PRERELEASE i386
>Organization:
>Environment:
System: FreeBSD 6.1-PRERELEASE #0: Mon Feb 27 04:52:11 EST 2006 i386

>Description:

This is at least the third file system which got hosed for me by the
ufs_dirbad bug on three different hard drives since 5.3 STABLE.
I suspect this is related to the following PRs:
http://www.FreeBSD.org/cgi/query-pr.cgi?pr=49079
http://www.FreeBSD.org/cgi/query-pr.cgi?pr=51001

In every case a process would lock up making the whole system
unresponsive.  A reboot, fsck -y in single user mode and another
reboot would produce the following during the mount of the corrupt
fs in rw mode:

bad dir ino 2 at  offset 16384: mangled entry
panic: ufs_dirbad: bad dir
cpuid = 0

Another reboot, fsck -y in single user mode and reboot produces the
same results repeatedly.  Previously I had recovered by mounting the
corrupt fs in ro mode, backup, newfs, restore.

Recently I noticed Matthew Dillon commit the following to the
DragonFly src repository:

http://leaf.DragonFlyBSD.org/mailarchive/commits/2006-02/msg00057.html

dillon      2006/02/21 10:46:56 PST

DragonFly src repository

  Modified files:
    sys/kern             vfs_cluster.c 
  Log:
  bioops.io_start() was being called in a situation where the buffer could
  be brelse()'d afterwords instead of I/O being initiated.  When this occurs,
  the buffer may contain softupdates-modified data which is never reverted,
  resulting in serious filesystem corruption.  When io_start is called on a
  buffer, I/O MUST be initiated and terminated with a biodone() or the buffer's
  data may not be properly reverted.
  
  Solve the problem by moving the io_start() call a little further on in the
  code, after the potential brelse().
  
  There is a possibility that this bug is responsible for the 'dirbad' panics
  often reported in DragonFly and FreeBSD circles.
  
  Revision  Changes    Path
  1.16      +7 -6      src/sys/kern/vfs_cluster.c

http://www.DragonFlyBSD.org/cvsweb/src/sys/kern/vfs_cluster.c.diff?r1=1.15&r2=1.16&f=u

Below is the equivalent patch to the FreeBSD RELENG_6 branch of
src/sys/kern/vfs_cluster.c

Hope this helps track down the problem.

>How-To-Repeat:
	mount <corrupt ufs>
>Fix:
--- src/sys/kern/vfs_cluster.c.orig	Fri Oct 28 03:28:27 2005
+++ src/sys/kern/vfs_cluster.c	Tue Feb 28 09:27:20 2006
@@ -881,11 +881,6 @@
 				bremfree(tbp);
 				tbp->b_flags &= ~B_DONE;
 			} /* end of code for non-first buffers only */
-			/* check for latent dependencies to be handled */
-			if ((LIST_FIRST(&tbp->b_dep)) != NULL) {
-				tbp->b_iocmd = BIO_WRITE;
-				buf_start(tbp);
-			}
 			/*
 			 * If the IO is via the VM then we do some
 			 * special VM hackery (yuck).  Since the buffer's
@@ -933,6 +928,11 @@
 			BUF_KERNPROC(tbp);
 			TAILQ_INSERT_TAIL(&bp->b_cluster.cluster_head,
 				tbp, b_cluster.cluster_entry);
+			/* check for latent dependencies to be handled */
+			if ((LIST_FIRST(&tbp->b_dep)) != NULL) {
+				tbp->b_iocmd = BIO_WRITE;
+				buf_start(tbp);
+			}
 		}
 	finishcluster:
 		pmap_qenter(trunc_page((vm_offset_t) bp->b_data),
>Release-Note:
>Audit-Trail:
>Unformatted: