From owner-freebsd-stable@FreeBSD.ORG Tue Mar 14 13:32:11 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1C60E16A41F for ; Tue, 14 Mar 2006 13:32:11 +0000 (UTC) (envelope-from anderson@centtech.com) Received: from mh2.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2DE9B43D46 for ; Tue, 14 Mar 2006 13:32:10 +0000 (GMT) (envelope-from anderson@centtech.com) Received: from [10.177.171.220] (neutrino.centtech.com [10.177.171.220]) by mh2.centtech.com (8.13.1/8.13.1) with ESMTP id k2EDW923072644; Tue, 14 Mar 2006 07:32:09 -0600 (CST) (envelope-from anderson@centtech.com) Message-ID: <4416C5D6.3020501@centtech.com> Date: Tue, 14 Mar 2006 07:32:06 -0600 From: Eric Anderson User-Agent: Thunderbird 1.5 (X11/20060112) MIME-Version: 1.0 To: Uwe Doering References: <4415E8BB.1080602@centtech.com> <441684F3.8030401@geminix.org> In-Reply-To: <441684F3.8030401@geminix.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.87.1/1329/Mon Mar 13 18:22:03 2006 on mh2.centtech.com X-Virus-Status: Clean Cc: freebsd-stable@freebsd.org Subject: Re: panic: ffs_valloc: dup alloc X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Mar 2006 13:32:11 -0000 Uwe Doering wrote: > Eric Anderson wrote: >> I get the above panic after nfs clients attach to this nfs server and >> being read/write ops on it after an unclean shutdown. I've fsck'ed >> the fs, and it marks it as clean, but I get this every time. It's an >> NFS share of a GEOM stripe (about 2TB). >> mode = 0100600, inum = 58456203, fs = /mnt >> panic: ffs_valloc: dup alloc > > Do you happen to have disk mirroring on this server (RAID 1)? At > work, on a workstation with RAID 1, we once had a case where after a > power failure fsck would succeed, but subsequently, when mounting and > using the partitions, the kernel still paniced because of a corrupt > filesystem. Repeatedly. > > This caused some major head scratching on our part until we figured > out what was happening. The mirrored disks had gone out of sync. For > performance reasons, a RAID 1 controller reads data from one disk > drive or the other, depending on which drive is less busy in that > particular moment. So while fsck was able to find and fix some > filesystem inconsistencies there were still some more left in disk > sectors it didn't access. > > The RAID controller we used turned out to have a verification mode > where it would scan the disks and re-synchronize them. Afterwards we > did another fsck run, and this fixed the remaining filesystem > inconsistencies. The kernel panics were gone. > > Now, with the information you've provided I can't tell whether these > findings apply to your case, but perhaps this story helps at least > others in a similar situation. I do have mirroring enabled on the OS drives, but this is happening with an external fiber channel array of SATA disks, striped using gstripe. Eric -- ------------------------------------------------------------------------ Eric Anderson Sr. Systems Administrator Centaur Technology Anything that works is better than anything that doesn't. ------------------------------------------------------------------------