From owner-freebsd-bugs@FreeBSD.ORG  Tue Mar 22 12:46:38 2011
Return-Path: <owner-freebsd-bugs@FreeBSD.ORG>
Delivered-To: freebsd-bugs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5F65E106564A
	for <freebsd-bugs@freebsd.org>; Tue, 22 Mar 2011 12:46:38 +0000 (UTC)
	(envelope-from gpalmer@freebsd.org)
Received: from noop.in-addr.com (mail.in-addr.com [IPv6:2001:470:8:162::1])
	by mx1.freebsd.org (Postfix) with ESMTP id 3353D8FC0C
	for <freebsd-bugs@freebsd.org>; Tue, 22 Mar 2011 12:46:38 +0000 (UTC)
Received: from gjp by noop.in-addr.com with local (Exim 4.74 (FreeBSD))
	(envelope-from <gpalmer@freebsd.org>)
	id 1Q20yp-000Le7-SE; Tue, 22 Mar 2011 08:46:35 -0400
Date: Tue, 22 Mar 2011 08:46:35 -0400
From: Gary Palmer <gpalmer@freebsd.org>
To: Micka?l Can?vet <canevet@embl.fr>
Message-ID: <20110322124635.GA1618@in-addr.com>
References: <1300791194.2566.37.camel@pc286.embl.fr>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1300791194.2566.37.camel@pc286.embl.fr>
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: gpalmer@freebsd.org
X-SA-Exim-Scanned: No (on noop.in-addr.com); SAEximRunCond expanded to false
Cc: freebsd-bugs@freebsd.org
Subject: Re: "Fatal double fault" panic
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Mar 2011 12:46:38 -0000

On Tue, Mar 22, 2011 at 11:53:14AM +0100, Micka?l Can?vet wrote:
> Hi,
> 
> I have a redundant NAS made of FreeBSD + HAST + ZFS and 24TB of disks.
> 
> This morning my primary node crashed around 4:20am.
> 
> On the console I can see:
> 
> Fatal double fault
> rip = 0xffffffff805e78b8
> rsp = 0xffffff8485d43fc0
> rbp = 0xffffff8485d44010
> cpuid = 1; apic id = 12
> panic: double fault
> cpuid = 1
> KDB: stack backstrace:
> #0 0xffffffff805f4e0e at kdb_backtrace+0x5e
> #1 0xffffffff805c2d07 at panic+0x187
> #2 0xffffffff808ac366 at dblfault_handler+0x96
> #3 0xffffffff808950bd at Xdblfault+0xad
> Uptime: 4d14h7m5s
> Cannot sump, Device not defined or unavailable.
> 
> The only thing I can see on my munin graphs is a strange IO activity
> (disk and network over my HAST link) that starts at 3am every morning
> and last about 1 hour and a half (and so until crash this morning). I
> double checked my scheduled scripts and I do not do anything at that
> time. So I suspect a system script to be responsible of this activity.
> I'm not sure that this IO activity results in the crash, but that the
> only track I have.

3am is when the scripts in /etc/periodic/daily fire

# grep daily /etc/crontab
# Perform daily/weekly/monthly maintenance.
1       3       *       *       *       root    periodic daily


Regards,

Gary