From owner-freebsd-hackers@FreeBSD.ORG Mon Mar 12 04:38:55 2007 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7131D16A405 for ; Mon, 12 Mar 2007 04:38:55 +0000 (UTC) (envelope-from freebsd@scottevil.com) Received: from pro28.abac.com (pro28.abac.com [66.226.64.29]) by mx1.freebsd.org (Postfix) with ESMTP id 5D2D113C48A for ; Mon, 12 Mar 2007 04:38:55 +0000 (UTC) (envelope-from freebsd@scottevil.com) Received: from [192.168.2.83] (CPE-24-166-164-34.kc.res.rr.com [24.166.164.34]) (authenticated bits=0) by pro28.abac.com (8.13.6/8.13.6) with ESMTP id l2C3wvSc038580 for ; Sun, 11 Mar 2007 20:58:58 -0700 (PDT) (envelope-from freebsd@scottevil.com) Message-ID: <45F4D001.4030901@scottevil.com> Date: Sun, 11 Mar 2007 22:58:57 -0500 From: Scott Oertel User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: 6.2-RELEASE Kernel Panic (thread taskq) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2007 04:38:55 -0000 Hello all, I have 8 machines running 6.2-RELEASE, they're all under pretty heavy load. All but one is running Dual Opterons on Tyan motherboards, the other is running 2x Dual Core Xeon, on a Supermicro mb. I am receiving this panic on all the machines, some of them it happens once a month, others every few days. The only thing they all have in common now is they're running 3ware 9550SX SATA RAID controllers, and a PAE/SMP kernel I have done a lot of searching around and have found other people having this same issue, but nobody seems to have a fix for it, or the threads just die eventually. I'm trying to get a dump but they seem to be corrupted when I try and load them into the debugger. I am going to attempt to do some online debugging next time I encounter one of these panic's Anyway, here is the panic message. I have access to a serial console on all the machines, and I've enabled the kernel option to drop to the debugger on panic. Does anyone have any advice, or a fix in regards to this issue? Here is the closest thing I got with nm from the instruction pointer: [root@xxx ~]# nm -n /boot/kernel/kernel | grep c03f60 c03f6050 T _mtx_lock_sleep ------------- kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 07 fault virtual address = 0x104 fault code = supervisor read, page not present instruction pointer = 0x20:0xc03f60ed stack pointer = 0x28:0xe8e05c90 frame pointer = 0x28:0xe8e05c9c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 5 (thread taskq) trap number = 12 panic: page fault cpuid = 3 Uptime: 2d19h50m39s Cannot dump. No dump device defined. ### NOTE: I've defined the dump device now. Automatic reboot in 15 seconds - press a key on the console to abort Thanks, Scott Oertel