From owner-freebsd-bugs@FreeBSD.ORG Thu May 17 09:44:32 2012 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C33E8106566C for ; Thu, 17 May 2012 09:44:32 +0000 (UTC) (envelope-from Kashyap.Desai@lsi.com) Received: from sam.nabble.com (sam.nabble.com [216.139.236.26]) by mx1.freebsd.org (Postfix) with ESMTP id 9D1368FC0A for ; Thu, 17 May 2012 09:44:32 +0000 (UTC) Received: from [192.168.236.26] (helo=sam.nabble.com) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1SUxG3-0002mn-Uj for freebsd-bugs@freebsd.org; Thu, 17 May 2012 02:44:31 -0700 Date: Thu, 17 May 2012 02:44:31 -0700 (PDT) From: kashyap To: freebsd-bugs@freebsd.org Message-ID: <1337247871932-5709169.post@n5.nabble.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: FREEBSD-8.3 Fatal trap 28: machine check trap while in kernel mode X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 May 2012 09:44:32 -0000 BELOW observation is found at LSI Test lab: On FreeBSD while running the IO's on bare drives along with HBA reset with interval of 30sec, kernel panic is hit between 3 to 5 hours. In this issue, we observed in two different test setups. The test setup details are as below Topology 1: ----------- Server: Supermicro MSIx OS: FreeBSD 9.0 i386 Controller: 2308 Drives: Milehigh enclosure with 512native SAS drives. Connectivity: HBA -> Enclosure Drives Test: IO's on Bare drive along with HBA reset with interval of 30sec. Result: kernel panic is hit between 3 to 5 hours with given below error message ============================================================================================ Fatal trap 28: machine check trap while in kernel mode cpuid = 15; Fatal trap 28: machine check trap while in kernel mode apic id = 17 cpuid = 14; instruction pointer = 0x20:0xc0d304c5 apic id = 16 stack pointer = 0x28:0xc5246c54 instruction pointer = 0x20:0xc0d304c5 frame pointer = 0x28:0xc5246c54 MCA: CPU 1 SourceMCA: Global Cap 0x0000000001000c16, Status 0x0000000000000005 code segment = base 0x0, limit 0xfffff, type 0x1b stack pointer = 0x28:0xc5243c54 = DPL 0, pres 1, def32 1, gran 1 frame pointer = 0x28:0xc5243c54 processor eflags = code segment = base 0x0, limit 0xfffff, type 0x1b interrupt enabled, DWR BUSL2 UNCOR IOPL = 0 = DPL 0, pres 1, def32 1, gran 1 PCC MCA: Vendor "GenuineIntel", ID 0x206e6, APIC ID 3 MCA: Bank 18, Status 0xb60000000000094a MCA: CPU 3 MCA: Global Cap 0x0000000001000c16, Status 0x0000000000000005 UNCOR MCA: Vendor "GenuineIntel", ID 0x206e6, APIC ID 2 PCC MCA: CPU 2 BUSL2 UNCOR SourcePCC DWR BUSL2 I/OSource timed out DWR processor eflags = current process = interrupt enabled, 11 (idle: cpu15) IOPL = 0 trap number = 28 current process = panic: machine check trap 11 (idle: cpu14) cpuid = 15 trap number = 28 KDB: stack backtrace: SourceI/O DWR timed outI/O timed outMCA: Address 0xfaef00c0 BUSL2 I/O#0 0xc0a4b157 at kdb_backtrace+0x47 MCA: Address 0xfaef00c0 Fatal trap 28: machine check trap while in kernel mode Fatal trap 28: machine check trap while in kernel mode cpuid = 12; cpuid = 13; apic id = 14 apic id = 15 instruction pointer = 0x20:0xc0a081d0 instruction pointer = 0x20:0xc0d304c5 stack pointer = 0x28:0xdf1dfc48 stack pointer = 0x28:0xc5240c54 frame pointer = 0x28:0xdf1dfc60 frame pointer = 0x28:0xc5240c54 code segment = base 0x0, limit 0xfffff, type 0x1b code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 = DPL 0, pres 1, def32 1, gran 1 processor eflags = processor eflags = interrupt enabled, interrupt enabled, IOPL = 0 IOPL = 0 current process = current process = 2 (mps_scan0) 11 (idle: cpu13) trap number = 28 trap number = 28 #1 0xc0a186b7 at panic+0x117 Source timed outMCA: Address 0xfaef00c0 MCA: Address 0xfaef00c0 ========================================================================================== Topology 2: ----------- Server: DELL T410 OS: FreeBSD 8.3 i386 Controller: 2208 Drives: Camdon enclosure with 512native SAS drives. Connectivity: HBA -> Enclosure Drives Test: IO's on Bare drive along with HBA reset with interval of 30sec. Result: After Few hours (some 3 to 5 hours) the system gets kernel panic as given below. ======================================================================================= Fatal trap 28: machine check trap while in kernel mode Fatal trap 28: machine check trap while in kernel mode cpuid = 1; cpuid = 3; apic id = 22 apic id = 34 instruction pointer = 0x20:0xc0c27b85 instruction pointer = 0x20:0xc0c27b85 stack pointer = 0x28:0xc6f7ac64 stack pointer = 0x28:0xc6f74c64 frame pointer = 0x28:0xc6f7ac64 frame pointer = 0x28:0xc6f74c64 code segment = base 0x0, limit 0xfffff, type 0x1b code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 = DPL 0, pres 1, def32 1, gran 1 processor eflags = processor eflags = interrupt enabled, interrupt enabled, IOPL = 0 IOPL = 0 current process = current process = 11 (idle: cpu1) 11 (idle: cpu3) trap number = 28 trap number = 28 panic: machine check trap cpuid = 1 KDB: stack backtrace: #0 0xc090f017 at kdb_backtrace+0x47 Fatal trap 28: machine check trap while in kernel mode #1 0xc08df457 at panic+0x117 cpuid = 2; #2 0xc0c46753 at trap_fatal+0x323 apic id = 32 #3 0xc0c471d2 at trap+0xc2 instruction pointer = 0x20:0xc0c27b85 #4 0xc0c2d8dc at calltrap+0x6 stack pointer = 0x28:0xc6f77c64 #5 0xc04f4eb9 at acpi_cpu_idle+0xe9 frame pointer = 0x28:0xc6f77c64 #6 0xc0c3762b at cpu_idle_acpi+0x1b code segment = base 0x0, limit 0xfffff, type 0x1b #7 0xc0c38ebb at cpu_idle+0x1b = DPL 0, pres 1, def32 1, gran 1 #8 0xc0901c51 at sched_idletd+0x231 processor eflags = #9 0xc08b2987 at fork_exit+0x97 interrupt enabled, #10 0xc0c2d954 at fork_trampoline+0x8 IOPL = 0 Uptime: 3h59m32s current process = 11 (idle: cpu2) trap number = 28 Fatal trap 28: machine check trap while in kernel mode Fatal trap 28: machine check trap while in kernel mode cpuid = 1; cpuid = 3; apic id = 22 ========================================================================================= Is there any reason why we started seeing this issue when we moved to FreeBSD-8.3 from 8.2 This is a first time we observe issue and suspecting something in FreeBSD-8.3 is really causing issue ? We are planning to try now with hw.mca.enable=0 option. Any help will be appreciated. `Kashyap -- View this message in context: http://freebsd.1045724.n5.nabble.com/FREEBSD-8-3-Fatal-trap-28-machine-check-trap-while-in-kernel-mode-tp5709169.html Sent from the freebsd-bugs mailing list archive at Nabble.com.