Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 May 2012 02:44:31 -0700 (PDT)
From:      kashyap <Kashyap.Desai@lsi.com>
To:        freebsd-bugs@freebsd.org
Subject:   FREEBSD-8.3 Fatal trap 28: machine check trap while in kernel mode
Message-ID:  <1337247871932-5709169.post@n5.nabble.com>

next in thread | raw e-mail | index | archive | help
BELOW observation is found at LSI Test lab:

On FreeBSD while running the IO's on bare drives along with HBA reset with
interval of 30sec, kernel panic is hit between 3 to 5 hours.

In this issue, we observed in two different test setups. The test setup
details are as below


Topology 1:
-----------
Server: Supermicro MSIx
OS: FreeBSD 9.0 i386
Controller: 2308
Drives: Milehigh enclosure with 512native SAS drives.

Connectivity: HBA -> Enclosure Drives

Test: IO's on Bare drive along with HBA reset with interval of 30sec.

Result: kernel panic is hit between 3 to 5 hours with given below error
message

============================================================================================
Fatal trap 28: machine check trap while in kernel mode

cpuid = 15; Fatal trap 28: machine check trap while in kernel mode
apic id = 17
cpuid = 14; instruction pointer    = 0x20:0xc0d304c5
apic id = 16
stack pointer            = 0x28:0xc5246c54
instruction pointer          = 0x20:0xc0d304c5
frame pointer            = 0x28:0xc5246c54
MCA: CPU 1 SourceMCA: Global Cap 0x0000000001000c16, Status
0x0000000000000005
code segment                = base 0x0, limit 0xfffff, type 0x1b
stack pointer            = 0x28:0xc5243c54
                                    = DPL 0, pres 1, def32 1, gran 1
frame pointer            = 0x28:0xc5243c54
processor eflags            = code segment            = base 0x0, limit
0xfffff, type 0x1b
interrupt enabled,  DWR BUSL2 UNCOR IOPL = 0
                                    = DPL 0, pres 1, def32 1, gran 1
PCC MCA: Vendor "GenuineIntel", ID 0x206e6, APIC ID 3
MCA: Bank 18, Status 0xb60000000000094a
MCA: CPU 3 MCA: Global Cap 0x0000000001000c16, Status 0x0000000000000005
UNCOR MCA: Vendor "GenuineIntel", ID 0x206e6, APIC ID 2
PCC MCA: CPU 2 BUSL2 UNCOR SourcePCC  DWR BUSL2 I/OSource timed out DWR
processor eflags  = current process                        = interrupt
enabled, 11 (idle: cpu15)
IOPL = 0
trap number                  = 28
current process              = panic: machine check trap
11 (idle: cpu14)
cpuid = 15
trap number                  = 28
KDB: stack backtrace:
SourceI/O DWR  timed outI/O
timed outMCA: Address 0xfaef00c0

BUSL2
I/O#0 0xc0a4b157 at kdb_backtrace+0x47
MCA: Address 0xfaef00c0

Fatal trap 28: machine check trap while in kernel mode
Fatal trap 28: machine check trap while in kernel mode
cpuid = 12; cpuid = 13; apic id = 14
apic id = 15
instruction pointer          = 0x20:0xc0a081d0
instruction pointer          = 0x20:0xc0d304c5
stack pointer            = 0x28:0xdf1dfc48
stack pointer            = 0x28:0xc5240c54
frame pointer            = 0x28:0xdf1dfc60
frame pointer            = 0x28:0xc5240c54
code segment                = base 0x0, limit 0xfffff, type 0x1b
code segment                = base 0x0, limit 0xfffff, type 0x1b
                                    = DPL 0, pres 1, def32 1, gran 1
                                    = DPL 0, pres 1, def32 1, gran 1
processor eflags            = processor eflags        = interrupt enabled,
interrupt enabled, IOPL = 0
IOPL = 0
current process              = current process                      = 2
(mps_scan0)
11 (idle: cpu13)
trap number                  = 28
trap number                  = 28
#1 0xc0a186b7 at panic+0x117
Source timed outMCA: Address 0xfaef00c0


MCA: Address 0xfaef00c0

==========================================================================================


Topology 2:
-----------
Server: DELL T410
OS: FreeBSD 8.3 i386
Controller: 2208
Drives: Camdon enclosure with 512native SAS drives.

Connectivity: HBA -> Enclosure Drives

Test: IO's on Bare drive along with HBA reset with interval of 30sec.

Result: After Few hours (some 3 to 5 hours) the system gets kernel panic as
given below.

=======================================================================================

Fatal trap 28: machine check trap while in kernel mode
Fatal trap 28: machine check trap while in kernel mode
cpuid = 1; cpuid = 3; apic id = 22
apic id = 34
instruction pointer    = 0x20:0xc0c27b85
instruction pointer    = 0x20:0xc0c27b85
stack pointer          = 0x28:0xc6f7ac64
stack pointer          = 0x28:0xc6f74c64
frame pointer          = 0x28:0xc6f7ac64
frame pointer          = 0x28:0xc6f74c64
code segment            = base 0x0, limit 0xfffff, type 0x1b
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = processor eflags      = interrupt enabled,
interrupt enabled, IOPL = 0
IOPL = 0
current process        = current process              = 11 (idle: cpu1)
11 (idle: cpu3)
trap number            = 28
trap number            = 28
panic: machine check trap
cpuid = 1
KDB: stack backtrace:

#0 0xc090f017 at kdb_backtrace+0x47
Fatal trap 28: machine check trap while in kernel mode
#1 0xc08df457 at panic+0x117
cpuid = 2; #2 0xc0c46753 at trap_fatal+0x323
apic id = 32
#3 0xc0c471d2 at trap+0xc2
instruction pointer    = 0x20:0xc0c27b85
#4 0xc0c2d8dc at calltrap+0x6
stack pointer          = 0x28:0xc6f77c64
#5 0xc04f4eb9 at acpi_cpu_idle+0xe9
frame pointer          = 0x28:0xc6f77c64
#6 0xc0c3762b at cpu_idle_acpi+0x1b
code segment            = base 0x0, limit 0xfffff, type 0x1b
#7 0xc0c38ebb at cpu_idle+0x1b
                        = DPL 0, pres 1, def32 1, gran 1
#8 0xc0901c51 at sched_idletd+0x231
processor eflags        = #9 0xc08b2987 at fork_exit+0x97
interrupt enabled, #10 0xc0c2d954 at fork_trampoline+0x8
IOPL = 0
Uptime: 3h59m32s
current process        = 11 (idle: cpu2)
trap number            = 28

Fatal trap 28: machine check trap while in kernel mode
Fatal trap 28: machine check trap while in kernel mode
cpuid = 1; cpuid = 3; apic id = 22
=========================================================================================



Is there any reason why we started seeing this issue when we moved to
FreeBSD-8.3 from 8.2
This is a first time we observe issue and suspecting something in
FreeBSD-8.3 is really causing issue ?

We are planning to try now with hw.mca.enable=0 option. 

Any help will be appreciated.

`Kashyap

--
View this message in context: http://freebsd.1045724.n5.nabble.com/FREEBSD-8-3-Fatal-trap-28-machine-check-trap-while-in-kernel-mode-tp5709169.html
Sent from the freebsd-bugs mailing list archive at Nabble.com.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1337247871932-5709169.post>