Skip site navigation (1)Skip section navigation (2)
Date:      Sun,  9 Feb 2003 14:06:39 -0800 (PST)
From:      Arun Sharma <adsharma@sharma-home.net>
To:        FreeBSD-gnats-submit@FreeBSD.org
Cc:        smp@FreeBSD.org
Subject:   kern/48117: SMP machine hang during boot related to idle proc and sched_lock
Message-ID:  <20030209220639.A55C02E@astra.mirabella.net>

next in thread | raw e-mail | index | archive | help

>Number:         48117
>Category:       kern
>Synopsis:       SMP machine hang during boot related to idle proc and sched_lock
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Feb 09 14:10:04 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Arun Sharma
>Release:        FreeBSD 5.0-CURRENT i386
>Organization:
>Environment:
System: FreeBSD astra.mirabella.net 5.0-CURRENT FreeBSD 5.0-CURRENT #16: Sat Feb 8 09:08:58 PST 2003 root@astra.mirabella.net:/usr/src/sys/i386/compile/astra i386


>Description:

The machine hangs randomly during bootup on a 2 way SMP box. In some of those hangs, it gets into ddb and I could collect the following info:

db> show pcpu
cpuid        = 0
curthread    = 0xc0d19380: pid 46 "sh"
curpcb       = 0xcad54da0
fpcurthread  = none
idlethread   = 0xc0d18b60: pid 12 "idle: cpu0"
currentldt   = 0x28
db> tr
Debugger(c0364696,0,c036423d,cad54a64,1) at Debugger+0x55
panic(c036423d,c036426b,c0d18a80,0,cad54af8) at panic+0x11f
_mtx_lock_spin(c038b6c0,2,0,0,c1fc4dc8) at _mtx_lock_spin+0x93
hardclock_process(cad54af8,0,c02f682b,20,0) at hardclock_process+0x76
hardclock(cad54af8,c0cf239c,c0334d57,c0829000,c1fc8b28) at hardclock+0x18
clkintr(0) at clkintr+0xec
Xfastintr0() at Xfastintr0+0xba
--- interrupt, eip = 0xc01cc580, esp = 0xcad54b3c, ebp = 0xcad54b58 ---
_mtx_lock_spin(c038b6c0,0,0,0,0) at _mtx_lock_spin+0x50
vm_fault(c0d1f114,80f8000,2,8,c0d19380) at vm_fault+0x1379
trap_pfault(cad54d48,1,80f8a78,202,80f8a78) at trap_pfault+0x125
trap(2f,2f,2f,2f,80fc000) at trap+0x2a3
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0x8052653, esp = 0xbfbff304, ebp = 0xbfbff308 ---
db> show pcpu 1
cpuid        = 1
curthread    = 0xc0d18a80: pid 11 "idle: cpu1"
curpcb       = 0xcad36da0
fpcurthread  = none
idlethread   = 0xc0d18a80: pid 11 "idle: cpu1"
currentldt   = 0x28
db > show msgbuf
[...]
panic: spin lock sched lock held by 0xc0d18a80 for > 5 seconds
cpuid = 0; lapic.id = 00000000

The only piece not captured above is the stack of the
idle process - which was in mi_switch().

Invariants and witness code were not configured-in.

>How-To-Repeat:

	Boot the SMP kernel repeatedly.

>Fix:

	Not clear. Need to figure out why the idle proc (cpu1) was sitting
	in mi_switch() for more than 5 secs.

>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030209220639.A55C02E>