Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 10 Dec 2006 12:57:14 +0000 (GMT)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Maxim Konovalov <maxim@macomnet.ru>
Cc:        freebsd-current@FreeBSD.org, yal <yal@yal.hopto.org>
Subject:   Re: CURRENT freezes on Laitude D520
Message-ID:  <20061210125011.F2296@fledge.watson.org>
In-Reply-To: <20061210123204.V52497@mp2.macomnet.net>
References:  <52944.192.168.1.110.1165679313.squirrel@yal.hopto.org> <20061209195519.B60055@mp2.macomnet.net> <20061209204924.N9926@fledge.watson.org> <20061210013735.D11309@mp2.macomnet.net> <20061210083752.G9926@fledge.watson.org> <20061210123204.V52497@mp2.macomnet.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 10 Dec 2006, Maxim Konovalov wrote:

>>> I didn't suggest to turn off mpsafenet forever and forget, I just wanted 
>>> to check my guess.  I would like to help to debug the problem but I need 
>>> some initial instructions to start.  There is a firewire console.  What do 
>>> I need to check?
>>
>> Start with the information in my followup e-mail to Andrew:
>>
>> - Configure WITNESS and see if you get any console output regarding
>>   lock order problems.
>
> Yes, there is one:
>
> lock order reversal
> 1st 0xd0f277c8 inp (rawinp) @ /usr/src/sys/netinet/raw_ip.c
> 2nd 0xd0ecbb54 wi0 (network driver) @ /usr/src/sys/modules/wi/../../dev/wi/if_wi.c
> KDB
> db_trace_self_wrapper(ce626f9d) at db_trace_self_wrapper+0x25
> kdb_backtrace(ffffffff,ce6a6378,ce6a6b20,ce65bd24,ce6e4ed0,...) at kdb_backtrace+0x29
> witness_checkorder(d0ecbb54,9,d0e73d13,388) at witness_checkorder+0x4db
> _mtx_lock_flags(d0ecbb54,0,d0e73d13,388,ce4d8cdd,...) at _mtx_lock_flags+0x1e
> wi_start(d0e05800) at wi_start+0x32
> if_start(d0e05800) at if_start+0x53
> ether_output_frame(d0e05800,d0d18100,0,1,0,...) at ether_output_frame+0x180
> ether_output(d0e05800,d0d18100,d0e652b0,d0e61bb8,ce6e6b18,...) at ether_output+0x3c0
> ieee80211_output(d0e05800,d0d18100,d0e652b0,d0e61bb8,0,...) at ieee80211_output+0x33
> ip_output(d0d18100,0,e1afbb38,20,0,...) at ip_output+0x7f0
> rip_output(d0d18100,d102ee44,1d2722c3,2000,e1afbbf0,...) at rip_output+0x29b
> rip_send(d102ee44,0,d0d18100,0,0,...) at rip_send+0x4f
> sosend_generic(d102ee44,0,0,d0d18100,0,...) at sosend_generic+0x3e1
> sosend(d102ee44,0,0,d0d18100,0,...) at sosend+0x22
> ng_ksocket_rcvdata(d10ab280,d104f750,1,e1afbc78,0,...) at ng_ksocket_rcvdata+0xa3
> ng_apply_item(d10ab200,d104f750,0,0,d10ab200,...) at ng_apply_item+0xf8
> ngintr(0) at ngintr+0x13d
> swi_net(0) at swi_net+0xba
> ithread_execute_handlers(d09acb40,d09dba00) at ithread_execute_handlers+0xce
> ithread_loop(d09dc180,e1afbd38,ce697af0,0,ce622832,328) at ithread_loop+0x4f
> fork_exit(ce4cdf0c,d09dc180,e1afbd38) at fork_exit+0x68
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xe1afbd6c, ebp = 0 ---
>
> At this point ifconfig wlan0 hangs, reboot hangs.
>
>> - Try setting net.isr.direct=0 and see if the problem goes away.
>
> This indeed help.  LOR has gone and wireless works.
>
>> - Try removing options PREEMPTION and see if the problem goes away.
>
> Haven't try.

As speculated by others, this is a bug in the if_wi driver, which improperly 
holds a device driver lock over a call into the network stack.  While this can 
result in a deadlock under other circumstances, net.isr.direct makes the 
chances of that deadlock much greater.  It appears also that you have netgraph 
in the mix somehow, which might well also increase the chances of the deadlock 
triggering.  Someone(tm) needs to fix if_wi to operate properly with respect 
to the network stack lock order; another feature likely to trigger the same 
device driver bug is IP fast forwarding from a wireless interface.  Sam has 
mentioned to me that this same bug exists in several wireless drivers.

Robert N M Watson
Computer Laboratory
University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061210125011.F2296>