FreeBSD Mail Archives

Date:      Fri, 25 May 2007 17:17:02 +0200
From:      Volker <volker@vwsoft.com>
To:        freebsd-stable@FreeBSD.ORG
Subject:   LORs (was Re: ghosthunting: machine freeze 6.2R)
Message-ID:  <4656FDEE.7020002@vwsoft.com>
In-Reply-To: <4656CC57.7010705@vwsoft.com>
References:  <200705230717.l4N7HuPW010071@lurza.secnetix.de>	<465408F9.6080302@vwsoft.com>	<4654C0C4.2030405@vwsoft.com>	<20070523215818.GB64723@xor.obsecurity.org>	<4656A73E.9040109@vwsoft.com> <4656CC57.7010705@vwsoft.com>

On 05/25/07 13:45, Volker wrote:
> Using a debug kernel, the machine came up quickly with this LOR after 
> the reboot:
> 
> lock order reversal:
>  1st 0xc077078c tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:625
>  2nd 0xc4f18180 pf task mtx (pf task mtx) @ 
> /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6386
> KDB: stack backtrace:
> kdb_backtrace(0,ffffffff,c072fcd0,c072e1c8,c06f6124,...) at 
> kdb_backtrace+0x29
> witness_checkorder(c4f18180,9,c4f1536e,18f2) at witness_checkorder+0x578
> _mtx_lock_flags(c4f18180,0,c4f1536e,18f2,c4f18180,...) at 
> _mtx_lock_flags+0x78
> pf_test(2,c4bdec00,e35c5ac4,0,0,...) at pf_test+0x81
> pf_check_out(0,e35c5ac4,c4bdec00,2,0) at pf_check_out+0x3d
> pfil_run_hooks(c0770340,e35c5b40,c4bdec00,2,0,...) at pfil_run_hooks+0xc9
> ip_output(c50c8200,0,e35c5b0c,0,0,0) at ip_output+0x83a
> tcp_respond(0,c4f85810,c4f85824,c50c8200,0,7a481ad6,4) at tcp_respond+0x3e1
> tcp_input(c50c8200,14,1,93d306d9,0,...) at tcp_input+0x3124
> ip_input(c50c8200) at ip_input+0x785
> netisr_processqueue(c076dfd8) at netisr_processqueue+0x6e
> swi_net(0) at swi_net+0xc2
> ithread_execute_handlers(c4afca78,c4b4b180) at 
> ithread_execute_handlers+0xe6
> ithread_loop(c4adb990,e35c5d38,c4adb990,c0505918,0,...) at 
> ithread_loop+0x67
> fork_exit(c0505918,c4adb990,e35c5d38) at fork_exit+0xa0
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xe35c5d6c, ebp = 0 ---
> Expensive timeout(9) function: 0xc0528fb4(0) 0.002565972 s
> 

This first one appeared at 13:22 (short after bootup).

ok, the next two LORs (similar to the first):


at 13:28 this one came into the logs:

lock order reversal:
  1st 0xc077078c tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:625
  2nd 0xc4f18180 pf task mtx (pf task mtx) @ 
/usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6386
KDB: stack backtrace:
kdb_backtrace(0,ffffffff,c072fcd0,c072e1c8,c06f6124,...) at 
kdb_backtrace+0x29
witness_checkorder(c4f18180,9,c4f1536e,18f2) at witness_checkorder+0x578
_mtx_lock_flags(c4f18180,0,c4f1536e,18f2,c4f18180,...) at 
_mtx_lock_flags+0x78
pf_test(2,c4bdec00,e35c5ac4,0,0,...) at pf_test+0x81
pf_check_out(0,e35c5ac4,c4bdec00,2,0) at pf_check_out+0x3d
pfil_run_hooks(c0770340,e35c5b40,c4bdec00,2,0,...) at 
pfil_run_hooks+0xc9
ip_output(c50c8200,0,e35c5b0c,0,0,0) at ip_output+0x83a
tcp_respond(0,c4f85810,c4f85824,c50c8200,0,7a481ad6,4) at 
tcp_respond+0x3e1
tcp_input(c50c8200,14,1,93d306d9,0,...) at tcp_input+0x3124
ip_input(c50c8200) at ip_input+0x785
netisr_processqueue(c076dfd8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2
ithread_execute_handlers(c4afca78,c4b4b180) at 
ithread_execute_handlers+0xe6
ithread_loop(c4adb990,e35c5d38,c4adb990,c0505918,0,...) at 
ithread_loop+0x67
fork_exit(c0505918,c4adb990,e35c5d38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe35c5d6c, ebp = 0 ---
Expensive timeout(9) function: 0xc0528fb4(0) 0.002565972 s

At 16:55 I catched this message:

kernel: acpi: suspend request ignored (not ready yet)

A minute (or seconds?) the machine died and I did not get anything 
around that time into the logs. What's the reason for this ACPI message?

After bootup (reset key pressed by an operator), the machine brought 
this LOR:

lock order reversal:
  1st 0xc4f68180 pf task mtx (pf task mtx) @ 
/usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6386
  2nd 0xc077078c tcp (tcp) @ 
/usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:2744
KDB: stack backtrace:
kdb_backtrace(0,ffffffff,c072e1c8,c072fcd0,c06f6124,...) at 
kdb_backtrace+0x29
witness_checkorder(c077078c,9,c4f6536e,ab8) at witness_checkorder+0x578
_mtx_lock_flags(c077078c,0,c4f6536e,ab8,c077078c,...) at 
_mtx_lock_flags+0x78
pf_socket_lookup(e35c5b00,e35c5b04,1,e35c5bc0,0,...) at 
pf_socket_lookup+0x1d3
pf_test_tcp(e35c5b70,e35c5b68,1,c4ee0e00,c4d6f400,...) at 
pf_test_tcp+0x11e6
pf_test(1,c4c11c00,e35c5c5c,0,0,...) at pf_test+0xb8b
pf_check_in(0,e35c5c5c,c4c11c00,1,0) at pf_check_in+0x37
pfil_run_hooks(c0770340,e35c5cb4,c4c11c00,1,0) at pfil_run_hooks+0xc9
ip_input(c4d6f400) at ip_input+0x272
netisr_processqueue(c076dfd8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2
ithread_execute_handlers(c4afca78,c4b4b180) at 
ithread_execute_handlers+0xe6
ithread_loop(c4adb990,e35c5d38,c4adb990,c0505918,0,...) at 
ithread_loop+0x67
fork_exit(c0505918,c4adb990,e35c5d38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe35c5d6c, ebp = 0 ---

My assumption: The LORs are somewhat pf related but are not related 
to the lockdown of the system. Am I correct? What might be reason 
for that ACPI message and may ACPI be a cause of the lockdown? What 
might be a possible cause for WITNESS and INVARIANTS being unable to 
catch whatever causes the freeze?

Thx

Volker

PS: sorry for flooding this list, should I direct postings to hackers@?
PPS: Is anybody able to provide me patches for these LORs?

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4656FDEE.7020002>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation