From owner-freebsd-stable@FreeBSD.ORG Tue Mar 25 19:21:02 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C48171065677 for ; Tue, 25 Mar 2008 19:21:02 +0000 (UTC) (envelope-from razor@dataxnet.ro) Received: from mail.dataxnet.ro (datax28.mediasat.ro [80.96.28.28]) by mx1.freebsd.org (Postfix) with SMTP id A62AF8FC27 for ; Tue, 25 Mar 2008 19:20:58 +0000 (UTC) (envelope-from razor@dataxnet.ro) Received: (qmail 62126 invoked by uid 1001); 25 Mar 2008 21:21:15 +0200 Date: Tue, 25 Mar 2008 21:21:15 +0200 From: Alex Popa To: Max Laier Message-ID: <20080325192113.GA61579@dataxnet.ro> References: <20080314192359.GA4677@dataxnet.ro> <200803152217.02568.max@love2party.net> <20080322102933.GA76747@dataxnet.ro> <200803221655.28975.max@love2party.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="k+w/mQv8wyuph6w0" Content-Disposition: inline In-Reply-To: <200803221655.28975.max@love2party.net> User-Agent: Mutt/1.4.2.2i Cc: freebsd-stable@freebsd.org, Robert Watson Subject: Re: Lock Order Reversal on 7.0-STABLE with pf and ipfw / dummynet (traces) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Mar 2008 19:21:02 -0000 --k+w/mQv8wyuph6w0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Mar 22, 2008 at 04:55:28PM +0100, Max Laier wrote: > Hi Alex, > > On Saturday 22 March 2008 11:29:33 Alex Popa wrote: > > Sorry for the big delay, but here are the traces you requested. > > don't worry, you are a great help! > > Could you try the attached patch? I missed the fact that you are using > FASTROUTE in your setup. There is obviously a problem with it, but the > attached patch should work around that. The other LOR really is harmless > and rather an oversight in WITNESS: a LOR with a shared/read lock can't > cause a deadlock (unless there is also a LOR with the same lock in > exclusive mode). But this is rather complex to check and might not be > easily implemented in WITNESS. > > Anyways - I believe this patch should work around your problem. Let us > know your findings - thanks. Hello. I have tested the patch, booted with a WITNESS kernel including that patch, and it has locked up (solid again, no numlock or console changing, no control-alt-esc to debugger) after about 41 minutes (timestamps in /var/log/all.log go from 19:12:57 to 19:53:40). I did get two LOR reports in dmesg, they are attached. > -- > /"\ Best regards, | mlaier@freebsd.org > \ / Max Laier | ICQ #67774661 > X http://pf4freebsd.love2party.net/ | mlaier@EFnet > / \ ASCII Ribbon Campaign | Against HTML Mail and News Alex -- "Computer science is no more about computers than astronomy is about telescopes" -- E. W. Dijkstra --k+w/mQv8wyuph6w0 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="patched-dmesg.txt" lock order reversal: 1st 0xffffffff8096ebc8 PFil hook read/write mutex (PFil hook read/write mutex) @ /usr/src/sys/net/pfil.c:73 2nd 0xffffffff8096f8e8 udp (udp) @ /usr/src/sys/netinet/udp_usrreq.c:385 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a witness_checkorder() at witness_checkorder+0x539 _mtx_lock_flags() at _mtx_lock_flags+0x1f udp_input() at udp_input+0x1f7 ip_input() at ip_input+0xa7 dummynet_send() at dummynet_send+0xde dummynet_io() at dummynet_io+0x587 ipfw_check_in() at ipfw_check_in+0x241 pfil_run_hooks() at pfil_run_hooks+0xac ip_input() at ip_input+0x292 ether_demux() at ether_demux+0x1ac ether_input() at ether_input+0x1bf em_handle_rxtx() at em_handle_rxtx+0x1d2 taskqueue_run() at taskqueue_run+0x95 taskqueue_thread_loop() at taskqueue_thread_loop+0x53 fork_exit() at fork_exit+0x112 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffffffa057ad30, rbp = 0 --- [...] lock order reversal: 1st 0xffffff00018ff690 inp (rawinp) @ /usr/src/sys/netinet/raw_ip.c:281 2nd 0xffffffff8096ebc8 PFil hook read/write mutex (PFil hook read/write mutex) @ /usr/src/sys/net/pfil.c:73 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a witness_checkorder() at witness_checkorder+0x539 _rw_rlock() at _rw_rlock+0x25 pfil_run_hooks() at pfil_run_hooks+0x44 ip_output() at ip_output+0x35a rip_output() at rip_output+0x1eb sosend_generic() at sosend_generic+0x289 kern_sendit() at kern_sendit+0x122 sendit() at sendit+0xc6 sendto() at sendto+0x4d syscall() at syscall+0x1b5 Xfast_syscall() at Xfast_syscall+0xab --- syscall (133, FreeBSD ELF64, sendto), rip = 0x80091132c, rsp = 0x7ffffffee6e8, rbp = 0x40 --- --k+w/mQv8wyuph6w0--