From owner-freebsd-stable@FreeBSD.ORG Sat Mar 15 20:36:42 2008 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 965AB106564A for ; Sat, 15 Mar 2008 20:36:42 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 569338FC12 for ; Sat, 15 Mar 2008 20:36:42 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id B778B46B28; Sat, 15 Mar 2008 16:36:41 -0400 (EDT) Date: Sat, 15 Mar 2008 20:36:41 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Alex Popa In-Reply-To: <20080314192359.GA4677@dataxnet.ro> Message-ID: <20080315203121.I42065@fledge.watson.org> References: <20080314192359.GA4677@dataxnet.ro> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: mlaier@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: Lock Order Reversal on 7.0-STABLE with pf and ipfw / dummynet X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Mar 2008 20:36:42 -0000 On Fri, 14 Mar 2008, Alex Popa wrote: > World was cvsupped on March 6th, around 18:00 GMT. > > Built and installed kernel + world, with options WITNESS and > WITNESS_SKIPSPIN. > > Short background: 7.0-RELEASE had excellent performance on the machine, but > it would randomly lock up after some hours (usually over 10 hours). The > lockups were hard, meaning nothing seemed to work (NumLock didn't toggle the > keyboard LED, no replies to ping, no disk activity). We changed the > motherboard and RAM and had the same behaviour. 6.2-REL is rock solid on > this machine (had over 50 days uptime), but upgrading to 6.3-REL made it > lock up just like 7.0 (so we put 6.2 back and accepted the lower performance > for the time being). > > The LOR messages from dmesg of 7.0-STABLE are as follows: > > lock order reversal: > 1st 0xffffffffb19e0680 pf task mtx (pf task mtx) @ /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6729 > 2nd 0xffffff00042ea0f0 radix node head (radix node head) @ /usr/src/sys/net/route.c:147 > lock order reversal: > 1st 0xffffffff80938508 PFil hook read/write mutex (PFil hook read/write mutex) @ /usr/src/sys/net/pfil.c:73 > 2nd 0xffffffff80938c48 tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:400 Dear Alex, Thanks for this report, and sorry about the problem. It could well be that the lock order warning from WITNESS is related to the hang, and might reflect a recursion-related bug in the pf policy routing code. I'm not sure to what extent you can tolerate further downtime, but it would be useful to gather some more information about the hang itself to try and confirm the involvement of lock order. In particular, if it's feasible, it would be very helpful if you could boot back to the 7-STABLE kernel (keeping the 6.2-STABLE userspace should be fine, I think), and when the hang occurs, use the console debuggger (ideally hooked up to serial or firewire) to run the following debugging commands: show pcpu show allpcpu trace alltrace show allocks show witness show lockedvnods show uma show malloc A shot-in-the-dark guess is that something about pf's interactions with the protocol stack is involved here, but unfortunately I suspect we'll need some more information to track it down. Also, could you confirm if you're using any credential-related firewall rules with either ipfw or pf? These would be uid/gid/jail matching rules. Robert N M Watson Computer Laboratory University of Cambridge > > More details about the machine in the attached dmesg. It's a SMP with > 4GB of RAM, 3 gigabit cards (em0, em1 and, depending on the motherboard > we used, either bge0 or msk0). Only em0 is linked to a gigabit port, > the others are 100Mbits/s > > My setup has in-kernel IPFIREWALL, IPFIREWALL_VERBOSE, > IPFIREWALL_DEFAULT_TO_ACCEPT, DUMMYNET. I have commented out INET6, > SCTP and the wireless interfaces. WITNESS and WITNESS_SKIPSPIN were > only added in the hope of figuring out what locks it up, and they did > signal these 2 LORs. > > pf and pflog are loaded as modules (pf_enable and pflog_enable set to > yes in rc.conf). > > - The ipfw/dummynet side: > > I use net.link.ether.ipfw = 1 for MAC address checking, ipfw + dummynet > for traffic shaping (4 queues at 95Mbits/s for the 2 external interfaces > in/out, and 4 more queues for traffic that goes outside the AS group for > which we have fast access). Deciding which queue traffic goes in > depends on its source address and whether its destination is in ipfw > tables 1, 2 or none. These tables are synchronized from pf tables via a > custom script in crontab, which runs every 3 minutes. The pf tables > used as source for these are controlled by OpenBGPD. > > - The pf side: > > Filtering is done here, as is policy routing. Filtering also contains > redirecting to a transparent squid proxy of traffic destined to port 80 > but not bound for networks received via BGP and saved to tables > and . Metro and special port 80 traffic goes directly to > the destination server. > > Traffic from net1 and net2 is routed via the "other" external interface, > which doesn't contain the default route... with the exception of traffic > to pf table (from BGP, same as table 2 in ipfw). Traffic to > is routed via fastroute in pf (meaning using the default > route). > > Attached are full dmesg and the kernel config. > > I still have access to the hard drive with 7.0-STABLE on it, but not the > motherboard/CPU and the network cards... they are running off the hard > drive with 6.2 on it. > > -- > "Computer science is no more about computers > than astronomy is about telescopes" -- E. W. Dijkstra >