Date: Sat, 15 Mar 2008 22:16:54 +0100 From: Max Laier <max@love2party.net> To: Alex Popa <razor@dataxnet.ro> Cc: freebsd-stable@freebsd.org, Robert Watson <rwatson@freebsd.org> Subject: Re: Lock Order Reversal on 7.0-STABLE with pf and ipfw / dummynet Message-ID: <200803152217.02568.max@love2party.net> In-Reply-To: <20080315203121.I42065@fledge.watson.org> References: <20080314192359.GA4677@dataxnet.ro> <20080315203121.I42065@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart1488400.EEhlPe4bs8 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Saturday 15 March 2008, Robert Watson wrote: > On Fri, 14 Mar 2008, Alex Popa wrote: > > World was cvsupped on March 6th, around 18:00 GMT. > > > > Built and installed kernel + world, with options WITNESS and > > WITNESS_SKIPSPIN. > > > > Short background: 7.0-RELEASE had excellent performance on the > > machine, but it would randomly lock up after some hours (usually over > > 10 hours). The lockups were hard, meaning nothing seemed to work > > (NumLock didn't toggle the keyboard LED, no replies to ping, no disk > > activity). We changed the motherboard and RAM and had the same > > behaviour. 6.2-REL is rock solid on this machine (had over 50 days > > uptime), but upgrading to 6.3-REL made it lock up just like 7.0 (so > > we put 6.2 back and accepted the lower performance for the time > > being). > > > > The LOR messages from dmesg of 7.0-STABLE are as follows: > > > > lock order reversal: > > 1st 0xffffffffb19e0680 pf task mtx (pf task mtx) @ > > /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6729 2nd > > 0xffffff00042ea0f0 radix node head (radix node head) @ > > /usr/src/sys/net/route.c:147 I haven't seen this one before, can you obtain the trace for this, please? = =20 You might need KDB & DDB for that - not sure. > > lock order reversal:=20 > > 1st 0xffffffff80938508 PFil hook read/write mutex (PFil hook > > read/write mutex) @ /usr/src/sys/net/pfil.c:73 2nd 0xffffffff80938c48 > > tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:400 This one is most certainly harmless and can be ignored. It is caused by=20 user/group rules, but a LOR with the read instance of a rwlock will not=20 lead to a deadlock. > Dear Alex, > > Thanks for this report, and sorry about the problem. It could well be > that the lock order warning from WITNESS is related to the hang, and > might reflect a recursion-related bug in the pf policy routing code.=20 > I'm not sure to what extent you can tolerate further downtime, but it > would be useful to gather some more information about the hang itself > to try and confirm the involvement of lock order. In particular, if > it's feasible, it would be very helpful if you could boot back to the > 7-STABLE kernel (keeping the 6.2-STABLE userspace should be fine, I you'll need at least a new pfctl, because the ioctl interface to /dev/pf=20 has changed. > think), and when the hang occurs, use the console debuggger (ideally > hooked up to serial or firewire) to run the following debugging > commands: > > show pcpu > show allpcpu > trace > alltrace > show allocks > show witness > show lockedvnods > show uma > show malloc > > A shot-in-the-dark guess is that something about pf's interactions with > the protocol stack is involved here, but unfortunately I suspect we'll > need some more information to track it down. > > Also, could you confirm if you're using any credential-related firewall > rules with either ipfw or pf? These would be uid/gid/jail matching > rules. > > Robert N M Watson > Computer Laboratory > University of Cambridge > > > More details about the machine in the attached dmesg. It's a SMP > > with 4GB of RAM, 3 gigabit cards (em0, em1 and, depending on the > > motherboard we used, either bge0 or msk0). Only em0 is linked to a > > gigabit port, the others are 100Mbits/s > > > > My setup has in-kernel IPFIREWALL, IPFIREWALL_VERBOSE, > > IPFIREWALL_DEFAULT_TO_ACCEPT, DUMMYNET. I have commented out INET6, > > SCTP and the wireless interfaces. WITNESS and WITNESS_SKIPSPIN were > > only added in the hope of figuring out what locks it up, and they did > > signal these 2 LORs. > > > > pf and pflog are loaded as modules (pf_enable and pflog_enable set to > > yes in rc.conf). > > > > - The ipfw/dummynet side: > > > > I use net.link.ether.ipfw =3D 1 for MAC address checking, ipfw + > > dummynet for traffic shaping (4 queues at 95Mbits/s for the 2 > > external interfaces in/out, and 4 more queues for traffic that goes > > outside the AS group for which we have fast access). Deciding which > > queue traffic goes in depends on its source address and whether its > > destination is in ipfw tables 1, 2 or none. These tables are > > synchronized from pf tables via a custom script in crontab, which > > runs every 3 minutes. The pf tables used as source for these are > > controlled by OpenBGPD. > > > > - The pf side: > > > > Filtering is done here, as is policy routing. Filtering also > > contains redirecting to a transparent squid proxy of traffic destined > > to port 80 but not bound for networks received via BGP and saved to > > tables <metro> and <special>. Metro and special port 80 traffic goes > > directly to the destination server. > > > > Traffic from net1 and net2 is routed via the "other" external > > interface, which doesn't contain the default route... with the > > exception of traffic to pf table <special> (from BGP, same as table 2 > > in ipfw). Traffic to <special> is routed via fastroute in pf > > (meaning using the default route). That's quite a complex setup. It would really be interesting to get the=20 trace for the first LOR in order to figure out which code path we are=20 looking at. I have a feeling that it might be the dummynet entry point,=20 but w/o the trace this is only speculation. > > Attached are full dmesg and the kernel config. > > > > I still have access to the hard drive with 7.0-STABLE on it, but not > > the motherboard/CPU and the network cards... they are running off the > > hard drive with 6.2 on it. =2D-=20 /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News --nextPart1488400.EEhlPe4bs8 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQBH3DzOXyyEoT62BG0RAh02AJ9hiDNrJqYSk9CkSGQFhKHakG5XDwCdHICn vy+CLMkO02wlNUYqjhRxD9k= =NmmE -----END PGP SIGNATURE----- --nextPart1488400.EEhlPe4bs8--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200803152217.02568.max>