From owner-freebsd-stable@FreeBSD.ORG Sun Mar 16 22:37:05 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DB94A106566B; Sun, 16 Mar 2008 22:37:05 +0000 (UTC) (envelope-from ianjhart@ntlworld.com) Received: from queueout04-winn.ispmail.ntl.com (queueout04-winn.ispmail.ntl.com [81.103.221.58]) by mx1.freebsd.org (Postfix) with ESMTP id 2B7168FC16; Sun, 16 Mar 2008 22:37:04 +0000 (UTC) (envelope-from ianjhart@ntlworld.com) Received: from aamtaout04-winn.ispmail.ntl.com ([81.103.221.35]) by mtaout02-winn.ispmail.ntl.com with ESMTP id <20080316221919.YKAE27871.mtaout02-winn.ispmail.ntl.com@aamtaout04-winn.ispmail.ntl.com>; Sun, 16 Mar 2008 22:19:19 +0000 Received: from cpc2-cove3-0-0-cust311.sol2.cable.ntl.com ([86.20.33.56]) by aamtaout04-winn.ispmail.ntl.com with ESMTP id <20080316221648.PTH29112.aamtaout04-winn.ispmail.ntl.com@cpc2-cove3-0-0-cust311.sol2.cable.ntl.com>; Sun, 16 Mar 2008 22:16:48 +0000 X-Virus-Scanned: amavisd-new at cpc1-cove3-0-0-cust839.sol2.cable.ntl.com Received: from gamma.private.lan (gamma.private.lan [192.168.0.12]) by cpc2-cove3-0-0-cust311.sol2.cable.ntl.com (8.14.2/8.14.2) with ESMTP id m2GMGKRO002940; Sun, 16 Mar 2008 22:16:20 GMT (envelope-from ianjhart@ntlworld.com) From: ian j hart To: freebsd-stable@freebsd.org Date: Sun, 16 Mar 2008 22:16:20 +0000 User-Agent: KMail/1.9.7 References: <20080314192359.GA4677@dataxnet.ro> <200803152217.02568.max@love2party.net> <20080316211616.GA67593@dataxnet.ro> In-Reply-To: <20080316211616.GA67593@dataxnet.ro> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200803162216.20041.ianjhart@ntlworld.com> Cc: Max Laier , Robert Watson , Alex Popa Subject: Re: Lock Order Reversal on 7.0-STABLE with pf and ipfw / dummynet X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Mar 2008 22:37:06 -0000 On Sunday 16 March 2008 21:16:16 Alex Popa wrote: > This is a mixed reply to both the previous mails, bear with me please. > > On Sat, Mar 15, 2008 at 10:16:54PM +0100, Max Laier wrote: > > On Saturday 15 March 2008, Robert Watson wrote: > > > On Fri, 14 Mar 2008, Alex Popa wrote: > > > > [snip] > > > > The LOR messages from dmesg of 7.0-STABLE are as follows: > > > > > > > > lock order reversal: > > > > 1st 0xffffffffb19e0680 pf task mtx (pf task mtx) @ > > > > /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6729 2nd > > > > 0xffffff00042ea0f0 radix node head (radix node head) @ > > > > /usr/src/sys/net/route.c:147 > > > > I haven't seen this one before, can you obtain the trace for this, > > please? You might need KDB & DDB for that - not sure. > > I'll do my best (see below for my questions about getting a trace). > > > > > lock order reversal: > > > > 1st 0xffffffff80938508 PFil hook read/write mutex (PFil hook > > > > read/write mutex) @ /usr/src/sys/net/pfil.c:73 2nd 0xffffffff80938c48 > > > > tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:400 > > > > This one is most certainly harmless and can be ignored. It is caused by > > user/group rules, but a LOR with the read instance of a rwlock will not > > lead to a deadlock. > > I'm not using uid/gid/jail rules as far as I remember. I'll add another > reply with pf.conf and the script I use to generate and reload the ipfw > rules (but I'll anonymize them). > > > > Dear Alex, > > > > > > Thanks for this report, and sorry about the problem. It could well be > > > that the lock order warning from WITNESS is related to the hang, and > > > might reflect a recursion-related bug in the pf policy routing code. > > > I'm not sure to what extent you can tolerate further downtime, but it > > > would be useful to gather some more information about the hang itself > > > to try and confirm the involvement of lock order. In particular, if > > > it's feasible, it would be very helpful if you could boot back to the > > > 7-STABLE kernel (keeping the 6.2-STABLE userspace should be fine, I > > > > you'll need at least a new pfctl, because the ioctl interface to /dev/pf > > has changed. > > Switching between 6.2-RELEASE-p7 (not STABLE, because as I said 6.3 > exhibited the lockups too) and 7-STABLE isn't that much of a problem. > The upgrade path was "buy a new hard drive, set up everything and then > adapt the old config files"... actually we bought 2 harddrives, and I > set them up one with amd64 and another with i386. I think /etc and > /usr/local/etc are perfectly identical on these 2 (I adapted the configs > from 6.2 to 7.0, but I just copied them from amd64 to i386). > > So, actions needed to switch: Backup the database on 6.2 (with IP/MAC > mappings and a bit more), put in the 7.0 hard drive, boot off 7.0, > restore DB, let it run. Total downtime should be around 7 minutes tops. > > > > think), and when the hang occurs, use the console debuggger (ideally > > > hooked up to serial or firewire) to run the following debugging > > > commands: > > > > > > show pcpu > > > show allpcpu > > > trace > > > alltrace > > > show allocks > > > show witness > > > show lockedvnods > > > show uma > > > show malloc > > This is where things get a bit tricky, and I need advice. > > As I said, my observation is that the keyboard seems to stop working > when the lockup occurs, that is, pressing Num Lock won't toggle the > state of the LED. Thus I have some doubts that trying the good-old > Control-Alt-ESC would have the desired effect (dropping me into the > debugger). However, I'm not that familiar with the FreeBSD > architecture, and wouldn't be surprised if the LED toggling would be in > another thread and the macine will actually respond to the keyboard > interrupt and drop me into ddb. Also, judging by the lack of NumLock > activity (it works fine when the system's up), would serial console or > firewire be functional during the lockup? Keyboard LEDs are broken for me on 6.3 amd64 (kbdmux). I'd double check they work before you rely on this as a diagnostic tool. > > Also, a bit of explanations: > > Why I'm asking the above: The current motherboard has a serial port > (and it works, we've used it), but not a firewire port. The other > motherboard we tried has firewire, but no serial. As a console > workstation, I can get a few with serials, but not so easy with > firewire. The null modem cable might be a problem too, depending on > length. > > Also, since the lockup isn't easily reproducible, I'll probably need to > spend some hours on location and if I'm going to do that, I'd like a > degree of hope that either keyboard, serial console or firewire will > work. Also, firewire will require me to switch motherboards, but that > can be done together with the hard drive swapping, during the night. > > After a bit of studying NOTES, I was wondering if a combination of > serial console (or just plain console) with "options WITNESS_KDB" would > help get a "good enough" trace. The upside of this is that both LORs > usually occur early (not much later than the login prompt, usually > earlier) as opposed to after 12...18 hours, and I can either force a > panic after each and get 2 core dumps, or run the debug commands > suggested (either as debug LOR1 / continue / debug LOR2, or debug LOR1 / > reboot / "continue" LOR1 / debug LOR2 - whichever is more appropriate). > > For the moment I have both hard drives (7.0-STABLE/amd64 and > 7.0-RELEASE/i386) and the new motherboard (no serial, but with firewire) > as a working computer under my desk. I can prepare for the night-time > switch and debug by compiling kernel and/or world and doing some > preliminary testing here. If I really need to test null modem console, > I can put the hdd in my own desktop and test with another machine. > > > > A shot-in-the-dark guess is that something about pf's interactions with > > > the protocol stack is involved here, but unfortunately I suspect we'll > > > need some more information to track it down. > > > > > > Also, could you confirm if you're using any credential-related firewall > > > rules with either ipfw or pf? These would be uid/gid/jail matching > > > rules. > > As I said above, I don't use any uid/gid/jail rules. Mail with pf.conf > and ipfw config incoming shortly after this one. > > > > Robert N M Watson > > > Computer Laboratory > > > University of Cambridge > > [snip] > > > That's quite a complex setup. It would really be interesting to get the > > trace for the first LOR in order to figure out which code path we are > > looking at. I have a feeling that it might be the dummynet entry point, > > but w/o the trace this is only speculation. > > Working on it. > > > -- > > /"\ Best regards, | mlaier@freebsd.org > > \ / Max Laier | ICQ #67774661 > > X http://pf4freebsd.love2party.net/ | mlaier@EFnet > > / \ ASCII Ribbon Campaign | Against HTML Mail and News > > I'd like suggestions / comments about the kernel config I'm thinking > about for debugging purposes: > > - take my KERNEL (GENERIC + IPFW - IPv6 and SCTP and wireless), and add: > > options WITNESS > options WITNESS_KDB # only if debug-on-first-warn is wanted > options WITNESS_SKIPSPIN > options KDB > #options KDB_TRACE # not needed since I'll trace anyway? > options DDB > #options BREAK_TO_DEBUGGER # would that work for my kind of lockup? > options MSGBUF_SIZE=409600 > > > Ideally I would like to hear that the manual tracing and debugging with > a keyboard console would provide enough info. I'll increase the kernel > buffer size to 400k as above, so I don't lose info when I continue and > dmesg > log.txt. > > Just as easily, I can try forcing a panic at the LORs and keeping the > kernel dumps (with optional debugging in ddb like above). The advantage > is that this might andswer supplementary questions after the deed is > done. > > Both the above options should be possible this week. > > The serial console part may or may not happen this week, and I'm quite > positive it will take another week before I find the time to spend 16+ > hours on location, waiting for a lockup (which might happen at a busy > time and therefore I'll have very little time to do all the debugging). > > Tips / suggestions are most welcome! > > Thanks for the help! > Alex -- ian j hart