From owner-freebsd-current@FreeBSD.ORG Wed Apr 18 08:03:34 2007 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5250816A400 for ; Wed, 18 Apr 2007 08:03:34 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outP.internet-mail-service.net (outP.internet-mail-service.net [216.240.47.239]) by mx1.freebsd.org (Postfix) with ESMTP id 3D26A13C487 for ; Wed, 18 Apr 2007 08:03:34 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.32) with ESMTP; Wed, 18 Apr 2007 00:31:51 -0700 Received: from [192.168.2.6] (home.elischer.org [216.240.48.38]) by idiom.com (Postfix) with ESMTP id 0F1C3125AED; Wed, 18 Apr 2007 01:03:33 -0700 (PDT) Message-ID: <4625D0DB.1080902@elischer.org> Date: Wed, 18 Apr 2007 01:03:39 -0700 From: Julian Elischer User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: Robert Watson References: <20070417153357.GA1335@seekingfire.com> <20070417173005.O42234@fledge.watson.org> <20070417181627.GA1225@seekingfire.com> <20070417220339.E2913@fledge.watson.org> <20070418084345.H2913@fledge.watson.org> In-Reply-To: <20070418084345.H2913@fledge.watson.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Tillman Hodgson , current@freebsd.org Subject: Re: Panic on boot with April 16 src (lengthy info attached) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Apr 2007 08:03:34 -0000 Robert Watson wrote: > > On Tue, 17 Apr 2007, Robert Watson wrote: > >>> I originally put it in there to work around a LOR that I was >>> experiencing (based on you mentioning it in an email to current@ Sun >>> 18 Mar 2007 15:50). http://sources.zabbadoz.net/freebsd/lor/191.html >>> doesn't show any changes to that particular LOR, do you happen to >>> know if there's any ongoing work on this? I'm very willing to act as >>> a test system. >> >> I chatted with Andre about the panic earlier this afternoon, and it >> sounds like the fix is straight forward. I would anticipate seeing it >> committed in the near future. >> >> I'll send out an e-mail explaining the above lock order reversal >> tomorrow morning. I understand that several people have been looking >> at this, so perhaps one of those people will reply talking about it >> before then. :-) > > The essential problem of this lock order reversal has to do with the > fact that higher network stack layers hold locks over lower network > stack layers. For example, the lock for a TCP connection is held over > the operation to enqueue the TCP packet for transmission at a lower > layer. This is necessary in order to maintain TCP transmission order > into the transmission queue between multiple threads operating on the > same TCP connection, as if the "transmit and enqueue" operation were > non-atomic with respect to the same TCP connection in another thread, > quite damaging reordering could take place. We directly dispatch the > entire outbound network stack from that enqueue point, meaning that the > per-TCP connection lock is held over that processing path, including the > firewall. As a result, PCB locks (TCP connection locks) preceed the > firewall in the lock order. > > Firewall locks are about protecting the rule state of the firewall from > corruption when firewall rules are updated, allowing readers to > interpret the rules using a static snapshot, and writers to avoid > mangling the rules via simultaneous non-atomic update. As such, when > the firewall code is entered, the firewall lock is acquired, and held > until the packet has been completely processed. Things get sticky deep > in the firewall code because our firewalls include credential-aware > rules, which essentially "peek up the stack" in order to decide what > user is associated with a packet before delivery to the connection is > done. The firewall rule lock is held over this lookup and inspection of > TCP-layer state. In the out-bound path, we pass down the TCP state > reference (PCB pointer) and guarantee the lock is already held. However, > in the in-bound direction, the firewall has to do the full lookup and > lock acquisition. Which reverses the lock order, and can lead to > deadlocks. I am doing work on fixing htis for ipfw. it involves moving ipfw to a lockless method of operation. (more info will be in the ipfw list in a few days) > > debug.mpsafenet=0 places the Giant lock in front of all network stack > lock acquisition, which effectively serializes all of the above. It > doesn't remove the lock order reversal, but it does eliminate > simultaneous lock acquisition, removing one of the necessary causes of > deadlock. This trick of a serializing "global" lock in order to prevent > lock order between "leaf" locks is not an uncommon technique, but in > this case has a significant overhead (requiring non-parallelism in > network processing), and needs to be fixed. > > The key is to guarantee that the acquisition of the firewall reference > will never be blocked waiting on a PCB lock -- i.e., that the firewall > "lock" isn't a lock so much as a reference count that will never have to > wait, removing the waiting requirement from the deadlock equation. I > know that Julian Elischer has been looking at doing this, and others may > have also. The model is essentially that you either starve writers to > the firewall data, or you create a read-only snapshot to be used by > readers in the event a writer arrives, allowing readers to pick up the > new rules if available, or the old rules if not, and never wait > indefinitely either way. yep.. I have detailed plans afoot but not for pf. I wouldn't know pf if it came up and kicked me in the shins so I'll be leaving that to someone else. > > Robert N M Watson > Computer Laboratory > University of Cambridge > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"