From owner-freebsd-current@FreeBSD.ORG Wed Sep 24 08:08:14 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AAB6E106567B for ; Wed, 24 Sep 2008 08:08:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 86C748FC22 for ; Wed, 24 Sep 2008 08:08:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTP id 1D6A646B37; Wed, 24 Sep 2008 04:08:14 -0400 (EDT) Date: Wed, 24 Sep 2008 09:08:13 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Stefan Ehmann In-Reply-To: Message-ID: References: <200809231851.42849.shoesoft@gmx.net> User-Agent: Alpine 1.10 (BSF 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-current@freebsd.org Subject: Re: ipfw: LOR/panic with uid rules X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2008 08:08:14 -0000 On Tue, 23 Sep 2008, Robert Watson wrote: > On Tue, 23 Sep 2008, Stefan Ehmann wrote: > >> Also posted about this problem recently in stable@. But got no replies >> there. So I tried on a recent CURRENT but the problem persists: >> >> ipfw rules using uid are causing a deadlock. eg. allow ip from any to any >> uid root A simple HTTP fetch triggers this problem nearly instantly. >> >> For me, this problem existed in 6.x with PREEMPTION enabled. It was fixed >> in 7.0. But in RELENG_7 and head it's back. This is a single processor i386 >> machine. > > This is an interesting edge case -- to prevent lookup of an inpcb in the > output path, we normally pass the inpcb reference down to the firewall so it > can directly access the cred rather than looking it up. Thus, we don't > recurse the global tcbinfo or inpcb locks normally on the transmit path. > However, it looks like we have an edge case here where we've freed the inpcb > but not yet unlocked the tcbinfo, and since the inpcb is freed we don't pass > it down--the firewall code tries to look up the inpcb and improperly > recurses the tcbinfo lock, boom. > > The uid/gid/jail code in ipfw is undesirable for a number of reasons, not > least because it's a layering violation. Historically, layering violations > meant slightly awkward and risky recursion, but now they also mean lock > recursion, which has more serious consequences. I'll investigate tomorrow > and see what the best solution is -- probably to drop the lock before > calling tcp_dropwithreset() on a NULL inpcb, which is a workaround/hack, but > I think our hands are forced in this case. I'll follow up with a patch > then. Here is a possible candidate patch, could you see if it resolves all of the issues you reported? (Possibly I have missed other similar cases as well...) Index: tcp_input.c =================================================================== --- tcp_input.c (revision 183235) +++ tcp_input.c (working copy) @@ -2472,12 +2472,19 @@ dropwithreset: KASSERT(headlocked, ("%s: dropwithreset: head not locked", __func__)); - tcp_dropwithreset(m, th, tp, tlen, rstreason); - - if (tp != NULL) + /* + * If tp is non-NULL, we call tcp_dropwithreset() holding both inpcb + * and global locks. However, if NULL, we must hold neither as + * firewalls may acquire the global lock in order to look for a + * matching inpcb. + */ + if (tp != NULL) { + tcp_dropwithreset(m, th, tp, tlen, rstreason); INP_WUNLOCK(tp->t_inpcb); - if (headlocked) - INP_INFO_WUNLOCK(&V_tcbinfo); + } + INP_INFO_WUNLOCK(&V_tcbinfo); + if (tp == NULL) + tcp_dropwithreset(m, th, NULL, tlen, rstreason); return; drop: