From owner-freebsd-pf@FreeBSD.ORG Tue Jan 9 14:23:20 2007 Return-Path: X-Original-To: freebsd-pf@freebsd.org Delivered-To: freebsd-pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BDA0B16A47C; Tue, 9 Jan 2007 14:23:20 +0000 (UTC) (envelope-from avatar@mmlab.cse.yzu.edu.tw) Received: from www.mmlab.cse.yzu.edu.tw (www.mmlab.cse.yzu.edu.tw [140.138.150.166]) by mx1.freebsd.org (Postfix) with ESMTP id 3859413C448; Tue, 9 Jan 2007 14:23:20 +0000 (UTC) (envelope-from avatar@mmlab.cse.yzu.edu.tw) Received: by www.mmlab.cse.yzu.edu.tw (qmail, from userid 1000) id 44C718C9C86; Tue, 9 Jan 2007 22:23:18 +0800 (CST) Received: from localhost (localhost [127.0.0.1]) by www.mmlab.cse.yzu.edu.tw (qmail) with ESMTP id E76D88C98F9; Tue, 9 Jan 2007 22:23:18 +0800 (CST) Date: Tue, 9 Jan 2007 22:23:18 +0800 (CST) From: Tai-hwa Liang To: "Christian S.J. Peron" In-Reply-To: <45953727.7020405@FreeBSD.ORG> Message-ID: <0701092218228.1404@www.mmlab.cse.yzu.edu.tw> References: <200612161335.kBGDZkMj012022@freefall.freebsd.org> <200612161709.48875.max@love2party.net> <45953727.7020405@FreeBSD.ORG> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-pf@freebsd.org Subject: Re: debug.mpsafenet=1 vs. user/group rules [Re: kern/106805: ...] X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 14:23:20 -0000 On Fri, 29 Dec 2006, Christian S.J. Peron wrote: > Max, > > I have replied to this mail and I guess it has been lost, as I have had no > response. Although this technically makes > the problem harmless, all you are doing is moving the lock order reversal > from pf+inpcb to pfil+inpcb. The I probably missed something; however, with Max's patch applied, I did not see any pf related LOR on a WITNESS + INVARIANT enabled -STABLE box during last two weeks. > only major difference is that we can have multiple readers of the pfil lock, > making the LOR harmless, in this path. > > In IPFW, I changed the chain locks to use rw_lock(9), so we can get rid of > the mpsafenet requirement for IPFW. > > I've thought about doing this in IPFW (looking up the ucred if there are any > uid/gid/jail rules) prior to picking up the > chain locks, but realized we could remove the lock ordering issue all > together if we anihilated the pfil lock. > > One idea I had was introducing PFIL_STATIC which requires that modules > wishing to register pfil hooks did so at > boot-time only. Something similar was done for the MAC framework to avoid > having to pickup policy list locks > for each security decision. > > This would also have the nice side effect of eliminating a couple of atomic > instructions per packet in the fast path. > Taking this approach along with moving the inpcb lookup prior to the firewall > chain locks allows us to actually > eliminate the lock ordering issues. However, the layering violation still > exists. But it's harmless. > > However, having said all that. This works, too. > > > Max Laier wrote: >> Okay, spoken too quick ... I just had an idea (enlightment you might say - >> given the time of year), that might finally get us rid of this symptom (not >> of the problem though). >> >> Short recap on why this is happening: Checking socket credentials on the >> IP layer (where pf lives) is a layering violation. A useful one, you may >> argue, but nontheless - it causes a lock order reversal: In order to walk >> the pf rules we need to hold the pf lock, in order to walk the socket hash >> we need to hold the "inp" lock. >> >> The attached diff circumvents the problem by **always** doing the >> credential lookup *before* walking the pf rules. This has the benefit, >> that it works (at least I think it should), but there is a price to pay. >> Now we have to pay for the socket lookup for *every* tcp and udp packet >> instead of just for those that really hit uid/gid rules. That's why I >> decided to make is a config option "PF_MPFSAFE_UGID" which you can turn on >> if you are running a setup that will benefit. The patch turns it on for >> the module-built by default. >> >> A possible scenario that should benefit is a big iron SMP box running lot >> of services that you want to filter using *stateful* uid/gid rules. For >> this setup where a huge percentage of the packets that are not captured by >> states eventually match a uid/gid rule, you will even get added parallelism >> with this patch. >> >> On every other typical setup, it should be better to avoid user/group rules >> or to disable mpsafenet. >> >> In order for this to hit the tree, I need tests confirming that it really >> helps and possibly benchmarks that qualify the impact of it. Thanks. >> >> ------------------------------------------------------------------------ >> >> Index: conf/options >> =================================================================== >> RCS file: /usr/store/mlaier/fcvs/src/sys/conf/options,v >> retrieving revision 1.567 >> diff -u -r1.567 options >> --- conf/options 10 Dec 2006 04:23:23 -0000 1.567 >> +++ conf/options 16 Dec 2006 15:36:08 -0000 >> @@ -349,6 +349,7 @@ >> DEV_PF opt_pf.h >> DEV_PFLOG opt_pf.h >> DEV_PFSYNC opt_pf.h >> +PF_MPSAFE_UGID opt_pf.h >> ETHER_II opt_ef.h >> ETHER_8023 opt_ef.h >> ETHER_8022 opt_ef.h >> Index: contrib/pf/net/pf.c >> =================================================================== >> RCS file: /usr/store/mlaier/fcvs/src/sys/contrib/pf/net/pf.c,v >> retrieving revision 1.42 >> diff -u -r1.42 pf.c >> --- contrib/pf/net/pf.c 22 Oct 2006 11:52:11 -0000 1.42 >> +++ contrib/pf/net/pf.c 16 Dec 2006 15:34:52 -0000 >> @@ -3032,6 +3032,12 @@ >> return (PF_DROP); >> } >> +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID) >> + PF_UNLOCK(); >> + lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp); >> + PF_LOCK(); >> +#endif >> + >> r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr); >> if (direction == PF_OUT) { >> @@ -3428,6 +3434,12 @@ >> return (PF_DROP); >> } >> +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID) >> + PF_UNLOCK(); >> + lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp); >> + PF_LOCK(); >> +#endif >> + >> r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr); >> if (direction == PF_OUT) { >> Index: modules/pf/Makefile >> =================================================================== >> RCS file: /usr/store/mlaier/fcvs/src/sys/modules/pf/Makefile,v >> retrieving revision 1.12 >> diff -u -r1.12 Makefile >> --- modules/pf/Makefile 12 Sep 2006 04:25:12 -0000 1.12 >> +++ modules/pf/Makefile 16 Dec 2006 15:41:00 -0000 >> @@ -10,7 +10,7 @@ >> in4_cksum.c \ >> opt_pf.h opt_inet.h opt_inet6.h opt_bpf.h opt_mac.h >> -CFLAGS+= -I${.CURDIR}/../../contrib/pf >> +CFLAGS+= -I${.CURDIR}/../../contrib/pf -DPF_MPSAFE_UGID >> .if !defined(KERNBUILDDIR) >> opt_inet.h: