Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Dec 2006 09:41:27 -0600
From:      "Christian S.J. Peron" <csjp@FreeBSD.ORG>
To:        Max Laier <max@love2party.net>
Cc:        avatar@mmlab.cse.yzu.edu.tw, freebsd-pf@freebsd.org
Subject:   Re: debug.mpsafenet=1 vs. user/group rules [Re: kern/106805: ...]
Message-ID:  <45953727.7020405@FreeBSD.ORG>
In-Reply-To: <200612161709.48875.max@love2party.net>
References:  <200612161335.kBGDZkMj012022@freefall.freebsd.org> <200612161709.48875.max@love2party.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Max,

I have replied to this mail and I guess it has been lost, as I have had 
no response.  Although this technically makes
the problem harmless, all you are doing is moving the lock order 
reversal from pf+inpcb to pfil+inpcb. The
only major difference is that we can have multiple readers of the pfil 
lock, making the LOR harmless, in this path.

In IPFW, I changed the chain locks to use rw_lock(9), so we can get rid 
of the mpsafenet requirement for IPFW.

I've thought about doing this in IPFW (looking up the ucred if there are 
any uid/gid/jail rules) prior to picking up the
chain locks, but realized we could remove the lock ordering issue all 
together if we anihilated the pfil lock.

One idea I had was introducing PFIL_STATIC which requires that modules 
wishing to register pfil hooks did so at
boot-time only. Something similar was done for the MAC framework to 
avoid having to pickup policy list locks
for each security decision.

This would also have the nice side effect of eliminating a couple of 
atomic instructions per packet in the fast path.
Taking this approach along with moving the inpcb lookup prior to the 
firewall chain locks allows us to actually
eliminate the lock ordering issues.  However, the layering violation 
still exists. But it's harmless.

However, having said all that. This works, too.


Max Laier wrote:
> Okay, spoken too quick ... I just had an idea (enlightment you might say - 
> given the time of year), that might finally get us rid of this symptom 
> (not of the problem though).
>
> Short recap on why this is happening:  Checking socket credentials on the 
> IP layer (where pf lives) is a layering violation.  A useful one, you may 
> argue, but nontheless - it causes a lock order reversal:  In order to 
> walk the pf rules we need to hold the pf lock, in order to walk the 
> socket hash we need to hold the "inp" lock.
>
> The attached diff circumvents the problem by **always** doing the 
> credential lookup *before* walking the pf rules.  This has the benefit, 
> that it works (at least I think it should), but there is a price to pay.  
> Now we have to pay for the socket lookup for *every* tcp and udp packet 
> instead of just for those that really hit uid/gid rules.  That's why I 
> decided to make is a config option "PF_MPFSAFE_UGID" which you can turn 
> on if you are running a setup that will benefit.  The patch turns it on 
> for the module-built by default.
>
> A possible scenario that should benefit is a big iron SMP box running lot 
> of services that you want to filter using *stateful* uid/gid rules.  For 
> this setup where a huge percentage of the packets that are not captured 
> by states eventually match a uid/gid rule, you will even get added 
> parallelism with this patch.
>
> On every other typical setup, it should be better to avoid user/group 
> rules or to disable mpsafenet.
>
> In order for this to hit the tree, I need tests confirming that it really 
> helps and possibly benchmarks that qualify the impact of it.  Thanks.
>
>   
> ------------------------------------------------------------------------
>
> Index: conf/options
> ===================================================================
> RCS file: /usr/store/mlaier/fcvs/src/sys/conf/options,v
> retrieving revision 1.567
> diff -u -r1.567 options
> --- conf/options	10 Dec 2006 04:23:23 -0000	1.567
> +++ conf/options	16 Dec 2006 15:36:08 -0000
> @@ -349,6 +349,7 @@
>  DEV_PF			opt_pf.h
>  DEV_PFLOG		opt_pf.h
>  DEV_PFSYNC		opt_pf.h
> +PF_MPSAFE_UGID		opt_pf.h
>  ETHER_II		opt_ef.h
>  ETHER_8023		opt_ef.h
>  ETHER_8022		opt_ef.h
> Index: contrib/pf/net/pf.c
> ===================================================================
> RCS file: /usr/store/mlaier/fcvs/src/sys/contrib/pf/net/pf.c,v
> retrieving revision 1.42
> diff -u -r1.42 pf.c
> --- contrib/pf/net/pf.c	22 Oct 2006 11:52:11 -0000	1.42
> +++ contrib/pf/net/pf.c	16 Dec 2006 15:34:52 -0000
> @@ -3032,6 +3032,12 @@
>  		return (PF_DROP);
>  	}
>  
> +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID)
> +	PF_UNLOCK();
> +	lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp);
> +	PF_LOCK();
> +#endif
> +
>  	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
>  
>  	if (direction == PF_OUT) {
> @@ -3428,6 +3434,12 @@
>  		return (PF_DROP);
>  	}
>  
> +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID)
> +	PF_UNLOCK();
> +	lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp);
> +	PF_LOCK();
> +#endif
> +
>  	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
>  
>  	if (direction == PF_OUT) {
> Index: modules/pf/Makefile
> ===================================================================
> RCS file: /usr/store/mlaier/fcvs/src/sys/modules/pf/Makefile,v
> retrieving revision 1.12
> diff -u -r1.12 Makefile
> --- modules/pf/Makefile	12 Sep 2006 04:25:12 -0000	1.12
> +++ modules/pf/Makefile	16 Dec 2006 15:41:00 -0000
> @@ -10,7 +10,7 @@
>  	in4_cksum.c \
>  	opt_pf.h opt_inet.h opt_inet6.h opt_bpf.h opt_mac.h
>  
> -CFLAGS+=  -I${.CURDIR}/../../contrib/pf
> +CFLAGS+=  -I${.CURDIR}/../../contrib/pf -DPF_MPSAFE_UGID
>  
>  .if !defined(KERNBUILDDIR)
>  opt_inet.h:
>   




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45953727.7020405>