Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Jun 2012 12:39:43 +0200
From:      =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To:        pf@freebsd.org
Cc:        net@freebsd.org
Subject:   Re: [CFT] SMP-friendly pf
Message-ID:  <CAPBZQG363E7jNoQUCBOZr7A%2BgbUrBdFuCfaymd-c7Dh%2Bs7r%2B0Q@mail.gmail.com>
In-Reply-To: <20120608061737.GA28197@glebius.int.ru>
References:  <20120608061737.GA28197@glebius.int.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jun 8, 2012 at 8:17 AM, Gleb Smirnoff <glebius@freebsd.org> wrote:
> =A0Hello, networkers!
>
> =A0[net@ in Cc, but further discussion should go on pf@]
>
> =A0As you already probably know, or some may be don't yet know, the pf(4)
> subsystem in FreeBSD is currently working under a single mutex. This mute=
x
> is acquired right at the beginning of any packet processing, and is dropp=
ed
> at the end. While one thread is in pf(4) all other threads are blocked on
> that mutex.
>
> =A0Meanwhile modern computers are getting more and more cores, and modern
> network cards getting more MSI interrupts, each serviced by a separate ke=
rnel
> thread in FreeBSD. So the single pf lock, which I call "the pf Giant" :),=
 is
> getting a point of hard contention.
>
> =A0Three and a half months ago I've started on a project "SMP-friendly pf=
",
> which recently have entered alpha stage. As you see from the subject of t=
his
> mail, this is call for testing.
>
>
> =A0Willing to test?
>

As i already asked in private wihtout a documentation/schema
describing how you protect the various elements in pf(4) this is very
hard to review.
- What do you do to allow correctness on statistics?
- What do you with tables protection, are they under same lock as rules...?
- How is if-bound versus floating states maintained?
- What is protecting scrub ruleset?
- What is protecting nat ruleset?
-....
- How you solved synproxy ? Is it scalable?
- Do you think you have introduced possiblity of security issues with
taskqueues you introduce?

There are many how? in this implementation that are difficult to see
without you telling!

> =A0The code lives in projects/pf/head branch in the SVN, and can be check=
ed
> out with:
>
> =A0svn checkout http://svn.freebsd.org/base/projects/pf/head pflock
>
> , where argument "pflock" is just directory name for checked out sources.
> =A0Then you need to build world and kernel from that branch and install t=
hem.
> The branch projects/pf/head gets head merged to it quite often, so if you
> run head world with a revision equal (or at least close) to last merge, t=
hen
> you don't need to install world, however rebuilding pfctl and snmp_pf fro=
m
> that branch is necessary.
> =A0If you are about to run this alpha pf on any important box, then you
> definitely need to establish safety measures: have a second box running
> stable/9 or head as carp(4) backup, ready to kick in, in case if new pf
> panics. pfsync(4) connection should also be established between new and
> backup boxes. pfsync(4) in the new code is wire compatible with stable/9
> or head.
> =A0I'm already running it on routers with 100k - 200k state entries, and
> forwarding 20k - 40k pps. If you are brave, you should try, too :) Good
> luck and report any problems to me!
>
>
> =A0Interested in details?
>
> =A0From the very beginning of the project it was clear, that code is goin=
g
> to diverge significantly from original OpenBSD code. OpenBSD has always
> developed pf without taking into account that code can ever get
> multithreaded, thus quite a lot needed to be changed. Thus, I've started
> with removing the "#ifdef __FreeBSD__" from the code, and later I didn't
> hesitate even a fraction of second if I wanted to toss some code. The pro=
s
> is that now code is much more readable and understandible then in head,
> the cons is that diff between us and OpenBSD is huge, although amount
> of shared code is huge, too. So, later on only manual merging of features
> from OpenBSD is possible and bulk imports of entire pf into FreeBSD are
> no longer possible.
>
> =A0The locking scheme is the following:
> - There is an rwlock(9) that protects rules and all kind of data that isn=
't
> =A0modified by forwarding threads. Forwarding threads reader lock it, ioc=
tl()
> =A0and other reconfiguring events write lock it.
> - The states and key states storage had moved from RB-trees to hashes, wi=
th
> =A0separate mutexes per hash slot. This should give us decent parallelism
> =A0when forwarding packets.
> - Source nodes storage moved to hash with per-slot locking.
> - pfsync(4) got separate mutex.
> - fragment reassembly got separate mutex.
>
> =A0Apart from the above key changes, many other optimisations and fixes d=
one.
> The entire diff is 22k lines large. You can view the projects history her=
e:
>
> http://svnweb.freebsd.org/base/projects/pf/head/?view=3Dlog
>
> (the beginning is on page 2 now, at r232042) I had tried to make informat=
ive
> commit messages.
>
> --
> Totus tuus, Glebius.
> _______________________________________________
> freebsd-pf@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-pf
> To unsubscribe, send any mail to "freebsd-pf-unsubscribe@freebsd.org"



--=20
Ermal



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPBZQG363E7jNoQUCBOZr7A%2BgbUrBdFuCfaymd-c7Dh%2Bs7r%2B0Q>