Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 07 Sep 2012 14:05:11 +0200
From:      Ian FREISLICH <ianf@clue.co.za>
To:        =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
Cc:        pf@freebsd.org
Subject:   Re: [HEADS UP] merging projects/pf into head
Message-ID:  <E1T9xJ9-0000pZ-Mg@clue.co.za>
In-Reply-To: <CAPBZQG2b1AAdNBT9NVve8kzzxF%2Bu2T5Kgs10jO92nmZegvWebw@mail.gmail.com>
References:  <CAPBZQG2b1AAdNBT9NVve8kzzxF%2Bu2T5Kgs10jO92nmZegvWebw@mail.gmail.com> <20120905115140.GF15915@FreeBSD.org> <50476187.8000303@gibfest.dk> <20120905183607.GI15915@glebius.int.ru> <CAPBZQG0a4WVB4W4OwF3CAJH-G4DTDan-Nz1HR1SFAgFOfe%2Ba=Q@mail.gmail.com> <20120906064640.GL15915@glebius.int.ru> <CAPBZQG1iQ31bxMkKOKUUFpfOt15YMxgx1hmnj3HsQSj%2B%2BGJYqw@mail.gmail.com> <E1T9upR-0000bK-SI@clue.co.za> 

next in thread | previous in thread | raw e-mail | index | archive | help
=?ISO-8859-1?Q?Ermal_Lu=E7i?= wrote:
> > - the "pf: state key linking mismatch" which affects pf as far back
> > as we've been prepared to test (FreeBSD-8.0).  Although it only
> > became visible in the logs in -CURRENT before 9-RELEASE with the
> > pf import then.  It manifests as connections stalling randomly.
> >
> This has been an issue since new pf(4) import.

My contention is that this issue is also present in earlier pf.
It's just not logged verbosely:

[firewall1.jnb1] ~ # uname -a
FreeBSD firewall1.jnb1.gp-online.net 8.1-RELEASE FreeBSD 8.1-RELEASE #23: Tue Aug  7 20:21:54 SAST 2012     ianf@firewall1.jnb1.gp-online.net:/usr/obj/usr/src/sys/FIREWALL  amd64
[firewall1.jnb1] ~ # pfctl -s inf
Status: Enabled for 30 days 16:27:26          Debug: Urgent

State Table                          Total             Rate
  current entries                   377102               
  searches                    126189706387        47596.4/s
  inserts                       6358571792         2398.3/s
  removals                      6358194690         2398.2/s
Counters
  match                        23798723897         8976.4/s
  bad-offset                             0            0.0/s
  fragment                           29807            0.0/s
  short                              76362            0.0/s
  normalize                            234            0.0/s
  memory                                 0            0.0/s
  bad-timestamp                          0            0.0/s
  congestion                             0            0.0/s
  ip-option                          78290            0.0/s
  proto-cksum                     11023818            4.2/s
  state-mismatch                   4799367            1.8/s
  state-insert                       75295            0.0/s
  state-limit                           22            0.0/s
  src-limit                              0            0.0/s
  synproxy                               0            0.0/s

Every time the state-mismatch counter increments, the connection
stalls.  This manifests as as web pages needing to be reloaded
sometimes in order to complete downloading, or ssh connections being
reset.  While 4799367 is a small fraction of the total searches,
the chance of your flow being bitten is multiplied by each hop
through a FreeBSD router running pf.  While composing this email,
the state-mismatch counter increased by 11589.

We don't see this issue at all with Gleb's patches applied and
forwarding performance is greatly improved.

Whatever happens I'd like a way forward to be found because pf
deployed at the scale we're using it is unuseable post 2011-06-28
(and not ideal before).

> > There's not been a fix since it was first reported.  We're seeing
> > 0.08% of our connections dropped on the floor or about 4 per second.
> > As a result, we've been seriously considering replacing our FreeBSD
> > routers.
> 
> I have missed the report of this, can you point to details?

http://www.freebsd.org/cgi/query-pr.cgi?pr=163208

Comes to mind.  I'm sure there were some earlier reports, but I
can't find them in a hurry.  I'm also pretty sure there have been
reports on current@.

I posted to current@
http://www.freebsd.org/cgi/getmsg.cgi?fetch=164206+169604+/usr/local/www/db/text/2012/freebsd-current/20120812.freebsd-current

Which is how I came to this list on mail from Gleb.

I can tell you that this is not peculiar to 9 and later.  pf pre-9
was just silent about dropping the flows although the problem occurs
less frequently.

Ian

-- 
Ian Freislich



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1T9xJ9-0000pZ-Mg>