Date: Tue, 3 May 2011 12:17:21 +0200 From: Vlad Galu <dudu@dudu.ro> To: Vincent Hoffman <vince@unsane.co.uk> Cc: freebsd-stable@freebsd.org, Jeremy Chadwick <freebsd@jdc.parodius.com>, freebsd-pf@freebsd.org Subject: Re: RELENG_8 pf stack issue (state count spiraling out of control) Message-ID: <BANLkTim9o0dCef_BVT29YviSiR0oXyk6VQ@mail.gmail.com> In-Reply-To: <BANLkTi=DB7xw57LPCyUDzzeGYqX=j6Ju4w@mail.gmail.com> References: <20110503015854.GA31444@icarus.home.lan> <20110503084800.GB9657@insomnia.benzedrine.cx> <20110503091619.GA39329@icarus.home.lan> <4DBFCB8D.10105@unsane.co.uk> <BANLkTi=DB7xw57LPCyUDzzeGYqX=j6Ju4w@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 3, 2011 at 12:12 PM, Vlad Galu <dudu@dudu.ro> wrote: > > > On Tue, May 3, 2011 at 11:31 AM, Vincent Hoffman <vince@unsane.co.uk>wrote: > >> On 03/05/2011 10:16, Jeremy Chadwick wrote: >> >> <snip lots of data relevant to the discussion but not my answer> >> > Sadly I don't see a way with bsnmpd(8) to monitor things like interrupt >> > usage, etc. otherwise I'd be graphing that. The more monitoring the >> > better; at least then I could say "wow, interrupts really did shoot >> > through the roof -- the box went crazy!" and RMA the thing. :-) >> > >> you could use net-mgmt/bsnmp-regex although I dont know what the >> overhead for that is like. >> > > I use munin for graphing, as it allows easy scripting without using SNMP. > > My case is a bit different from Jeremy's. Every once in a while there is a > sudden traffic spike which impacts pf performance as well. However, the > graphed figures are nowhere near what I'd consider alarming levels (this box > has withstood more in the past). I was able to coincidentally log in after > such a spike and noticed the pfpurge thread eating up about 30% of the CPU > while using the normal optimization policy. In my case, it could be related > to another issue I'm seeing on this box - mbuma allocation failures. Here > are my graphs: > > http://dl.dropbox.com/u/14650083/PF/bge_bits_1-week.png > http://dl.dropbox.com/u/14650083/PF/bge_packets_1-week.png > http://dl.dropbox.com/u/14650083/PF/bge_stats_1-week.png > http://dl.dropbox.com/u/14650083/PF/load-week.png > http://dl.dropbox.com/u/14650083/PF/mbuf_errors-week.png > http://dl.dropbox.com/u/14650083/PF/mbuf_usage-week.png > http://dl.dropbox.com/u/14650083/PF/pf_inserts-week.png > http://dl.dropbox.com/u/14650083/PF/pf_matches-week.png > http://dl.dropbox.com/u/14650083/PF/pf_removals-week.png > http://dl.dropbox.com/u/14650083/PF/pf_searches-week.png > http://dl.dropbox.com/u/14650083/PF/pf_src_limit-week.png > http://dl.dropbox.com/u/14650083/PF/pf_states-week.png > http://dl.dropbox.com/u/14650083/PF/pf_synproxy-week.png > > I'll wait for the next time the symptom occurs to switch to a stateless > configuration. > > I forgot to mention this is a UP box using TSC for timekeeping and running ntpd. -- /boot/loader.conf -- hint.p4tcc.0.disabled="1" hint.acpi_throttle.0.disabled="1" debug.acpi.disabled="timer" -- /boot/loader.conf -- -- sysctl output -- kern.timecounter.choice: TSC(800) i8254(0) dummy(-1000000) kern.timecounter.hardware: TSC -- sysctl output -- -- Good, fast & cheap. Pick any two.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BANLkTim9o0dCef_BVT29YviSiR0oXyk6VQ>