Date: Tue, 3 May 2011 09:00:42 +0200 From: Daniel Hartmeier <daniel@benzedrine.cx> To: Jeremy Chadwick <freebsd@jdc.parodius.com> Cc: freebsd-stable@freebsd.org, freebsd-pf@freebsd.org Subject: Re: RELENG_8 pf stack issue (state count spiraling out of control) Message-ID: <20110503070042.GA9657@insomnia.benzedrine.cx> In-Reply-To: <20110503015854.GA31444@icarus.home.lan> References: <20110503015854.GA31444@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
I read those graphs differently: the problem doesn't arise slowly, but rather seems to start suddenly at 13:00. Right after 13:00, traffic on em0 drops, i.e. the firewall seems to stop forwarding packets completely. Yet, at the same time, the states start to increase, almost linearly at about one state every two seconds, until the limit of 10,000 is reached. Reaching the limit seems to be only a side-effect of a problem that started at 13:00. > Here's one piece of core.0.txt which makes no sense to me -- the "rate" > column. I have a very hard time believing that was the interrupt rate > of all the relevant devices at the time (way too high). Maybe this data > becomes wrong only during a coredump? The total column I could believe. > > ------------------------------------------------------------------------ > vmstat -i > > interrupt total rate > irq4: uart0 54768 912 > irq6: fdc0 1 0 > irq17: uhci1+ 172 2 > irq23: uhci3 ehci1+ 2367 39 > cpu0: timer 13183882632 219731377 > irq256: em0 260491055 4341517 > irq257: em1 127555036 2125917 > irq258: ahci0 225923164 3765386 > cpu2: timer 13183881837 219731363 > cpu1: timer 13002196469 216703274 > cpu3: timer 13183881783 219731363 > Total 53167869284 886131154 > ------------------------------------------------------------------------ I find this suspect as well, but I don't have an explanation yet. Are you using anything non-GENERIC related to timers, like change HZ or enable polling? Are you sure the problem didn't start right at 13:00, and cause complete packet loss for the entire period, and that it grew gradually worse instead? Daniel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110503070042.GA9657>