Date: Sat, 11 Oct 2008 14:06:49 -0700 From: "Michael K. Smith" <mksmith@adhost.com> To: Jeremy Chadwick <koitsu@FreeBSD.org> Cc: questions@freebsd.org Subject: Re: FreeBSD as PF/Router/Firewall dying on the vine Message-ID: <C5166379.1DDB9%mksmith@adhost.com> In-Reply-To: <20081007043009.GA38719@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello Jeremy: On 10/6/08 9:30 PM, "Jeremy Chadwick" <koitsu@FreeBSD.org> wrote: > On Mon, Oct 06, 2008 at 06:08:50PM -0700, Michael K. Smith - Adhost wrote: >> Hello All: >> >> We have a load balanced pair of PF boxes sitting in front of a whole bunch of >> server doing all manner of things! It's been working great up until today >> when it, well, didn't. Here's what I see in top -S. >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU >> COMMAND >> 14 root 1 -44 -163 0K 8K CPU1 0 44:21 88.18% swi1: >> net >> 11 root 1 171 52 0K 8K RUN 0 24:58 53.32% idle: >> cpu0 >> 10 root 1 171 52 0K 8K RUN 1 17:44 35.50% idle: >> cpu1 >> 24 root 1 -68 -187 0K 8K *Giant 0 5:30 11.62% irq16: >> em2 uhci3 >> 23 root 1 -68 -187 0K 8K WAIT 0 1:27 3.08% irq25: >> em1 >> 25 root 1 -68 -187 0K 8K WAIT 1 1:16 2.64% irq17: >> em3 >> >> This is 6.3 with Intel 1000 Fiber and Copper interfaces, all using the 'em' >> driver. Also, there are 15 VLAN's configured on one of the NIC's for subnet >> separation. >> >> If anyone has any ideas I'm all ears. My google-fu is coming up empty with >> the swi1: net > > Can you explain what the problem is? Sorry it took so long to reply. We actually got the issue resolved, but I wanted to make sure our fix actually worked. Here is what the problem/solution is. The problem was significant packet loss and connectivity issue to and through the PF server. Even pinging the loopback address on the server itself was returning 4 ms times. The problem was a very busy NFS server with clients on the same VLAN, but on a different subnet. So, we had a VLAN interface on em1 that had two address ranges attached, 10.255.0.0/16 and 10.212.6.0/16. The NFS server was on the 10.255 and the clients were on the 10.212. Even though they were on the same VLAN, they weren't directly ARP'able, so all traffic (400 - 600 Mb/sec) between them had to be processed by the server. When we moved the clients on to the same subnet as the server, everything stabilized. I think this was an issue of bad design on my part. Regards, Mike
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C5166379.1DDB9%mksmith>