From owner-freebsd-net@FreeBSD.ORG Wed Nov 21 02:16:57 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AFED440B for ; Wed, 21 Nov 2012 02:16:57 +0000 (UTC) (envelope-from khatfield@socllc.net) Received: from smtp151.dfw.emailsrvr.com (smtp151.dfw.emailsrvr.com [67.192.241.151]) by mx1.freebsd.org (Postfix) with ESMTP id 6631D8FC08 for ; Wed, 21 Nov 2012 02:16:57 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp5.relay.dfw1a.emailsrvr.com (SMTP Server) with ESMTP id A821159247; Tue, 20 Nov 2012 21:16:56 -0500 (EST) X-Virus-Scanned: OK Received: by smtp5.relay.dfw1a.emailsrvr.com (Authenticated sender: khatfield-AT-socllc.net) with ESMTPSA id EA75E59219; Tue, 20 Nov 2012 21:16:55 -0500 (EST) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Re: FreeBSD boxes as a 'router'... References: <1353448328.76219.YahooMailClassic@web121602.mail.ne1.yahoo.com> <50AC08EC.8070107@mu.org> <832757660.33924.1353460119408@238ae4dab3b4454b88aea4d9f7c372c1.nuevasync.com> From: khatfield@socllc.net Mime-Version: 1.0 In-Reply-To: Message-Id: <250266404.35502.1353464214924@238ae4dab3b4454b88aea4d9f7c372c1.nuevasync.com> Date: Tue, 20 Nov 2012 20:16:49 -0600 To: Adrian Chadd X-NS-Received: from Apple-iPhone5C2/1001.525(khatfield@socllc.net) SECURED(HTTPS); Wed, 21 Nov 2012 02:16:53 +0000 (UTC) Cc: Barney Cordoba , Jim Thompson , Alfred Perlstein , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Nov 2012 02:16:57 -0000 I may be misstating. Specifically under high burst floods either routed or being dropped by pf w= e would see the system go unresponsive to user-level applications / SSH for= example. The system would still function but it was inaccessible. To clarify as well= this was any number of floods or attacks to any ports, the behavior remain= ed. (These were not SSH ports being hit) Now we did a lot of sysctl resource tuning to correct this with some floods= but high rate would still cause the behavior. Other times the system would= simply drop all traffic (like a buffer filled or max connections) but it w= as not either case.=20 The attacks were also well within bandwidth capabilities for the pipe and n= etwork gear. All of these issues stopped upon adding polling or the overall threshold wa= s increased tremendously with polling. Yet, polling has some downsides not necessarily due to FreeBSD but applicat= ion issues. Haproxy is one example where we had handshake/premature connect= ions terminated with polling. Those issues were not present with polling di= sabled.=20 So that is my reasoning for saying that it was perfect for some things and = not for others. In the end, we spent years tinkering and it was always satisfactory but nev= er perfect. Finally we grew to the point of replacing the edge with MX80's = and left BSD to load balancing and the like. This finally resolved all issu= es for us. Albeit, we were a DDoS mitigation company running high PPS and lots of burs= ting. BSD was beautiful until we ended up needing 10Gps+ on the edge and it= was time to go Juniper. I still say BSD took us from nothing to a $30M company. So despite somethin= g's requiring tinkering with I think it is still worth the effort to put in= the testing to find what is best for your gear and environment. I got off-track but we did find one other thing. We found ipfw did seem to = reduce load on the interrupts (likely because we couldn't do near the scrub= bing with it vs pf) at any rate less filtering may also fix the issue with = the op.=20 Your forwarding - we found doing forwarding via a simple pf rule and a GRE = tunnel to an app server or by using a tool like haproxy on the router itsel= f seemed to reduce a large majority of our original stability issues (verse= s pure fw-based packet forwarding) *I also agree because as I mentioned in a previous email... (To me) our ove= rall PPS seemed to decrease from FBSD 7 to 9. No idea why but we seemed to = begin having less effect with polling as we seemed to get with polling on 7= .4. Not to say that this wasn't due to error on our part or some issue with th= e Juniper switches but we seemed to just run into more issues with newer re= leases when it came to performance with Intel 1Gbps NIC's. this later cause= d us to move more app servers to Linux because we never could get to the bo= ttom of some of those things. We do intend to revisit BSD with our new CDN = company to see if we can restandardize it for high volume traffic servers. Best, Kevin=20 On Nov 20, 2012, at 7:19 PM, "Adrian Chadd" wrote: > Ok, so since people are talking about it, and i've been knee deep in > at least the older intel gige interrupt moderation - at maximum pps, > how exactly is the interrupt moderation giving you a livelock > scenario? >=20 > The biggest benefit I found when doing some forwarding work a few > years ago was to write a little daemon that actually sat there and > watched the interrupt rates and packet drop rates per-interface - and > then tuned the interrupt moderation parameters to suit. So at the > highest pps rates I wasn't swamped with interrupts. >=20 > I think polling here is hiding some poor choices in driver design and > network stack design.. >=20 >=20 >=20 > adrian