From owner-freebsd-current Thu Oct 18 7: 4:51 2001 Delivered-To: freebsd-current@freebsd.org Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.121.50]) by hub.freebsd.org (Postfix) with ESMTP id 7010637B405 for ; Thu, 18 Oct 2001 07:04:45 -0700 (PDT) Received: from dialup-209.247.141.141.dial1.sanjose1.level3.net ([209.247.141.141] helo=mindspring.com) by avocet.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 15uDmr-0003Tl-00; Thu, 18 Oct 2001 07:04:42 -0700 Message-ID: <3BCEE1A8.3D0AE7F0@mindspring.com> Date: Thu, 18 Oct 2001 07:05:28 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Mike Silbersack Cc: freebsd-current@freebsd.org Subject: Re: Some interrupt coalescing tests References: <20011017123225.B47595-100000@achilles.silby.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Mike Silbersack wrote: > What probably should be done, if you have time, is to add a bit of > profiling to your patch to find out how it helps most. I'm curious how > many times it ends up looping, and also why it is looping (whether this is > due to receive or transmit.) I think knowing this information would help > optimize the drivers further, and perhaps suggest a tact we haven't > thought of. On 960 megabits per second on a Tigon III (full wire speed, non-jumbogram), the looping is almost entirely (~85%) on the receive side. It loops for 75% of the hardware interrupts in the LRP case (reduction of interrupts from 12,000 to 8,000 -- 33%). This is really expected, since in the LRP case, the receive processing is significantly higher, and even in that case, we are not driving the CPU to the wall in interrupt processing. In the non-LRP case, the percentage drop in interrupt overhead is ~10% (as has been observed by others). THis makes sense, too, if you consider that NETISR driving of receives means less time in interrupt processing. If we multiply the 15% (100% - 85% = 15% in transmit) by 3 (12000/(12000-8000) = 100% / 33% = 3), then we get 45% in transmit in the non-LRP case. It would be nice if someone could confirm that slightly less than 1/2 of the looping is on the transmit side for a non-LRP kernel, but that's about what we should expect... > > I don't know if anyone has tested what happens to apache in > > a denial of service attack consisting of a huge number of > > partial "GET" requests that are incomplete, and so leave state > > hanging around in the HTTP server... > > I'm sure it would keel over and die, since it needs a process > per socket. If you're talking about sockets in TIME_WAIT or > such, see netkill.pl. I was thinking in terms of connections not getting dropped. The most correct way to handle this is probably an accept filter for , indicating a complete GET request (still leaves POST, though, which has a body), with dropping of long duration incomplete requests. Unfortunately, without going into the "Content-Length:" parsing, we are pretty much screwed on POST, and very big POSTs still screw you badly (imagine a "Content-Length: 1000000000"). You can mitigate that by limiting request size, but you are still talking about putting HTTP parsing in the kernel, above and beyond simple accept filters. I'm really surprised abuse of the HTTP protocol itself in denial of service attacks isn't more common. > > Yes. Floyd and Druschel recommend using high and low > > watermarks on the amount of data pending processing in > > user space. The most common approach is to use a fair > > share scheduling algorithm, which reserves a certain > > amount of CPU for user space processing, but this is > > somewhat wasteful, if there is no work, since it denies > > quantum to the interrupt processing, potentially wrongly. > > I'm not sure such an algorithm would be wasteful - there must be data > coming in to trigger such a huge amount of interrupts. I guess this would > depend on how efficient your application is, how you set the limits, etc. Yes. The "waste" comment is aimed at the idea that you will most likely have a heterogeneous loading, so you can not accurately predict ahead of time that you will spend 80% of your time in the kernel, and 20% processing in user space, or whatever ratio you come up with. This becomes much more of an issue when you have an attack, which will, by definition, end up being asymmetric. In practice, however, no one out there has a pipe size in excess of 400 Mbits outside of a lab, so most people never really need 1Gbit of throughput, anyway. If you can make your system handle full wire speed for 1Gbit, you are pretty much safe from any attack that someone might want to throw at you, at least until the pipes get larger. Even ignoring this, there's a pretty clear off the shelf hardware path to a full 10 gigabits, with PCI-X (8 gigabits times 2 busses gets you there, which is 25 times the largest UUNet hosting center pipe size today). Fair share is more a problem for slower interfaces without hardware coelescing, and software is an OK band-aid for them (IMO). I suspect that you will want to spend most of your CPU time doing processing, rather than interrupt handing, in any case. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message