From owner-freebsd-smp Wed Apr 3 12:35:20 2002 Delivered-To: freebsd-smp@freebsd.org Received: from hawk.mail.pas.earthlink.net (hawk.mail.pas.earthlink.net [207.217.120.22]) by hub.freebsd.org (Postfix) with ESMTP id 1016C37B416; Wed, 3 Apr 2002 12:34:47 -0800 (PST) Received: from pool0355.cvx40-bradley.dialup.earthlink.net ([216.244.43.100] helo=mindspring.com) by hawk.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 16srSs-0000li-00; Wed, 03 Apr 2002 12:34:42 -0800 Message-ID: <3CAB674A.F6BC6F9D@mindspring.com> Date: Wed, 03 Apr 2002 12:34:18 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Robert Watson Cc: Stefan Saroiu , freebsd-smp@FreeBSD.org Subject: Re: Interrupt Handlers and Multiple CPUs References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Robert Watson wrote: > On Tue, 2 Apr 2002, Stefan Saroiu wrote: > > The application still gets 20% of the CPU which is quite good actually. > > Although I'm not familiar with Druschel's work, I'm not sure whether > > better scheduling will help me here. > > > > I've been toying with the idea to change the driver to raise interrupts > > only once every 100 packets or something like that. Currently it is 1 > > interrupt per 1 packet. > > Ouch. No wonder you're having problems. You definitely want to implement > one of coalesced interrupt delivery or polled device access. In theory, > we have both in 4.x and 5.x, but support for coalesced delivery is on a > per-card basis. 5.x will allow you do to the kinds of things you want > (eventually) once the network stack is fine-grained enough, but it sounds > like the big problem is the driver model. I believe fxp and em drivers > support this, and might be a good model to look at. You might want to > consider posting to freebsd-net for pointers in this space. I missed this part. Your best bet is coelesced interrupt delivery in hardware, which should be handled in all gigabit drivers already. I personally provided patches for soft interrupt coelescing for some of the more popular 10/100 drivers. Basically, it takes advantage of Bill Paul's seperation of interrupt processing into rx_eof and tx_eof routines, and then adds a return code to indicate if they've done any work. If they have, it recalls them from the interrupt routine until there is no more work to do. You can put a high watermark on it by counting the number of times that it does this, and bailing, if that's all it's doing. In addition, you might want to look at Luigi Rizzo's patcehs for polling, and John Lemons patches for diminishing the NETISR requirements by processing some operations until completion at interrupt time (a "poor man's LRP"). Both of these are in -current (5.x). The problem with both John and Luigi's patches when used with a user space program is that they do not provide weighted fair share scheduling to ensure that a user space program that requires an arbitrary number of cycles to run gets the CPU time it needs. In effect, you must manually tune the CPU time ratio so that the user space program doesn't starvel, and likewise so that you don't stall the kernel polling of packets "because it's time to do user space processing", if you don't have any user space processing to do. The Druschel and Aron references I made specifically deal with this issue the way the Mogul reference I made suggests that you deal with it, which is to measure the queue depth to user space, and disable interrupts when it hits a high watermark, and reenable them when it hits a low watermark (indicating that the user space application has processed down the data to be processed). Using LRP itself has the additional benefit of removing latency from the processing path. The improved performance is by about a factor of 4 (measured, it went from 7,000 connections a second to 32,000 connections a second, processing through the TCP stack to completion, as opposed to at NETISR, and that's without the SYN cache stuff). If you simply don't have time to do this work, because your work is tangentially related (i.e. you need the performance in order to investigate a network on wich other research is taking place), then FreeBSD may not be the answer. You could apply the LRP patches to FreeBSD rather trivially; here are the patches for FreeBSD 4.3 (these are a port of the Rice University code, made by Duke University): http://www.cs.duke.edu/~anderson/freebsd/rescon/ I dislike resource containers, because they are primarily an accounting niceity which has no bearing on an embedded system application, other than to slow it down. If you want to ignore this problem entirely, then you might want to use QLinux, instead. QLinux uses a number of the ideas I've noted so far in this thread (plus a couple others, which I personally consider to be bad ideas) to improve the overall performance. Here is a pointer to QLinux: http://lass.cs.umass.edu/~lass/software/qlinux/ In general, if you hit performance problems with QLinux, throwing more CPUs at the problem isn't going to help you out any more there than it does in FreeBSD. The problem you have is not ammenable to being forcibly scaled by being spammed with more hardware. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message