From owner-freebsd-net@FreeBSD.ORG Fri Mar 4 13:32:46 2005 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0D20916A4CE for ; Fri, 4 Mar 2005 13:32:45 +0000 (GMT) Received: from f22.mail.ru (f22.mail.ru [194.67.57.55]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2C8C343D5A for ; Fri, 4 Mar 2005 13:32:45 +0000 (GMT) (envelope-from _pppp@mail.ru) Received: from mail by f22.mail.ru with local id 1D7Cv5-0004mV-00; Fri, 04 Mar 2005 16:32:43 +0300 Received: from [81.200.13.122] by win.mail.ru with HTTP; Fri, 04 Mar 2005 16:32:43 +0300 From: dima <_pppp@mail.ru> To: Luigi Rizzo Mime-Version: 1.0 X-Mailer: mPOP Web-Mail 2.19 X-Originating-IP: [81.200.13.122] Date: Fri, 04 Mar 2005 16:32:43 +0300 In-Reply-To: <20050304025942.E134@xorpc.icir.org> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Message-Id: cc: net@freebsd.org Subject: Re: Polling objectives (was Re: Giant-free polling [PATCH]) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: dima <_pppp@mail.ru> List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2005 13:32:46 -0000 -----Original Message----- From: Luigi Rizzo To: Pawel Jakub Dawidek Date: Fri, 4 Mar 2005 02:59:42 -0800 Subject: Polling objectives (was Re: Giant-free polling [PATCH]) > > On Fri, Mar 04, 2005 at 12:24:58AM +0100, Pawel Jakub Dawidek wrote: > .... > (luigi) >> +> this said, if the lock requests are blocking, you basically end >> +> up with the polling loops always contending for the locks, with only one >> +> doing actual work and the other one always busy-waiting. > .... > > (pawel) >> I think we should just implement per-interface idlepoll threads, so we can >> run polling code on many CPUs for many interfaces. > > no this won't work, because this would leave the problem of > scheduling the idlepoll threads unresolved, and > you would end up with a huge overhead context-switching the > idlepoll threads, or with one (or a few) interfaces getting > polled and saturating resources. > > For the records, even the UP case had this problem -- in the initial > implementation of polling, the first busy interface encountered in > the polling loop would easily saturate the entire ipintrq causing > other packets to be dropped systematically. > > I would like to restate the motivations for using polling instead > of interrupts. Among the advantages of polling there were: > > 1. reduction of context switches: not one per interrupt, but > one per timer tick (this is orders or magnitude smaller on > a busy box, where you do care for the overhead); > > If you don't implement this, you will not improve the > throughput of the box under load, and possibly will not > even be able to prevent livelock > > 2. predictable scheduling of kernel vs userland work; > > If you don't implement this, once again you won't be able > to prevent livelock > > 3. predictable scheduling of work among the various interfaces. > > if you don't implement this, you might risk unfairness in > the handling of traffic, which can even lead to systematic > starvation of certain interfaces. > > The UP polling code did implement all of the above. > > I would suggest people interested in implementing SMP polling > to make sure that their _design_ covers the above issues _before_ > coming up with patches. I _do not_ have a complete solution. Well, my primary idea was to replace Giant with a personal lock for polling (so it would become independent of any blockings in the kernel; thus polling performance will be more predictable). That's why I didn't change *your* design in any way. Since Giant is removed we can let SMP people use polling also. Let's discuss the scheduling sanity then... The polling loop can be scheduled from 3 points: 1. hardclock_process() -- this is a per-CPU one, thus it's quite ok if all CPUs can poll different interfaces (it is possible in the second version of the patch because of mtx_trylock() in the loop). 2. trap() -- I don't like the current implementation; it should probably check for the source of interrupt, so say a disk-intensive task wouldn't bring polling overhead. But note that scheduling frequency isn't affected. 3. poll_idle() -- this is the strangest one since it isn't mentioned anywhere else in the kernel... I don't like the Pawel's idea about per-interface threads also... PS: my question about locking in ether_poll_register() is still actual. I think pr[] should be protected by sx while adding a new handler. > > Just coming up with something that is called polling but > has none of the above properties would be misleading for > the users (who do associate features to names) and a regression > for the project. IMHO. > > thanks > luigi > >> -- >> Pawel Jakub Dawidek http://www.wheel.pl >> pjd@FreeBSD.org http://www.FreeBSD.org >> FreeBSD committer Am I Evil? Yes, I Am!