From owner-freebsd-net Sat Oct 27 11:55:16 2001 Delivered-To: freebsd-net@freebsd.org Received: from swan.prod.itd.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id C110937B403 for ; Sat, 27 Oct 2001 11:55:11 -0700 (PDT) Received: from dialup-209.247.143.45.dial1.sanjose1.level3.net ([209.247.143.45] helo=mindspring.com) by swan.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 15xYbo-0003vh-00; Sat, 27 Oct 2001 11:55:05 -0700 Message-ID: <3BDB033C.98ED2BDF@mindspring.com> Date: Sat, 27 Oct 2001 11:55:56 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Mike Silbersack Cc: Alfred Perlstein , Soren Kristensen , Luigi Rizzo , net@FreeBSD.ORG Subject: Re: NEW CODE: polling support for device drivers. References: <20011027044854.X88536-100000@achilles.silby.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Mike Silbersack wrote: > > > Is it possible to implement all the basic packet forwarding to run to > > > completion at interrupt, ie, when a packet comes in, the interrupt code > > > would run until the packet has been sent out on another interface, and > > > then loop back to see if there's more incomming packets, in a polling > > > fashion. > > > > > > That would give the advantage of the polling, but without the latency. > > > > > > I'm mostly a hardware guy, so bear over with me if it's not possible at > > > all.... > > > > Actually your idea is sort of what Terry Lambert posted about a couple > > of weeks ago. I have no idea what happened in the end, the flameage > > went on and on and I lost interest. > > > > Can anyone summarize for the benefit of the list? > > Summary: The patch Terry posted was to loop a few more times in the > interrupt handler. I was going to commit it this weekend for the dc > driver, but it looks like Luigi's work overshadows that. > > The idea proposed above is (similar to) LRP, which Terry implemented for > clickarray. It is not in his patchset. Luigi's stuff is rather complementary, IMO, and would require the same thing, in order to not lose an inordinate amount of packets to other high latency processing. On you backhand comment on the unavailability of the LRP code... Rice University has made an LRP implementation available for integration into FreBSD on no less that two occasions. Realize that the LRP I implemented for ClickArray is based on the 2.x FreeBSD work they did for LRP. I _strongly_ disagree with their resource container design, which they have built for FreeBSD 4.x, and which is an easy ported to the RELENG_4 branch. There are two problems with the resource container implementation: 1) It ignores some fundamental issues of SMP, kernel preemption, stack reentrancy, and container contention. 2) It has a very bad license, which requires you to get permission -- and pay -- for commercial use of the code. Both the resource container and the earlier implementations are research toys, as they are distributed. By this, I mean that they are not commercial quality code; the way they implemented their integration was to define an alternate protocol type, and in so doing, they have created a method whereby they can run the LRP stack in parallel with the BSD stack. The reason this makes them toys is that they do not handle most of the required processing for a huge number of boundary conditions, and rely on the fact that they will only get a subset of the trafic on the LRP stack that they would otherwise get on the wire. They also disable support for a number of RFC mandated TCP options and actual TCP implemnetation details; my lab testing with these reenabled shows no reason for this, other than they were difficult to reenable. ClickArray has a number of similar "toys" in their research projects, as well. My implementation (which I indicated before I would like to get freed up and donated back to FreeBSD -- if I can't, I will rewrite it from scratch, on my own time and equipment) does not have these problems, in that I have integrated into the real BSD stack. It can peacefully coeexist with other networking stacks, and it can handle all of TCP, not just the expected traffic, without breaking on, for example, new socket allocation for a kqueue based batch accept. You will get it when I can give it to you, and not before; meanwhile, I'll give you what I can when I can. PS: Luigi had copies of some of my code well before I was able to give copies of the coelescing code to FreeBSD; he had previously signed a non-disclosure agreement which allowed me to hand him a lot of code that I still have not handed to others. It's no wonder that there is some overlap. PPS: The polling support for device drivers was something I had discussed with Bill Paul previously. He did not like the idea of externalizing the entry points for the tx_eof and rx_eof routines, as it changed the interface, and not all cards were capable of supporting it, anyway. PPPS: A more correct implementation of polling would probably be implemented using opportunistic timers; due to an office move which will not really be complete before Tuesday, I am unfortunately unable to cite paper references on opportunistic timers today. PPPPS: The techniques we are describing are nothing new; the people at Rice University who have pioneered a number of them are actually the founders of iMimic, which, if you were into such things, you would know is the company that kicks everyone's butt on the commercial benchmarks. Regards, -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message