Date: Mon, 18 Nov 2002 17:48:39 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Luigi Rizzo <rizzo@icir.org> Cc: David Gilbert <dgilbert@velocet.ca>, dolemite@wuli.nu, freebsd-hackers@FreeBSD.ORG, freebsd-net@FreeBSD.ORG Subject: Re: Small initial LRP processing patch vs. -current Message-ID: <3DD99877.2F7C6D12@mindspring.com> References: <20021109180321.GA559@unknown.nycap.rr.com> <3DCD8761.5763AAB2@mindspring.com> <15823.51640.68022.555852@canoe.velocet.net> <3DD1865E.B9C72DF5@mindspring.com> <15826.24074.605709.966155@canoe.velocet.net> <3DD2F33E.BE136568@mindspring.com> <3DD96FC0.B77331A1@mindspring.com> <20021118151109.B19767@xorpc.icir.org> <3DD99018.73B703A@mindspring.com> <20021118173155.C29018@xorpc.icir.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Luigi Rizzo wrote: > > > This patch will not make any difference if you have device_polling > > > enabled, because polling already does this -- queues a small number > > > of packets (default is max 5 per card) and calls ip_input on them > > > right away. > > > > The problem with this is that it introduces a livelock point at > > no it doesn't because new packets are not grabbed off the card until > the queue is empty. It's still possible to run out of mbufs. > > > I do not understand this claim: > > > > > > > The basic theory here is that ipintr processing can be delayed > > > > indefinitely, if interrupt load is high enough, and there will > > > > be a maximum latency of 10ms for IP processing after ether_input(), > > > > in the normal stack case, without the patches. > > > > > > because netisr are not timer driven to the best of my knowledge -- > > > they just fire right after the cards' interrupts are complete. > > > > That's almost right. The soft interrupt handlers run when you > > splx() out of a raised priority level. In fact, this happens at > > the end of clockintr, so NETISR *is* timer driven, until you hit > > i think it happens at the end of the device interrupt! It happens at splx(). THis happens at the end of a device interrupt, but... acking the interrupt can result in another interrupt before processing is complete to the point that soft interrupts will run. See the Jeffrey Mogul paper on receiver livelock, and the Rice University paper on LRP. > > Polling changes this somewhat. The top end is reduced, in exchange > > for not dropping off as badly > > actually, this is not true in general, and not in the case of > FreeBSD's DEVICE_POLLING. > > Polling-enabled drivers fetch the cards' state from in-memory > descriptors, not from the interrupt status register across the > PCI bus. Also, they to look for exceptions (which require going > through the PCI bus) only every so often. So the claim that the top > end is reduced is not true in general -- it depends on how the > interrupt vs. polling code are written and optimised. No. That's more of a side-issue, and it's dictated by the hardware and firmware implementation, more than anything else, I think. The actual problem is that the balance between system time spent polling in the kernel vs. running the application in user space is based on reserving a fixed amount of time, rather than a load-dependent amount of time for processing. I understand that DEVICE_POLLING is your baby; I'm not attacking your implementation. It does what it was supposed to do. Things are better with polling than without; all I am saying is that they could be better still. The reason I asked for the second set of numbers (polling with/without with the ip_input code path change) is actually to support the idea that polling and/or additional patches are still required. You really want to achieve the highest possible throughput, without ever dropping a packet. If you drop a packet anywhere between the network card and the application, then you are not handling the highest load the hardware is capable of handling. Polling only deals with this up to the top of the TCP stack, at a cost of increased latency over interrupt in exchange for reduced interrupt processing overhead (you get to wait until the ether_poll() runs, instead of handling the packet immediately, which introduces an unavoidable hardclock/2 latency). You avoid interrupt livelock, while still risking a deadly embrace waiting for applications to service the sockets (hence the need for scheduler hacks, as well). When it comes down to it, latency := pool retention time, and the smaller your pool retention time, the more connections you can handle simultaneously with a given pool size. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DD99877.2F7C6D12>