From owner-freebsd-arch Thu Jun 20 19:48:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from scaup.mail.pas.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id BB33437B40E for ; Thu, 20 Jun 2002 19:47:52 -0700 (PDT) Received: from pool0544.cvx21-bradley.dialup.earthlink.net ([209.179.194.34] helo=mindspring.com) by scaup.mail.pas.earthlink.net with esmtp (Exim 3.33 #2) id 17LESk-0000n0-00; Thu, 20 Jun 2002 19:47:50 -0700 Message-ID: <3D1293AE.FDEC441D@mindspring.com> Date: Thu, 20 Jun 2002 19:47:10 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Gary Thorpe Cc: freebsd-arch@freebsd.org Subject: Re: multiple threads for interrupts References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Gary Thorpe wrote: > >Seigo Tanimura wrote: > > > One solution is to run multiple threads for each of the interrupt > > > types. Since I noticed this issue first during my work of network > > > locking, I have been tweaking the swi subsystem so that it runs > > > multiple threads for an swi type. For those who are interested, the > > > patch can be found at: > > > > > > http://people.FreeBSD.org/~tanimura/patches/swipool.diff.gz > > > >Benchmarks before and after, demonstrating an improvement? > > > >-- Terry > > I am not a kernel programmer, but I have read a paper which concludes that > making threads have an "affinity" or "stickiness" to the last CPU it was run > on is benifical because it leads to less cache flushing/refilling. Maybe > this will be a factor in having multiple threads for interrupt handling? THat's a general scheduling problem. The solution is well known, and implemented in Dynix, then IRIX, now Linux. Alfred Perlstein has some patches that take it most of the way there, but leave an interlock, by maintaining a global queue. The solution I'm talking about is per CPU scheduling queues, where threads are only migrated between scheduling queues under extraordinary conditiona, so most of the scheduling never requires the locking the FreeBSD-current has today. This solves the affinity problem. I'm not sure the affinity fix solves the NETISR problem, because I think that the issue there is that the affinity you want in that case is mbuf<->cpu affinity. Basically, if you take the network interrupt on CPU 3, then you want to run the NETISR code that is associated with the protocol processing on CPU 3, as well, to avoid cache busting. The way I would suggest doing this is to run the protocol processing up to user space at interrupt time (LRP). This gets rid of NETISR. A lot of people complain that this won't allow you to receive as many packets in a given period of time. They are missing the fact that this only affect the burst rate until poll saturation occurs, at which point in time the number of packets that you receive is in fact clocked by buffer availability, and buffer availability is clocked by the ability of NETISR to process the packets up to the user space boundary, and that in turn is clocked by the ability to process the packets out to the user space programs on the other end of the sockets. What this all boils down to is that you should only permit receive data interrupts to occur at the rate that you can move the data from the wire, all the way through the system, to completion. The feedback process in the absence of available mbufs is to take the interrupt, and then replace the contents of the mubuf receive buffer ring with the new contents. The mbuf's only ever get pushed up the stack if there is a replacement mbuf allocable from the system in order to put on the ring in place of the received mbuf. Effectively, we are talking about receive ring overflow here. If you trace the dependency graph on mbuf availability all the way to user space, you will see that if you are receiving packets faster than you can process them, then you end up spending all your time servicing interrupts, and that takes away from your time to actually push data through. Jeff Mogul of DEC Western Research Laboratories described this as "receiver livelock" back early in the last decade. Luigi's and Jon Lemon's work only partially mitigates the problem. Turning off interrupts doesn't deal with the NETISR triggering, which only occurs when you splx() down from a hardware interrupt level so that the SWL list is run. Running the packets on partially up the stack doesn't resolve the problems up to the user/kernel barrier. So both are only partial solutions. I'm convinced that CPU affinity needs to happen. I'm also convinced that, for the most part, running NETISR in kernel threads, rather than to completion at interrupt, is the wrong way to go. I'm currently agnostic on the idea of whether interrupt threads will help in areas outside of networking. My instinct is that the added contention will mean that they will not. I'm reserving judgement pending seeing real benchmarks. To me, it looks like a lot of people are believing something is better because they are being told that it is better, not because they have personally measured it, gotten better numbers, and have proven to themselves that those better numbers were a result of what they thought they were measuring, rather than an artifact that could be exploited in the old code, as well. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message