From owner-freebsd-net@FreeBSD.ORG Wed Nov 21 18:26:03 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DC45BF28; Wed, 21 Nov 2012 18:26:03 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 312338FC1C; Wed, 21 Nov 2012 18:26:02 +0000 (UTC) Received: by mail-wg0-f50.google.com with SMTP id 12so3408259wgr.31 for ; Wed, 21 Nov 2012 10:26:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=AMDGOmDO8D78FRZkd9szZv2BScP13higpgP/6Mg99zc=; b=fbrwBpw4jmpwVwkDyOP5d7Dm2s4dAsM6oR25tdh18/8lRGePtarpMKFfb3RU4MGWC/ GEr9VT/GtKso+sSV537m2QnkbMmviFjjwlv+NyWAVTHjzqo5eNEDalVaCt6j+damkFnn aIw62ohEMW2UddAZv0hYw63qFmkZ06+hjsZ4qVjueiJWQd9Nq3+bN0gMqBV9HcqNaxAn 26mvnf2lHvEkVOiYPhwcFaBzaDOQnNukIf1dA/U3ng4/NIRoYptncYFuZvgyyLGLp5nn zu+l85/WIBSE/y0BgRztIfBCJRNWqxqa3ol0d5ZqlVLSlxeAwief438n8jkpozBH6/ql F2EQ== MIME-Version: 1.0 Received: by 10.216.74.85 with SMTP id w63mr8125587wed.212.1353522361826; Wed, 21 Nov 2012 10:26:01 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.216.21.211 with HTTP; Wed, 21 Nov 2012 10:26:01 -0800 (PST) In-Reply-To: <50AC910C.4030004@freebsd.org> References: <1353448328.76219.YahooMailClassic@web121602.mail.ne1.yahoo.com> <50AC08EC.8070107@mu.org> <832757660.33924.1353460119408@238ae4dab3b4454b88aea4d9f7c372c1.nuevasync.com> <250266404.35502.1353464214924@238ae4dab3b4454b88aea4d9f7c372c1.nuevasync.com> <50AC8393.3060001@freebsd.org> <50AC910C.4030004@freebsd.org> Date: Wed, 21 Nov 2012 10:26:01 -0800 X-Google-Sender-Auth: M8peE6LifayZc0H9a60xj8NqK44 Message-ID: Subject: Re: FreeBSD boxes as a 'router'... From: Adrian Chadd To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 Cc: Barney Cordoba , Jim Thompson , Alfred Perlstein , khatfield@socllc.net, "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Nov 2012 18:26:04 -0000 On 21 November 2012 00:30, Andre Oppermann wrote: > On 21.11.2012 08:55, Adrian Chadd wrote: >> >> Something that has popped up a few times, even recently, is breaking >> out of an RX loop after you service a number of frames. > > That is what I basically described. Right, and this can be done right now without too much reworking, right? I mean, people could begin by doing a drive-by on drivers for this. The RX path for a driver shouldn't be too difficult to do; the TX path is the racy one. >> During stupidly high levels of RX, you may find the NIC happily >> receiving frames faster than you can service the RX queue. If this >> occurs, you could end up just plain being stuck there. > That's the live-lock. And again you can solve this without having to devolve into polling. Again, polling to me feels like a bludgeon beating around a system that isn't really designed for the extreme cases it's facing. Maybe your work in the tcp_taskqueue branch addresses the larger scale issues here, but I've solved this relatively easily in the past. >> So what I've done in the past is to loop over a certain number of >> frames, then schedule a taskqueue to service whatever's left over. > Taskqueue's shouldn't be used anymore. We've got ithreads now. > Contrary to popular belief (and due to poor documentation) an > ithread does not run at interrupt level. Only the fast interrupt > handler does that. The ithread is a normal kernel thread tied to > an fast interrupt handler and trailing it whenever it said > INTR_SCHEDULE_ITHREAD. Sure, but taskqueues are still useful if you want to serialise access without relying on mutexes wrapping large parts of the packet handling code to enforce said order. Yes, normal ithreads don't run at interrupt level. And we can change the priority of taskqueues in each driver, right? And/or we could change the behaviour of driver ithreads/taskqueues to be automatically reniced? I'm not knocking your work here, I'm just trying to understand whether we can do this stuff as small individual pieces of work rather than one big subsystem overhaul. And CoDel is interesting as a concept, but it's certainly not new. But again, if you don't drop the frames during the driver receive path (and try to do it higher up in the stack, eg as part of some firewall rule) you still risk reaching a stable state where the CPU is 100% pinned because you've wasted cycles pushing those frames into the queue only to be dropped. What _I_ had to do there was have a quick gate to look up if a frame was part of an active session in ipfw and if it was, let it be queued to the driver. I also had a second gate in the driver for new TCP connections, but that was a separate hack. Anything else was dropped. In any case, what I'm trying to say is this - when I was last doing this kind of stuff, I didn't just subscribe to "polling will fix all." I spent a few months knee deep in the public intel e1000 documentation and tuning guide, the em driver and the queue/firewall code, in order to figure out how to attack this without using polling. And yes, you've also just described NAPI. :-) Adrian