From owner-freebsd-current@FreeBSD.ORG Tue Nov 6 13:16:33 2012 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0564C3D6; Tue, 6 Nov 2012 13:16:33 +0000 (UTC) (envelope-from fabien.thomas@netasq.com) Received: from work.netasq.com (gwlille.netasq.com [91.212.116.1]) by mx1.freebsd.org (Postfix) with ESMTP id 863618FC14; Tue, 6 Nov 2012 13:16:31 +0000 (UTC) Received: from [10.2.1.1] (unknown [10.2.1.1]) by work.netasq.com (Postfix) with ESMTPSA id 5A7622705567; Tue, 6 Nov 2012 14:16:30 +0100 (CET) Subject: Re: polling's future [was: Re: Dynamic Ticks/HZ] Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=iso-8859-1 From: Fabien Thomas In-Reply-To: <5098F7BD.9060204@freebsd.org> Date: Tue, 6 Nov 2012 14:16:29 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <509758B8.1000409@rewt.org.uk> <50975F6F.6010907@rewt.org.uk> <5097898C.9080109@rewt.org.uk> <20121105163654.GA12870@onelab2.iet.unipi.it> <5097E880.8010001@rewt.org.uk> <20121105165748.GA13098@onelab2.iet.unipi.it> <5098E526.6070101@freebsd.org> <04A8DD03-71B2-4EFA-864B-522F49BF1478@netasq.com> <5098F7BD.9060204@freebsd.org> To: Andre Oppermann X-Mailer: Apple Mail (2.1283) Cc: Davide Italiano , Luigi Rizzo , Joe Holden , Ryan Stone , FreeBSD Current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Nov 2012 13:16:33 -0000 Le 6 nov. 2012 =E0 12:42, Andre Oppermann a =E9crit : > On 06.11.2012 12:02, Fabien Thomas wrote: >>>>=20 >>>=20 >>> Hi Luigi, >>>=20 >>> do you agree on polling having outlived its usefulness in the light >>> of interrupt moderating NIC's and SMP complications/disadvantages? >>>=20 >> If you have only one interface yes polling is not really necessary. >>=20 >> If you have 10 interfaces the interrupt moderation threshold is hard = to find >> to not saturate the system. >> Doing polling at 8000hz in that case is a lot better regarding global = interrupt level. >=20 > OK. Is the problem the interrupt load itself, or the taskqueues? Both, interrupt load will be higher if you want to keep latency low and = taskqueue=20 is just polling without global fairness (if you have 10 interface with 6 = core this will give you 60 taskqueue). If you poll 16 packets at a time from each = interface,=20 processing are more fair. >=20 >> The problem is that in the current state polling does not work well = and people remember >> the good old time where polling was better. >=20 > Indeed. >=20 >> rstone@ and myself have made some improvement to polling. >>=20 >> You can find a diff here for 8.3 with updated intel driver : >> http://people.freebsd.org/~fabient/polling/patch-pollif_8.3_11052012 >>=20 >> - support multiqueue for ixgbe, igb, em. >> - compat API for old driver >> - keep interrupt for link / status >> - user core mapping / auto mapping >> - deadline to keep cpu available >> - integrated to netisr >> - deferred packet injection with optional prefetching >=20 > This is a number of interesting but sometimes only tangentially > related features. Lets focus on the network cpu monopolization > issue first. This is what deadline is: Deadline is the maximum time spend over the scheduling period in = percent. Scheduling period is a fraction of the polling period (100hz by = default). Each round is measured to estimate time of a round (if some packet = require crypto load will increase for example) and processing stop when the deadline is = reached (If no thread want to run deadline is extended). Hope it is more clear. Sample: ~$ sysctl kern.pollif kern.pollif.map:=20 kern.pollif.stats_clear: 0 kern.pollif.stats:=20 Work queue 0: CPU load =3D 0 % pass =3D 80 run overflow =3D 0 Interface ix1.0 resched rx =3D 0 Interface ix0.0 resched rx =3D 0 Work queue 1: CPU load =3D 0 % pass =3D 80 run overflow =3D 0 Interface ix1.1 resched rx =3D 0 Interface ix0.1 resched rx =3D 0 Work queue 2: CPU load =3D 0 % pass =3D 80 run overflow =3D 0 Interface ix1.2 resched rx =3D 0 Interface ix0.2 resched rx =3D 0 Work queue 3: CPU load =3D 0 % pass =3D 80 run overflow =3D 0 Interface ix1.3 resched rx =3D 0 Interface ix0.3 resched rx =3D 0 kern.pollif.deadline: 80 kern.pollif.register_check: 10 kern.pollif.sched_div: 80 kern.pollif.packet_per_round: 16 kern.pollif.handlers: 8 >=20 >> Performance are on par with interrupt but you can keep a system alive = more easily >> by accounting all network processing for the deadline (with direct = dispatch). >=20 > Would you be willing to work a solution with me with a load aware > taskqueue as I proposed in a recent email to Luigi? That way we > don't need special cases or features or even a normal server under > DDoS wouldn't go down. The main problem of current version I have is that you consume a little = CPU when idle (99.8% idle with top, < 0.5% with PMC using = CPU_CLK_UNHALTED.THREAD_P). To solve that, kickstarting the polling with interrupt is a good idea to = reduce it but i've never tested so why not.=20 >=20 > --=20 > Andre >=20