From owner-freebsd-net@FreeBSD.ORG Fri Aug 31 05:19:23 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C08AE106564A; Fri, 31 Aug 2012 05:19:23 +0000 (UTC) (envelope-from egrosbein@rdtc.ru) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13::5]) by mx1.freebsd.org (Postfix) with ESMTP id E86F08FC0A; Fri, 31 Aug 2012 05:19:22 +0000 (UTC) Received: from eg.sd.rdtc.ru (localhost [127.0.0.1]) by eg.sd.rdtc.ru (8.14.5/8.14.5) with ESMTP id q7V5JJ4J093305; Fri, 31 Aug 2012 12:19:19 +0700 (NOVT) (envelope-from egrosbein@rdtc.ru) Message-ID: <50404957.302@rdtc.ru> Date: Fri, 31 Aug 2012 12:19:19 +0700 From: Eugene Grosbein User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; ru-RU; rv:1.9.2.13) Gecko/20110112 Thunderbird/3.1.7 MIME-Version: 1.0 To: pyunyh@gmail.com References: <1865271844.20120829131610@serebryakov.spb.ru> <1807373989.20120829223125@serebryakov.spb.ru> <20120830152726.A33776@sola.nimnet.asn.au> <534292400.20120830131158@serebryakov.spb.ru> <20120831180721.GB3208@michelle.cdnetworks.com> In-Reply-To: <20120831180721.GB3208@michelle.cdnetworks.com> Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 8bit Cc: freebsd-net@freebsd.org, Lev Serebryakov , Ian Smith Subject: Re: Bad routing performance on 500Mhz Geode LX with CURRENT, ipfw and mpd5 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Aug 2012 05:19:23 -0000 01.09.2012 01:07, YongHyeon PYUN пишет: > It would be interesting to know whether there is any difference > before/after taskq change made in r235334. I was told that taskq > conversion for vr(4) resulted in better performance but I think > taskq shall add more burden on slow hardware. > Pre-r235334 interrupt handler still has issues since it wouldn't > exit interrupt handler if there are any pending interrupts. > It shall consume most of its CPU cycles in the interrupt handler > under extreme network load. If pre-r235334 shows better result, > you are probably able to implement interrupt mitigation by using > VT6102/VT6105's timer interrupt. I guess some frames would be lost > with the interrupt mitigation under high network load but other > part of kernel would have more chance to run important tasks. > Anyway, vr(4) controllers wouldn't be one of best choice for slow > machines due to DMA alignment limitation and driver assisted > padding requirement. I also have AMD Geode LX8-based system having two on-board vr(4) interfaces. I've just tried it with vr(4) driver from HEAD built as module for my 8.3-STABLE/i386. It builds just fine with minor change: --- if_vr.c.orig 2012-08-29 23:36:05.000000000 +0700 +++ if_vr.c 2012-08-29 22:51:01.000000000 +0700 @@ -2176,7 +2176,7 @@ VR_LOCK(sc); mii = device_get_softc(sc->vr_miibus); LIST_FOREACH(miisc, &mii->mii_phys, mii_list) - PHY_RESET(miisc); + mii_phy_reset(miisc); sc->vr_flags &= ~(VR_F_LINK | VR_F_TXPAUSE); error = mii_mediachg(mii); VR_UNLOCK(sc); dmesg says: vr0: port 0xe000-0xe0ff mem 0xef024000-0xef0240ff irq 10 at device 12.0 on pci0 vr0: Quirks: 0x0 vr0: Revision: 0x86 miibus0: on vr0 ukphy0: PHY 1 on miibus0 ukphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr0: Ethernet address: 00:10:f3:13:72:c6 vr0: [ITHREAD] vr1: port 0xe400-0xe4ff mem 0xef025000-0xef0250ff irq 11 at device 13.0 on pci0 vr1: Quirks: 0x0 vr1: Revision: 0x86 miibus1: on vr1 ukphy1: PHY 1 on miibus1 ukphy1: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr1: Ethernet address: 00:10:f3:13:72:c7 vr1: [ITHREAD] This is industrial Nexcom NICE-3120-LX8 fanless PC system used as home router, the only miniPCI expansion slot is occupied with ath0 WiFi card. http://www.orbitmicro.com/global/system-4423.html I have to say that HEAD driver runs MUCH worse. With stock 8.3 driver I have same 3.35MByte/s one-thread http transfer through this system but LA=1.7 only and userland is pretty responsive. top(1) shows: last pid: 29696; load averages: 1.70, 1.08, 0.88 up 2+00:11:31 22:21:46 94 processes: 2 running, 78 sleeping, 14 waiting CPU: 7.7% user, 0.0% nice, 0.0% system, 15.4% interrupt, 76.9% idle Mem: 51M Active, 671M Inact, 188M Wired, 18M Cache, 110M Buf, 60M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 11 root -68 - 0K 112K WAIT 235:11 51.56% intr{irq11: vr1} 10 root 171 ki31 0K 8K RUN 24.4H 31.15% idle 11 root -44 - 0K 112K WAIT 0:51 9.38% intr{swi1: netisr 0} 11 root -68 - 0K 112K WAIT 0:30 6.40% intr{irq10: vr0} 29688 root 44 0 3628K 1708K RUN 0:00 0.10% top With HEAD driver, for same test LA pikes to 8 and higher and it takes up to 10 seconds for userland applications like shell or screen(1) to respond to physical console events: last pid: 1335; load averages: 8.27, 4.05, 2.04 up 0+00:14:21 23:31:18 97 processes: 2 running, 83 sleeping, 12 waiting CPU: 0.1% user, 0.0% nice, 55.7% system, 43.6% interrupt, 0.6% idle Mem: 40M Active, 21M Inact, 175M Wired, 2512K Cache, 109M Buf, 749M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 12 root -16 - 0K 8K sleep 1:12 44.87% ng_queue 11 root -28 - 0K 96K WAIT 1:45 35.60% intr{swi5: +} 11 root -44 - 0K 96K WAIT 1:03 18.80% intr{swi1: netisr 0} 10 root 171 ki31 0K 8K RUN 6:34 0.39% idle 13 root -16 - 0K 8K - 0:07 0.10% yarrow That's with direct NETISR mode, indirect mode makes it only worse (LA is higher for both drivers, up to 4.5 for old one and up to 9+ for new). I ran tests with same custom kernel, loading/unloading old/new drivers as modules without reboot. Schedules is default SCHED_ULE. Another note: I run mpd/PPPoE/ng0 over vr1 and http transfer were through ng0. Eugene Grosbein