Date: Fri, 31 Aug 2012 12:19:19 +0700 From: Eugene Grosbein <egrosbein@rdtc.ru> To: pyunyh@gmail.com Cc: freebsd-net@freebsd.org, Lev Serebryakov <lev@freebsd.org>, Ian Smith <smithi@nimnet.asn.au> Subject: Re: Bad routing performance on 500Mhz Geode LX with CURRENT, ipfw and mpd5 Message-ID: <50404957.302@rdtc.ru> In-Reply-To: <20120831180721.GB3208@michelle.cdnetworks.com> References: <1865271844.20120829131610@serebryakov.spb.ru> <CAHu1Y70MynCMQTrJUMwTZ0%2BLrM1JiZFt_B77028XHfoiRgzmaA@mail.gmail.com> <1807373989.20120829223125@serebryakov.spb.ru> <20120830152726.A33776@sola.nimnet.asn.au> <534292400.20120830131158@serebryakov.spb.ru> <20120831180721.GB3208@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
01.09.2012 01:07, YongHyeon PYUN пишет: > It would be interesting to know whether there is any difference > before/after taskq change made in r235334. I was told that taskq > conversion for vr(4) resulted in better performance but I think > taskq shall add more burden on slow hardware. > Pre-r235334 interrupt handler still has issues since it wouldn't > exit interrupt handler if there are any pending interrupts. > It shall consume most of its CPU cycles in the interrupt handler > under extreme network load. If pre-r235334 shows better result, > you are probably able to implement interrupt mitigation by using > VT6102/VT6105's timer interrupt. I guess some frames would be lost > with the interrupt mitigation under high network load but other > part of kernel would have more chance to run important tasks. > Anyway, vr(4) controllers wouldn't be one of best choice for slow > machines due to DMA alignment limitation and driver assisted > padding requirement. I also have AMD Geode LX8-based system having two on-board vr(4) interfaces. I've just tried it with vr(4) driver from HEAD built as module for my 8.3-STABLE/i386. It builds just fine with minor change: --- if_vr.c.orig 2012-08-29 23:36:05.000000000 +0700 +++ if_vr.c 2012-08-29 22:51:01.000000000 +0700 @@ -2176,7 +2176,7 @@ VR_LOCK(sc); mii = device_get_softc(sc->vr_miibus); LIST_FOREACH(miisc, &mii->mii_phys, mii_list) - PHY_RESET(miisc); + mii_phy_reset(miisc); sc->vr_flags &= ~(VR_F_LINK | VR_F_TXPAUSE); error = mii_mediachg(mii); VR_UNLOCK(sc); dmesg says: vr0: <VIA VT6105 Rhine III 10/100BaseTX> port 0xe000-0xe0ff mem 0xef024000-0xef0240ff irq 10 at device 12.0 on pci0 vr0: Quirks: 0x0 vr0: Revision: 0x86 miibus0: <MII bus> on vr0 ukphy0: <Generic IEEE 802.3u media interface> PHY 1 on miibus0 ukphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr0: Ethernet address: 00:10:f3:13:72:c6 vr0: [ITHREAD] vr1: <VIA VT6105 Rhine III 10/100BaseTX> port 0xe400-0xe4ff mem 0xef025000-0xef0250ff irq 11 at device 13.0 on pci0 vr1: Quirks: 0x0 vr1: Revision: 0x86 miibus1: <MII bus> on vr1 ukphy1: <Generic IEEE 802.3u media interface> PHY 1 on miibus1 ukphy1: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr1: Ethernet address: 00:10:f3:13:72:c7 vr1: [ITHREAD] This is industrial Nexcom NICE-3120-LX8 fanless PC system used as home router, the only miniPCI expansion slot is occupied with ath0 WiFi card. http://www.orbitmicro.com/global/system-4423.html I have to say that HEAD driver runs MUCH worse. With stock 8.3 driver I have same 3.35MByte/s one-thread http transfer through this system but LA=1.7 only and userland is pretty responsive. top(1) shows: last pid: 29696; load averages: 1.70, 1.08, 0.88 up 2+00:11:31 22:21:46 94 processes: 2 running, 78 sleeping, 14 waiting CPU: 7.7% user, 0.0% nice, 0.0% system, 15.4% interrupt, 76.9% idle Mem: 51M Active, 671M Inact, 188M Wired, 18M Cache, 110M Buf, 60M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 11 root -68 - 0K 112K WAIT 235:11 51.56% intr{irq11: vr1} 10 root 171 ki31 0K 8K RUN 24.4H 31.15% idle 11 root -44 - 0K 112K WAIT 0:51 9.38% intr{swi1: netisr 0} 11 root -68 - 0K 112K WAIT 0:30 6.40% intr{irq10: vr0} 29688 root 44 0 3628K 1708K RUN 0:00 0.10% top With HEAD driver, for same test LA pikes to 8 and higher and it takes up to 10 seconds for userland applications like shell or screen(1) to respond to physical console events: last pid: 1335; load averages: 8.27, 4.05, 2.04 up 0+00:14:21 23:31:18 97 processes: 2 running, 83 sleeping, 12 waiting CPU: 0.1% user, 0.0% nice, 55.7% system, 43.6% interrupt, 0.6% idle Mem: 40M Active, 21M Inact, 175M Wired, 2512K Cache, 109M Buf, 749M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 12 root -16 - 0K 8K sleep 1:12 44.87% ng_queue 11 root -28 - 0K 96K WAIT 1:45 35.60% intr{swi5: +} 11 root -44 - 0K 96K WAIT 1:03 18.80% intr{swi1: netisr 0} 10 root 171 ki31 0K 8K RUN 6:34 0.39% idle 13 root -16 - 0K 8K - 0:07 0.10% yarrow That's with direct NETISR mode, indirect mode makes it only worse (LA is higher for both drivers, up to 4.5 for old one and up to 9+ for new). I ran tests with same custom kernel, loading/unloading old/new drivers as modules without reboot. Schedules is default SCHED_ULE. Another note: I run mpd/PPPoE/ng0 over vr1 and http transfer were through ng0. Eugene Grosbein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50404957.302>