Date: Fri, 27 Jul 2012 08:35:14 +0200 From: Luigi Rizzo <rizzo@iet.unipi.it> To: Adrian Chadd <adrian@freebsd.org> Cc: current@freebsd.org Subject: Re: RFC: use EM_LEGACY_IRQ in if_lem.c ? Message-ID: <20120727063514.GA49988@onelab2.iet.unipi.it> In-Reply-To: <CAJ-VmokmjVnFKr-MXEB55p7qDPAro6AUVHuDL5UGifUZ-W8Yfw@mail.gmail.com> References: <20120724202019.GA22927@onelab2.iet.unipi.it> <CAJ-VmokG-%2BkjaOC2g2uvVX5z4eBtry_-L8nMFaOPBan9SSzyYQ@mail.gmail.com> <20120725151403.GA33640@onelab2.iet.unipi.it> <CAJ-VmokmjVnFKr-MXEB55p7qDPAro6AUVHuDL5UGifUZ-W8Yfw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 26, 2012 at 10:52:08PM -0700, Adrian Chadd wrote: > On 25 July 2012 08:14, Luigi Rizzo <rizzo@iet.unipi.it> wrote: > > >> I suggest doing some digging to understand why. I bet we all know the > >> answer, but it would be nice to have it documented and investigated. I > >> bet em(4) isn't the only device that would benefit from this? > > > > I am not so sure i know the answer on bare iron (and my take is that the > > difference is more or less irrelevant there), but in the virtualized case > > the improvement is almost surely because the code used in FAST_INTR > > has a couple of MMIO accesses to disable/enable interrupts on the > > card while the taskqueue runs. These are expensive in a VM > > (such accesses cost ~10K cycles each, even with hw support) > > Hm, really? Doing these register accesses to a virtualised em NIC in a > VM is that expensive, or is there something else going on I don't > understand? access to emulated registers is expensive because you need to exit from the hardware VM before you can run the emulation code, and the exit is incredibly expensive as it needs to save a ton of state. Things are marginally better in qemu, but there everything is more expensive so you still see a measurable performance gain if you save the accesses. There is a recent paper/talk at usenix that discusses briefly the problem https://www.usenix.org/conference/usenixfederatedconferencesweek/software-techniques-avoiding-hardware-virtualization-exits The ~10k cycles per access were measured in kvm on recent i5 and i7 processors. For qemu, the change moves the packet rate from 7.5 to 8.3 Kpps, if you do the math the difference is about 13us per packet so we are in a similar ballpark. (the numbers are low because the default emulation does not support interrupt mitigation, i have patches for that, see the qemu-devel mailing list http://lists.nongnu.org/archive/html/qemu-devel/2012-07/msg03195.html which bring the rate up to some 50kpps. cheers luigi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120727063514.GA49988>
