Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Jul 2012 08:35:14 +0200
From:      Luigi Rizzo <rizzo@iet.unipi.it>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        current@freebsd.org
Subject:   Re: RFC: use EM_LEGACY_IRQ in if_lem.c ?
Message-ID:  <20120727063514.GA49988@onelab2.iet.unipi.it>
In-Reply-To: <CAJ-VmokmjVnFKr-MXEB55p7qDPAro6AUVHuDL5UGifUZ-W8Yfw@mail.gmail.com>
References:  <20120724202019.GA22927@onelab2.iet.unipi.it> <CAJ-VmokG-%2BkjaOC2g2uvVX5z4eBtry_-L8nMFaOPBan9SSzyYQ@mail.gmail.com> <20120725151403.GA33640@onelab2.iet.unipi.it> <CAJ-VmokmjVnFKr-MXEB55p7qDPAro6AUVHuDL5UGifUZ-W8Yfw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 26, 2012 at 10:52:08PM -0700, Adrian Chadd wrote:
> On 25 July 2012 08:14, Luigi Rizzo <rizzo@iet.unipi.it> wrote:
> 
> >> I suggest doing some digging to understand why. I bet we all know the
> >> answer, but it would be nice to have it documented and investigated. I
> >> bet em(4) isn't the only device that would benefit from this?
> >
> > I am not so sure i know the answer on bare iron (and my take is that the
> > difference is more or less irrelevant there), but in the virtualized case
> > the improvement is almost surely because the code used in FAST_INTR
> > has a couple of MMIO accesses to disable/enable interrupts on the
> > card while the taskqueue runs.  These are expensive in a VM
> > (such accesses cost ~10K cycles each, even with hw support)
> 
> Hm, really? Doing these register accesses to a virtualised em NIC in a
> VM is that expensive, or is there something else going on I don't
> understand?

access to emulated registers is expensive because you need to
exit from the hardware VM before you can run the emulation code,
and the exit is incredibly expensive as it needs to save a ton
of state. Things are marginally better in qemu, but there
everything is more expensive so you still see a measurable
performance gain if you save the accesses.

There is a recent paper/talk at usenix that discusses briefly
the problem

https://www.usenix.org/conference/usenixfederatedconferencesweek/software-techniques-avoiding-hardware-virtualization-exits

The ~10k cycles per access were measured in kvm on recent i5 and i7 processors.

For qemu, the change moves the packet rate from 7.5 to 8.3 Kpps,
if you do the math the difference is about 13us per packet
so we are in a similar ballpark.
(the numbers are low because the default emulation does not
support interrupt mitigation, i have patches for that, see the
qemu-devel mailing list

http://lists.nongnu.org/archive/html/qemu-devel/2012-07/msg03195.html

which bring the rate up to some 50kpps.


cheers
luigi



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120727063514.GA49988>