Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Apr 2020 17:49:28 +0200
From:      Harry Schmalzbauer <freebsd@omnilan.de>
To:        freebsd-virtualization@freebsd.org
Subject:   Re: bhyve win-guest benchmark comparing
Message-ID:  <f5a78199-9306-bccc-606a-23c30f56b0f1@omnilan.de>
In-Reply-To: <9e7f4c01-6cd1-4045-1a5b-69c804b3881b@omnilan.de>
References:  <9e7f4c01-6cd1-4045-1a5b-69c804b3881b@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Am 22.10.2018 um 13:26 schrieb Harry Schmalzbauer:
…
>
> Test-Runs:
> Each hypervisor had only the one bench-guest running, no other 
> tasks/guests were running besides system's native standard processes.
> Since the time between powering up the guest and finishing logon 
> differed notably (~5s vs. ~20s) from one host to the other, I did a 
> quick synthetic IO-Test beforehand.
> I'm using IOmeter since heise.de published a great test pattern called 
> IOmix – about 18 years ago I guess.  This access pattern has always 
> perfectly reflected the system performance for human computer usage 
> with non-caculation-centric applications, and still is my favourite, 
> despite throughput and latency changed by some orders of manitudes 
> during the last decade (and I had defined something for "fio" which 
> mimics IOmix and shows reasonable relational results; but I'm still 
> prefering IOmeter for homogenous IO benchmarking).
>
> The results is about factor 7 :-(
> ~3800iops&69MB/s (CPU-guest-usage 42%IOmeter+12%irq)
>                 vs.
> ~29000iops&530MB/s (CPU-guest-usage 11%IOmeter+19%irq)
>
>
>     [with debug kernel and debug-malloc, numbers are 3000iops&56MB/s,
>      virtio-blk instead of ahci,hd: results in 5660iops&104MB/s with 
> non-debug kernel
>      – much better, but even higher CPU load and still factor 4 slower]
>
> What I don't understand is, why the IOmeter process differs that much 
> in CPU utilization!?!  It's the same binary on the same OS (guest) 
> with the same OS-driver and the same underlying hardware – "just" the 
> AHCI emulation and the vmm differ...

I repeated this test with a slightly different device backend (Samsung 
850pro SSD on mps(4) instead of mfid(4)).
After applying r358848 to stable/12, the numbers changed dramatically.0 
on the same haswell based Xeon E3 platform.

With the single SSD, the IOmeter numbers for ESXi as host drop
from ~29000iops&530MB/s to ~11000/205MB/s.
     But the numbers for bhyve as host
raise from ~3800iops&69MB/s to ~8800/160MB/s at the same time!!!

So there's still a penalty of ~20% for ahci-bhyve vs. ahci-esx, but this 
is a enourmous improvement.
Please don't skip the MFC for r358848!

Thanks a lot for all the work!

-harry





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f5a78199-9306-bccc-606a-23c30f56b0f1>