From nobody Wed Feb 28 19:31:06 2024 X-Original-To: virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TlPc23DMrz5CHVb for ; Wed, 28 Feb 2024 19:31:30 +0000 (UTC) (envelope-from gusev.vitaliy@gmail.com) Received: from mail-lj1-x22d.google.com (mail-lj1-x22d.google.com [IPv6:2a00:1450:4864:20::22d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TlPc16qzGz4PkR for ; Wed, 28 Feb 2024 19:31:29 +0000 (UTC) (envelope-from gusev.vitaliy@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lj1-x22d.google.com with SMTP id 38308e7fff4ca-2d2505352e6so888581fa.3 for ; Wed, 28 Feb 2024 11:31:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709148688; x=1709753488; darn=freebsd.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=Rc2Q9glJ978b46nPDzeonzO79RJwcZ2ldWtxkxfLT2g=; b=Pia8Fkl1tKcxzI8X0vaJ4rvCgOqvLy4CKm2wCwF5SUoGIa16a6jQoXpwSpqGnVfPv+ 3S79SPO9BVHIC0sdO85T2FDdgR9//Jnh/Vd+dO5UHnAp76fsU0i32bpkU66MXlpi9glt NI49PX6lmZMJ+EuwCxEpf9MC9JKxotAkPqjup6WKcphMq203uS0fHHU03wTbYArqnYKg 0vCUQbuUcQBIwuzezzz6nq5XTiiTEVLO7XVAzxkHaSh4Pw49s4Si0ml+4Ic3sQTBt7Hb uebgtfK9FKmzW9ddl0afThFpk75+6rlMOY1lKAfpuDjTBVvIcB9QZmAvXPF2mIqLQkgM 431A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709148688; x=1709753488; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Rc2Q9glJ978b46nPDzeonzO79RJwcZ2ldWtxkxfLT2g=; b=UvM7CYTql3xp6aZmOshpI8BeRcKFxFrMIrlkzGMs3p/hFjKhrFG9S7P5OwSYE4M+9q bSnBpiO2Rrra8Z8BvF1GkINTnkmIJjYWy0Bq4KLivqsl54zYQRzMd9poLjSqHdcvyjpc 04KYPa7JmqTMcspKAnWJ2zyvnsLwOaxm5+BU79i5C3j9+YxrM8YIBDfozoD6iwcZMPxb z3RXzfYIBkhskFbML2RtoVFlSm3k+F0fiokW3vBkvXWbrqIJgnhZAupkQSLD/yWRhCZt ColmQPN54Mqw6d8uB1UCO4A63FoAGw2CnJxZmSpnUi977eBVwvABK4q/RHMUKT47iv2u gulA== X-Gm-Message-State: AOJu0YxbQdgQwgRBr6dbFoYTdxOTnhYsaBmni+TeFf9I2qACDmU39Sl5 unJ9uGyMHzZ5Ej86OKOOGqW8jc8fYI88yu4PxV3xkZp3mtIz+cJFLaIG3sMSHGQ1pA== X-Google-Smtp-Source: AGHT+IEd++0ZNDtGg9MxUo030h7GPNPM9Uf6N0b3wRDwiqzUYZqHXK9uu+x2mhpS8S2aO4z1z9tzTg== X-Received: by 2002:a2e:a274:0:b0:2d2:39e6:fa3f with SMTP id k20-20020a2ea274000000b002d239e6fa3fmr7515481ljm.31.1709148687745; Wed, 28 Feb 2024 11:31:27 -0800 (PST) Received: from smtpclient.apple ([188.187.60.230]) by smtp.gmail.com with ESMTPSA id l18-20020a2e9092000000b002d0ca6e0f9fsm12075ljg.15.2024.02.28.11.31.27 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 28 Feb 2024 11:31:27 -0800 (PST) From: Vitaliy Gusev Message-Id: <1DAEB435-A613-4A04-B63F-D7AF7A0B7C0A@gmail.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_4C6B768C-854E-40D3-81D0-E10177927920" List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.400.31\)) Subject: Re: bhyve disk performance issue Date: Wed, 28 Feb 2024 22:31:06 +0300 In-Reply-To: <25ddf43d-f700-4cb5-af2a-1fe669d1e24b@shrew.net> Cc: virtualization@freebsd.org To: Matthew Grooms References: <6a128904-a4c1-41ec-a83d-56da56871ceb@shrew.net> <28ea168c-1211-4104-b8b4-daed0e60950d@app.fastmail.com> <0ff6f30a-b53a-4d0f-ac21-eaf701d35d00@shrew.net> <6f6b71ac-2349-4045-9eaf-5c50d42b89be@shrew.net> <50614ea4-f0f9-44a2-b5e6-ebb33cfffbc4@shrew.net> <6a4e7e1d-cca5-45d4-a268-1805a15d9819@shrew.net> <25ddf43d-f700-4cb5-af2a-1fe669d1e24b@shrew.net> X-Mailer: Apple Mail (2.3774.400.31) X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4TlPc16qzGz4PkR --Apple-Mail=_4C6B768C-854E-40D3-81D0-E10177927920 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi, Matthew. I still do not know what command line was used for bhyve. I couldn't = find it through the thread, sorry. And I couldn't find virtual disk size = that you used. Could you, please, simplify bonnie++ output, it is hard to decode due to = alignment and use exact numbers for: READ seq - I see you had 1.6GB/s for the good time and ~500MB/s for the = worst. WRITE seq - ... If you have slow results both for the read and write operations, you = probably should perform testing only for READs and do not do anything = until READs are fine. Again, if you have slow performance for Ext4 Filesystem in guest VM = placed on the passed disk image, you should try to test on the raw disk = image, i.e. without Ext4, because it could be related. If you run test inside VM on a filesystem, you can have deal with = filesystem bottlenecks, bugs, fragmentation etc. Do you want to fix them = all? I don=E2=80=99t think so. For example, if you pass disk image 40G and create Ext4 filesystem, and = during testing the filesystem becomes full over 80%, I/O could be = performed not so fine. You probably should eliminate that guest filesystem behaviour when you = meet IO performance slowdown. Also, please look at the TRIM operations when you perform WRITE testing. = It could be also related to the slow write I/O. =E2=80=94=E2=80=94 Vitaliy > On 28 Feb 2024, at 21:29, Matthew Grooms wrote: >=20 > On 2/27/24 04:21, Vitaliy Gusev wrote: >> Hi, >>=20 >>=20 >>> On 23 Feb 2024, at 18:37, Matthew Grooms = wrote: >>>=20 >>>> ... >>> The problem occurs when an image file is used on either ZFS or UFS. = The problem also occurs when the virtual disk is backed by a raw disk = partition or a ZVOL. This issue isn't related to a specific underlying = filesystem. >>>=20 >>=20 >> Do I understand right, you ran testing inside VM inside guest VM on = ext4 filesystem? If so you should be aware about additional overhead in = comparison when you were running tests on the hosts. >>=20 > Hi Vitaliy, >=20 > I appreciate you providing the feedback and suggestions. I spent over = a week trying as many combinations of host and guest options as possible = to narrow this issue down to a specific host storage or a guest device = model option. Unfortunately the problem occurred with every combination = I tested while running Linux as the guest. Note, I only tested RHEL8 & = RHEL9 compatible distributions ( Alma & Rocky ). The problem did not = occur when I ran FreeBSD as the guest. The problem did not occur when I = ran KVM in the host and Linux as the guest. >=20 >> I would suggest to run fio (or even dd) on raw disk device inside VM, = i.e. without filesystem at all. Just do not forget do =E2=80=9Cecho 3 > = /proc/sys/vm/drop_caches=E2=80=9D in Linux Guest VM before you run = tests. > The two servers I was using to test with are are no longer available. = However, I'll have two more identical servers arriving in the next week = or so. I'll try to run additional tests and report back here. I used = bonnie++ as that was easily installed from the package repos on all the = systems I tested. >=20 >>=20 >> Could you also give more information about: >>=20 >> 1. What results did you get (decode bonnie++ output)? > If you look back at this email thread, there are many examples of = running bonnie++ on the guest. I first ran the tests on the host system = using Linux + ext4 and FreeBSD 14 + UFS & ZFS to get a baseline of = performance. Then I ran bonnie++ tests using bhyve as the hypervisor and = Linux & FreeBSD as the guest. The combination of host and guest storage = options included ... >=20 > 1) block device + virtio blk > 2) block device + nvme > 3) UFS disk image + virtio blk > 4) UFS disk image + nvme > 5) ZFS disk image + virtio blk > 6) ZFS disk image + nvme > 7) ZVOL + virtio blk > 8) ZVOL + nvme >=20 > In every instance, I observed the Linux guest disk IO often perform = very well for some time after the guest was first booted. Then the = performance of the guest would drop to a fraction of the original = performance. The benchmark test was run every 5 or 10 minutes in a cron = job. Sometimes the guest would perform well for up to an hour before = performance would drop off. Most of the time it would only perform well = for a few cycles ( 10 - 30 mins ) before performance would drop off. The = only way to restore the performance was to reboot the guest. Once I = determined that the problem was not specific to a particular host or = guest storage option, I switched my testing to only use a block device = as backing storage on the host to avoid hitting any system disk caches. >=20 > Here is the test script I used in the cron job ... >=20 > #!/bin/sh > FNAME=3D'output.txt' >=20 > echo = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D >> $FNAME > echo Begin @ `/usr/bin/date` >> $FNAME > echo >> $FNAME > /usr/sbin/bonnie++ 2>&1 | /usr/bin/grep -v 'done\|,' >> $FNAME > echo >> $FNAME > echo End @ `/usr/bin/date` >> $FNAME >=20 > As you can see, I'm calling bonnie++ with the system defaults. That = uses a data set size that's 2x the guest RAM in an attempt to minimize = the effect of filesystem cache on results. Here is an example of the = output that bonnie++ produces ... >=20 > Version 2.00 ------Sequential Output------ --Sequential Input- = --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- = --Seeks-- > Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP = /sec %CP > linux-blk 63640M 694k 99 1.6g 99 737m 76 985k 99 1.3g 69 = +++++ +++ > Latency 11579us 535us 11889us 8597us 21819us = 8238us > Version 2.00 ------Sequential Create------ --------Random = Create-------- > linux-blk -Create-- --Read--- -Delete-- -Create-- --Read--- = -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP = /sec %CP > 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ = +++++ +++ > Latency 7620us 126us 1648us 151us 15us = 633us >=20 > --------------------------------- speed drop = --------------------------------- >=20 > Version 2.00 ------Sequential Output------ --Sequential Input- = --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- = --Seeks-- > Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP = /sec %CP > linux-blk 63640M 676k 99 451m 99 314m 93 951k 99 402m 99 = 15167 530 > Latency 11902us 8959us 24711us 10185us 20884us = 5831us > Version 2.00 ------Sequential Create------ --------Random = Create-------- > linux-blk -Create-- --Read--- -Delete-- -Create-- --Read--- = -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP = /sec %CP > 16 0 96 +++++ +++ +++++ +++ 0 96 +++++ +++ = 0 75 > Latency 343us 165us 1636us 113us 55us = 1836us >=20 > In the example above, the benchmark test repeated about 20 times with = results that were similar to the performance shown above the dotted line = ( ~ 1.6g/s seq write and 1.3g/s seq read ). After that, the performance = dropped to what's shown below the dotted line which is less than 1/4 the = original speed ( ~ 451m/s seq write and 402m/s seq read ). >=20 >> 2. What results expecting? >>=20 > What I expect is that, when I perform the same test with the same = parameters, the results would stay more or less consistent over time. = This is true when KVM is used as the hypervisor on the same hardware and = guest options. That said, I'm not worried about bhyve being consistently = slower than kvm or a FreeBSD guest being consistently slower than a = Linux guest. I'm concerned that the performance drop over time is = indicative of an issue with how bhyve interacts with non-freebsd guests. >=20 >> 3. VM configuration, virtio-blk disk size, etc. >> 4. Full command for tests (including size of test-set), bhyve, etc. > I believe this was answered above. Please let me know if you have = additional questions. >=20 >>=20 >> 5. Did you pass virtio-blk as 512 or 4K ? If 512, probably you = should try 4K. >>=20 > The testing performed was not exclusively with virtio-blk. >=20 >=20 >> 6. Linux has several read-ahead options for IO schedule, and it = could be related too. >>=20 > I suppose it's possible that bhyve could be somehow causing the disk = scheduler in the Linux guest to act differently. I'll see if I can = figure out how to disable that in future tests. >=20 >=20 >> Additionally could also you play with =E2=80=9Csync=3Ddisabled=E2=80=9D= volume/zvol option? Of course it is only for write testing. > The testing performed was not exclusively with zvols. >=20 >=20 > Once I have more hardware available, I'll try to report back with more = testing. It may be interesting to also see how a Windows guest performs = compared to Linux & FreeBSD. I suspect that this issue may only be = triggered when a fast disk array is in use on the host. My tests use a = 16x SSD RAID 10 array. It's also quite possible that the disk IO = slowdown is only a symptom of another issue that's triggered by the disk = IO test ( please see end of my last post related to scheduler priority = observations ). All I can say for sure is that ... >=20 > 1) There is a problem and it's reproducible across multiple hosts > 2) It affects RHEL8 & RHEL9 guests but not FreeBSD guests > 3) It is not specific to any host or guest storage option >=20 > Thanks, >=20 > -Matthew >=20 --Apple-Mail=_4C6B768C-854E-40D3-81D0-E10177927920 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Hi,  Matthew.

I still do not know what command line was used for bhyve. I =  couldn't find it through the thread, sorry. And = I couldn't find virtual = disk size that you used.

Could you, = please, simplify bonnie++ output, it is hard to decode due to alignment = and use exact numbers for:

READ seq =  - I see you had 1.6GB/s for the good time and ~500MB/s for the = worst.
WRITE seq =  - ...

If you = have slow results both for the read and write operations, you probably = should perform testing only for READs and do not do anything = until READs are fine.

Again, if = you have slow performance for Ext4 Filesystem in guest VM placed on the = passed disk image, you should try to test on the raw disk image, i.e. without Ext4, because it = could be related.

If you run = test inside VM on a filesystem, you can have deal with filesystem = bottlenecks, bugs, fragmentation etc. Do you want to fix them all? I = don=E2=80=99t think so.

For = example, if you pass disk image 40G and create Ext4 filesystem, and = during testing the filesystem becomes full over 80%, I/O could be = performed not so fine.

You = probably should eliminate that guest filesystem behaviour when you meet = IO performance slowdown.

Also, = please look at the TRIM operations when you perform WRITE testing. It = could be also related to the slow write I/O.

=E2=80=94=E2=80=94
Vitaliy

On= 28 Feb 2024, at 21:29, Matthew Grooms <mgrooms@shrew.net> = wrote:

=20 =20
On 2/27/24 04:21, Vitaliy Gusev = wrote:
Hi,


On 23 Feb 2024, at 18:37, Matthew Grooms <mgrooms@shrew.net> = wrote:

...
The problem occurs when an image file is used on either ZFS or UFS. The problem also occurs when the virtual disk is backed by a raw disk partition or a ZVOL. This issue isn't related to a specific underlying = filesystem.


Do I understand right, you ran testing inside VM inside guest VM  on ext4 filesystem? If so you should be aware about additional overhead in comparison when you were running tests on the hosts.

Hi Vitaliy,

I appreciate you providing the feedback and suggestions. I spent over a week trying as many combinations of host and guest options as possible to narrow this issue down to a specific host storage or a guest device model option. Unfortunately the problem occurred with every combination I tested while running Linux as the guest. Note, I only tested RHEL8 & RHEL9 compatible distributions ( Alma & Rocky ). The problem did not occur when I ran FreeBSD as the guest. The problem did not occur when I ran KVM in the host and Linux as the guest.

I would suggest to run fio (or even dd) on raw disk device inside VM, i.e. without filesystem at all.  Just do not = forget do =E2=80=9Cecho 3 > /proc/sys/vm/drop_caches=E2=80=9D in Linux = Guest VM before you run tests.

The two servers I was using to test with are are no = longer available. However, I'll have two more identical servers arriving in the next week or so. I'll try to run additional tests and report back here. I used bonnie++ as that was easily installed from the package repos on all the systems I tested.


Could you also give more information about:

 1. What results did you get (decode bonnie++ = output)?

If you look back at this email thread, there are = many examples of running bonnie++ on the guest. I first ran the tests on the host system using Linux + ext4 and FreeBSD 14 + UFS & ZFS to get a baseline of performance. Then I ran bonnie++ tests using bhyve as the hypervisor and Linux & FreeBSD as the guest. The combination of host and guest storage options included ...

1) block device + virtio blk
2) block device + nvme
3) UFS disk image + virtio blk
4) UFS disk image + nvme
5) ZFS disk image + virtio blk
6) ZFS disk image + nvme
7) ZVOL + virtio blk
8) ZVOL + nvme

In every instance, I observed the Linux guest disk IO often perform very well for some time after the guest was first booted. Then the performance of the guest would drop to a fraction of the original performance. The benchmark test was run every 5 or 10 minutes in a cron job. Sometimes the guest would perform well for up to an hour before performance would drop off. Most of the time it would only perform well for a few cycles ( 10 - 30 mins ) before performance would drop off. The only way to restore the performance was to reboot the guest. Once I determined that the problem was not specific to a particular host or guest storage option, I switched my testing to only use a block device as backing storage on the host to avoid hitting any system disk caches.

Here is the test script I used in the cron job ...

#!/bin/sh
FNAME=3D'output.txt'

echo = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D >> $FNAME
echo Begin @ `/usr/bin/date` >> $FNAME
echo >> $FNAME
/usr/sbin/bonnie++ 2>&1 | /usr/bin/grep -v 'done\|,' >> $FNAME
echo >> $FNAME
echo End @ `/usr/bin/date` >> $FNAME


As you can see, I'm calling bonnie++ with the system defaults. That uses a data set size that's 2x the guest RAM in an attempt to minimize the effect of filesystem cache on results. Here is an example of the output that bonnie++ produces ...

Version  = 2.00       ------Sequential Output------ --Sequential Input- = --Random-
=             &n= bsp;       -Per Chr- --Block-- -Rewrite- = -Per Chr- --Block-- --Seeks--
Name:Size etc        /sec = %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
linux-blk    63640M  694k  99  = 1.6g  99  737m  76  985k  99  1.3g  69 +++++ +++
= Latency           &= nbsp; 11579us     535us   = 11889us    8597us   21819us    8238us
Version  2.00       = ------Sequential Create------ --------Random Create--------
= linux-blk           = -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
=             &n= bsp; files  /sec %CP  /sec %CP  /sec %CP  /sec = %CP  /sec %CP  /sec %CP
=             &n= bsp;    16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
= Latency           &= nbsp;  7620us     126us    = 1648us     151us      15us     633us

--------------------------------- speed drop ---------------------------------

Version  2.00       = ------Sequential Output------ --Sequential Input- --Random-
=             &n= bsp;       -Per Chr- --Block-- -Rewrite- = -Per Chr- --Block-- --Seeks--
Name:Size etc        /sec = %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
linux-blk    63640M  676k  99  = 451m  99  314m  93  951k  99  402m  99 15167 530
= Latency           &= nbsp; 11902us    8959us   24711us   = 10185us   20884us    5831us
Version  2.00       = ------Sequential Create------ --------Random Create--------
= linux-blk           = -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
=             &n= bsp; files  /sec %CP  /sec %CP  /sec %CP  /sec = %CP  /sec %CP  /sec %CP
=             &n= bsp;    16     0  96 +++++ +++ = +++++ +++     0  96 +++++ +++     0  75
= Latency           &= nbsp;   343us     165us    = 1636us     113us      55us    1836us

In the example above, the benchmark test repeated about 20 times with results that were similar to the performance shown above the dotted line ( ~ 1.6g/s seq write and 1.3g/s seq read ). After that, the performance dropped to what's shown below the dotted line which is less than 1/4 the original speed ( ~ 451m/s seq write and 402m/s seq read ).

 2. What results expecting?

What I expect is that, when I perform the same test = with the same parameters, the results would stay more or less consistent over time. This is true when KVM is used as the hypervisor on the = same hardware and guest options. That said, I'm not worried about bhyve being consistently slower than kvm or a FreeBSD guest being consistently slower than a Linux guest. I'm concerned that the performance drop over time is indicative of an issue with how bhyve interacts with non-freebsd guests.

 3. VM configuration, virtio-blk disk size, etc.
 4. Full command for tests (including size of = test-set), bhyve, etc.

I believe this was answered above. Please let me = know if you have additional questions.


 5. Did you pass virtio-blk as 512 or 4K ? If 512, = probably you should try 4K.

The testing performed was not exclusively with = virtio-blk.

 6. Linux has several read-ahead options for IO = schedule, and it could be related too.

I suppose it's possible that bhyve could be somehow = causing the disk scheduler in the Linux guest to act differently. I'll see if I can figure out how to disable that in future tests.

Additionally could also you play with =E2=80=9Csync=3Ddisable= d=E2=80=9D volume/zvol option? Of course it is only for write = testing.

The testing performed was not exclusively with = zvols.

Once I have more hardware available, I'll try to report back = with more testing. It may be interesting to also see how a Windows guest performs compared to Linux & FreeBSD. I suspect that this issue may only be triggered when a fast disk array is in use on the host. My tests use a 16x SSD RAID 10 array. It's also quite possible that the disk IO slowdown is only a symptom of another issue that's triggered by the disk IO test ( please see end of my last post related to scheduler priority observations ). All I can say for sure is that ...

1) There is a problem and it's reproducible across multiple = hosts
2) It affects RHEL8 & RHEL9 guests but not FreeBSD guests
3) It is not specific to any host or guest storage option

Thanks,

-Matthew


= --Apple-Mail=_4C6B768C-854E-40D3-81D0-E10177927920--