Date: Fri, 23 Feb 2024 17:25:39 +1100 From: Jason Tubnor <jason@tubnor.net> To: Matthew Grooms <mgrooms@shrew.net>, virtualization@freebsd.org Subject: Re: bhyve disk performance issue Message-ID: <f01a9bca-7023-40c0-93f2-8cdbe4cd8078@tubnor.net> In-Reply-To: <6a4e7e1d-cca5-45d4-a268-1805a15d9819@shrew.net> References: <6a128904-a4c1-41ec-a83d-56da56871ceb@shrew.net> <28ea168c-1211-4104-b8b4-daed0e60950d@app.fastmail.com> <0ff6f30a-b53a-4d0f-ac21-eaf701d35d00@shrew.net> <6f6b71ac-2349-4045-9eaf-5c50d42b89be@shrew.net> <50614ea4-f0f9-44a2-b5e6-ebb33cfffbc4@shrew.net> <6a4e7e1d-cca5-45d4-a268-1805a15d9819@shrew.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 23/02/2024 4:51 pm, Matthew Grooms wrote: > On 2/18/24 09:21, Matthew Grooms wrote: >> On 2/17/24 15:53, Matthew Grooms wrote: >>> >>> Unfortunately same story with 13.2. I'm going to try and downgrade >>> to 12.4 and see if it gets any better. >>> >>> ================================================================================ >>> >>> Begin @ Sat Feb 17 11:00:01 CST 2024 >>> >>> Version 2.00 ------Sequential Output------ --Sequential >>> Input- --Random- >>> -Per Chr- --Block-- -Rewrite- -Per Chr- >>> --Block-- --Seeks-- >>> Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec >>> %CP /sec %CP >>> localhost.lo 63640M 690k 99 1.5g 97 727m 78 950k 99 1.3g 68 >>> +++++ +++ >>> Latency 11759us 29114us 8098us 8649us 23413us >>> 4540us >>> Version 2.00 ------Sequential Create------ --------Random >>> Create-------- >>> localhost.localdoma -Create-- --Read--- -Delete-- -Create-- >>> --Read--- -Delete-- >>> files /sec %CP /sec %CP /sec %CP /sec %CP /sec >>> %CP /sec %CP >>> 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ >>> +++ +++++ +++ >>> Latency 7791us 131us 1671us 464us 15us >>> 1811us >>> >>> End @ Sat Feb 17 11:03:13 CST 2024 >>> ================================================================================ >>> >>> Begin @ Sat Feb 17 11:10:01 CST 2024 >>> >>> Version 2.00 ------Sequential Output------ --Sequential >>> Input- --Random- >>> -Per Chr- --Block-- -Rewrite- -Per Chr- >>> --Block-- --Seeks-- >>> Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec >>> %CP /sec %CP >>> localhost.lo 63640M 667k 99 449m 99 313m 94 940k 99 398m 99 >>> 16204 563 >>> Latency 12147us 1079us 24470us 8795us 17853us >>> 4384us >>> Version 2.00 ------Sequential Create------ --------Random >>> Create-------- >>> localhost.localdoma -Create-- --Read--- -Delete-- -Create-- >>> --Read--- -Delete-- >>> files /sec %CP /sec %CP /sec %CP /sec %CP /sec >>> %CP /sec %CP >>> 16 0 93 +++++ +++ +++++ +++ 0 96 +++++ >>> +++ +++++ +++ >>> Latency 118us 159us 269us 164us 54us >>> 844us >>> >>> End @ Sat Feb 17 11:18:43 CST 2024 >>> >> I wasn't able to get a working 12.4 system in place due to lack of >> packages. However, I did fire up a FreeBSD 14 VM and let it run >> overnight on the same SSD array. It consistently ran at a much higher >> speed for 50+ runs @ 10m intervals ... >> >> ================================================================================ >> >> Begin @ Sun Feb 18 15:00:00 UTC 2024 >> >> Version 1.98 ------Sequential Output------ --Sequential Input- >> --Random- >> -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- >> --Seeks-- >> Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP >> /sec %CP >> freebsd.shrew.l 64G 628k 99 1.6g 98 831m 60 1278k 99 1.1g 42 >> +++++ +++ >> Latency 13447us 68490us 207ms 7187us 195ms >> 17665us >> Version 1.98 ------Sequential Create------ --------Random >> Create-------- >> freebsd.shrew.lab -Create-- --Read--- -Delete-- -Create-- --Read--- >> -Delete-- >> files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP >> /sec %CP >> 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ >> +++++ +++ >> Latency 18225us 18us 28us 18812us 18us 25us >> >> End @ Sun Feb 18 15:03:11 UTC 2024 >> >> I used identical options to run both that VM and the Linux VM I've >> been testing. The backing store for each VM has a 1TB partition and >> the guest interfaces are NVME. Now I'm really scratching my head. >> >> Chuck, were you testing disk performance in Linux VMs or only FreeBSD? >> >> Anyone have ideas on why Linux disk performance would drop off a >> cliff over time? >> > I took a detour trying out Xen but apparently that has it's own set of > performance issues related to the FreeBSD port missing newer features. > I did install KVM on the same hardware for comparison. I then tested a > guest provisioned identically to the bhyve VM with a virtio blk device > which ran for 2.5 hours with consistently impressive output ... > > ================================================================================ > > Begin @ Thu Feb 22 20:55:01 CST 2024 > > Version 2.00 ------Sequential Output------ --Sequential Input- > --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- > --Seeks-- > Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > linux2.shrew.pr 31G 2191k 99 1.4g 55 1.1g 59 3484k 99 2.5g 83 > 7817 127 > Latency 4480us 2528us 17656us 2650us 102us 3568us > Version 2.00 ------Sequential Create------ --------Random > Create-------- > linux2.shrew.prv -Create-- --Read--- -Delete-- -Create-- --Read--- > -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ > +++++ +++ > Latency 9722us 90us 123us 61us 20us 998us > > End @ Thu Feb 22 20:56:11 CST 2024 > ================================================================================ > > > > For comparison, here is the output from a recent run of a Linux VM on > bhyve using the virtio blk device. Note the performance drop off > between the first and second run ... > > > ================================================================================ > > Begin @ Thu Feb 22 22:00:02 CST 2024 > > Version 2.00 ------Sequential Output------ --Sequential Input- > --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- > --Seeks-- > Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > linux.shrew. 63640M 694k 99 1.5g 97 772m 70 985k 99 1.4g 75 > 2302 115 > Latency 11557us 28959us 27540us 8308us 25227us > 8605us > Version 2.00 ------Sequential Create------ --------Random > Create-------- > linux.shrew.lab -Create-- --Read--- -Delete-- -Create-- --Read--- > -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ > +++++ +++ > Latency 4058us 125us 1651us 96us 23us 627us > > End @ Thu Feb 22 22:03:07 CST 2024 > ================================================================================ > > Begin @ Thu Feb 22 22:10:02 CST 2024 > > Version 2.00 ------Sequential Output------ --Sequential Input- > --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- > --Seeks-- > Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > linux.shrew. 63640M 676k 99 406m 99 300m 92 952k 99 373m 99 > 2145 158 > Latency 11871us 154us 20673us 9926us 22765us > 14034us > Version 2.00 ------Sequential Create------ --------Random > Create-------- > linux.shrew.lab -Create-- --Read--- -Delete-- -Create-- --Read--- > -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > 16 0 95 +++++ +++ 0 79 0 96 +++++ > +++ 0 76 > Latency 4426us 125us 1687us 576us 72us 654us > > End @ Thu Feb 22 22:19:19 CST 2024 > ================================================================================ > > > It certainly feels like a deficiency in bhyve that isn't tied to any > particular storage device model. I did notice a pattern in top that I > thought was peculiar. When watching the bhyve threads while the > benchmark test is running, I see several CPU threads running at the > top of the list followed by what I assume to be block device threads. > When the disk performance is high, it looks like this ... > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > 5628 root 68 0 32G 31G CPU24 24 0:53 88.90% > bhyve{vcpu 5} > 5628 root 36 0 32G 31G vmidle 7 0:18 17.86% > bhyve{vcpu 15} > 5628 root 34 0 32G 31G vmidle 26 1:06 16.76% > bhyve{vcpu 8} > 5628 root 21 0 32G 31G uwait 37 0:05 2.69% > bhyve{blk-4:0-2} > 5628 root 21 0 32G 31G uwait 60 0:04 2.64% > bhyve{blk-4:0-0} > 5628 root 21 0 32G 31G uwait 52 0:06 2.62% > bhyve{blk-4:0-1} > 5628 root 21 0 32G 31G uwait 14 0:05 2.58% > bhyve{blk-4:0-6} > 5628 root 21 0 32G 31G RUN 50 0:05 2.51% > bhyve{blk-4:0-4} > 5628 root 20 0 32G 31G uwait 38 0:05 2.51% > bhyve{blk-4:0-5} > 5628 root 21 0 32G 31G uwait 56 0:05 2.46% > bhyve{blk-4:0-3} > 5628 root 20 0 32G 31G uwait 22 0:06 2.45% > bhyve{blk-4:0-7} > > When disk performance drops off, I see that one of the bhyve CPU > threads shows it's PRI value climb quickly until it hits around 135. > At that point, the one CPU thread basically is pegged at 100% WCPU > until the test is done. Other bhyve threads are much less busy ... > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > 5518 root 135 0 32G 31G CPU59 59 6:49 99.97% > bhyve{vcpu 2} > 5518 root 26 0 32G 31G vmidle 36 5:40 8.80% > bhyve{vcpu 15} > 5518 root 23 0 32G 31G vmidle 18 0:41 4.74% > bhyve{vcpu 13} > 5518 root 20 0 32G 31G vmidle 37 0:09 0.85% > bhyve{vcpu 10} > 5518 root 20 0 32G 31G uwait 6 0:20 0.72% > bhyve{blk-4:0-2} > 5518 root 20 0 32G 31G uwait 8 0:20 0.71% > bhyve{blk-4:0-7} > 5518 root 20 0 32G 31G uwait 43 0:22 0.70% > bhyve{blk-4:0-0} > 5518 root 20 0 32G 31G uwait 12 0:20 0.70% > bhyve{blk-4:0-5} > 5518 root 20 0 32G 31G uwait 43 0:19 0.68% > bhyve{blk-4:0-3} > 5518 root 20 0 32G 31G uwait 36 0:21 0.68% > bhyve{blk-4:0-6} > 5518 root 20 0 32G 31G uwait 46 0:20 0.68% > bhyve{blk-4:0-4} > 5518 root 20 0 32G 31G uwait 32 0:20 0.64% > bhyve{blk-4:0-1} > > This pattern in top repeats each time the benchmark is run unless the > guest is rebooted. > > Does bhyve call pthread_attr_setschedparam at run time under certain > circumstances or is that the scheduler doing it's thing? Anyone have > any ideas? I'm pretty much out of them :/ > > Thanks in advance, > > -Matthew > > Hi Matthew, Have you tried using UFS2 or even memory backed host storage for testing yet? This will rule out if it is a ZFS or bhyve issue. Cheers, Jason.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f01a9bca-7023-40c0-93f2-8cdbe4cd8078>