Date: Thu, 22 Feb 2024 23:51:06 -0600 From: Matthew Grooms <mgrooms@shrew.net> To: virtualization@freebsd.org Subject: Re: bhyve disk performance issue Message-ID: <6a4e7e1d-cca5-45d4-a268-1805a15d9819@shrew.net> In-Reply-To: <50614ea4-f0f9-44a2-b5e6-ebb33cfffbc4@shrew.net> References: <6a128904-a4c1-41ec-a83d-56da56871ceb@shrew.net> <28ea168c-1211-4104-b8b4-daed0e60950d@app.fastmail.com> <0ff6f30a-b53a-4d0f-ac21-eaf701d35d00@shrew.net> <6f6b71ac-2349-4045-9eaf-5c50d42b89be@shrew.net> <50614ea4-f0f9-44a2-b5e6-ebb33cfffbc4@shrew.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2/18/24 09:21, Matthew Grooms wrote: > On 2/17/24 15:53, Matthew Grooms wrote: >> >> Unfortunately same story with 13.2. I'm going to try and downgrade to >> 12.4 and see if it gets any better. >> >> ================================================================================ >> Begin @ Sat Feb 17 11:00:01 CST 2024 >> >> Version 2.00 ------Sequential Output------ --Sequential Input- >> --Random- >> -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- >> --Seeks-- >> Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP >> /sec %CP >> localhost.lo 63640M 690k 99 1.5g 97 727m 78 950k 99 1.3g 68 >> +++++ +++ >> Latency 11759us 29114us 8098us 8649us 23413us >> 4540us >> Version 2.00 ------Sequential Create------ --------Random >> Create-------- >> localhost.localdoma -Create-- --Read--- -Delete-- -Create-- --Read--- >> -Delete-- >> files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP >> /sec %CP >> 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ >> +++++ +++ >> Latency 7791us 131us 1671us 464us 15us 1811us >> >> End @ Sat Feb 17 11:03:13 CST 2024 >> ================================================================================ >> Begin @ Sat Feb 17 11:10:01 CST 2024 >> >> Version 2.00 ------Sequential Output------ --Sequential Input- >> --Random- >> -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- >> --Seeks-- >> Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP >> /sec %CP >> localhost.lo 63640M 667k 99 449m 99 313m 94 940k 99 398m 99 >> 16204 563 >> Latency 12147us 1079us 24470us 8795us 17853us >> 4384us >> Version 2.00 ------Sequential Create------ --------Random >> Create-------- >> localhost.localdoma -Create-- --Read--- -Delete-- -Create-- --Read--- >> -Delete-- >> files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP >> /sec %CP >> 16 0 93 +++++ +++ +++++ +++ 0 96 +++++ +++ >> +++++ +++ >> Latency 118us 159us 269us 164us 54us 844us >> >> End @ Sat Feb 17 11:18:43 CST 2024 >> > I wasn't able to get a working 12.4 system in place due to lack of > packages. However, I did fire up a FreeBSD 14 VM and let it run > overnight on the same SSD array. It consistently ran at a much higher > speed for 50+ runs @ 10m intervals ... > > ================================================================================ > Begin @ Sun Feb 18 15:00:00 UTC 2024 > > Version 1.98 ------Sequential Output------ --Sequential Input- > --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- > --Seeks-- > Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > freebsd.shrew.l 64G 628k 99 1.6g 98 831m 60 1278k 99 1.1g 42 > +++++ +++ > Latency 13447us 68490us 207ms 7187us 195ms 17665us > Version 1.98 ------Sequential Create------ --------Random > Create-------- > freebsd.shrew.lab -Create-- --Read--- -Delete-- -Create-- --Read--- > -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ > +++++ +++ > Latency 18225us 18us 28us 18812us 18us 25us > > End @ Sun Feb 18 15:03:11 UTC 2024 > > I used identical options to run both that VM and the Linux VM I've > been testing. The backing store for each VM has a 1TB partition and > the guest interfaces are NVME. Now I'm really scratching my head. > > Chuck, were you testing disk performance in Linux VMs or only FreeBSD? > > Anyone have ideas on why Linux disk performance would drop off a cliff > over time? > I took a detour trying out Xen but apparently that has it's own set of performance issues related to the FreeBSD port missing newer features. I did install KVM on the same hardware for comparison. I then tested a guest provisioned identically to the bhyve VM with a virtio blk device which ran for 2.5 hours with consistently impressive output ... ================================================================================ Begin @ Thu Feb 22 20:55:01 CST 2024 Version 2.00 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP linux2.shrew.pr 31G 2191k 99 1.4g 55 1.1g 59 3484k 99 2.5g 83 7817 127 Latency 4480us 2528us 17656us 2650us 102us 3568us Version 2.00 ------Sequential Create------ --------Random Create-------- linux2.shrew.prv -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ Latency 9722us 90us 123us 61us 20us 998us End @ Thu Feb 22 20:56:11 CST 2024 ================================================================================ For comparison, here is the output from a recent run of a Linux VM on bhyve using the virtio blk device. Note the performance drop off between the first and second run ... ================================================================================ Begin @ Thu Feb 22 22:00:02 CST 2024 Version 2.00 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP linux.shrew. 63640M 694k 99 1.5g 97 772m 70 985k 99 1.4g 75 2302 115 Latency 11557us 28959us 27540us 8308us 25227us 8605us Version 2.00 ------Sequential Create------ --------Random Create-------- linux.shrew.lab -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ Latency 4058us 125us 1651us 96us 23us 627us End @ Thu Feb 22 22:03:07 CST 2024 ================================================================================ Begin @ Thu Feb 22 22:10:02 CST 2024 Version 2.00 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP linux.shrew. 63640M 676k 99 406m 99 300m 92 952k 99 373m 99 2145 158 Latency 11871us 154us 20673us 9926us 22765us 14034us Version 2.00 ------Sequential Create------ --------Random Create-------- linux.shrew.lab -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 0 95 +++++ +++ 0 79 0 96 +++++ +++ 0 76 Latency 4426us 125us 1687us 576us 72us 654us End @ Thu Feb 22 22:19:19 CST 2024 ================================================================================ It certainly feels like a deficiency in bhyve that isn't tied to any particular storage device model. I did notice a pattern in top that I thought was peculiar. When watching the bhyve threads while the benchmark test is running, I see several CPU threads running at the top of the list followed by what I assume to be block device threads. When the disk performance is high, it looks like this ... PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 5628 root 68 0 32G 31G CPU24 24 0:53 88.90% bhyve{vcpu 5} 5628 root 36 0 32G 31G vmidle 7 0:18 17.86% bhyve{vcpu 15} 5628 root 34 0 32G 31G vmidle 26 1:06 16.76% bhyve{vcpu 8} 5628 root 21 0 32G 31G uwait 37 0:05 2.69% bhyve{blk-4:0-2} 5628 root 21 0 32G 31G uwait 60 0:04 2.64% bhyve{blk-4:0-0} 5628 root 21 0 32G 31G uwait 52 0:06 2.62% bhyve{blk-4:0-1} 5628 root 21 0 32G 31G uwait 14 0:05 2.58% bhyve{blk-4:0-6} 5628 root 21 0 32G 31G RUN 50 0:05 2.51% bhyve{blk-4:0-4} 5628 root 20 0 32G 31G uwait 38 0:05 2.51% bhyve{blk-4:0-5} 5628 root 21 0 32G 31G uwait 56 0:05 2.46% bhyve{blk-4:0-3} 5628 root 20 0 32G 31G uwait 22 0:06 2.45% bhyve{blk-4:0-7} When disk performance drops off, I see that one of the bhyve CPU threads shows it's PRI value climb quickly until it hits around 135. At that point, the one CPU thread basically is pegged at 100% WCPU until the test is done. Other bhyve threads are much less busy ... PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 5518 root 135 0 32G 31G CPU59 59 6:49 99.97% bhyve{vcpu 2} 5518 root 26 0 32G 31G vmidle 36 5:40 8.80% bhyve{vcpu 15} 5518 root 23 0 32G 31G vmidle 18 0:41 4.74% bhyve{vcpu 13} 5518 root 20 0 32G 31G vmidle 37 0:09 0.85% bhyve{vcpu 10} 5518 root 20 0 32G 31G uwait 6 0:20 0.72% bhyve{blk-4:0-2} 5518 root 20 0 32G 31G uwait 8 0:20 0.71% bhyve{blk-4:0-7} 5518 root 20 0 32G 31G uwait 43 0:22 0.70% bhyve{blk-4:0-0} 5518 root 20 0 32G 31G uwait 12 0:20 0.70% bhyve{blk-4:0-5} 5518 root 20 0 32G 31G uwait 43 0:19 0.68% bhyve{blk-4:0-3} 5518 root 20 0 32G 31G uwait 36 0:21 0.68% bhyve{blk-4:0-6} 5518 root 20 0 32G 31G uwait 46 0:20 0.68% bhyve{blk-4:0-4} 5518 root 20 0 32G 31G uwait 32 0:20 0.64% bhyve{blk-4:0-1} This pattern in top repeats each time the benchmark is run unless the guest is rebooted. Does bhyve call pthread_attr_setschedparam at run time under certain circumstances or is that the scheduler doing it's thing? Anyone have any ideas? I'm pretty much out of them :/ Thanks in advance, -Matthew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6a4e7e1d-cca5-45d4-a268-1805a15d9819>