Date: Mon, 31 Oct 2022 16:03:17 +0800 From: Zhenlei Huang <zlei.huang@gmail.com> To: Paul Procacci <pprocacci@gmail.com> Cc: FreeBSD virtualization <freebsd-virtualization@freebsd.org> Subject: Re: NFS in bhyve VM mounted via bridge interface Message-ID: <3858240B-7225-4ECB-B4A6-4DE006ED869D@gmail.com> In-Reply-To: <CAFbbPuiHZokb_7Q=TpXy7fFBNfJGtN=9Dt3T2%2Bbx5OZUQOqjbg@mail.gmail.com> References: <A4F5B9EF-AA2B-4F34-8F62-A12ECE4E9566@jld3.net> <CAFbbPuiHZokb_7Q=TpXy7fFBNfJGtN=9Dt3T2%2Bbx5OZUQOqjbg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] > On Oct 31, 2022, at 2:02 PM, Paul Procacci <pprocacci@gmail.com> wrote: > > > > On Mon, Oct 31, 2022 at 12:00 AM John Doherty <bsdlists@jld3.net <mailto:bsdlists@jld3.net>> wrote: > I have a machine running FreeBSD 12.3-RELEASE with a zpool that consists > of 12 mirrored pairs of 14 TB disks. I'll call this the "storage > server." On that machine, I can write to ZFS file systems at around 950 > MB/s and read from them at around 1450 MB/s. I'm happy with that. > > I have another machine running Alma linux 8.6 that mounts file systems > from the storage server via NFS over a 10 GbE network. On this machine, > I can write to and read from an NFS file system at around 450 MB/s. I > wish that this were better but it's OK. > > I created a bhyve VM on the storage server that also runs Alma linux > 8.6. It has a vNIC that is bridged with the 10 GbE physical NIC and a > tap interface: > > [root@ss3] # ifconfig vm-storage > vm-storage: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 > mtu 1500 > ether 82:d3:46:17:4e:ee > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> > ifmaxaddr 0 port 10 priority 128 path cost 2000000 > member: ixl0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> > ifmaxaddr 0 port 5 priority 128 path cost 2000 > groups: bridge vm-switch viid-ddece@ > nd6 options=1<PERFORMNUD> > > I mount file systems from the storage server on this VM via NFS. I can > write to those file systems at around 250 MB/s and read from them at > around 280 MB/s. This surprised me a little: I thought that this might > perform better than or at least as well as the physical 10 GbE network > but find that it performs significantly worse. > > All my read and write tests here are stupidly simple, using dd to read > from /dev/zero and write to a file or to read from a file and write to > /dev/null. > > Is anyone else either surprised or unsurprised by these results? > > I have not yet tried passing a physical interface on the storage server > through to the VM with PCI passthrough, but the machine does have > another 10 GbE interface I could use for this. This stuff is all about > 3,200 miles away from me so I need to get someone to plug a cable in for > me. I'll be interested to see how that works out, though. > > Any comments much appreciated. Thanks. > > > > I was getting geared up to help you with this and then this happened: > > Host: > # dd if=17-04-27.mp4 of=/dev/null bs=4096 > 216616+1 records in > 216616+1 records out > 887263074 bytes transferred in 76.830892 secs (11548259 bytes/sec) > > VM: > dd if=17-04-27.mp4 of=/dev/null bs=4096 > 216616+1 records in > 216616+1 records out > 887263074 bytes transferred in 7.430017 secs (119416016 bytes/sec) > > I'm totally flabbergasted. These results are consistent and not at all what I expected to see. > I even ran the tests on the VM first and the host second. Call me confused. I thinks you should bypass local cache while testing. Try iflag=direct , see dd(1) . If the input file 17-04-27.mp4 is on NFS, then you could also verify the network IO by netstat. > > Anyways, that's a problem for me to figure out. > > Back to your problem, I had something typed out concerning checking rxsum's and txsum's are turned off on > the interfaces, or at least see if that makes a difference, trying to use a disk type of nvme, and trying ng_bridge > w/ netgraph interfaces but now I'm concluding my house is made of glass -- Hah! -- so until I get my house in > order I'm going to refrain from providing details. > > Sorry and thanks! > ~Paul Best regards, Zhenlei [-- Attachment #2 --] <html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Oct 31, 2022, at 2:02 PM, Paul Procacci <<a href="mailto:pprocacci@gmail.com" class="">pprocacci@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class=""><div class=""><div class=""><div class=""><div dir="ltr" class=""><br class=""></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 31, 2022 at 12:00 AM John Doherty <<a href="mailto:bsdlists@jld3.net" class="">bsdlists@jld3.net</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I have a machine running FreeBSD 12.3-RELEASE with a zpool that consists <br class=""> of 12 mirrored pairs of 14 TB disks. I'll call this the "storage <br class=""> server." On that machine, I can write to ZFS file systems at around 950 <br class=""> MB/s and read from them at around 1450 MB/s. I'm happy with that.<br class=""> <br class=""> I have another machine running Alma linux 8.6 that mounts file systems <br class=""> from the storage server via NFS over a 10 GbE network. On this machine, <br class=""> I can write to and read from an NFS file system at around 450 MB/s. I <br class=""> wish that this were better but it's OK.<br class=""> <br class=""> I created a bhyve VM on the storage server that also runs Alma linux <br class=""> 8.6. It has a vNIC that is bridged with the 10 GbE physical NIC and a <br class=""> tap interface:<br class=""> <br class=""> [root@ss3] # ifconfig vm-storage<br class=""> vm-storage: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 <br class=""> mtu 1500<br class=""> ether 82:d3:46:17:4e:ee<br class=""> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15<br class=""> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200<br class=""> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0<br class=""> member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP><br class=""> ifmaxaddr 0 port 10 priority 128 path cost 2000000<br class=""> member: ixl0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP><br class=""> ifmaxaddr 0 port 5 priority 128 path cost 2000<br class=""> groups: bridge vm-switch viid-ddece@<br class=""> nd6 options=1<PERFORMNUD><br class=""> <br class=""> I mount file systems from the storage server on this VM via NFS. I can <br class=""> write to those file systems at around 250 MB/s and read from them at <br class=""> around 280 MB/s. This surprised me a little: I thought that this might <br class=""> perform better than or at least as well as the physical 10 GbE network <br class=""> but find that it performs significantly worse.<br class=""> <br class=""> All my read and write tests here are stupidly simple, using dd to read <br class=""> from /dev/zero and write to a file or to read from a file and write to <br class=""> /dev/null.<br class=""> <br class=""> Is anyone else either surprised or unsurprised by these results?<br class=""> <br class=""> I have not yet tried passing a physical interface on the storage server <br class=""> through to the VM with PCI passthrough, but the machine does have <br class=""> another 10 GbE interface I could use for this. This stuff is all about <br class=""> 3,200 miles away from me so I need to get someone to plug a cable in for <br class=""> me. I'll be interested to see how that works out, though.<br class=""> <br class=""> Any comments much appreciated. Thanks.<br class=""> <br class=""> <br class=""> </blockquote></div><br class=""></div>I was getting geared up to help you with this and then this happened:<br class=""><br class=""></div><div class="">Host:<br class=""># dd if=17-04-27.mp4 of=/dev/null bs=4096<br class="">216616+1 records in<br class="">216616+1 records out<br class="">887263074 bytes transferred in 76.830892 secs (11548259 bytes/sec)<br class=""><br class=""></div><div class="">VM:<br class="">dd if=17-04-27.mp4 of=/dev/null bs=4096</div><div class="">216616+1 records in<br class="">216616+1 records out<br class=""></div><div class="">887263074 bytes transferred in 7.430017 secs (119416016 bytes/sec)<br class=""><br class=""></div><div class="">I'm totally flabbergasted. These results are consistent and not at all what I expected to see.<br class=""></div><div class="">I even ran the tests on the VM first and the host second. Call me confused.<br class=""></div></div></div></div></div></blockquote><div><br class=""></div><div>I thinks you should bypass local cache while testing. Try iflag=direct , see dd(1) .</div><div><br class=""></div><div>If the input file 17-04-27.mp4 is on NFS, then you could also verify the network IO by netstat.</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><div class=""><div class=""><br class=""></div><div class="">Anyways, that's a problem for me to figure out.<br class=""></div><div class=""><br class="">Back to your problem, I had something typed out concerning checking rxsum's and txsum's are turned off on<br class="">the interfaces, or at least see if that makes a difference, trying to use a disk type of nvme, and trying ng_bridge<br class="">w/ netgraph interfaces but now I'm concluding my house is made of glass -- Hah! -- so until I get my house in<br class="">order I'm going to refrain from providing details.<br class=""><br class=""></div><div class="">Sorry and thanks!<br class=""></div><div class="">~Paul<br class=""></div></div></div></div> </div></blockquote></div><br class=""><div class=""><div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">Best regards,</div><div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">Zhenlei</div></div></body></html>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3858240B-7225-4ECB-B4A6-4DE006ED869D>
