Date: Tue, 29 Aug 2023 07:07:39 +0000 From: Wei Hu <weh@microsoft.com> To: Mark Millard <marklmi@yahoo.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: RE: Very slow scp performance comparing to Linux Message-ID: <SI2P153MB0441D7E1C0178139C687A340BBE7A@SI2P153MB0441.APCP153.PROD.OUTLOOK.COM> In-Reply-To: <07C2C9E3-7317-43AF-A60C-393ADF90079D@yahoo.com> References: <948CAEBD-EB60-46B9-96EE-FE41CA6C64A1@yahoo.com> <07C2C9E3-7317-43AF-A60C-393ADF90079D@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Mark, Sorry for the top posting. But I don't want to make it look too messy. Here= is the Information that I have missed in my original email. All VMs are running on Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-clas= s CPU). FreeBSD VMs are 16 vcpu with 128 GB memory, in non-debug build: 14.0-ALPHA1 FreeBSD 14.0-ALPHA1 amd64 1400094 #7 nodbg-n264692-59e706ffee52= -dirty... /usr/obj/usr/src/main/amd64.amd64/sys/GENERIC-NODEBUG amd64 Ubuntu VMs are 4 vcpu with 32 GB memory, kernel version: 6.2.0-1009-azure #9~22.04.3-Ubuntu SMP Tue Aug 1 20:51:07 UTC 2023 x86_64 = x86_64 x86_64 GNU/Linux I did a couple more tests as suggested by others in this thread. In recap: Scp to localhost, FreeBSD (ufs) vs Ubuntu (ext4): 70 MB/s vs 550 MB/s Scp to localhost, FreeBSD (tmpfs) vs Ubuntu (tmpfs): 630 MB/s vs 660 MB/s Iperf3 single stream to localhost: FreeBSD vs Ubuntu: 30.9 Gb/s vs 48.8 Gb/= s Would these numbers suggest that 1. ext4 caches a lot more than ufs? 2. there is a tcp performance gap in the network stack between FreeBSD and = Ubuntu? Would you also try run scp on ufs on your bare metal arm host? I am curious= to now how different between ufs and zfs. Thanks, Wei > -----Original Message----- > From: Mark Millard <marklmi@yahoo.com> > Sent: Tuesday, August 29, 2023 12:16 AM > To: Wei Hu <weh@microsoft.com>; FreeBSD Hackers <freebsd- > hackers@freebsd.org> > Subject: Re: Very slow scp performance comparing to Linux >=20 > On Aug 28, 2023, at 08:43, Mark Millard <marklmi@yahoo.com> wrote: >=20 > > Wei Hu <weh_at_microsoft.com> wrote on > > Date: Mon, 28 Aug 2023 07:32:35 UTC : > > > >> When I was testing a new NIC, I found the single stream scp performanc= e > was almost 8 time slower than Linux on the RX side. Initially I thought i= t might > be something with the NIC. But when I switched to sending the file on > localhost, the numbers stay the same. > >> > >> Here I was sending a 2GB file from sender to receiver using scp. FreeB= SD is a > recent NON-DEBUG build from CURRENT. The Ubuntu Linux kernel is 6.2.0. > Both run in HyperV VMs on the same type of hardware. The FreeBSD VM has > 16 vcpus, while Ubuntu VM has 4 vcpu. > >> > >> Sender Receiver throughput > >> Linux FreeBSD 70 MB/s > >> Linux Linux 550 MB/s > >> FreeBSD FreeBSD 70 MB/s > >> FreeBSD Linux 350 MB/s > >> FreeBSD localhost 70 MB/s > >> Linux localhost 550 MB/s > >> > >> From theses test, it seems I can rule out the issue on NIC and its dri= ver. > Looks the FreeBSD kernel network stack is much slower than Linux on singl= e > stream TCP, or there are some problem with scp? > >> > >> I also tried turning on following kernel parameters on FreeBSD kernel.= But it > makes no difference, neither do the other tcp cc algorithms such as htcp = and > newreno. > >> > >> net.inet.tcp.soreceive_stream=3D"1" > >> net.isr.maxthreads=3D"-1" > >> net.isr.bindthreads=3D"1" > >> > >> net.inet.ip.intr_queue_maxlen=3D2048 > >> net.inet.tcp.recvbuf_max=3D16777216 > >> net.inet.tcp.recvspace=3D419430 > >> net.inet.tcp.sendbuf_max=3D16777216 > >> net.inet.tcp.sendspace=3D209715 > >> kern.ipc.maxsockbuf=3D16777216 > >> > >> Any ideas? > > > > > > You do not give explicit commands to try. Nor do you specify your > > hardware context that is involved, just that HyperV is involved. > > > > So, on a HoneyComb (16 cortex-A72's) with Optane boot media in its > > PCIe slot I, no HyperV or VM involved, tried: >=20 > I should have listed the non-debug build in use: >=20 > # uname -apKU > FreeBSD CA72-16Gp-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 > 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT > 2023 root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg- > clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 > aarch64 1500000 1500000 >=20 > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 120.2MB/s 00:42 > > > > It is not a high performance system. 64 GiBytes of RAM. > > > > So instead trying a ThreadRipper 1950X that also has Optane in a CPIe > > slot for its boot media, no HyperV or VM involved, >=20 > I should have listed the non-debug build in use: >=20 > # uname -apKU > FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 > #116 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:20 PDT 2023 > root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main- > src/amd64.amd64/sys/GENERIC-NODBG amd64 amd64 1500000 1500000 >=20 > (Same source tree content.) >=20 > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 299.7MB/s 00:17 > > > > (These systems do not run with any tmpfs areas, not even /tmp . So I'm > > not providing that kind of example, at least for now.) > > > > 128 GiBytes of RAM. > > > > Both systems are ZFS based but with a simple single partition. > > (Used for bectl BE not for other types of reasons to use ZFS. > > I could boot UFS variants of the boot media and test that kind of > > context.) > > > > So both show between your FreeBSD figure and the Linux figure. > > I've no means of checking how reasonable the figures are relative to > > your test context. I just know the results are better than you report > > for localhost use. >=20 > Adding a Windows Dev Kit 2023 booting via USB3 (but via a > U.2 adapter to Optane media), again ZFS, again no VM involved: >=20 > # uname -apKU > FreeBSD CA78C-WDK23-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 > 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:20:31 PDT > 2023 root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main-CA78C-nodbg- > clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA78C arm64 > aarch64 1500000 1500000 >=20 > # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818- > 77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img > . . . > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 168.7MB/s 00:30 >=20 >=20 > Note: the cortex-a72 and cortex-a78c/x1c builds were optimized via -mcpu= =3D > use. The ThreadRipper build was not. >=20 >=20 > Note: I've not controlled for if the reads of the input *.img data were g= otten > from memory caching of prior activity or not. I could do so if you want: = reboot > before scp command. >=20 > =3D=3D=3D > Mark Millard > marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?SI2P153MB0441D7E1C0178139C687A340BBE7A>