Date: Tue, 29 Aug 2023 12:55:35 +0000 From: Wei Hu <weh@microsoft.com> To: Mark Millard <marklmi@yahoo.com> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: RE: Very slow scp performance comparing to Linux Message-ID: <SI2P153MB0441C7B26F38519CAF179B9EBBE7A@SI2P153MB0441.APCP153.PROD.OUTLOOK.COM> In-Reply-To: <6E433834-E192-44F1-9FF1-3814F13449FF@yahoo.com> References: <98C8E07C-2247-4439-8836-ED350CC83F16@yahoo.com> <6E433834-E192-44F1-9FF1-3814F13449FF@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Mark, Thanks for the update. Seems the numbers are the same on zfs and ufs. That'= s=20 good to know.=20 Yes, your numbers on ARM64 are better than mine on Intel. However, my origi= nal intention was to find out why scp on Linux is performing much better than F= reeBSD under the same hardware env.=20 Is it possible to try Linux in your ARM64 setting? I am using Ubuntu 22.04 = on ext4=20 file system. Thanks, Wei=20 > -----Original Message----- > From: Mark Millard <marklmi@yahoo.com> > Sent: Tuesday, August 29, 2023 7:22 PM > To: Wei Hu <weh@microsoft.com> > Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> > Subject: Re: Very slow scp performance comparing to Linux >=20 > [Adding USB3/U.2 Optane UFS Windows Dev Kit 2023 scp examples, no VM's > involved.] >=20 > On Aug 29, 2023, at 03:27, Mark Millard <marklmi@yahoo.com> wrote: >=20 > > Wei Hu <weh_at_microsoft.com> wrote on > > Date: Tue, 29 Aug 2023 07:07:39 UTC : > > > >> Sorry for the top posting. But I don't want to make it look too > >> messy. Here is the Information that I have missed in my original email= . > >> > >> All VMs are running on Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz > K8-class CPU). > >> > >> FreeBSD VMs are 16 vcpu with 128 GB memory, in non-debug build: > >> 14.0-ALPHA1 FreeBSD 14.0-ALPHA1 amd64 1400094 #7 > >> nodbg-n264692-59e706ffee52-dirty... > >> /usr/obj/usr/src/main/amd64.amd64/sys/GENERIC-NODEBUG amd64 > >> > >> Ubuntu VMs are 4 vcpu with 32 GB memory, kernel version: > >> 6.2.0-1009-azure #9~22.04.3-Ubuntu SMP Tue Aug 1 20:51:07 UTC 2023 > >> x86_64 x86_64 x86_64 GNU/Linux > >> > >> I did a couple more tests as suggested by others in this thread. In re= cap: > >> > >> Scp to localhost, FreeBSD (ufs) vs Ubuntu (ext4): 70 MB/s vs 550 MB/s > >> Scp to localhost, FreeBSD (tmpfs) vs Ubuntu (tmpfs): 630 MB/s vs 660 > >> MB/s > >> > >> Iperf3 single stream to localhost: FreeBSD vs Ubuntu: 30.9 Gb/s vs > >> 48.8 Gb/s > >> > >> Would these numbers suggest that > >> 1. ext4 caches a lot more than ufs? > >> 2. there is a tcp performance gap in the network stack between FreeBSD > and Ubuntu? > >> > >> Would you also try run scp on ufs on your bare metal arm host? I am > curious to now how different between ufs and zfs. > > > > > > For this round I'm rebooting between the unxz and the 1st scp. > > So I'll also have zfs results again. I'll also do a 2nd scp (no > > reboot) to see if it gets notably different results. > > > > . . . > > > > Well, I just got FreeBSD main [so: 15] running under HyperV on the > > Windows Dev Kit 2023. So reporting for there first. This was via an > > ssh session. The context is ZFS. The VM file size is fixed, as is the > > RAM size. > > 6 cores (of 8) and 24576 MiBytes assigned (of 32 > > GiBytes) to the one FreeBSD instance. The VM file is on the internal > > NVMe drive in the Windows 11 Pro file system in the default place. > > > > (I was having it copy the hardrive media to the VM file when I started > > this process. Modern HyperV no longer seems to support direct use of > > USB3 physical media. I first had to produce a copy of the material on > > smaller media so that a fixed VM file size from a copy to create the > > VM file would fit in the NVMe's free space.) > > > > # uname -apKU > > FreeBSD CA78C-WDK23s-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT > aarch64 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25 > 09:20:31 PDT 2023 root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main- > CA78C-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG- > CA78C arm64 aarch64 1500000 1500000 > > > > (The ZFS content is a copy of the USB3 interfaced ZFS Optane media's > > content previously reported on. > > So the installed system was built with -mcpu=3D based optimization, as > > noted before.) > > > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 193.6MB/s 00:26 > > > > # rm ~/FreeBSD-14-TEST.img > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 198.0MB/s 00:25 > > > > > > So, faster than what you are reporting for the > > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context. > > > > For reference: > > > > # gpart show -pl > > =3D> 40 468862055 da0 GPT (224G) > > 40 32728 - free - (16M) > > 32768 102400 da0p1 wdk23sCA78Cefi (50M) > > 135168 421703680 da0p2 wdk23sCA78Czfs (201G) > > 421838848 47022080 da0p3 wdk23sCA78Cswp22 (22G) > > 468860928 1167 - free - (584K) > > > > # zpool list > > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP > HEALTH ALTROOT > > zwdk23s 200G 79.8G 120G - - 0% 39% 1.00x = ONLINE - > > > > (UFS would have notably more allocated and less free for the same size > > partition.) > > > > > > > > The below is be based on the HoneyComb (16 cortex-a72's) since I've > > got the HyperV context going on the Windows Dev Kit 2023 at the > > moment. > > > > > > UFS first: > > > > # uname -apKU > > FreeBSD HC-CA72-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 > 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT > 2023 root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg- > clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 > aarch64 1500000 1500000 > > > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 129.7MB/s 00:39 > > > > # rm ~/FreeBSD-14-TEST.img > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 130.9MB/s 00:39 > > > > > > So, faster than what you are reporting for the > > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context. > > > > Note: This is via a U.2 Optane 960 GB media and an M.2 adapter instead > > of being via a PCIe Optane 960 GB media in the PCIe slot. > > > > > > ZFS second: > > > > # uname -apKU > > FreeBSD CA72-16Gp-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 > 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT > 2023 root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg- > clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 > aarch64 1500000 1500000 > > > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > . . . > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 121.1MB/s 00:42 > > > > # rm ~/FreeBSD-14-TEST.img > > # scp > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.i > > mg root@localhost:FreeBSD-14-TEST.img > > (root@localhost) Password for root@CA72-16Gp-ZFS: > > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 124.6MB/s 00:41 > > > > > > So, faster than what you are reporting for the > > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context. > > > > Note: This is via a PCIe Optane 960 GB media in the PCIe slot. > > > > > > UFS was slightly faster then ZFS for the HoneyComb context but there > > is the M.2 vs. PCIe difference as well. > > >=20 > # uname -apKU > FreeBSD CA78C-WDK23-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64 > 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:20:31 PDT > 2023 root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main-CA78C-nodbg- > clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA78C arm64 > aarch64 1500000 1500000 >=20 > Again, a -mcpu=3D optimized build context for the FreeBSD in > operation. >=20 > (Still rebooting first. Then . . .) >=20 > # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818- > 77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img > . . . > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 199.3MB/s 00:25 >=20 > # rm ~/FreeBSD-14-TEST.img > # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818- > 77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img > . . . > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048- > 264841.img = 100% 5120MB > 204.9MB/s 00:24 >=20 >=20 > So, faster than what you are reporting for the > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) > context. >=20 > The Windows Dev Kit 2023 figures are generally faster than the > HoneyComb figures. >=20 > =3D=3D=3D > Mark Millard > marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?SI2P153MB0441C7B26F38519CAF179B9EBBE7A>