Date: Tue, 29 Aug 2023 12:55:35 +0000 From: Wei Hu <weh@microsoft.com> To: Mark Millard <marklmi@yahoo.com> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: RE: Very slow scp performance comparing to Linux Message-ID: <SI2P153MB0441C7B26F38519CAF179B9EBBE7A@SI2P153MB0441.APCP153.PROD.OUTLOOK.COM> In-Reply-To: <6E433834-E192-44F1-9FF1-3814F13449FF@yahoo.com> References: <98C8E07C-2247-4439-8836-ED350CC83F16@yahoo.com> <6E433834-E192-44F1-9FF1-3814F13449FF@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Mark,
Thanks for the update. Seems the numbers are the same on zfs and ufs. That'=
s=20
good to know.=20
Yes, your numbers on ARM64 are better than mine on Intel. However, my origi=
nal
intention was to find out why scp on Linux is performing much better than F=
reeBSD
under the same hardware env.=20
Is it possible to try Linux in your ARM64 setting? I am using Ubuntu 22.04 =
on ext4=20
file system.
Thanks,
Wei=20
> -----Original Message-----
> From: Mark Millard <marklmi@yahoo.com>
> Sent: Tuesday, August 29, 2023 7:22 PM
> To: Wei Hu <weh@microsoft.com>
> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org>
> Subject: Re: Very slow scp performance comparing to Linux
>=20
> [Adding USB3/U.2 Optane UFS Windows Dev Kit 2023 scp examples, no VM's
> involved.]
>=20
> On Aug 29, 2023, at 03:27, Mark Millard <marklmi@yahoo.com> wrote:
>=20
> > Wei Hu <weh_at_microsoft.com> wrote on
> > Date: Tue, 29 Aug 2023 07:07:39 UTC :
> >
> >> Sorry for the top posting. But I don't want to make it look too
> >> messy. Here is the Information that I have missed in my original email=
.
> >>
> >> All VMs are running on Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz
> K8-class CPU).
> >>
> >> FreeBSD VMs are 16 vcpu with 128 GB memory, in non-debug build:
> >> 14.0-ALPHA1 FreeBSD 14.0-ALPHA1 amd64 1400094 #7
> >> nodbg-n264692-59e706ffee52-dirty...
> >> /usr/obj/usr/src/main/amd64.amd64/sys/GENERIC-NODEBUG amd64
> >>
> >> Ubuntu VMs are 4 vcpu with 32 GB memory, kernel version:
> >> 6.2.0-1009-azure #9~22.04.3-Ubuntu SMP Tue Aug 1 20:51:07 UTC 2023
> >> x86_64 x86_64 x86_64 GNU/Linux
> >>
> >> I did a couple more tests as suggested by others in this thread. In re=
cap:
> >>
> >> Scp to localhost, FreeBSD (ufs) vs Ubuntu (ext4): 70 MB/s vs 550 MB/s
> >> Scp to localhost, FreeBSD (tmpfs) vs Ubuntu (tmpfs): 630 MB/s vs 660
> >> MB/s
> >>
> >> Iperf3 single stream to localhost: FreeBSD vs Ubuntu: 30.9 Gb/s vs
> >> 48.8 Gb/s
> >>
> >> Would these numbers suggest that
> >> 1. ext4 caches a lot more than ufs?
> >> 2. there is a tcp performance gap in the network stack between FreeBSD
> and Ubuntu?
> >>
> >> Would you also try run scp on ufs on your bare metal arm host? I am
> curious to now how different between ufs and zfs.
> >
> >
> > For this round I'm rebooting between the unxz and the 1st scp.
> > So I'll also have zfs results again. I'll also do a 2nd scp (no
> > reboot) to see if it gets notably different results.
> >
> > . . .
> >
> > Well, I just got FreeBSD main [so: 15] running under HyperV on the
> > Windows Dev Kit 2023. So reporting for there first. This was via an
> > ssh session. The context is ZFS. The VM file size is fixed, as is the
> > RAM size.
> > 6 cores (of 8) and 24576 MiBytes assigned (of 32
> > GiBytes) to the one FreeBSD instance. The VM file is on the internal
> > NVMe drive in the Windows 11 Pro file system in the default place.
> >
> > (I was having it copy the hardrive media to the VM file when I started
> > this process. Modern HyperV no longer seems to support direct use of
> > USB3 physical media. I first had to produce a copy of the material on
> > smaller media so that a fixed VM file size from a copy to create the
> > VM file would fit in the NVMe's free space.)
> >
> > # uname -apKU
> > FreeBSD CA78C-WDK23s-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT
> aarch64 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25
> 09:20:31 PDT 2023 root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main-
> CA78C-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-
> CA78C arm64 aarch64 1500000 1500000
> >
> > (The ZFS content is a copy of the USB3 interfaced ZFS Optane media's
> > content previously reported on.
> > So the installed system was built with -mcpu=3D based optimization, as
> > noted before.)
> >
> > # scp
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.i
> > mg root@localhost:FreeBSD-14-TEST.img
> > . . .
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.img =
100% 5120MB
> 193.6MB/s 00:26
> >
> > # rm ~/FreeBSD-14-TEST.img
> > # scp
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.i
> > mg root@localhost:FreeBSD-14-TEST.img
> > . . .
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.img =
100% 5120MB
> 198.0MB/s 00:25
> >
> >
> > So, faster than what you are reporting for the
> > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context.
> >
> > For reference:
> >
> > # gpart show -pl
> > =3D> 40 468862055 da0 GPT (224G)
> > 40 32728 - free - (16M)
> > 32768 102400 da0p1 wdk23sCA78Cefi (50M)
> > 135168 421703680 da0p2 wdk23sCA78Czfs (201G)
> > 421838848 47022080 da0p3 wdk23sCA78Cswp22 (22G)
> > 468860928 1167 - free - (584K)
> >
> > # zpool list
> > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP
> HEALTH ALTROOT
> > zwdk23s 200G 79.8G 120G - - 0% 39% 1.00x =
ONLINE -
> >
> > (UFS would have notably more allocated and less free for the same size
> > partition.)
> >
> >
> >
> > The below is be based on the HoneyComb (16 cortex-a72's) since I've
> > got the HyperV context going on the Windows Dev Kit 2023 at the
> > moment.
> >
> >
> > UFS first:
> >
> > # uname -apKU
> > FreeBSD HC-CA72-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64
> 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT
> 2023 root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg-
> clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64
> aarch64 1500000 1500000
> >
> > # scp
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.i
> > mg root@localhost:FreeBSD-14-TEST.img
> > . . .
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.img =
100% 5120MB
> 129.7MB/s 00:39
> >
> > # rm ~/FreeBSD-14-TEST.img
> > # scp
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.i
> > mg root@localhost:FreeBSD-14-TEST.img
> > . . .
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.img =
100% 5120MB
> 130.9MB/s 00:39
> >
> >
> > So, faster than what you are reporting for the
> > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context.
> >
> > Note: This is via a U.2 Optane 960 GB media and an M.2 adapter instead
> > of being via a PCIe Optane 960 GB media in the PCIe slot.
> >
> >
> > ZFS second:
> >
> > # uname -apKU
> > FreeBSD CA72-16Gp-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64
> 1500000 #110 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:19:53 PDT
> 2023 root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA72-nodbg-
> clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64
> aarch64 1500000 1500000
> >
> > # scp
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.i
> > mg root@localhost:FreeBSD-14-TEST.img
> > . . .
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.img =
100% 5120MB
> 121.1MB/s 00:42
> >
> > # rm ~/FreeBSD-14-TEST.img
> > # scp
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.i
> > mg root@localhost:FreeBSD-14-TEST.img
> > (root@localhost) Password for root@CA72-16Gp-ZFS:
> > FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.img =
100% 5120MB
> 124.6MB/s 00:41
> >
> >
> > So, faster than what you are reporting for the
> > Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU) context.
> >
> > Note: This is via a PCIe Optane 960 GB media in the PCIe slot.
> >
> >
> > UFS was slightly faster then ZFS for the HoneyComb context but there
> > is the M.2 vs. PCIe difference as well.
> >
>=20
> # uname -apKU
> FreeBSD CA78C-WDK23-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT aarch64
> 1500000 #13 main-n265027-2f06449d6429-dirty: Fri Aug 25 09:20:31 PDT
> 2023 root@CA78C-WDK23-ZFS:/usr/obj/BUILDs/main-CA78C-nodbg-
> clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA78C arm64
> aarch64 1500000 1500000
>=20
> Again, a -mcpu=3D optimized build context for the FreeBSD in
> operation.
>=20
> (Still rebooting first. Then . . .)
>=20
> # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
> . . .
> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.img =
100% 5120MB
> 199.3MB/s 00:25
>=20
> # rm ~/FreeBSD-14-TEST.img
> # scp FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-
> 77013f29d048-264841.img root@localhost:FreeBSD-14-TEST.img
> . . .
> FreeBSD-14.0-ALPHA2-arm-armv7-GENERICSD-20230818-77013f29d048-
> 264841.img =
100% 5120MB
> 204.9MB/s 00:24
>=20
>=20
> So, faster than what you are reporting for the
> Intel(R) Xeon(R) Platinum 8473C (2100.00-MHz K8-class CPU)
> context.
>=20
> The Windows Dev Kit 2023 figures are generally faster than the
> HoneyComb figures.
>=20
> =3D=3D=3D
> Mark Millard
> marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?SI2P153MB0441C7B26F38519CAF179B9EBBE7A>
