Date: Thu, 20 May 2021 11:25:30 +0100 From: Steven Hartland <killing@multiplay.co.uk> To: mike tancsa <mike@sentex.net> Cc: Alan Somers <asomers@freebsd.org>, freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: speeding up zfs send | recv (update) Message-ID: <CAHEMsqYf4JOL22R3%2B13kqSOaDMydnpsq9Z2mR4EFBg7u78FjSQ@mail.gmail.com> In-Reply-To: <f6ea3387-faf8-4c63-d1e7-906fa397b00b@sentex.net> References: <866d6937-a4e8-bec3-d61b-07df3065fca9@sentex.net> <CAOtMX2gifUmgqwSKpRGcfzCm_=BX_szNF1AF8WTMfAmbrJ5UWA@mail.gmail.com> <f6ea3387-faf8-4c63-d1e7-906fa397b00b@sentex.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000624b2305c2c05e06 Content-Type: text/plain; charset="UTF-8" What is your pool structure / disk types? On Mon, 17 May 2021 at 16:58, mike tancsa <mike@sentex.net> wrote: > On 5/13/2021 11:37 AM, Alan Somers wrote: > > On Thu, May 13, 2021 at 8:45 AM mike tancsa <mike@sentex.net > > <mailto:mike@sentex.net>> wrote: > > > > For offsite storage, I have been doing a zfs send across a 10G > > link and > > noticed something I don't understand with respect to speed. I have a > > > > > > Is this a high latency link? ZFS send streams can be bursty. Piping > > the stream through mbuffer helps with that. Just google "zfs send > > mbuffer" for some examples. And be aware that your speed may be > > limited by the sender. Especially if those small files are randomly > > spread across the platter, your sending server's disks may be the > > limiting factor. Use gstat to check. > > -Alan > > > Just a quick follow up. I was doing some tests with just mbuffer, > mbuffer and ssh, and just ssh (aes128-gcm) with an compressed stream and > non compressed stream (zfs send vs zfs send -c). Generally, didnt find > too much of a difference. I was testing on a production server that is > generally uniformly busy, so it wont be 100% reliable, but I think close > enough as there is not much variance in the back ground load nor in the > results. > > I tried this both with datasets that were backups of mailspools, so LOTS > of little files and big directories as well as with zfs datasets with a > few big files. > > On the mail spool just via mbuffer (no ssh involved at all) > > zfs send > summary: 514 GiByte in 1h 09min 35.9sec - average of 126 MiB/s > zfs send -c > summary: 418 GiByte in 1h 05min 58.5sec - average of 108 MiB/s > > and the same dataset, sending just through OpenSSH took 1h:06m (zfs > send) and 1h:01m (zfs send -c) > > > On the large dataset (large VMDK files), similar pattern. I did find one > interesting thing, when I was testing with a smaller dataset of just > 12G. As the server has 65G of ram, 29 allocated to ARC, sending a zfs > stream with -c made a giant difference. I guess there is some efficiency > with sending something thats already compressed in arc ? Or maybe its > just all cache effect. > > Testing with one with about 1TB of referenced data using mbuffer with > and without ssh and just ssh > > zfs send with mbuffer and ssh > summary: 772 GiByte in 51min 06.2sec - average of 258 MiB/s > zfs send -c > summary: 772 GiByte in 1h 22min 09.3sec - average of 160 MiB/s > > And the same dataset just with ssh -- zfs send 53min and zfs send -c 55min > > and just mbuffer (no ssh) > > summary: 772 GiByte in 56min 45.7sec - average of 232 MiB/s (zfs send -c) > summary: 1224 GiByte in 53min 20.4sec - average of 392 MiB/s (zfs send) > > This seems to imply the disk is the bottleneck. mbuffer doesnt seem to > make much of a difference either way. Straight up ssh looks to be fine > / best. > > Next step is going to allocate a pair of SSDs as special allocation > class vdevs to see if it starts to make a difference for all that > metadata. I guess I will have to send/resend the datasets to make sure > they make full use of the special vdevs > > ----Mike > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > --000000000000624b2305c2c05e06 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">What is your pool structure / disk types?</div><br><div cl= ass=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, 17 May 20= 21 at 16:58, mike tancsa <<a href=3D"mailto:mike@sentex.net">mike@sentex= .net</a>> wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"mar= gin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1= ex">On 5/13/2021 11:37 AM, Alan Somers wrote:<br> > On Thu, May 13, 2021 at 8:45 AM mike tancsa <<a href=3D"mailto:mike= @sentex.net" target=3D"_blank">mike@sentex.net</a><br> > <mailto:<a href=3D"mailto:mike@sentex.net" target=3D"_blank">mike@s= entex.net</a>>> wrote:<br> ><br> >=C2=A0 =C2=A0 =C2=A0For offsite storage, I have been doing a zfs send a= cross a 10G<br> >=C2=A0 =C2=A0 =C2=A0link and<br> >=C2=A0 =C2=A0 =C2=A0noticed something I don't understand with respe= ct to speed.=C2=A0 I have a<br> ><br> ><br> > Is this a high latency link?=C2=A0 ZFS send streams can be bursty.=C2= =A0 Piping<br> > the stream through mbuffer helps with that.=C2=A0 Just google "zf= s send<br> > mbuffer" for some examples.=C2=A0 And be aware that your speed ma= y be<br> > limited by the sender.=C2=A0 Especially if those small files are rando= mly<br> > spread across the platter, your sending server's disks may be the<= br> > limiting factor.=C2=A0 Use gstat to check.<br> > -Alan<br> <br> <br> Just a quick follow up.=C2=A0 I was doing some tests with just mbuffer,<br> mbuffer and ssh, and just ssh (aes128-gcm) with an compressed stream and<br= > non compressed stream (zfs send vs zfs send -c).=C2=A0 Generally, didnt fin= d<br> too much of a difference.=C2=A0 I was testing on a production server that i= s<br> generally uniformly busy, so it wont be 100% reliable, but I think close<br= > enough as there is not much variance in the back ground load nor in the<br> results.<br> <br> I tried this both with datasets that were backups of mailspools, so LOTS<br= > of little files and big directories as well as with zfs datasets with a<br> few big files.=C2=A0<br> <br> On the mail spool just via mbuffer (no ssh involved at all)<br> <br> zfs send<br> summary:=C2=A0 514 GiByte in 1h 09min 35.9sec - average of=C2=A0 126 MiB/s<= br> zfs send -c<br> summary:=C2=A0 418 GiByte in 1h 05min 58.5sec - average of=C2=A0 108 MiB/s<= br> <br> and the same dataset, sending just through OpenSSH took 1h:06m (zfs<br> send) and 1h:01m (zfs send -c)<br> <br> <br> On the large dataset (large VMDK files), similar pattern. I did find one<br= > interesting thing, when I was testing with a smaller dataset of just<br> 12G.=C2=A0 As the server has 65G of ram, 29 allocated to ARC, sending a zfs= <br> stream with -c made a giant difference. I guess there is some efficiency<br= > with sending something thats already compressed in arc ? Or maybe its<br> just all cache effect.<br> <br> Testing with one with about 1TB of referenced data using mbuffer with<br> and without ssh=C2=A0 and just ssh<br> <br> zfs send with mbuffer and ssh<br> summary:=C2=A0 772 GiByte in 51min 06.2sec - average of=C2=A0 258 MiB/s<br> zfs send -c<br> summary:=C2=A0 772 GiByte in 1h 22min 09.3sec - average of=C2=A0 160 MiB/s<= br> <br> And the same dataset just with ssh -- zfs send 53min and zfs send -c 55min<= br> <br> and just mbuffer (no ssh)<br> <br> summary:=C2=A0 772 GiByte in 56min 45.7sec - average of=C2=A0 232 MiB/s (zf= s send -c)<br> summary: 1224 GiByte in 53min 20.4sec - average of=C2=A0 392 MiB/s (zfs sen= d)<br> <br> This seems to imply the disk is the bottleneck. mbuffer doesnt seem to<br> make much of a difference either way.=C2=A0 Straight up ssh looks to be fin= e<br> / best.<br> <br> Next step is going to allocate a pair of SSDs as special allocation<br> class vdevs to see if it starts to make a difference for all that<br> metadata. I guess I will have to send/resend the datasets to make sure<br> they make full use of the special vdevs<br> <br> =C2=A0=C2=A0=C2=A0 ----Mike<br> <br> <br> _______________________________________________<br> <a href=3D"mailto:freebsd-fs@freebsd.org" target=3D"_blank">freebsd-fs@free= bsd.org</a> mailing list<br> <a href=3D"https://lists.freebsd.org/mailman/listinfo/freebsd-fs" rel=3D"no= referrer" target=3D"_blank">https://lists.freebsd.org/mailman/listinfo/free= bsd-fs</a><br> To unsubscribe, send any mail to "<a href=3D"mailto:freebsd-fs-unsubsc= ribe@freebsd.org" target=3D"_blank">freebsd-fs-unsubscribe@freebsd.org</a>&= quot;<br> </blockquote></div> --000000000000624b2305c2c05e06--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHEMsqYf4JOL22R3%2B13kqSOaDMydnpsq9Z2mR4EFBg7u78FjSQ>