From owner-freebsd-fs@freebsd.org Thu May 13 14:45:51 2021 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DA2EE644708 for ; Thu, 13 May 2021 14:45:51 +0000 (UTC) (envelope-from mike@sentex.net) Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [IPv6:2607:f3e0:0:3::19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "pyroxene.sentex.ca", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Fgvbf5l00z3GlH for ; Thu, 13 May 2021 14:45:50 +0000 (UTC) (envelope-from mike@sentex.net) Received: from [IPv6:2607:f3e0:0:4:80af:9f4f:a874:2b2a] ([IPv6:2607:f3e0:0:4:80af:9f4f:a874:2b2a]) by pyroxene2a.sentex.ca (8.16.1/8.15.2) with ESMTPS id 14DEjn06077938 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO) for ; Thu, 13 May 2021 10:45:49 -0400 (EDT) (envelope-from mike@sentex.net) To: freebsd-fs From: mike tancsa Subject: speeding up zfs send | recv Message-ID: <866d6937-a4e8-bec3-d61b-07df3065fca9@sentex.net> Date: Thu, 13 May 2021 10:45:50 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: en-US X-Rspamd-Queue-Id: 4Fgvbf5l00z3GlH X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of mike@sentex.net designates 2607:f3e0:0:3::19 as permitted sender) smtp.mailfrom=mike@sentex.net X-Spamd-Result: default: False [-2.00 / 15.00]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; FREEFALL_USER(0.00)[mike]; FROM_HAS_DN(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2607:f3e0:0:3::19:from]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; HFILTER_HELO_IP_A(1.00)[pyroxene2a.sentex.ca]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_ZRD(0.00)[2607:f3e0:0:3::19:from:127.0.2.255]; HFILTER_HELO_NORES_A_OR_MX(0.30)[pyroxene2a.sentex.ca]; TO_DN_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_NA(0.00)[sentex.net]; R_SPF_ALLOW(-0.20)[+ip6:2607:f3e0::/32]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:11647, ipnet:2607:f3e0::/32, country:CA]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 May 2021 14:45:51 -0000 For offsite storage, I have been doing a zfs send across a 10G link and noticed something I don't understand with respect to speed.=C2=A0 I have = a number of datasets I am sending, one which contains a great many Maildirs / files and others that a few large 30-60G files (vm disk images).=C2=A0 When I am sending the data set with the mail spool (many s= mall files and directories), transfer tends to be markedly slower.=C2=A0 Looki= ng at the cacti graph, it seems to hover around 500Mb/s through an aes128-gcm cipher when sending the mail spool, vs sending the dataset that has the VMs on it, around 2.5Gb/s (both on a 5min average)... Why would the mail spool send be so slow compared to the sends where datasets only have a few large files ? One thing I am wondering is, could it be due to the amount of snapshots I have ? For each, I have about 60-100 snapshots. I am only sending a copy based on the latest snapshot, but I guess that's a lot of calculations to go through in order to get a complete image. However, I would have thought that would impact both types of datasets equally ? e.g. on my oldest mailspool snapshot, I see 60G of difference from the oldest snapshot on a dataset that's about 600GB in size By contrast, the dataset with VM images, is 300G and the oldest snapshot shows just 16G of difference and has a total of 93 snapshots. Is there anything I can do to speed up the send ? The recv side has lots of spare CPU. I dont see the disk blocking at all.=C2=A0 The sending side= is pretty busy, but I would imagine equally busy across all data sets. sender is a recent RELENG_12, recv side is RELENG_13 as a side note, is zstd ever nice!=C2=A0 On a different dataset that has = a lot of big ass json files. I am seeing refcompressratio=C2=A0 22.15x=C2=A0= =C2=A0 vs 13.19x=C2=A0 for the old lz4. =C2=A0=C2=A0=C2=A0 ---Mike