From owner-freebsd-fs@freebsd.org Thu May 13 15:37:37 2021 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B5C78645B4B for ; Thu, 13 May 2021 15:37:37 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ot1-f50.google.com (mail-ot1-f50.google.com [209.85.210.50]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FgwlN39TZz3LLN for ; Thu, 13 May 2021 15:37:36 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ot1-f50.google.com with SMTP id n32-20020a9d1ea30000b02902a53d6ad4bdso23922627otn.3 for ; Thu, 13 May 2021 08:37:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=23L5Sf2OUPD6lf5bjMLsA9qmqnPhL0UW3mpRyRjmtlc=; b=BREc2FpW1IInVZrfhbmPbX+zpYDVYNilvHPgHuUwdSaDffcIF21B3PgqEufJCCl2J4 3Egn0Ictbx0H3JnM3R1vUSdABycLQUsb6L6WyWCPGGIUAn+D7v6XFa3YYJw9MfN5sGpT bQpbi87weDYBvBlILe0JuJndEs7a8zEDGHnGIzOG0Vbv7/muLTnMDCZQXRzx0MBXL8Nc wGk6aJHdWDqbP/3UhHAdlBBH+obg97duTSmc54muQT6FRiIgP6Y9jj4BAOmaT3ZH3HA3 sq6wzqQqDDuJvkqlMOYVzW4vaTFwrTevbIeBREhHpWEDyriG7WNd/Er2gvMBR5z3/16/ +TcQ== X-Gm-Message-State: AOAM533U6diaMueqkEErJLs6LWf692ocKpyvfYgEDu8ETpU0l1uYBHs1 1/LdUW9LkRiOktUJSBvvrU2aQ7rWDoPAt5OXvDJ/KEOV X-Google-Smtp-Source: ABdhPJw1jtu5zq5WjxWlpFjFh1j+9NjycRxe/fLLYRYstZp4z1yadMnUHfXdx753pGOEpkDttiYjYoKT2Eef3BUCTEI= X-Received: by 2002:a05:6830:349b:: with SMTP id c27mr18505457otu.251.1620920255141; Thu, 13 May 2021 08:37:35 -0700 (PDT) MIME-Version: 1.0 References: <866d6937-a4e8-bec3-d61b-07df3065fca9@sentex.net> In-Reply-To: <866d6937-a4e8-bec3-d61b-07df3065fca9@sentex.net> From: Alan Somers Date: Thu, 13 May 2021 09:37:24 -0600 Message-ID: Subject: Re: speeding up zfs send | recv To: mike tancsa Cc: freebsd-fs X-Rspamd-Queue-Id: 4FgwlN39TZz3LLN X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.210.50 as permitted sender) smtp.mailfrom=asomers@gmail.com X-Spamd-Result: default: False [-3.00 / 15.00]; RWL_MAILSPIKE_GOOD(0.00)[209.85.210.50:from]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; TO_DN_ALL(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RBL_DBL_DONT_QUERY_IPS(0.00)[209.85.210.50:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FREEFALL_USER(0.00)[asomers]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; SPAMHAUS_ZRD(0.00)[209.85.210.50:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[209.85.210.50:from]; R_DKIM_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 May 2021 15:37:37 -0000 On Thu, May 13, 2021 at 8:45 AM mike tancsa wrote: > For offsite storage, I have been doing a zfs send across a 10G link and > noticed something I don't understand with respect to speed. I have a > number of datasets I am sending, one which contains a great many > Maildirs / files and others that a few large 30-60G files (vm disk > images). When I am sending the data set with the mail spool (many small > files and directories), transfer tends to be markedly slower. Looking > at the cacti graph, it seems to hover around 500Mb/s through an > aes128-gcm cipher when sending the mail spool, vs sending the dataset > that has the VMs on it, around 2.5Gb/s (both on a 5min average)... > > Why would the mail spool send be so slow compared to the sends where > datasets only have a few large files ? > > One thing I am wondering is, could it be due to the amount of snapshots > I have ? For each, I have about 60-100 snapshots. I am only sending a > copy based on the latest snapshot, but I guess that's a lot of > calculations to go through in order to get a complete image. However, I > would have thought that would impact both types of datasets equally ? > e.g. on my oldest mailspool snapshot, I see 60G of difference from the > oldest snapshot on a dataset that's about 600GB in size > > By contrast, the dataset with VM images, is 300G and the oldest snapshot > shows just 16G of difference and has a total of 93 snapshots. > > Is there anything I can do to speed up the send ? The recv side has lots > of spare CPU. I dont see the disk blocking at all. The sending side is > pretty busy, but I would imagine equally busy across all data sets. > > sender is a recent RELENG_12, recv side is RELENG_13 > > as a side note, is zstd ever nice! On a different dataset that has a > lot of big ass json files. I am seeing refcompressratio 22.15x vs > 13.19x for the old lz4. > > ---Mike > Is this a high latency link? ZFS send streams can be bursty. Piping the stream through mbuffer helps with that. Just google "zfs send mbuffer" for some examples. And be aware that your speed may be limited by the sender. Especially if those small files are randomly spread across the platter, your sending server's disks may be the limiting factor. Use gstat to check. -Alan