Date: Mon, 23 Aug 2021 09:54:39 +1000 From: George Michaelson <ggm@algebras.org> To: Alan Somers <asomers@freebsd.org> Cc: Ben RUBSON <ben.rubson@gmx.com>, freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: ZFS on high-latency devices Message-ID: <CAKr6gn2RG_PQCj=2dT7Ntqp0nZRLGOOhdXgbm2e%2B%2BtJvL5WfKg@mail.gmail.com> In-Reply-To: <CAOtMX2hRuh_9ZOOoQufNT2QG3Ui0S3rJq%2BL-ox2kxsq1oJMSMA@mail.gmail.com> References: <YR4mY%2Bb6o7fBJqEN@server.rulingia.com> <023225AD-2A97-47C5-9FE4-3ABF1BFD66F1@gmx.com> <CAKr6gn0r8xG9HNGOFh1A_usU4tPAYezeZv1chOG_bBMqy_HtXw@mail.gmail.com> <CAOtMX2hRuh_9ZOOoQufNT2QG3Ui0S3rJq%2BL-ox2kxsq1oJMSMA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I don't think its sensible to mesh long-delay file constructs into a pool. Maybe there is a model which permits this but I think higher abstractions like CEPH may be a better fit for the kind of distributed filestore this implies. I use ZFS send/receive to make "clones" of a zpool/zfs structure exist, and they are not bound in the delay problem for live FS delivery: you use mbuffer to make the transport of the entire ZFS datastate more effective. A long time ago, 25+ years ago I ran NFS over X25, as well as SMB. (ip over X25) and it was very painful. Long-haul, long delay networks are not a good fit for direct FS semantics of read/write if you care about speed. There isn't enough cacheing in the world to make the distance fully transparent. When I have to use FS-over-<thing> now it's typically to mount virtual ISO images for console-FS recovery of a host, and its bad enough over 150ms delay fibre links not to want to depend on it beyond the bootstrap phase. -G On Mon, Aug 23, 2021 at 9:48 AM Alan Somers <asomers@freebsd.org> wrote: > > mbuffer is not going to help the OP. He's trying to create a pool on top of a networked block device. And if I understand correctly, he's connecting over a WAN, not a LAN. ZFS will never achieve decent performance in such a setup. It's designed as a local file system, and assumes it can quickly read metadata off of the disks at any time. The OP's best option is to go with "a": encrypt each dataset and send them with "zfs send --raw". I don't know why he thinks that it would be "very difficult". It's quite easy, if he doesn't care about old snapshots. Just: > > $ zfs create <crypto options> pool/new_dataset > $ cp -a pool/old_dataset/* pool/new_dataset/ > > -Alan > > On Sun, Aug 22, 2021 at 5:40 PM George Michaelson <ggm@algebras.org> wrote: >> >> I don't want to abuse the subject line too much, but I can highly >> recommend the mbuffer approach, I've used this repeatedly, BSD-BSD and >> BSD-Linux. It definitely feels faster than SSH, since the 'no cipher' >> options were removed, and in the confusion of the HPC buffer changes. >> But, its not crypted on-the-wire. >> >> Mbuffer tuning is a bit of a black art: it would help enormously if >> there was some guidance on this, and personally I've never found the >> mbuffer -v option to work well: I get no real sense of how full or >> empty the buffer "is" or, if the use of sendmsg/recvmsg type buffer >> chains is better or worse. >> >> -G >> >> On Fri, Aug 20, 2021 at 6:19 PM Ben RUBSON <ben.rubson@gmx.com> wrote: >> > >> > > On 19 Aug 2021, at 11:37, Peter Jeremy <peter@rulingia.com> wrote: >> > > >> > > (...) or a way to improve throughput doing "zfs recv" to a pool with a high RTT. >> > >> > You should use zfs send/receive through mbuffer, which will allow to sustain better throughput over high latency links. >> > Feel free to play with its buffer size parameters to find the better settings, depending on your link characteristics. >> > >> > Ben >> > >> > >>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAKr6gn2RG_PQCj=2dT7Ntqp0nZRLGOOhdXgbm2e%2B%2BtJvL5WfKg>
