Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Aug 2016 12:50:06 +0200
From:      Borja Marcos <borjam@sarenet.es>
To:        linda@kateley.com
Cc:        Chris Watson <bsdunix44@gmail.com>, freebsd-fs@freebsd.org
Subject:   Re: HAST + ZFS + NFS + CARP
Message-ID:  <354253C2-E42E-4B9C-9931-9135A5A7DFD9@sarenet.es>
In-Reply-To: <6b866b6e-1ab3-bcc5-151b-653e401742bd@kateley.com>
References:  <61283600-A41A-4A8A-92F9-7FAFF54DD175@ixsystems.com> <20160704183643.GI41276@mordor.lan> <AE372BF0-02BE-4BF3-9073-A05DB4E7FE34@ixsystems.com> <20160704193131.GJ41276@mordor.lan> <E7D42341-D324-41C7-B03A-2420DA7A7952@sarenet.es> <20160811091016.GI70364@mordor.lan> <1AA52221-9B04-4CF6-97A3-D2C2B330B7F9@sarenet.es> <472bc879-977f-8c4c-c91a-84cc61efcd86@internetx.com> <20160817085413.GE22506@mordor.lan> <465bdec5-45b7-8a1d-d580-329ab6d4881b@internetx.com> <20160817095222.GG22506@mordor.lan> <52d5b687-1351-9ec5-7b67-bfa0be1c8415@kateley.com> <92F4BE3D-E4C1-4E5C-B631-D8F124988A83@gmail.com> <6b866b6e-1ab3-bcc5-151b-653e401742bd@kateley.com>

next in thread | previous in thread | raw e-mail | index | archive | help

> On 17 Aug 2016, at 20:03, Linda Kateley <lkateley@kateley.com> wrote:
>=20
> You do risk losing data if you batch zfs send. It is very hard to run =
that real time. You have to take the snap then send the snap. Most =
people run in cron, even if it's not in cron, you would want one to =
finish before you started the next. If you lose the sending host before =
the receive is complete you won't have a full copy. With zfs though you =
will probably still have the data on the sending host, however long it =
takes to bring it back up. RSF-1 runs in the zfs stack and send the =
writes to the second system. It's kind of pricey, but actually much less =
expensive than commercial alternatives.

Doing somewhat critical stuff off cron is not usually a good idea. I do =
ZFS replication with a custom program which makes sure of some important =
stuff:

- Using holds to avoid an accidental snapshot deletion to require a full =
send/receive.=20

- Avoiding starting a new send/receive on a dataset in case the previous =
one didn=E2=80=99t finish for whatever reason (the main problem with =
cron)

- Offering the possibility of some random variation on the replication =
period so that, in case several happen to start simultaneously, you =
don=E2=80=99t have a periodically overloaded system.

- Avoiding mounting the replicas so that the receive won=E2=80=99t need =
a rollback, which would be potentially risky.

- Supports one-to-many replicas, with different periodicity for each =
destination if required.

I am sorry I can=E2=80=99t share it (company property) but the program =
is rather silly anyway. The important work was the decision to have the =
previous
features, and a design decision to avoid destructive and portentially =
error-prone operations such as rollbacks.=20


Most applications that require real time replication are databases, and =
they usually include a clustering option which can be much simpler to =
manage (and more robust in this case) than filesystem replication.

For other cases, often you can design around the loss of a small amount =
of data. I understand that in some cases you have no other option,
but the benefits of asynchronous send/receive are so many, especially if =
you are on a tight budget, it=E2=80=99s well worth to try to make the =
most of it.






Borja.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?354253C2-E42E-4B9C-9931-9135A5A7DFD9>