Date: Thu, 18 Aug 2016 12:50:06 +0200 From: Borja Marcos <borjam@sarenet.es> To: linda@kateley.com Cc: Chris Watson <bsdunix44@gmail.com>, freebsd-fs@freebsd.org Subject: Re: HAST + ZFS + NFS + CARP Message-ID: <354253C2-E42E-4B9C-9931-9135A5A7DFD9@sarenet.es> In-Reply-To: <6b866b6e-1ab3-bcc5-151b-653e401742bd@kateley.com> References: <61283600-A41A-4A8A-92F9-7FAFF54DD175@ixsystems.com> <20160704183643.GI41276@mordor.lan> <AE372BF0-02BE-4BF3-9073-A05DB4E7FE34@ixsystems.com> <20160704193131.GJ41276@mordor.lan> <E7D42341-D324-41C7-B03A-2420DA7A7952@sarenet.es> <20160811091016.GI70364@mordor.lan> <1AA52221-9B04-4CF6-97A3-D2C2B330B7F9@sarenet.es> <472bc879-977f-8c4c-c91a-84cc61efcd86@internetx.com> <20160817085413.GE22506@mordor.lan> <465bdec5-45b7-8a1d-d580-329ab6d4881b@internetx.com> <20160817095222.GG22506@mordor.lan> <52d5b687-1351-9ec5-7b67-bfa0be1c8415@kateley.com> <92F4BE3D-E4C1-4E5C-B631-D8F124988A83@gmail.com> <6b866b6e-1ab3-bcc5-151b-653e401742bd@kateley.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 17 Aug 2016, at 20:03, Linda Kateley <lkateley@kateley.com> wrote: >=20 > You do risk losing data if you batch zfs send. It is very hard to run = that real time. You have to take the snap then send the snap. Most = people run in cron, even if it's not in cron, you would want one to = finish before you started the next. If you lose the sending host before = the receive is complete you won't have a full copy. With zfs though you = will probably still have the data on the sending host, however long it = takes to bring it back up. RSF-1 runs in the zfs stack and send the = writes to the second system. It's kind of pricey, but actually much less = expensive than commercial alternatives. Doing somewhat critical stuff off cron is not usually a good idea. I do = ZFS replication with a custom program which makes sure of some important = stuff: - Using holds to avoid an accidental snapshot deletion to require a full = send/receive.=20 - Avoiding starting a new send/receive on a dataset in case the previous = one didn=E2=80=99t finish for whatever reason (the main problem with = cron) - Offering the possibility of some random variation on the replication = period so that, in case several happen to start simultaneously, you = don=E2=80=99t have a periodically overloaded system. - Avoiding mounting the replicas so that the receive won=E2=80=99t need = a rollback, which would be potentially risky. - Supports one-to-many replicas, with different periodicity for each = destination if required. I am sorry I can=E2=80=99t share it (company property) but the program = is rather silly anyway. The important work was the decision to have the = previous features, and a design decision to avoid destructive and portentially = error-prone operations such as rollbacks.=20 Most applications that require real time replication are databases, and = they usually include a clustering option which can be much simpler to = manage (and more robust in this case) than filesystem replication. For other cases, often you can design around the loss of a small amount = of data. I understand that in some cases you have no other option, but the benefits of asynchronous send/receive are so many, especially if = you are on a tight budget, it=E2=80=99s well worth to try to make the = most of it. Borja.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?354253C2-E42E-4B9C-9931-9135A5A7DFD9>