From owner-freebsd-fs@FreeBSD.ORG  Fri Apr  5 11:36:47 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 87E17BA1
 for <freebsd-fs@freebsd.org>; Fri,  5 Apr 2013 11:36:47 +0000 (UTC)
 (envelope-from peter.maloney@brockmann-consult.de)
Received: from moutng.kundenserver.de (moutng.kundenserver.de
 [212.227.126.187]) by mx1.freebsd.org (Postfix) with ESMTP id 1C7AAC1
 for <freebsd-fs@freebsd.org>; Fri,  5 Apr 2013 11:36:46 +0000 (UTC)
Received: from [10.3.0.26] ([141.4.215.32])
 by mrelayeu.kundenserver.de (node=mrbap1) with ESMTP (Nemesis)
 id 0LjuTB-1UvDqA2GM5-00bk8h; Fri, 05 Apr 2013 13:36:39 +0200
Message-ID: <515EB744.5000607@brockmann-consult.de>
Date: Fri, 05 Apr 2013 13:36:36 +0200
From: Peter Maloney <peter.maloney@brockmann-consult.de>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Damien Fleuriot <ml@my.gd>
Subject: Re: Regarding regular zfs
References: <CAFfb-hpt4iKSb0S2fgQ16Hp51KLWJew1Se32yX1cUPYi6pp72g@mail.gmail.com>
 <8B0FFF01-B8CC-41C0-B0A2-58046EA4E998@my.gd>
In-Reply-To: <8B0FFF01-B8CC-41C0-B0A2-58046EA4E998@my.gd>
X-Enigmail-Version: 1.5.1
X-Provags-ID: V02:K0:Ww6NopP+qrPq1rVl65Iybul4SVwhphcMbVJSyIbNPR4
 GzFaIakGklutqFuK2l6HGCkMC4ge/5V1z/wzneZUcO05bM55Jh
 uXgXxYXVMbCJfGL/MSoPUpj7PFMi8y8+BIRYzEG3Djrdl1kASJ
 lnRI0BzCQesTKFdnx5x/Fx3y0WW2dj8Q2EtWrsexoPcSyEcBCF
 lWnJTZG8wzjk4lun6GzSO7yam2rwUpCNtwfKQIbmeI84ueEb5x
 u0xRtBe960YgZoRrI904ksYmhqrXSNDY8La9A+Q6MWaLjeCBbm
 jrfRIGSFAu1W+Xd9JsTkw3+1MtsFyAeGtE1NmUOc9Ff5UG1Y9l
 Ct/hcD3pyNdJszr1lS/+4Dc/Qj1hDsJA8vXPXD0St
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Apr 2013 11:36:47 -0000

On 2013-04-05 13:07, Damien Fleuriot wrote:
> -I've implemented mbuffer for the zfs send / receive operations. With
> mbuffer the sync went a lot faster, but still got the same symptoms
> when the zfs receive is done, the hang / unresponsiveness returns for
> 5-20 seconds
> -I've upgraded to 8.3-RELEASE ( + zpool upgrade and zfs upgrade to
> V28), same symptoms
> -I've upgraded to 9.1-RELEASE, still same symptoms

> So my question(s) to the list would be:
> In my setup have I taken the use case for zfs send / receive too far
> (?) as in, it's not meant for this kind of syncing and this often, so
> there's actually nothing 'wrong'.

I do the same thing on an 8.3-STABLE system, with replication every 20
minutes (compared to your 15 minutes), and it has worked flawlessly for
over a year. Before that point, it was hanging often, until I realized
that all hangs were from when there was more than 1 writing "zfs"
command running at the same time (snapshot, send, destroy, rename, etc.
but not list, get, etc.). So now *all my scripts have a common lock
between them* (just a pid file like in /var/run; cured the hangs), and I
don't run manual zfs commands without stopping my cronjobs. If the hang
was caused by a destroy or smething during a send, I think it would
usually unhang when the send is done, do the destroy or whatever else
was blocking, then be unhung completely, smoothly working. In other
cases, I think it would be deadlocked.


NAME                        USED  REFER  USEDCHILD  USEDDS  USEDSNAP 
AVAIL  MOUNTPOINT
tank                       38.5T   487G      37.4T    487G      635G 
9.54T  /tank
tank/backup                7.55T  1.01T      5.08T   1.01T     1.46T 
9.54T  /tank/backup
...

Sends are still quick with 38 T to send. The last replication run
started 2013-04-05 13:20:00 +0200 and finished 2013-04-05 13:22:18
+0200. I have 234 snapshots at the moment (one per 20 min today + one
daily for a few months).