Date: Fri, 5 Apr 2013 15:02:12 +0200 From: Joar Jegleim <joar.jegleim@gmail.com> To: Peter Maloney <peter.maloney@brockmann-consult.de> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: Regarding regular zfs Message-ID: <CAFfb-hpz2wR9ad0yqO7vip6m5nGwmniwsZpf2UZ2pTgVfrFSOQ@mail.gmail.com> In-Reply-To: <515EB744.5000607@brockmann-consult.de> References: <CAFfb-hpt4iKSb0S2fgQ16Hp51KLWJew1Se32yX1cUPYi6pp72g@mail.gmail.com> <8B0FFF01-B8CC-41C0-B0A2-58046EA4E998@my.gd> <515EB744.5000607@brockmann-consult.de>
next in thread | previous in thread | raw e-mail | index | archive | help
You make some interesting points . I don't _think_ the script 'causes more than 1 zfs write at a time, and I'm sure 'nothing else' is doing that neither . But I'm gonna check that out because it does sound like a logical explanation. I'm wondering if the rsync from the receiving server (that is: the backup server is doing rsync from the zfs receive server) could 'cause the same problem, it's only reading though ... -- ---------------------- Joar Jegleim Homepage: http://cosmicb.no Linkedin: http://no.linkedin.com/in/joarjegleim fb: http://www.facebook.com/joar.jegleim AKA: CosmicB @Freenode ---------------------- On 5 April 2013 13:36, Peter Maloney <peter.maloney@brockmann-consult.de>wrote: > On 2013-04-05 13:07, Damien Fleuriot wrote: > > -I've implemented mbuffer for the zfs send / receive operations. With > mbuffer the sync went a lot faster, but still got the same symptoms > when the zfs receive is done, the hang / unresponsiveness returns for > 5-20 seconds > -I've upgraded to 8.3-RELEASE ( + zpool upgrade and zfs upgrade to > V28), same symptoms > -I've upgraded to 9.1-RELEASE, still same symptoms > > > So my question(s) to the list would be: > In my setup have I taken the use case for zfs send / receive too far > (?) as in, it's not meant for this kind of syncing and this often, so > there's actually nothing 'wrong'. > > > I do the same thing on an 8.3-STABLE system, with replication every 20 > minutes (compared to your 15 minutes), and it has worked flawlessly for > over a year. Before that point, it was hanging often, until I realized that > all hangs were from when there was more than 1 writing "zfs" command > running at the same time (snapshot, send, destroy, rename, etc. but not > list, get, etc.). So now *all my scripts have a common lock between them*(just a pid file like in /var/run; cured the hangs), and I don't run manual > zfs commands without stopping my cronjobs. If the hang was caused by a > destroy or smething during a send, I think it would usually unhang when the > send is done, do the destroy or whatever else was blocking, then be unhung > completely, smoothly working. In other cases, I think it would be > deadlocked. > > > NAME USED REFER USEDCHILD USEDDS USEDSNAP > AVAIL MOUNTPOINT > tank 38.5T 487G 37.4T 487G 635G > 9.54T /tank > tank/backup 7.55T 1.01T 5.08T 1.01T 1.46T > 9.54T /tank/backup > ... > > Sends are still quick with 38 T to send. The last replication run started > 2013-04-05 13:20:00 +0200 and finished 2013-04-05 13:22:18 +0200. I have > 234 snapshots at the moment (one per 20 min today + one daily for a few > months). > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFfb-hpz2wR9ad0yqO7vip6m5nGwmniwsZpf2UZ2pTgVfrFSOQ>