Date: Mon, 8 Apr 2013 10:29:52 +0200 From: Joar Jegleim <joar.jegleim@gmail.com> To: Peter Jeremy <peter@rulingia.com> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: Regarding regular zfs Message-ID: <CAFfb-hqeKqY-8j09BfocDxw5VaBuA=tzS6CZr5Kzor8ZCrBDow@mail.gmail.com> In-Reply-To: <20130405211249.GB31958@server.rulingia.com> References: <CAFfb-hpt4iKSb0S2fgQ16Hp51KLWJew1Se32yX1cUPYi6pp72g@mail.gmail.com> <20130405211249.GB31958@server.rulingia.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[...]"Are you deleting old snapshots after the newer snapshots have been sent?"[...] yeah, the script deletes old snapshots. The slave will usually hold 2 snapshots ( 1 being the initial snapshot received via zfs send from master, 2nd being the latest snapshot received from master) . [...]"Can you clarify which machine you mean by server in the last line above. I presume you mean the slave machine running "zfs recv". If you monitor the "server" with "vmstat -v 1", "gstat -a" and "zfs-mon -a" (the latter is part of ports/sysutils/zfs-stats) during the "freeze", what do you see? Are the disks saturated or idle? Are the "cache" or "free" values close to zero?" [...] The last line "Everything on the server halts / hangs completely." I'm talking about the 'slave' (the receiving end) I'll check how cache is doing, but as I wrote in my previous reply, the 'slave' server is completely unresponsive, nothing works at all for 5-15 seconds, when the server is responsive again (can ssh in and so on) I can't seem to find anything in dmesg or any log hinting about anything at all that went 'wrong' . "There was a bug in interface between ZFS ARC and FreeBSD VM that resulted in ARC starvation. This was fixed between 8.2 and 8.3/9.0." ah, ok . "Do you have atime enabled or disabled? What happens when you don't run rsync at the same time? Are you able to break into DDB?" atime is disabled. When I don't run rsync the server seem ok, I've tried to detect any hang (as in I ssh into the server and issue various commands such as top, ls and so on) while not rsync'ing and there might have been a really minor 'glitch' but it was hardly noticeable at all, and nothing compared to those 5-15 seconds when the backup server is doing the rsync (from the live volume, not a snapshot) . I could try DDB, I'm gonna have to get back to you on that, I haven't debug'ed FreeBSD kernel before and the system is in production, so I would have to be cautious. I might be able to try out that during this week . [...]Apart from the rsync whilst receiving, everything sounds OK. It's possible that the rsync whilst receiving is triggering a bug.[...] I sort of think so too, at least since the whole OS is unresponsive / hang for anything from 5-15 seconds . -- ---------------------- Joar Jegleim Homepage: http://cosmicb.no Linkedin: http://no.linkedin.com/in/joarjegleim fb: http://www.facebook.com/joar.jegleim AKA: CosmicB @Freenode ---------------------- On 5 April 2013 23:12, Peter Jeremy <peter@rulingia.com> wrote: > On 2013-Apr-05 12:17:27 +0200, Joar Jegleim <joar.jegleim@gmail.com> > wrote: > >I've got this script that initially zfs send's a whole zfs volume, and > >for every send after that only sends the diff . So after the initial zfs > >send, the diff's usually take less than a minute to send over. > > Are you deleting old snapshots after the newer snapshots have been sent? > > >I've had increasing problems on the 'slave', it seem to grind to a > >halt for anything between 5-20 seconds after every zfs receive . > Everything > >on the server halts / hangs completely. > > Can you clarify which machine you mean by server in the last line above. > I presume you mean the slave machine running "zfs recv". > > If you monitor the "server" with "vmstat -v 1", "gstat -a" and "zfs-mon -a" > (the latter is part of ports/sysutils/zfs-stats) during the "freeze", > what do you see? Are the disks saturated or idle? Are the "cache" or > "free" values close to zero? > > ># 16GB arc_max ( server got 30GB of ram, but had a couple 'freeze' > >situations, suspect zfs.arc ate too much memory) > > There was a bug in interface between ZFS ARC and FreeBSD VM that resulted > in ARC starvation. This was fixed between 8.2 and 8.3/9.0. > > >I suspect it may have something to do with the zfs volume being sent > >is mount'ed on the slave, and I'm also doing the backups from the > >slave, which means a lot of the time the backup server is rsyncing the > >zfs volume being updated. > > Do you have atime enabled or disabled? What happens when you don't run > rsync at the same time? > > Are you able to break into DDB? > > >In my setup have I taken the use case for zfs send / receive too far > >(?) as in, it's not meant for this kind of syncing and this often, so > >there's actually nothing 'wrong'. > > Apart from the rsync whilst receiving, everything sounds OK. It's > possible that the rsync whilst receiving is triggering a bug. > > -- > Peter Jeremy >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFfb-hqeKqY-8j09BfocDxw5VaBuA=tzS6CZr5Kzor8ZCrBDow>