Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 May 2011 08:33:59 +0100
From:      Luke Marsden <luke-lists@hybrid-logic.co.uk>
To:        Borja Marcos <borjam@sarenet.es>
Cc:        Charles Sprickman <spork@bway.net>, stable@FreeBSD.org, Andriy Gapon <avg@FreeBSD.org>, Jeremy Chadwick <freebsd@jdc.parodius.com>
Subject:   Re: 8.1R possible zfs snapshot livelock?
Message-ID:  <1305876839.13971.5.camel@pow>
In-Reply-To: <FCE5F082-A3BF-4A21-B2E3-FEF3EA715F2C@sarenet.es>
References:  <alpine.OSX.2.00.1105170120510.1983@hotlap.nat.fasttrackmonkey.com> <20110517073029.GA44359@icarus.home.lan> <4DD25264.8040305@FreeBSD.org> <20110517112952.GA48610@icarus.home.lan> <FCE5F082-A3BF-4A21-B2E3-FEF3EA715F2C@sarenet.es>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 2011-05-18 at 14:05 +0200, Borja Marcos wrote:=20
> On May 17, 2011, at 1:29 PM, Jeremy Chadwick wrote:
>=20
> > * ZFS send | ssh zfs recv results in ZFS subsystem hanging;
> 8.1-RELEASE;
> >  February 2011:
> >
> http://lists.freebsd.org/pipermail/freebsd-fs/2011-February/010602.html
>=20
> I found a reproducible deadlock condition actually. If you keep some
> I/O activity on a dataset on which you are receiving a ZFS incremental
> snapshot at the same time, it can deadlock.
>=20
> Imagine this situation: Two servers, A and B. A dataset on server A is
> replicated at regular intervals to B, so that you keep a reasonably up
> to date copy.
>=20
> Something like:
>=20
> (Runnning on server A):
>=20
> zfs snapshot thepool/thedataset@thistime
> zfs send -Ri thepooll/thedataser@previoustime
> hepool/thedataset@thistime | ssh serverB zfs receive -d thepool
>=20
> It works, but I suffered a deadlock when one of the periodic "daily"
> scripts was running. Doing some tests, I saw that ZFS  can deadlock if
> you do a zfs receive onto a dataset which has some read activity.
> Disabling atime didn't help either.
>=20
> But if you make sure *not* to access the replicated dataset it works,
> I haven=C2=B4t seen it failing otherwise.=20
>=20
> If  you wish to reproduce it, try creating a dataset for /usr/obj,
> running make buildworld on it, replicating at, say, 30 or 60 second
> intervals, and keep several scripts (or rsync) reading the target
> dataset files and just copying them to another place in the usual,
> "classic" way. (example: tar cf - . | ( cd /destination && tar xf -)
>=20

Is there a PR for this?  I'd like to see it addressed, since read-only
I/O on a dataset which is being updated by `zfs recv` is an important
part of what we plan to do with ZFS on FreeBSD.

--=20
Best Regards,
Luke Marsden
CTO, Hybrid Logic Ltd.

Web: http://www.hybrid-cluster.com/
Hybrid Web Cluster - cloud web hosting

Phone: +447791750420





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1305876839.13971.5.camel>