From owner-freebsd-stable@FreeBSD.ORG  Tue Sep 29 08:43:43 2009
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D8BB9106566B
	for <freebsd-stable@freebsd.org>; Tue, 29 Sep 2009 08:43:43 +0000 (UTC)
	(envelope-from borjam@sarenet.es)
Received: from proxypop2.sarenet.es (proxypop2.sarenet.es [194.30.0.95])
	by mx1.freebsd.org (Postfix) with ESMTP id 9DAFC8FC17
	for <freebsd-stable@freebsd.org>; Tue, 29 Sep 2009 08:43:43 +0000 (UTC)
Received: from [172.16.1.204] (izaro.sarenet.es [192.148.167.11])
	by proxypop2.sarenet.es (Postfix) with ESMTP id E4F2773406;
	Tue, 29 Sep 2009 10:43:41 +0200 (CEST)
Mime-Version: 1.0 (Apple Message framework v1076)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
From: Borja Marcos <borjam@sarenet.es>
In-Reply-To: <089F63A7-574B-4646-97C7-D82B226CD4CF@sarenet.es>
Date: Tue, 29 Sep 2009 10:43:41 +0200
Content-Transfer-Encoding: 7bit
Message-Id: <6C7DE346-65C5-4130-86B8-56A60A1DAC28@sarenet.es>
References: <089F63A7-574B-4646-97C7-D82B226CD4CF@sarenet.es>
To: Borja Marcos <borjam@sarenet.es>
X-Mailer: Apple Mail (2.1076)
Cc: freebsd-stable@freebsd.org
Subject: Re: 8.0RC1, ZFS: deadlock
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 29 Sep 2009 08:43:43 -0000


On Sep 29, 2009, at 10:29 AM, Borja Marcos wrote:

>
> Hello,
>
> I have observed a deadlock condition when using ZFS. We are making a  
> heavy usage of zfs send/zfs receive to keep a replica of a dataset  
> on a remote machine. It can be done at one minute intervals. Maybe  
> we're doing a somehow atypical usage of ZFS, but, well, seems to be  
> a great solution to keep filesystem replicas once this is sorted out.
>
>
> How to reproduce:
>
> Set up two systems. A dataset with heavy I/O activity is replicated  
> from the first to the second one. I've used a dataset containing / 
> usr/obj while I did a make buildworld.
>
> Replicate the dataset from the first machine to the second one using  
> an incremental send
>
> zfs send -i pool/dataset@Nminus1 pool/dataset@N | ssh destination  
> zfs receive -d pool
>
> When there is read activity on the second system, reading the  
> replicated system, I mean, having read access while zfs receive is  
> updating it, there can be a deadlock. We have discovered this doing  
> a test on a hopefully soon in production server, with 8 GB RAM. A  
> Bacula backup agent was running and ZFS deadlocked.

Sorry, forgot to explain what was happening on the second system (the  
one receiving the incremental snapshots) for the deadlock to happen.

It was just running an endless loop, copying the contents of /usr/obj  
to another dataset, in order to keep the reading activity going on.

That's how it has deadlocked. On the original test system an rsync did  
the same trick.


Borja