From owner-freebsd-fs@FreeBSD.ORG  Mon Dec 14 22:00:35 2009
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 48E061065692
	for <freebsd-fs@freebsd.org>; Mon, 14 Dec 2009 22:00:35 +0000 (UTC)
	(envelope-from mm@FreeBSD.org)
Received: from mail.vx.sk (core.vx.sk [188.40.32.143])
	by mx1.freebsd.org (Postfix) with ESMTP id C94578FC22
	for <freebsd-fs@freebsd.org>; Mon, 14 Dec 2009 22:00:34 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by mail.vx.sk (Postfix) with ESMTP id 21DD61A2DA;
	Mon, 14 Dec 2009 22:41:37 +0100 (CET)
X-Virus-Scanned: amavisd-new at mail.vx.sk
Received: from mail.vx.sk ([127.0.0.1])
	by localhost (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id 6Z5IHuGudQw9; Mon, 14 Dec 2009 22:41:35 +0100 (CET)
Received: from [10.9.8.1] (chello089173000055.chello.sk [89.173.0.55])
	by mail.vx.sk (Postfix) with ESMTPSA id C24D61A2BC;
	Mon, 14 Dec 2009 22:41:34 +0100 (CET)
Message-ID: <4B26B08E.5000203@FreeBSD.org>
Date: Mon, 14 Dec 2009 22:39:26 +0100
From: Martin Matuska <mm@FreeBSD.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk;
	rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23
	Mnenhy/0.7.5.0
MIME-Version: 1.0
To: Borja Marcos <borjam@sarenet.es>
References: <op.u2df9msz8527sy@82-170-177-25.ip.telfort.nl>
	<op.u2dgfwap8527sy@82-170-177-25.ip.telfort.nl>
	<op.u2i2wknf8527sy@82-170-177-25.ip.telfort.nl>
	<20091029205121.GB3418@garage.freebsd.pl>
	<9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es>
	<20091214154750.GF1666@garage.freebsd.pl>
	<495F94EF-8F57-440D-8810-F40E40DE69D5@sarenet.es>
In-Reply-To: <495F94EF-8F57-440D-8810-F40E40DE69D5@sarenet.es>
X-Enigmail-Version: 0.96.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek <pjd@FreeBSD.org>,
	Ronald Klop <ronald-freebsd8@klop.yi.org>
Subject: Re: zfs receive gives: internal error: Argument list too long
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Dec 2009 22:00:35 -0000

I was unable to reproduce the panic (with 8.0-RELEASE-p1 + Pawel's patch
or with my patch).

I can split my patch into two Opensolaris changesets - 8986, that is
exactly pjd's patch. The other changeset is 7994.
BUG ID 6764159: restore_object() makes a call that can block while
having a tx open but not yet committed.

So to make life easier, I have split this and use 2 patches (that make
together my old patch)
a) 6764159_restore_blocking.patch
b) zfs_recv_E2BIG.patch

I have also encountered a problem with recursive zfs snapshots of
previsously transferred datasets.
On many of my systems, zfs snapshot -r tank@xyz just did not work with
the following error: zfs snapshot -r failed because filesystem was busy

Patch links:
http://mfsbsd.vx.sk/patches/6764159_restore_blocking.patch
http://mfsbsd.vx.sk/patches/6462803_zfs_snapshot_busy.patch
http://people.freebsd.org/~pjd/patches/zfs_recv_E2BIG.patch

Related OpenSolaris links:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6462803 (zfs
snapshot busy)
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6764159
(restore_object blocking)
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6801979 (zfs
receive E2BIG)

I am running all three patches on about 30-40 servers with 8 CPU cores,
amd64 and intensive zfs snapshot -r, intense zfs send/receive operations
for several days.
No panics or other problems by now.

Borja Marcos  wrote / napísal(a):
> On Dec 14, 2009, at 4:47 PM, Pawel Jakub Dawidek wrote:
>
>   
>> On Tue, Nov 03, 2009 at 12:08:54PM +0100, Borja Marcos wrote:
>>     
>>> On Oct 29, 2009, at 9:51 PM, Pawel Jakub Dawidek wrote:
>>>
>>> It's caused a panic for me on 8.0-RC2/amd64. Seems a new problem,  
>>> never saw a panic in this situation before.
>>>
>>> How to reproduce: With /usr/src and /usr/obj in a dataset, just
>>>
>>> cd /usr/src
>>> make clean
>>>
>>> Instant panic, in less than 20 seconds.
>>>
>>> Trying to get panic information, unfortunately I'm running on VMWare  
>>> Fussion and the silly thing doesn't offer the equivalent of a serial  
>>> console.
>>>       
>> Martin, this is the panic report I was refering to. Could you please try
>> to reproduce it? Maybe first with my patch to confirm it is reproducible
>> and then with your patch to confirm it has no such problem?
>> I'd be very grateful if you could do that. I don't want something to go
>> into the tree if there might be a problem with the patch.
>>     
>
> It was me, not Martin :)
>
> I will try to reproduce again. By the way, any news about the zfs receive deadlock when accessing the target dataset?
>
>
>
>
>
> Borja.
>
>
>