From owner-freebsd-fs@FreeBSD.ORG  Thu Jul 18 09:34:58 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id C5400504;
 Thu, 18 Jul 2013 09:34:58 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
 by mx1.freebsd.org (Postfix) with ESMTP id B404C1AA;
 Thu, 18 Jul 2013 09:34:57 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
 [212.40.38.100])
 by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA19305;
 Thu, 18 Jul 2013 12:34:55 +0300 (EEST)
 (envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
 by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
 id 1Uzkbv-0004kP-I2; Thu, 18 Jul 2013 12:34:55 +0300
Message-ID: <51E7B686.4090509@FreeBSD.org>
Date: Thu, 18 Jul 2013 12:33:58 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130708 Thunderbird/17.0.7
MIME-Version: 1.0
To: Adrian Chadd <adrian@FreeBSD.org>, Konstantin Belousov <kib@FreeBSD.org>
Subject: Re: Deadlock in nullfs/zfs somewhere
References: <CAJ-Vmomy3MrkSwJLQUGnDuD3EC3HzrudEghSDMeDwzVdaFNpLg@mail.gmail.com>
 <51DCFEDA.1090901@FreeBSD.org>
 <CAJ-VmokctCmV4+y17uvqO9wXEyh0s+aXZ9nggvoAgP5+ZHSgFA@mail.gmail.com>
 <51E59FD9.4020103@FreeBSD.org>
 <CAJ-VmokR8jJpdRc_kBJzhW4_R1pJnj3UPfsG5ANpq-kEGwCP9g@mail.gmail.com>
 <51E67F54.9080800@FreeBSD.org>
 <CAJ-Vmonk2HAzX38-mbL8hwxiUfL6JyJrMTq0dTBctW=P4dfyEQ@mail.gmail.com>
In-Reply-To: <CAJ-Vmonk2HAzX38-mbL8hwxiUfL6JyJrMTq0dTBctW=P4dfyEQ@mail.gmail.com>
X-Enigmail-Version: 1.5.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Jul 2013 09:34:58 -0000

on 17/07/2013 20:19 Adrian Chadd said the following:
> On 17 July 2013 04:26, Andriy Gapon <avg@freebsd.org> wrote:
>> One possibility is to add getnewvnode_reserve() calls before the ZFS transaction
>> beginnings in the places where a new vnode/znode may have to be allocated within
>> a transaction.
>> This looks like a quick and cheap solution but it makes the code somewhat messier.
>>
>> Another possibility is to change something in VFS machinery, so that VOP_RECLAIM
>> getting blocked for one filesystem does not prevent vnode allocation for other
>> filesystems.
>>
>> I could think of other possible solutions via infrastructural changes in VFS or
>> ZFS...
> 
> Well, what do others think? This seems like a showstopper for systems
> with lots and lots of ZFS filesystems doing lots and lots of activity.
> 

Looks like others are not speaking yet :-)

My current idea is that ZFS should set MNTK_SUSPEND in zfs_suspend_fs() path
before acquiring its z_teardown* locks.  This should make intentions of ZFS
visible to VFS.  And thus it should prevent VOP_RECLAIM call on a suspended ZFS
filesystem and that should prevent vnlru_free() getting stuck.
Hopefully this should break the deadlock cycle.

Kostik,

what is your opinion?
For your convenience here is a message with my analysis of this issue:
http://thread.gmane.org/gmane.os.freebsd.current/150889/focus=18534
-- 
Andriy Gapon