From owner-freebsd-fs@FreeBSD.ORG Thu Jul 18 09:34:58 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C5400504; Thu, 18 Jul 2013 09:34:58 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id B404C1AA; Thu, 18 Jul 2013 09:34:57 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA19305; Thu, 18 Jul 2013 12:34:55 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Uzkbv-0004kP-I2; Thu, 18 Jul 2013 12:34:55 +0300 Message-ID: <51E7B686.4090509@FreeBSD.org> Date: Thu, 18 Jul 2013 12:33:58 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130708 Thunderbird/17.0.7 MIME-Version: 1.0 To: Adrian Chadd , Konstantin Belousov Subject: Re: Deadlock in nullfs/zfs somewhere References: <51DCFEDA.1090901@FreeBSD.org> <51E59FD9.4020103@FreeBSD.org> <51E67F54.9080800@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Jul 2013 09:34:58 -0000 on 17/07/2013 20:19 Adrian Chadd said the following: > On 17 July 2013 04:26, Andriy Gapon wrote: >> One possibility is to add getnewvnode_reserve() calls before the ZFS transaction >> beginnings in the places where a new vnode/znode may have to be allocated within >> a transaction. >> This looks like a quick and cheap solution but it makes the code somewhat messier. >> >> Another possibility is to change something in VFS machinery, so that VOP_RECLAIM >> getting blocked for one filesystem does not prevent vnode allocation for other >> filesystems. >> >> I could think of other possible solutions via infrastructural changes in VFS or >> ZFS... > > Well, what do others think? This seems like a showstopper for systems > with lots and lots of ZFS filesystems doing lots and lots of activity. > Looks like others are not speaking yet :-) My current idea is that ZFS should set MNTK_SUSPEND in zfs_suspend_fs() path before acquiring its z_teardown* locks. This should make intentions of ZFS visible to VFS. And thus it should prevent VOP_RECLAIM call on a suspended ZFS filesystem and that should prevent vnlru_free() getting stuck. Hopefully this should break the deadlock cycle. Kostik, what is your opinion? For your convenience here is a message with my analysis of this issue: http://thread.gmane.org/gmane.os.freebsd.current/150889/focus=18534 -- Andriy Gapon