From owner-freebsd-current@FreeBSD.ORG Tue Jul 16 19:40:37 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4F3ADEEC; Tue, 16 Jul 2013 19:40:37 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wi0-x233.google.com (mail-wi0-x233.google.com [IPv6:2a00:1450:400c:c05::233]) by mx1.freebsd.org (Postfix) with ESMTP id 8E8F2D66; Tue, 16 Jul 2013 19:40:36 +0000 (UTC) Received: by mail-wi0-f179.google.com with SMTP id hj3so1109517wib.12 for ; Tue, 16 Jul 2013 12:40:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=e4g4eB7EyD1Iooh4RReiuCMqz0WX8xTsz/RvmJVnyr4=; b=e9lvg/uR8ueYsWfKdCrYICdjf0cFt5NYEhsrzeH4PnXcbiYDJgFg25ONdoVSIBl+OU +ElnQ6nDnyTTCw6BEo6jyXhhXdmonxZcNh0FZ9C3YkfWaoNmE5aqwY+NiRNWOHFmxnjc x5lf20441lW/dagqWFajN7+lRYP/JBmN3bYA9P6lUEdX+YVe49BgSLPD6uq1nuGKnWRj RXAcgkJAbwZVTV86cW066V+f/A1++abFIvKrcCy9DcXFPfm/zSekwsbRzjmTzM6+vG0v mJ+0Kau6ID9NaeIlMzHubZTUnxGlX1neIrC972dPWlCnFcELVbtQMeiXhJRcLDkNTZpb ZxlA== MIME-Version: 1.0 X-Received: by 10.180.39.212 with SMTP id r20mr2286284wik.30.1374003635341; Tue, 16 Jul 2013 12:40:35 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.217.94.132 with HTTP; Tue, 16 Jul 2013 12:40:35 -0700 (PDT) In-Reply-To: <51E59FD9.4020103@FreeBSD.org> References: <51DCFEDA.1090901@FreeBSD.org> <51E59FD9.4020103@FreeBSD.org> Date: Tue, 16 Jul 2013 12:40:35 -0700 X-Google-Sender-Auth: GCnD91X04MXzJpmn1mauhv_YnNw Message-ID: Subject: Re: Deadlock in nullfs/zfs somewhere From: Adrian Chadd To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org, freebsd-current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jul 2013 19:40:37 -0000 On 16 July 2013 12:32, Andriy Gapon wrote: > vmcore.0 was useless for some reason - an interesting address was not accessible. Eek. > vmcore.1 seems to be very similar and is actually useful. Oh good. > This problem looks like an interesting deadlock involving ZFS and VFS and vnode > shortage. > The most obvious things are that many threads could not allocate a new vnode and > are waiting in getnewvnode_reserve and also many threads are stuck waiting on > vnode locks held by the former threads. > In effect, they all wait for vnlru, which in turn is stuck in > zfs_freebsd_reclaim on z_teardown_lock. > That lock is held by a thread doing a rollback ioctl. > And that thread waits for zfs sync thread to actually perform the rollback. > The sync thread waits on zfs quiesce thread to declare the current transaction > group as quiesced. > The quiesce thread, obviously, waits for all operations running in the current > transaction group to complete. > Some of those operations are e.g. VOP_CREATE -> zfs_create. They already > started a zfs transaction (as a part of the current transaction group) and they > execute zfs_mknode which needs a new vnode. So these threads are waiting for a > new vnode and do not let the current transaction group become quiesced. > GOTO beginning. > > Compressing the above description to the extreme, it boils down to: ZFS needs a > new vnode from vnlru and is waiting on it, while vnlru has to wait on ZFS. :( So it's a deadlock. Ok, so what's next? -adrian