From owner-freebsd-current@FreeBSD.ORG Mon Sep 22 17:21:18 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B2AED1065675 for ; Mon, 22 Sep 2008 17:21:18 +0000 (UTC) (envelope-from scott@bqinternet.com) Received: from mail.bqinternet.com (mail.bqinternet.com [69.9.32.203]) by mx1.freebsd.org (Postfix) with ESMTP id 8958A8FC15 for ; Mon, 22 Sep 2008 17:21:18 +0000 (UTC) (envelope-from scott@bqinternet.com) Received: from localhost (mail [69.9.32.203]) by mail.bqinternet.com (Postfix) with ESMTP id 25B2E2D964D for ; Mon, 22 Sep 2008 17:21:20 +0000 (GMT) Received: from mail.bqinternet.com ([69.9.32.203]) by localhost (mail.bqinternet.com [69.9.32.203]) (amavisd-new, port 10024) with ESMTP id NEQjY0K1Chla for ; Mon, 22 Sep 2008 17:21:17 +0000 (GMT) Received: from scott-burnss-macbook-air.local (mail [69.9.32.203]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.bqinternet.com (Postfix) with ESMTP id EF5912DB0F4 for ; Mon, 22 Sep 2008 17:12:55 +0000 (GMT) Message-ID: <48D7D212.7090908@bqinternet.com> Date: Mon, 22 Sep 2008 13:12:50 -0400 From: Scott Burns User-Agent: Thunderbird 2.0.0.16 (Macintosh/20080707) MIME-Version: 1.0 To: freebsd-current@freebsd.org References: <48D4E974.2020008@bqinternet.com> In-Reply-To: <48D4E974.2020008@bqinternet.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: ZFS panic in zone_dataset_visible X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Sep 2008 17:21:18 -0000 Scott Burns wrote: > Hello, > > I am running several servers using Pawel's July 27 ZFS patchset, applied > against 8-current source from the same day. I have seen a similar panic > on two different servers: ... > Stopped at _mtx_lock_flags+0x15: lock cmpxchgq %rsi,0x18(%rdi) > db> bt > Tracing pid 95276 tid 100432 td 0xffffff010b3cc000 > _mtx_lock_flags() at _mtx_lock_flags+0x15 > zone_dataset_visible() at zone_dataset_visible+0x94 > zfs_mount() at zfs_mount+0x3e5 ... With a bit of testing, I found that this panic is easily reproducible. Simply try to list the contents of a snapshot from within a jail, as long as the snapshot isn't already mounted, and the system panics. If I mount the snapshot from outside of the jail first, and then list it inside the jail, it does not panic. I spent a bit of time debugging this weekend. Trying to list an unmounted snapshot triggers a zfs_mount() for the snapshot, which calls zone_dataset_visible() to determine if the snapshot should be visible in the current zone. When it is run outside of a jail, it returns true early on because INGLOBALZONE(curproc) is true, otherwise it takes another code path. The panic is happening after that check, at mtx_lock(&pr->cr_mtx), because (pr = curthread->td_ucred->cr_prison) is NULL. Interestingly, it's not NULL if zone_dataset_visible() is triggered by a "zfs list" command, but it is NULL if zone_dataset_visible() is called from zfs_mount(). As a temporary workaround, I modified my copy of cddl/compat/opensolaris/kern/opensolaris_zone.c to have zone_dataset_visible() return true if it is being called for a snapshot. I modified it as below: -if (INGLOBALZONE(curproc)) +if (INGLOBALZONE(curproc) || strchr(dataset, '@')) This is obviously not ideal, since it allows the manipulation of the snapshot from another jail if the caller knows that it exists. Since I am the only one with root access to any of the jails, I am not concerned with that. "zfs list" continues to behave normally. I will continue looking at this, but since my main goal of working around the panic has been taken care of, I am not sure how long my attention span will last. If the cause of curthread->td_ucred->cr_prison being NULL under these conditions is obvious to anyone, please let me know. -- Scott Burns System Administrator BQ Internet Corporation