From owner-freebsd-stable@FreeBSD.ORG Mon Jun 10 16:21:11 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3194B90F for ; Mon, 10 Jun 2013 16:21:11 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 0F9A11184 for ; Mon, 10 Jun 2013 16:21:11 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 5E4C4B96C; Mon, 10 Jun 2013 12:21:10 -0400 (EDT) From: John Baldwin To: Julian Stecklina Subject: Re: Reproducable Infiniband panic Date: Mon, 10 Jun 2013 12:15:11 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <51B07705.207@os.inf.tu-dresden.de> <201306071206.52994.jhb@freebsd.org> <51B5C0BC.2000402@os.inf.tu-dresden.de> In-Reply-To: <51B5C0BC.2000402@os.inf.tu-dresden.de> MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201306101215.11640.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 10 Jun 2013 12:21:10 -0400 (EDT) Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 16:21:11 -0000 On Monday, June 10, 2013 8:04:12 am Julian Stecklina wrote: > On 06/07/2013 06:06 PM, John Baldwin wrote: > > On Friday, June 07, 2013 5:07:34 am Julian Stecklina wrote: > >> On 06/06/2013 08:57 PM, John Baldwin wrote: > >>> On Thursday, June 06, 2013 9:54:35 am Andriy Gapon wrote: > >> [...] > >>>> The problem seems to be in incorrect interaction between devfs_close_f > > and > >>>> linux_file_dtor. The latter expects curthread->td_fpop to have a valid > > reasonable > >>>> value. But the former sets curthread->td_fpop to fp only around > > vnops.fo_close() > >>>> call and then restores it back to some (what?) previous value before > > calling > >>>> devfs_fpdrop->devfs_destroy_cdevpriv. In this case the previous value is > > NULL. > >>> > >>> It is normally NULL in this case. Why does linux_file_dtor even look at > >>> td_fpop? > >>> > >>> Ah. I think it should not do that and make the data it uses in the dtor > > more > >>> self-contained: > [...] > > Seems to fix my panic. Thanks! Can you please retest this updated version? I had thought that I didn't need a reference count on the vnode, but devfs drops its reference count before the cdevpriv destructor is called. Index: sys/ofed/include/linux/fs.h =================================================================== --- sys/ofed/include/linux/fs.h (revision 251604) +++ sys/ofed/include/linux/fs.h (working copy) @@ -73,6 +73,7 @@ struct dentry f_dentry_store; struct selinfo f_selinfo; struct sigio *f_sigio; + struct vnode *f_vnode; }; #define file linux_file Index: sys/ofed/include/linux/linux_compat.c =================================================================== --- sys/ofed/include/linux/linux_compat.c (revision 251604) +++ sys/ofed/include/linux/linux_compat.c (working copy) @@ -212,7 +212,8 @@ struct linux_file *filp; filp = cdp; - filp->f_op->release(curthread->td_fpop->f_vnode, filp); + filp->f_op->release(filp->f_vnode, filp); + vdrop(filp->f_vnode); kfree(filp); } @@ -232,6 +233,8 @@ filp->f_dentry = &filp->f_dentry_store; filp->f_op = ldev->ops; filp->f_flags = file->f_flag; + vhold(file->f_vnode); + filp->f_vnode = file->f_vnode; if (filp->f_op->open) { error = -filp->f_op->open(file->f_vnode, filp); if (error) { -- John Baldwin