From owner-freebsd-stable@FreeBSD.ORG Thu Mar 19 17:55:06 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9530D106564A for ; Thu, 19 Mar 2009 17:55:06 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 5BC878FC0A for ; Thu, 19 Mar 2009 17:55:06 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (pool-98-109-39-197.nwrknj.fios.verizon.net [98.109.39.197]) by cyrus.watson.org (Postfix) with ESMTPSA id D434146B9F; Thu, 19 Mar 2009 13:55:05 -0400 (EDT) Received: from localhost (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id n2JHsxf3017781; Thu, 19 Mar 2009 13:54:59 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: Kostik Belousov Date: Thu, 19 Mar 2009 12:46:05 -0400 User-Agent: KMail/1.9.7 References: <200903191001.44491.jhb@freebsd.org> <20090319160251.GJ7716@deviant.kiev.zoral.com.ua> In-Reply-To: <20090319160251.GJ7716@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200903191246.05641.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 19 Mar 2009 13:55:00 -0400 (EDT) X-Virus-Scanned: ClamAV 0.94.2/9140/Thu Mar 19 11:16:32 2009 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-stable@freebsd.org Subject: Re: in recent 7-STABLE: VOP_WRITE...is not exclusive locked but should be X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Mar 2009 17:55:06 -0000 On Thursday 19 March 2009 12:02:51 pm Kostik Belousov wrote: > On Thu, Mar 19, 2009 at 10:01:44AM -0400, John Baldwin wrote: > > On Thursday 19 March 2009 8:05:34 am Tim Chase wrote: > > > Hello, > > > > > > I have a system that had been running quite well with an oldish 7-STABLE > > > (from around August 7, 2008) but has started deadlocking within the past > > > week or so. > > > > > > I updated the kernel to a newer 7-STABLE (Mar 15, 2009) and enabled > > > INVARIANTS, INVARIANT_SUPPORT, WITNESS, DEBUG_LOCKS DEBUG_VFS_LOCKS and > > > DIAGNOSTIC and the message indicated in the subject line has now appeared > > > 3 times as shown below. Is this something to be terribly concerned about? > > > Is there anything I can to to further track down the cause? Since the > > > system is a production mail server, I have it set to not drop into DDB > > > when this happens. > > > > > > The machine is a 4-core Xeon X5450 with 8G of RAM running FreeBSD > > > amd64 and in userland it's pretty much just cyrus imapd and apache/php. > > > The file systems are all ZFS on a bunch of SAS drives connected to a > > > LSI Logic 1068 controller. > > > > > > As to the deadlock that started this exercise, if the machine follows its > > > recent pattern, that should happen within the next 2-4 hours. > > > > Err, the vn_write() routine should be using an exclusive vnode lock: > > > > vn_write() > > { > > > > ... > > vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td); > > if ((flags & FOF_OFFSET) == 0) > > uio->uio_offset = fp->f_offset; > > ioflag |= sequential_heuristic(uio, fp); > > #ifdef MAC > > error = mac_check_vnode_write(active_cred, fp->f_cred, vp); > > if (error == 0) > > #endif > > error = VOP_WRITE(vp, uio, ioflag, fp->f_cred); > > ... > > } > > > > Can you check your /sys/kern/vfs_vnops.c and verify that LK_EXCLUSIVE is > > present in your vn_write() routine? If so, then perhaps run memtest? > > Note that this happens on the ZFS. Might be, ZFS unlocks the vnode and > then relocks it with some invalid flags ? Hmm, it depends on where the check is I guess (before or after zfs_write()). Tim, can you do 'l *VOP_WRITE_APV+0x155' on your kernel from gdb? -- John Baldwin