From owner-freebsd-current@FreeBSD.ORG Fri Sep 3 23:41:22 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B6A0010656CE for ; Fri, 3 Sep 2010 23:41:22 +0000 (UTC) (envelope-from obrien@NUXI.org) Received: from dragon.nuxi.org (trang.nuxi.org [74.95.12.85]) by mx1.freebsd.org (Postfix) with ESMTP id 757248FC15 for ; Fri, 3 Sep 2010 23:41:22 +0000 (UTC) Received: from dragon.nuxi.org (obrien@localhost [127.0.0.1]) by dragon.nuxi.org (8.14.4/8.14.4) with ESMTP id o83NUm0W001576; Fri, 3 Sep 2010 16:30:48 -0700 (PDT) (envelope-from obrien@dragon.nuxi.org) Received: (from obrien@localhost) by dragon.nuxi.org (8.14.4/8.14.4/Submit) id o83NUjmt001573; Fri, 3 Sep 2010 16:30:45 -0700 (PDT) (envelope-from obrien) Date: Fri, 3 Sep 2010 16:30:44 -0700 From: "David O'Brien" To: Jeff Roberson Message-ID: <20100903233038.GA1383@dragon.NUXI.org> Mail-Followup-To: obrien@freebsd.org, Jeff Roberson , freebsd-current@freebsd.org References: <4BDF2A4D.3030706@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: FreeBSD 9.0-CURRENT X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? User-Agent: Mutt/1.5.16 (2007-06-09) Cc: freebsd-current@freebsd.org Subject: Re: SUJ deadlock X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: obrien@freebsd.org List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Sep 2010 23:41:22 -0000 On Wed, May 05, 2010 at 12:54:07PM -1000, Jeff Roberson wrote: > On Mon, 3 May 2010, Fabien Thomas wrote: >>>> I'm with r207548 now and since some days i've system deadlock. >>>> It seems related to SUJ with process waiting on suspfs or ppwait. >>> >>> I've also seen it stalled in suspfs, but this information is way better >>> than what I was able to garner. I was only able to tell via ctrl-t on >>> a stalled 'ls' process in a terminal before hard booting. [..] > Can anyone who has experienced this hang test this patch: > > Thanks, > Jeff > Index: ffs_softdep.c > =================================================================== > --- ffs_softdep.c (revision 207480) > +++ ffs_softdep.c (working copy) > @@ -9301,7 +9301,7 @@ > hadchanges = 1; > } > /* Leave this inodeblock dirty until it's in the list. */ > - if ((inodedep->id_state & (UNLINKED | DEPCOMPLETE)) == UNLINKED) > + if ((inodedep->id_state & (UNLINKED | UNLINKONLIST)) == UNLINKED) Hi Jeff, I didn't seem to experience this problem back in May, but I'm now experiencing it on a regular basis. I seem to trigger it almost every other or 3rd day during the daily run. I wind up with cvsup or svnsync stalled and any 'ls' of my sources partition waiting on suspfs. (note, I am also running diskcheckd from ports.) My kernel sources are at: Last Changed Author: davidxu Last Changed Rev: 211534 Last Changed Date: 2010-08-20 16:51:34 -0700 (Fri, 20 Aug 2010) I have also experienced it back to at least: Last Changed Author: yongari Last Changed Rev: 210152 Last Changed Date: 2010-07-15 16:34:58 -0700 (Thu, 15 Jul 2010) Weird thing is - I can still access this partition across NFS without problems. dragon$ cd /src/fbsd Filesystem Size Used Avail Capacity Mounted on /dev/da31s1f 271G 119G 130G 48% /src dragon$ ls load: 0.12 cmd: ls 77901 [suspfs] 2.26r 0.00u 0.00s 0% 1212k quynh$ cd /src/fbsd quynh$ df . Filesystem Size Used Avail Capacity Mounted on dragon:/src 271G 119G 130G 48% /src quynh$ ls .svn/ lib/ COPYRIGHT libexec/ ..snip.. Processes also have a tendency to complete quite slowly at times - waiting in vlruwk. When I reboot, usually / and /src (but not 3 other partitions) give a "Bad cg number {negative number}" error from fsck; so a full fsck is run. This results in what seems tens of thousands iterations of: UNREF FILE I=[..snip..] RECONNECT? yes SORRY no space in lost+found directory unexpected soft update inconsistency CLEAR? yes thoughts? -- -- David (obrien@FreeBSD.org)