From owner-freebsd-current@FreeBSD.ORG Sun Apr 2 18:01:58 2006 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1522516A420; Sun, 2 Apr 2006 18:01:58 +0000 (UTC) (envelope-from Tor.Egge@cvsup.no.freebsd.org) Received: from pil.idi.ntnu.no (pil.idi.ntnu.no [129.241.107.93]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7475243D48; Sun, 2 Apr 2006 18:01:56 +0000 (GMT) (envelope-from Tor.Egge@cvsup.no.freebsd.org) Received: from cvsup.no.freebsd.org (c2h5oh.idi.ntnu.no [129.241.103.69]) by pil.idi.ntnu.no (8.13.6/8.13.1) with ESMTP id k32I1som014390 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sun, 2 Apr 2006 20:01:54 +0200 (MEST) Received: from localhost (localhost [127.0.0.1]) by cvsup.no.freebsd.org (8.13.4/8.13.4) with ESMTP id k32I1r2P097888; Sun, 2 Apr 2006 18:01:54 GMT (envelope-from Tor.Egge@cvsup.no.freebsd.org) Date: Sun, 02 Apr 2006 18:01:53 +0000 (UTC) Message-Id: <20060402.180153.74658240.Tor.Egge@cvsup.no.freebsd.org> To: peter@holm.cc From: Tor Egge In-Reply-To: <20060402094431.GA81954@peter.osted.lan> References: <20060402094431.GA81954@peter.osted.lan> X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Multipart/Mixed; boundary="--Next_Part(Sun_Apr__2_18:01:53_2006_708)--" Content-Transfer-Encoding: 7bit X-Virus-Scanned-By: mimedefang.idi.ntnu.no, using CLAMD X-SMTP-From: Sender=, Relay/Client=c2h5oh.idi.ntnu.no [129.241.103.69], EHLO=cvsup.no.freebsd.org X-Scanned-By: MIMEDefang 2.48 on 129.241.107.38 X-Scanned-By: mimedefang.idi.ntnu.no, using MIMEDefang 2.48 with local filter 16.42-idi X-Filter-Time: 1 seconds Cc: truckman@freebsd.org, current@freebsd.org Subject: Re: Livelock / softdep_flush "loop" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Apr 2006 18:01:58 -0000 ----Next_Part(Sun_Apr__2_18:01:53_2006_708)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit > I managed to zoom in on the livelocks I've been seeing lately. According to the console log, process 708 has marked one dependency as being in progress, locked a vnode and then slept waiting for the softdep lock. softdep_flush() doesn't take into account that some of the remaining dependencies cannot be processed at once. Process 45 ended up looping inside softdep_flush(), never sleeping, always believing that more work could be done. The enclosed patch might help. - Tor Egge ----Next_Part(Sun_Apr__2_18:01:53_2006_708)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="softdep.diff2" Index: sys/ufs/ffs/ffs_softdep.c =================================================================== RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_softdep.c,v retrieving revision 1.193 diff -u -r1.193 ffs_softdep.c --- sys/ufs/ffs/ffs_softdep.c 12 Mar 2006 05:25:16 -0000 1.193 +++ sys/ufs/ffs/ffs_softdep.c 2 Apr 2006 17:21:13 -0000 @@ -718,6 +718,7 @@ { struct mount *nmp; struct mount *mp; + struct ufsmount *ump; struct thread *td; int remaining; int vfslocked; @@ -752,7 +753,9 @@ continue; vfslocked = VFS_LOCK_GIANT(mp); softdep_process_worklist(mp, 0); - remaining += VFSTOUFS(mp)->softdep_on_worklist; + ump = VFSTOUFS(mp); + remaining += ump->softdep_on_worklist - + ump->softdep_on_worklist_inprogress; VFS_UNLOCK_GIANT(vfslocked); mtx_lock(&mountlist_mtx); nmp = TAILQ_NEXT(mp, mnt_list); @@ -914,11 +917,13 @@ if ((flags & LK_NOWAIT) == 0 || wk->wk_type != D_DIRREM) break; wk->wk_state |= INPROGRESS; + ump->softdep_on_worklist_inprogress++; FREE_LOCK(&lk); ffs_vget(mp, WK_DIRREM(wk)->dm_oldinum, LK_NOWAIT | LK_EXCLUSIVE, &vp); ACQUIRE_LOCK(&lk); wk->wk_state &= ~INPROGRESS; + ump->softdep_on_worklist_inprogress--; if (vp != NULL) break; } Index: sys/ufs/ufs/ufsmount.h =================================================================== RCS file: /home/ncvs/src/sys/ufs/ufs/ufsmount.h,v retrieving revision 1.36 diff -u -r1.36 ufsmount.h --- sys/ufs/ufs/ufsmount.h 8 Mar 2006 23:43:39 -0000 1.36 +++ sys/ufs/ufs/ufsmount.h 2 Apr 2006 17:21:13 -0000 @@ -76,6 +76,7 @@ struct workhead softdep_workitem_pending; /* softdep work queue */ struct worklist *softdep_worklist_tail; /* Tail pointer for above */ int softdep_on_worklist; /* Items on the worklist */ + int softdep_on_worklist_inprogress; /* Busy items on worklist */ int softdep_deps; /* Total dependency count */ int softdep_accdeps; /* accumulated dep count */ int softdep_req; /* Wakeup when deps hits 0. */ ----Next_Part(Sun_Apr__2_18:01:53_2006_708)----