From owner-cvs-all  Thu Apr 16 06:04:44 1998
Return-Path: <owner-cvs-all@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id GAA28414
          for cvs-all-outgoing; Thu, 16 Apr 1998 06:04:44 -0700 (PDT)
          (envelope-from owner-cvs-all@FreeBSD.ORG)
Received: from spinner.netplex.com.au (spinner.netplex.com.au [202.12.86.3])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id GAA28124;
          Thu, 16 Apr 1998 06:02:11 -0700 (PDT)
          (envelope-from peter@netplex.com.au)
Received: from spinner.netplex.com.au (localhost [127.0.0.1])
	by spinner.netplex.com.au (8.8.8/8.8.8/Spinner) with ESMTP id VAA16055;
	Thu, 16 Apr 1998 21:00:30 +0800 (WST)
	(envelope-from peter@spinner.netplex.com.au)
Message-Id: <199804161300.VAA16055@spinner.netplex.com.au>
X-Mailer: exmh version 2.0.2 2/24/98
To: Bruce Evans <bde@zeta.org.au>
cc: cvs-all@FreeBSD.ORG, cvs-committers@FreeBSD.ORG, cvs-sys@FreeBSD.ORG,
        dyson@FreeBSD.ORG
Subject: Re: cvs commit: src/sys/kern vfs_subr.c 
In-reply-to: Your message of "Thu, 16 Apr 1998 18:51:05 +1000."
             <199804160851.SAA04053@godzilla.zeta.org.au> 
Date: Thu, 16 Apr 1998 21:00:29 +0800
From: Peter Wemm <peter@netplex.com.au>
Sender: owner-cvs-all@FreeBSD.ORG
Precedence: bulk

Bruce Evans wrote:
> >  Modified files:
> >    sys/kern             vfs_subr.c 
> >  Log:
> >  When the softdep conversion took place, the periodic vfs_msync() from
> >  update got lost.  This is responsible for ensuring that dirty mmap() pages
> >  get periodically written to disk.  Without it, long time mmap's might not
> >  have their dirty pages written out at all of the system crashes or isn't
> >  cleanly shut down.  This could be nasty if you've got a long-running
> >  writing via mmap(), dirty pages used to get written to disk within 30
> >  seconds or so.
> 
> sync_fsync() seems to be called too often for this (approx. every 3
> seconds on an idle system).

Well, from what I could see, it's called however often it needs in order 
to sync each filesystem one-at-a-time over a 30 second period.  The old 
update process called vfs_msync() n times (n = number of mountpoints) 
every 30 seconds.  The way I understand things, it's now being called 
at exactly the same rate overall but at a more even spread.

>  vfs_msync() seems to do a lot every time it is called.

Yes..  It walks through the vnodes attached to a given filesystem.  From 
the look of it, there could be some optimizations there..  We could 
eliminate uninteresting vnodes before wasting time doing VOP's, getting 
simple_locks, etc.

Index: vfs_subr.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.150
diff -u -2 -r1.150 vfs_subr.c
--- vfs_subr.c	1998/04/16 03:31:26	1.150
+++ vfs_subr.c	1998/04/16 12:57:05
@@ -2405,4 +2405,5 @@
 vfs_msync(struct mount *mp, int flags) {
 	struct vnode *vp, *nvp;
+	struct vm_object *obj;
 	int anyio, tries;
 
@@ -2417,4 +2418,9 @@
 			goto loop;
 		}
+
+		/* skip vnodes that are uninteresting before locking */
+		obj = vp->v_object;
+		if (obj == NULL || (obj->flags & OBJ_MIGHTBEDIRTY) == 0)
+			continue;
 
 		if ((vp->v_flag & VXLOCK) ||

With a few thousand vnodes, all those VOP_ISLOCKED() and 
simple_lock()/unlock calls have got to add up.  It should be harmless to 
test these variables outside of any critical section protection.

> We also lost control of the update interval.

Yes, but it is compile-time tweakable, so those with a desperate need to 
tweak it can still do so, although on a power-of-two basis.

> We also lost waking the sync daemon in vm_pageout_scan().  There is
> still a wakeup on vfs_update_wakeup, but nothing sleeps on it any
> more.
> 
> A better quick fix for most of this is probably to resurrect the
> update daemon and only call vfs_msync() from it.

I don't think there's much gain to be had overall..  The old update called 
sync() every 30 seconds.  sync does:

        simple_lock(&mountlist_slock);
        for (mp = mountlist.cqh_first; mp != (void *)&mountlist; mp = nmp) {
[..]
		vfs_msync(mp, MNT_NOWAIT);
[..]
	}
	simple_unlock(&mountlist_slock);

So, if you have 10 filesystems, you get 20 vfs_msync's per minute under
update, but you get a whole glut of disk writes in one major hit.

Under syncer, you get 1 vfs_msync() every 3 seconds, stepping through each 
filesystem one at a time.  You still have 20 vfs_msync's per minute.

> Bruce
> 

Cheers,
-Peter
--
Peter Wemm <peter@netplex.com.au>   Netplex Consulting


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message