Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Sep 1998 16:32:23 +0800
From:      Peter Wemm <peter@netplex.com.au>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        toasty@home.dragondata.com (Kevin Day), eivind@yes.no, street@iname.com, freebsd-current@FreeBSD.ORG
Subject:   Re: Softupdates panics 
Message-ID:  <199809280832.QAA03301@spinner.netplex.com.au>
In-Reply-To: Your message of "Mon, 28 Sep 1998 01:05:11 GMT." <199809280105.SAA08374@usr05.primenet.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
Terry Lambert wrote:
> > > > The dependencies for "noatime" are not switchable, and by enabling
> > > > it, you are breaking the dependency graph into seperate pieces!
> > > > 
> > > > The bug here is that it didn't ignore your request for "noatime".
> > > 
> > > Return error.  noatime request a particular behaviour for the mount;
> > > ignoring that request on the assumption that people only do it for
> > > speed is IMO totally bogus.
> > 
> > I've gotta agree here - softupdates looked good for my palmtop that uses
> > flash - noatime is essentially required for flash media. :)
> 
> Why?

Limit write cycles.  If you have a slow continuous read on a very large 
file. the inode gets written back many many times contributing to the 
wear-out.

If my memory serves me correctly, the reason DG implemented it in the 
first place was after measuring freefall's disk access.  Something like 60 
or 70% of disk activity was caused by atime updates.

I am really puzzled as to why this is required by the code.  If I open,
read() and close a file with noatime active, there should be no write
activity at all for any of the metadata related to the file.  Softupdates
shouldn't even be involved because there are no write order dependencies to
maintain.

Turning on MNT_NOATIME causes ufs_inactive() to not call UFS_UPDATE().  
ffs_update() would have tested the bits (IN_ACCESS and friends) and if any 
were set, it would touch the inode, bread() the on-disk "block" containing 
the inode group, call softdep_update_inodeblock(), the copy the in-core 
inode to the disk buffer and then write it out (presumably calling another 
softdep function to monitor or order the writeback).

If we are not modifying the in-core inode or the disk block, then why does 
softdep need this to happen?  There are no dependencies to maintain.  Open/
close dependency information should already be handled..  I don't see why 
close(open("foo", O_RDWR)); should have to be treated any differently by 
softupdates than fd = open("foo", O_RDWR); read(fd, buf, 1); close(fd); 
while MNT_NOATIME is active.

The thing that I wonder is whether a VOP_READ() on directory nodes might be
involved somehow.  Since directries seem to be related to the panics,
perhaps only allow NOATIME to skip the IN_ACCESS flag set on VREG vnodes
and see if that has any effect.  ufs_rename() also might call VOP_READ() (I
think) indirectly via ufs_checkpath() and then vn_rdwr().

ufs_lookup() doesn't use VOP_READ (that I can see) and a path search 
doesn't touch access timestamps so a 'cat /etc/passwd' won't hit the atime 
for /etc.  If a hack workaround patch along these lines works, then that's 
probably good enough.  An 'ls /etc' would still touch /etc's atime though.

> > But is that what terry was saying? :)
> 
> I was saying that you should ignore the flag.
> 
> Elvind corrected me (I agree with him) that it's not enough to
> ignore it, you need to actually barf on the flag.

I think it would be better to refuse a mount if async and/or noatime 
was requested.

The flag is *needed* by some people, not just wanted (for hardware
preservation, not just speed) and the best fix is to make softupdates work
with it.

Cheers,
-Peter

Anyway, here's a patch to disable NOATIME on VOP_READ() of a directory. 
I'd be interested to know if it has any affect.
Index: ufs_readwrite.c
===================================================================
RCS file: /home/ncvs/src/sys/ufs/ufs/ufs_readwrite.c,v
retrieving revision 1.52
diff -u -r1.52 ufs_readwrite.c
--- ufs_readwrite.c	1998/09/07 11:50:19	1.52
+++ ufs_readwrite.c	1998/09/28 08:05:37
@@ -115,7 +122,8 @@
 		if (toread >= PAGE_SIZE) {
 			error = uioread(toread, uio, object, &nread);
 			if ((uio->uio_resid == 0) || (error != 0)) {
-				if (!(vp->v_mount->mnt_flag & MNT_NOATIME))
+				if (!(vp->v_mount->mnt_flag & MNT_NOATIME) &&
+				    vp->v_type != VDIR)
 					ip->i_flag |= IN_ACCESS;
 				if (object)
 					vm_object_vndeallocate(object);
@@ -137,7 +145,8 @@
 			if (toread >= PAGE_SIZE) {
 				error = uioread(toread, uio, object, &nread);
 				if ((uio->uio_resid == 0) || (error != 0)) {
-					if (!(vp->v_mount->mnt_flag & MNT_NOATIME))
+					if (!(vp->v_mount->mnt_flag & MNT_NOATIME) &&
+					    vp->v_type != VDIR)
 						ip->i_flag |= IN_ACCESS;
 					if (object)
 						vm_object_vndeallocate(object);
@@ -230,7 +239,7 @@
 
 	if (object)
 		vm_object_vndeallocate(object);
-	if (!(vp->v_mount->mnt_flag & MNT_NOATIME))
+	if (!(vp->v_mount->mnt_flag & MNT_NOATIME) && vp->v_type != VDIR)
 		ip->i_flag |= IN_ACCESS;
 	return (error);
 }



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809280832.QAA03301>