Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 8 Mar 1998 23:31:19 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        dima@tejblum.dnttm.rssi.ru (Dmitrij Tejblum)
Cc:        tlambert@primenet.com, current@FreeBSD.ORG
Subject:   Re: vnode_pager: *** WARNING *** stale FS code in system
Message-ID:  <199803082331.QAA08328@usr08.primenet.com>
In-Reply-To: <199803072354.CAA02807@tejblum.dnttm.rssi.ru> from "Dmitrij Tejblum" at Mar 8, 98 02:54:05 am

next in thread | previous in thread | raw e-mail | index | archive | help
> > The reason I went the way I did was that putting the vnode_pager.c
> > code in the defaultops vector does nothing toward making the code
> > in vnode_pager.c _go away_, which is the eventual goal.  
> 
> Your changes do same nothing toward this goal. Everyone can insert a 
> function into a vnops table. It is trivial. Almost nobody can write a 
> real getpages or putpages routine. For example, you failed even to 
> write a readonly cd9660_putpages, while it would be, apparently, 
> trivial. (I suspect, simple 'return (VM_PAGER_BAD);' would do a right 
> thing.) You also failed to mention this problem in your standard comment. 
> So, when real implementation for getpages and putpages for all these 
> filesystem will written?

Most assuredly, at some time *after* a skeleton versions of FS specific
getpages/putpages have been written.  Organizationally, the patches I
gave *do* move toward that goal.

If you insist, I can duplicate the generic code into the FS's themselves
immediately, and remove it from the generic location.  I don't see this
action as being useful.  What I *do* see as the next logical progression
is to move the code into *one* FS, and make it work.

The reason this is more logical is that it provides a staged reference
to the generic routines, so that we can tell the difference between
an FS which has been migrated, and one that still needs work.  It also
gives other people the chance to work on FS specific code: it broadens
the potential for participation, even though I suspect I know the one
or two people who might help already -- at least the potential is there.


> Filesystem-specific getpages/putpages possible since end of 1995. You 
> saved a filesystem writer from writing 7 obvious lines of each 
> implementations (even these lines contain style bugs, so actual number 
> is slightly less). Is this the big win?

Codewise, no.  That's why I would have preferred to leave the other FS's
issuing the warning until such time as I could fix the code; after all,
I am a purist.  But pragmatically, it makes the desired organization
obvious (instead of clouding it unnecessarily), and the patches were
necessary, given the stabilization requirements now in force.

I would cetrainly welcome your assitance in unstubbing the local media
getpages/putpages, if you were willing to give it.  This would resolve
your implementation issues, without stepping on my organizational issues.


> > FS's which
> > can act as backing store need to have FS specific backing store
> > management, 
> 
> A problem is that every leaf filesystem with 'regular' files may be 
> asked to act as backing store. For example, try 
> cmp /proc/1/map /proc/curproc/map

Which calls mmap, and invokes a vnode pager, because it things these
are normal files.

	mmap()
	  vm_mmap()
	    vm_pager_allocate()
	      pagertab[ OBJT_VNODE]->pgo_alloc

> After you 'fix' getpages on procfs with your usual method, you will 
> notice that procfs does not implement VOP_STRATEGY. (and that pfs_type 
> is not a string, like procfs_print thinks). But why something ever 
> try to call STRATEGY on procfs? Because procfs implement BMAP in some 
> 'specific' way. This serves as a little example of usefulness of some 
> filesystem-specific code.

Clearly you misjudge how I would fix this problem.

This code is, obviously, in error.  The vm_mmap() should probably
dereference vp->v_mount->mnt_kern_flag looking for MNT_PSEUDO, and
if present, use a seperate pseduo-local-media pager that's not the
same thing (ie: the pages in procfs are backed by kernel memory, not
physical storage -- the vnode_pager is anppropriate for them).

The quick hack in this case is to either treat the procfs vnodes
as something other than normal vnodes (avoiding the cmp code
triggering the mmap()) or simply to fail the mmap (since the data
represents derived data, in any case).

One problem here is that cmp (stupidly) does not fall back to
c_special() in c_regular() in the case that the mmap() fails.  This
is a bug in cmp.


> > for a large variety of reasons (I can enumerate these
> > reasons again, if you need me to).
> 
> Well, actually I don't know any reasons for it, other than reference to
> John Dyson's authority and complexity of the generic code. So it would 
> be interesting. But it is outside of scope of this discussion.

Then casting aspersions on my reasoning by implying that my only
validation is lodged in "appeal to authority" is probably *also*
outside the scope of this discussion, don't you think?


> > A secondary reason is that it means I have to make special code in
> > stacking FS's in order to be able to access the bypass (which I think
> > should be placed in the defaultops vector instead).  
> 
> I am not sure if I can parse this. Anyway, I claim that my way has 
> exactly same results as your in this aspect.

In terms of eventual results, yes.

> From point of view of a 
> stacking layer, the filesystem below provide a getpages/putpages 
> implementation, and this is all that you want.

Actually, I want to use the local media FS's pager.  The difference here
is that if I have a crypto FS stacked on top of a transaction FS stacked
on top of a ACL FS stacked on top of a quota FS on top of an FFS FS,
when I try to getpages on the crypto FS, it should try to getpages on the
underlying FFS.

In your suggested implementation, I can't collapse the call graph
because in the stacking FS's will each have a getpages/putpages to
access the bypass explicitly, instead of accessing it implicitly.

> Also you want that you 
> will get a warning if your stacking layer does not bypass or implement
> the getpages/putpages correctly. Then make vop_nogetpages and 
> vop_noputpages, which will print the warning and the bad vnode and 
> return VM_PAGER_BAD (which is at least a correct getpages/putpages return 
> value, unlike EOPNOTSUPP), and put vop_stdgetpages and vop_stdputpages 
> to vnops table of affected local medial file systems.

Here's my preferred soloution, since it will pacify you as well: make
the warning non-fatal *for now*; ie:
=========================================================================
Index: vnode_pager.c
===================================================================
RCS file: /b/cvstree/ncvs/src/sys/vm/vnode_pager.c,v
retrieving revision 1.87
diff -c -r1.87 vnode_pager.c
*** 1.87	1998/02/26 06:39:58
--- vnode_pager.c	1998/03/08 23:22:24
***************
*** 531,545 ****
  {
  	int rtval;
  	struct vnode *vp;
  
  	vp = object->handle;
  	/* 
  	 * XXX temporary diagnostic message to help track stale FS code,
  	 * Returning EOPNOTSUPP from here may make things unhappy.
  	 */
! 	rtval = VOP_GETPAGES(vp, m, count*PAGE_SIZE, reqpage, 0);
! 	if (rtval == EOPNOTSUPP)
! 	    printf("vnode_pager: *** WARNING *** stale FS code in system.\n");
  	return rtval;
  }
  
--- 531,548 ----
  {
  	int rtval;
  	struct vnode *vp;
+ 	int bytes = count * PAGE_SIZE;
  
  	vp = object->handle;
  	/* 
  	 * XXX temporary diagnostic message to help track stale FS code,
  	 * Returning EOPNOTSUPP from here may make things unhappy.
  	 */
! 	rtval = VOP_GETPAGES(vp, m, bytes, reqpage, 0);
! 	if (rtval == EOPNOTSUPP) {
! 	    printf("vnode_pager: *** WARNING *** stale FS getpages\n");
! 	    rtval = vnode_pager_generic_getpages( vp, m, bytes, reqpage);
! 	}
  	return rtval;
  }
  
***************
*** 803,811 ****
  {
  	int rtval;
  	struct vnode *vp;
  
  	vp = object->handle;
! 	return VOP_PUTPAGES(vp, m, count*PAGE_SIZE, sync, rtvals, 0);
  }
  
  
--- 806,820 ----
  {
  	int rtval;
  	struct vnode *vp;
+ 	int bytes = count * PAGE_SIZE;
  
  	vp = object->handle;
! 	rtval = VOP_PUTPAGES(vp, m, bytes, sync, rtvals, 0);
! 	if (rtval == EOPNOTSUPP) {
! 	    printf("vnode_pager: *** WARNING *** stale FS putpages\n");
! 	    rtval = vnode_pager_generic_putpages( vp, m, bytes, sync, rtvals);
! 	}
! 	return rtval;
  }
  
  
=========================================================================


> > A tertiary reason is that if the code is in the defaultops vector,
> > I can't know when it becomes safe to remove from vnode_pager.c
> > and the defaultops vector.  There's no way to measure usage of
> > the defaultops code (that's kind of the point, really).
> 
> OK, don't put it to defaultops vector, put vop_stdgetpages and 
> vop_stdputpages to vnops table of every affected filesystem. I suggest 
> it in every mail on this topic :-).

Now you are basically griping only about naming.

The problem with calling it standard is that *it's not supposed to be
the standard, correct way of doing things*.

The blunt fact is that the way it was being done is screwed up, and
*something* needs to be done to unscrew it, and done in such a way
as it doesn't screw the FS stacking at the same time.


> One may say that my complains are more about decorative issues that 
> about real programming issues. Sorry. It is because all discussing 
> changes actually more decorative than real code. And as decorative, 
> they are IMHO bad. Also, unfortunately, most points in this discussion 
> were repeated several times...

I don't know how many times I should have to repeat this, but the code
*should not be in vnode_pager.c* and was, in fact *designed to not be
in vnode_pager.c*.

There is a discrepancy between design and implementation that we need
to reconcile.

If this counts as "decorative" in your book, so be it.  I think our
disagreement hinges more on order and granularity of implementation
of this change than it does on whether or not the goal is reasonable.

I think the goal is reasonable.  If you agree, and don't like the
granularity of implementation, then start implementing.  I could use
the help.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803082331.QAA08328>