From owner-freebsd-current Sun Mar 8 15:31:50 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id PAA29842 for freebsd-current-outgoing; Sun, 8 Mar 1998 15:31:50 -0800 (PST) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id PAA29836 for ; Sun, 8 Mar 1998 15:31:48 -0800 (PST) (envelope-from tlambert@usr08.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.8.8/8.8.8) id QAA23781; Sun, 8 Mar 1998 16:31:45 -0700 (MST) Received: from usr08.primenet.com(206.165.6.208) via SMTP by smtp02.primenet.com, id smtpd023772; Sun Mar 8 16:31:40 1998 Received: (from tlambert@localhost) by usr08.primenet.com (8.8.5/8.8.5) id QAA08328; Sun, 8 Mar 1998 16:31:19 -0700 (MST) From: Terry Lambert Message-Id: <199803082331.QAA08328@usr08.primenet.com> Subject: Re: vnode_pager: *** WARNING *** stale FS code in system To: dima@tejblum.dnttm.rssi.ru (Dmitrij Tejblum) Date: Sun, 8 Mar 1998 23:31:19 +0000 (GMT) Cc: tlambert@primenet.com, current@FreeBSD.ORG In-Reply-To: <199803072354.CAA02807@tejblum.dnttm.rssi.ru> from "Dmitrij Tejblum" at Mar 8, 98 02:54:05 am X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > > The reason I went the way I did was that putting the vnode_pager.c > > code in the defaultops vector does nothing toward making the code > > in vnode_pager.c _go away_, which is the eventual goal. > > Your changes do same nothing toward this goal. Everyone can insert a > function into a vnops table. It is trivial. Almost nobody can write a > real getpages or putpages routine. For example, you failed even to > write a readonly cd9660_putpages, while it would be, apparently, > trivial. (I suspect, simple 'return (VM_PAGER_BAD);' would do a right > thing.) You also failed to mention this problem in your standard comment. > So, when real implementation for getpages and putpages for all these > filesystem will written? Most assuredly, at some time *after* a skeleton versions of FS specific getpages/putpages have been written. Organizationally, the patches I gave *do* move toward that goal. If you insist, I can duplicate the generic code into the FS's themselves immediately, and remove it from the generic location. I don't see this action as being useful. What I *do* see as the next logical progression is to move the code into *one* FS, and make it work. The reason this is more logical is that it provides a staged reference to the generic routines, so that we can tell the difference between an FS which has been migrated, and one that still needs work. It also gives other people the chance to work on FS specific code: it broadens the potential for participation, even though I suspect I know the one or two people who might help already -- at least the potential is there. > Filesystem-specific getpages/putpages possible since end of 1995. You > saved a filesystem writer from writing 7 obvious lines of each > implementations (even these lines contain style bugs, so actual number > is slightly less). Is this the big win? Codewise, no. That's why I would have preferred to leave the other FS's issuing the warning until such time as I could fix the code; after all, I am a purist. But pragmatically, it makes the desired organization obvious (instead of clouding it unnecessarily), and the patches were necessary, given the stabilization requirements now in force. I would cetrainly welcome your assitance in unstubbing the local media getpages/putpages, if you were willing to give it. This would resolve your implementation issues, without stepping on my organizational issues. > > FS's which > > can act as backing store need to have FS specific backing store > > management, > > A problem is that every leaf filesystem with 'regular' files may be > asked to act as backing store. For example, try > cmp /proc/1/map /proc/curproc/map Which calls mmap, and invokes a vnode pager, because it things these are normal files. mmap() vm_mmap() vm_pager_allocate() pagertab[ OBJT_VNODE]->pgo_alloc > After you 'fix' getpages on procfs with your usual method, you will > notice that procfs does not implement VOP_STRATEGY. (and that pfs_type > is not a string, like procfs_print thinks). But why something ever > try to call STRATEGY on procfs? Because procfs implement BMAP in some > 'specific' way. This serves as a little example of usefulness of some > filesystem-specific code. Clearly you misjudge how I would fix this problem. This code is, obviously, in error. The vm_mmap() should probably dereference vp->v_mount->mnt_kern_flag looking for MNT_PSEUDO, and if present, use a seperate pseduo-local-media pager that's not the same thing (ie: the pages in procfs are backed by kernel memory, not physical storage -- the vnode_pager is anppropriate for them). The quick hack in this case is to either treat the procfs vnodes as something other than normal vnodes (avoiding the cmp code triggering the mmap()) or simply to fail the mmap (since the data represents derived data, in any case). One problem here is that cmp (stupidly) does not fall back to c_special() in c_regular() in the case that the mmap() fails. This is a bug in cmp. > > for a large variety of reasons (I can enumerate these > > reasons again, if you need me to). > > Well, actually I don't know any reasons for it, other than reference to > John Dyson's authority and complexity of the generic code. So it would > be interesting. But it is outside of scope of this discussion. Then casting aspersions on my reasoning by implying that my only validation is lodged in "appeal to authority" is probably *also* outside the scope of this discussion, don't you think? > > A secondary reason is that it means I have to make special code in > > stacking FS's in order to be able to access the bypass (which I think > > should be placed in the defaultops vector instead). > > I am not sure if I can parse this. Anyway, I claim that my way has > exactly same results as your in this aspect. In terms of eventual results, yes. > From point of view of a > stacking layer, the filesystem below provide a getpages/putpages > implementation, and this is all that you want. Actually, I want to use the local media FS's pager. The difference here is that if I have a crypto FS stacked on top of a transaction FS stacked on top of a ACL FS stacked on top of a quota FS on top of an FFS FS, when I try to getpages on the crypto FS, it should try to getpages on the underlying FFS. In your suggested implementation, I can't collapse the call graph because in the stacking FS's will each have a getpages/putpages to access the bypass explicitly, instead of accessing it implicitly. > Also you want that you > will get a warning if your stacking layer does not bypass or implement > the getpages/putpages correctly. Then make vop_nogetpages and > vop_noputpages, which will print the warning and the bad vnode and > return VM_PAGER_BAD (which is at least a correct getpages/putpages return > value, unlike EOPNOTSUPP), and put vop_stdgetpages and vop_stdputpages > to vnops table of affected local medial file systems. Here's my preferred soloution, since it will pacify you as well: make the warning non-fatal *for now*; ie: ========================================================================= Index: vnode_pager.c =================================================================== RCS file: /b/cvstree/ncvs/src/sys/vm/vnode_pager.c,v retrieving revision 1.87 diff -c -r1.87 vnode_pager.c *** 1.87 1998/02/26 06:39:58 --- vnode_pager.c 1998/03/08 23:22:24 *************** *** 531,545 **** { int rtval; struct vnode *vp; vp = object->handle; /* * XXX temporary diagnostic message to help track stale FS code, * Returning EOPNOTSUPP from here may make things unhappy. */ ! rtval = VOP_GETPAGES(vp, m, count*PAGE_SIZE, reqpage, 0); ! if (rtval == EOPNOTSUPP) ! printf("vnode_pager: *** WARNING *** stale FS code in system.\n"); return rtval; } --- 531,548 ---- { int rtval; struct vnode *vp; + int bytes = count * PAGE_SIZE; vp = object->handle; /* * XXX temporary diagnostic message to help track stale FS code, * Returning EOPNOTSUPP from here may make things unhappy. */ ! rtval = VOP_GETPAGES(vp, m, bytes, reqpage, 0); ! if (rtval == EOPNOTSUPP) { ! printf("vnode_pager: *** WARNING *** stale FS getpages\n"); ! rtval = vnode_pager_generic_getpages( vp, m, bytes, reqpage); ! } return rtval; } *************** *** 803,811 **** { int rtval; struct vnode *vp; vp = object->handle; ! return VOP_PUTPAGES(vp, m, count*PAGE_SIZE, sync, rtvals, 0); } --- 806,820 ---- { int rtval; struct vnode *vp; + int bytes = count * PAGE_SIZE; vp = object->handle; ! rtval = VOP_PUTPAGES(vp, m, bytes, sync, rtvals, 0); ! if (rtval == EOPNOTSUPP) { ! printf("vnode_pager: *** WARNING *** stale FS putpages\n"); ! rtval = vnode_pager_generic_putpages( vp, m, bytes, sync, rtvals); ! } ! return rtval; } ========================================================================= > > A tertiary reason is that if the code is in the defaultops vector, > > I can't know when it becomes safe to remove from vnode_pager.c > > and the defaultops vector. There's no way to measure usage of > > the defaultops code (that's kind of the point, really). > > OK, don't put it to defaultops vector, put vop_stdgetpages and > vop_stdputpages to vnops table of every affected filesystem. I suggest > it in every mail on this topic :-). Now you are basically griping only about naming. The problem with calling it standard is that *it's not supposed to be the standard, correct way of doing things*. The blunt fact is that the way it was being done is screwed up, and *something* needs to be done to unscrew it, and done in such a way as it doesn't screw the FS stacking at the same time. > One may say that my complains are more about decorative issues that > about real programming issues. Sorry. It is because all discussing > changes actually more decorative than real code. And as decorative, > they are IMHO bad. Also, unfortunately, most points in this discussion > were repeated several times... I don't know how many times I should have to repeat this, but the code *should not be in vnode_pager.c* and was, in fact *designed to not be in vnode_pager.c*. There is a discrepancy between design and implementation that we need to reconcile. If this counts as "decorative" in your book, so be it. I think our disagreement hinges more on order and granularity of implementation of this change than it does on whether or not the goal is reasonable. I think the goal is reasonable. If you agree, and don't like the granularity of implementation, then start implementing. I could use the help. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message