Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 May 1997 10:34:27 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        james@westongold.com (James Mansion)
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: mmap()
Message-ID:  <199705161734.KAA17584@phaeton.artisoft.com>
In-Reply-To: <337C3FAE.4295@westongold.com> from "James Mansion" at May 16, 97 12:06:22 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > so John would have to amke a decision to accept the non-0 degradation
> > on that basis.
> 
> I take issue with moans about 'it costs more than zero' and
> being so defensive about a few cycles.  Its not the first time this
> attitude is apperent on this list.

[ ... ]

> I know that VM pagein code is used a lot, but how much of the total of
> (say) ftp.cdrom.com is in this code?  Say 5%?  Heck, you could add 20%
> to the code path before you get a 'measurable' difference of 1%.  (And
> do I believe its 5%?  No.  It'll be less than this).

[ ... ]

I am not opposed to this change.

I am suggesting that you write the code.

I am also cautioning you as to who will have the final say on whether
or not your code is integrated, and the issues he may be looking at
after having been beaten up on various performance issues vis-a-vis
Linux for about three years now (making that particular code path a
bit sensitive).


> Its much more likely in my view that the kind of graph closure coding
> that you propose for a fine grained SMP system will have a measurable
> impact.

You're right.  It will.  It will make the UP kernel preemptive, and
then FreeBSD will get the same 60% improvement in I/O latency that
UnixWare got when we did it at USL.  Only FreeBSD will probably be
better than 60%, because it will use Soft Updates instead of Delayed
Ordered Writes within the concurrency graph.  And it will have the
effect of reducing interprocessor synchronization (not mattering
whether it's by IPI or by MESI hardware coherency) and therfore
FreeBSD will be SMP scalable to more processors than UnixWare before
bus effects make it cost-ineffective to add more processors.


Not that this has anything to do with predictive fault-ahead in a VM
system.


> Please, people, be reasonable in your opposition to extending code
> paths.

Again, I'm not opposed.  Feel Free to implement your code changes, and
after doing so, feel free to submit them with as flimsy or as robust
a set of performance tradeoff benchmarks as you see fit.

It's not like I'll have any part in the decision to adopt or not adopt
the code.


The real problem I have is that you seem to be asking someone else to
do the code.  It would be a different story if you had done the code,
and were simply asking that it be integrated (I happen to have this
particular problem myself).


> Well, clearly you have to check to see if the flag is set, but then
> you've got a whole bunch of flags to check anyway.
> 
> Not sure why you need to save the last value - why not just page in
> multiple pages right away?  If the latency for getting a subsequent
> page is low, this will probably be cheaper than trying to set up an
> async request.

Because MADV_SEQUENTIAL is the wrong flag to use for this.  Really,
what the flag does is deprioritize-behind pages.  Arguably, this
should be names something else, and MADV_SEQUENTIAL should be reserved
for meaning what you apparently want it to mean... it's broken to
deprioritize the pages if your sequential access is (for example)
occuring on anyting more than 4k blocks.


> > No.  You'd think data pages, would, though.  Datapage might benefit from
> > this optimization as well, so you may be able to sell it that way.  It's
> > just that your case is odd, so it's not likely to attract a lot of effort
> > to optimize.  I'd suggest you make the changes yourself, if you can, and
> > then see what it does as far as performance.
> 
> (Can't, I'm afraid, owing to hardware failure at the moment.  Not to
> mention imminent arrival of triplets.)

Well, if you are asking someone else to do the code, you need to ask
John Dyson, directly.  Or on -current, not on -hackers.


> I would have thought that code pages would benefit too, especially if
> you have an opportunity to perform reordering of the functions so that
> there is good locality of reference.  Admittedly, ld doesn't do this now.

Well, you are assuming predictive forward locality, actually.  So ld is
not the only thing that would be involved... the compiler code generator
out to generate function blocks out of order relative to their physical
location in the source file, ideally, if this were the case.  Alternately,
functions should get their own ELF sections and always be PIC so that the
sections could be reordered for page preference.  Even then, you are at
best engaged in statistical branch path prediction... when do you turn
"learning" off?  It's like the problem with back propagation Neural nets
once they've been trained.  You have to reset the data state for each
sample so that the training is not adversely affected by the output no
longer being clamped; so when do you "clamp" the object order?


> I think this is relevant, though Win32 targetted:
> http://www.cs.washington.edu/homes/bershad/etch/index.html
> 
> I'm sure its not the only such system.

No.  The University of Utah reference I gave in the CLUSTERING discussion
is similar, but for UNIX and UNIX-like systems (thanks for the pointer
to this one, though).


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199705161734.KAA17584>