Date: Wed, 26 Jun 2002 00:58:25 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Terry Lambert <tlambert2@mindspring.com> Cc: Peter Wemm <peter@wemm.org>, Alfred Perlstein <bright@mu.org>, Patrick Thomas <root@utility.clubscholarship.com>, freebsd-hackers@FreeBSD.ORG Subject: Re: tunings for many httpds... Message-ID: <200206260758.g5Q7wPgZ019100@apollo.backplane.com> References: <20020625222632.B7C7D3811@overcee.wemm.org> <3D1970E7.697D4A49@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hmm. I'm fairly sure that Linux does not quite do it that way. I believe the 2-level page tables are copy-on-write, but that only gives you shareability across a fork() and then only for a little while. I'm fairly certain that Linux cannot share page tables for post-fork modifications (like when you mmap() or get a SysV shared segment). The rmap patches are roughly equivalent to our i386 pmap code and allow Rik to implement page queues and proper page aging. -Matt Matthew Dillon <dillon@backplane.com> :Peter Wemm wrote: :> > Even more importantly it would be nice if we could share compatible :> > pmap pages, then we would have no need for 4MB pages... 50 mappings :> > of the same shared memory segment would wind up using the same pmap :> > pages as if only one mapping had been made. Such a feature would work :> > for SysV shared memory and for mmap()s. I've looked at doing this :> > off and on for two years but do not have a sufficient chunk of time :> > available yet. :> :> SVR4/Solaris/Digital Unix^H^H^H^H^H^HTru64 do this by having an additional :> layer between VM and pmap. The equivalent of our pmap is just another one :> of the address space handlers. The SHM stuff etc is often implemented such :> that it grabbed blocks of 4MB address space to manage in a way that it :> likes. This means it constructs its own page tables etc in such a way that :> they are suitable for common use. *If* I recall correctly, in SunOS/SVR4/ :> Solaris parlance this is the segment layer. Naturally there is quite a bit :> of variation. It has been a long long time since I looked at this. : :Linux 2.4 has this with their "rmap" patch. Alan Cox compares the :VM system performance as "similar to what I see with FreeBSD". : :They are coming from a perspective of sharing all page mappings :by pointing them at the same entries, without a reverse lookup :mechanism (this is what the "rmap" patches add, the reverse lookup; :linux has always shared equivalent page mappings). : :The reverse lookup maintains a linked list (for some reason, this :is 12 bytes -- don't know why yet) that is a list of the PTE references :to the mapping. So the reverse means going backwards and doing a :linear list traversal if the pages are shared (they usually are, for :code pages for any program that's running more than one instance). : :For page waits, they use a shared hash, and then wake up processes :unnecessarily, but they expect the contention to be minimal (they :estimate 4-8% overhead under extreme load with quantum at 100ms). : :Doing this in FreeBSD would probably confuse the heck out of the :exiting page discard code's LRU determination (among other things), :but it's probably worth it, for the cases you've mentioned. I think :the extra overhead in the unloaded case is in the noise, and in the :loaded case, well worth the trade. : :I don't know how the PAE code you were rumored to be doing stands; :if there were plans to put the PTE's for the process in the bank :with the program pages that were running there, then doing this :might prevent that from working very well, if those entries had to :be shared with entries in another bank. : :-- Terry : To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200206260758.g5Q7wPgZ019100>