Date: Mon, 8 Aug 2005 09:49:53 -0700 From: Marcel Moolenaar <marcel@xcllnt.net> To: Doug Rabson <dfr@nlsystems.com> Cc: cvs-src@FreeBSD.org, Marcel Moolenaar <marcel@FreeBSD.org>, cvs-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: cvs commit: src/sys/ia64/ia64 exception.S interrupt.c machdep.c mp_machdep.c pmap.c trap.c vm_machdep.c src/sys/ia64/include proc.h smp.h Message-ID: <749ADAD2-6AF5-4412-A880-22812A2C634C@xcllnt.net> In-Reply-To: <200508080911.32706.dfr@nlsystems.com> References: <200508062028.j76KSJtM019032@repoman.freebsd.org> <200508070941.33821.dfr@nlsystems.com> <FAABF8ED-0FC7-4FBB-98D2-3A9F2618480F@xcllnt.net> <200508080911.32706.dfr@nlsystems.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 8, 2005, at 1:11 AM, Doug Rabson wrote: >> I'd like to do is get a better sense of how critical it is if there's >> a VHPT miss. Maybe we can implement the code that handles it in C, >> use locks >> and open the doors to having various different hash bucket >> implementations >> to play with. I still have my concerns about the assembly in >> exception.S and the lack of locking therein. This in the context of >> having spurious core dumps. >> > > If you make it a spin mutex, I think it might be possible to take the > mutex from exception.s safely. The uses of this mutex should be > extremely short (and collisions rare). I made them spin mutexes already. For the reasons you mentioned. I'll play with it a bit. >> In parallel, I'm measuring the effect on performance of bumping up >> the page >> size to 16K and 32K. I suspect the cost of a VHPT miss is mostly due >> to us >> needing to find the PTE in the hash bucket by walking a linked list. >> Keeping >> the average length of the list small may improve our overall >> performance. >> >> Lots to learn... >> > > How about the effect of different VHPT sizes? A larger VHPT does not necessarily improve performance. I think I got the best results with a 64K VHPT in a 2GB machine. The performance deltas were really small, but that might be due to the particulars of the load I put onto the machine. The effects on large databases may be better for example. > A long time ago I > experimented with different ways of assigning region IDs to processes > in an attempt to reduce collisions (and therefore reduce collision > chain length). I think there still might be some mileage in that > direction. I think the algorithm is defined in the architecture specification. It should be possible to analyze it and determine if get a good distribution. We probably get better results if we share translations across processes. For this to work, we need to use the permission keys so that we can assign different permissions per process without having to create new translations for it. -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?749ADAD2-6AF5-4412-A880-22812A2C634C>