Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 8 Aug 2005 09:49:53 -0700
From:      Marcel Moolenaar <marcel@xcllnt.net>
To:        Doug Rabson <dfr@nlsystems.com>
Cc:        cvs-src@FreeBSD.org, Marcel Moolenaar <marcel@FreeBSD.org>, cvs-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: cvs commit: src/sys/ia64/ia64 exception.S interrupt.c machdep.c mp_machdep.c pmap.c trap.c vm_machdep.c src/sys/ia64/include proc.h smp.h
Message-ID:  <749ADAD2-6AF5-4412-A880-22812A2C634C@xcllnt.net>
In-Reply-To: <200508080911.32706.dfr@nlsystems.com>
References:  <200508062028.j76KSJtM019032@repoman.freebsd.org> <200508070941.33821.dfr@nlsystems.com> <FAABF8ED-0FC7-4FBB-98D2-3A9F2618480F@xcllnt.net> <200508080911.32706.dfr@nlsystems.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 8, 2005, at 1:11 AM, Doug Rabson wrote:

>> I'd like to do is get a better sense of how critical it is if there's
>> a VHPT miss. Maybe we can implement the code that handles it in C,
>> use locks
>> and open the doors to having various different hash bucket
>> implementations
>> to play with. I still have my concerns about the assembly in
>> exception.S and the lack of locking therein. This in the context of
>> having spurious core dumps.
>>
>
> If you make it a spin mutex, I think it might be possible to take the
> mutex from exception.s safely. The uses of this mutex should be
> extremely short (and collisions rare).

I made them spin mutexes already. For the reasons you mentioned. I'll
play with it a bit.

>> In parallel, I'm measuring the effect on performance of bumping up
>> the page
>> size to 16K and 32K. I suspect the cost of a VHPT miss is mostly due
>> to us
>> needing to find the PTE in the hash bucket by walking a linked list.
>> Keeping
>> the average length of the list small may improve our overall
>> performance.
>>
>> Lots to learn...
>>
>
> How about the effect of different VHPT sizes?

A larger VHPT does not necessarily improve performance. I think I got
the best results with a 64K VHPT in a 2GB machine. The performance
deltas were really small, but that might be due to the particulars of
the load I put onto the machine. The effects on large databases may
be better for example.

> A long time ago I
> experimented with different ways of assigning region IDs to processes
> in an attempt to reduce collisions (and therefore reduce collision
> chain length). I think there still might be some mileage in that
> direction.

I think the algorithm is defined in the architecture specification. It
should be possible to analyze it and determine if get a good  
distribution.

We probably get better results if we share translations across  
processes.
For this to work, we need to use the permission keys so that we can  
assign
different permissions per process without having to create new  
translations
for it.

-- 
  Marcel Moolenaar         USPA: A-39004          marcel@xcllnt.net





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?749ADAD2-6AF5-4412-A880-22812A2C634C>