Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 03 Jan 2005 20:58:50 +0100
From:      =?ISO-8859-1?Q?S=F8ren_Schmidt?= <sos@DeepCore.dk>
To:        Nate Lawson <nate@root.org>
Cc:        FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: ATA rman performance enhancement
Message-ID:  <41D9A3FA.20204@DeepCore.dk>
In-Reply-To: <41D9A2CF.30704@root.org>
References:  <41D984A5.7010408@root.org> <41D99FBD.8070500@DeepCore.dk> <41D9A2CF.30704@root.org>

next in thread | previous in thread | raw e-mail | index | archive | help

Nate Lawson wrote:
> Søren Schmidt wrote:
> 
>> Nate Lawson wrote:
>>
>>> While doing some benchmarking of other code, I noticed that there 
>>> were a lot of calls to rman_get_bustag/handle().  They weren't taking 
>>> up much actual time since they're pretty lightweight but seemed to be 
>>> unnecessary.
>>>
>>> I worked up the attached diff and benchmarked it.  There are about 
>>> 1000 calls a second to the rman routines without this patch and 
>>> essentially none with it.  It makes about a 1% difference in 
>>> throughput under some IO loads.  It is only for non-Promise or 
>>> non-SII controllers right now since I didn't extend the 
>>> initialization step to more than ata-pci.c. The same approach could 
>>> be used for the other INW/OUTW calls as well but they're not in the 
>>> fast path.  I think it may make more of a difference with small reads.
>>
>> I had something semilar to this once back when, but since HW got lots 
>> faster I couldn't measure it anymore, but maybe things has changed...
>>
>> Anyhow it needs to be applied to ata-isa.c ata-card.c ata-cbus.c etc 
>> to not break anything at least. 
> 
> Yes, I agree.  I limited the change only to the IDX macros that were 
> used in the fast path although it could apply to all of them.

I *must* be applied to those files, otherwise the handle & tag fields 
are not initialized when using !pci based devices with you patch.

>> I'll think about it and eventually do something about it in ATA-mkIII 
>> if it really is mesureable again..
> 
> It is a minimal difference but with small transfers on a fast drive with 
> a low-powered CPU, it's measurable.  I made the same change a while back 
> in acpi_timer.c since the resource is only allocated once but accessed 
> very often and in a low-latency context.

Right, but I'd like to get the much bigger fishes catched first before 
doing micro optims, thats what ATA-mkIII is all about in the first 
place. However I'll put it back on the TODO list as something to look 
into when the time comes...

-- 

-Søren




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?41D9A3FA.20204>