Date: Sun, 29 May 2005 11:36:07 -0600 From: Scott Long <scottl@samsco.org> To: "M. Warner Losh" <imp@bsdimp.com> Cc: "Justin T. Gibbs" <gibbs@scsiguy.com>, arch@freebsd.org, nyan@jp.FreeBSD.org Subject: Re: [RFC] remove bus_memio.h and bus_pio.h Message-ID: <4299FD87.1000505@samsco.org> In-Reply-To: <20050525.111945.41668351.imp@bsdimp.com> References: <20050525.212009.71136852.nyan@jp.FreeBSD.org> <20050525.111945.41668351.imp@bsdimp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
M. Warner Losh wrote: > In message: <20050525.212009.71136852.nyan@jp.FreeBSD.org> > Takahashi Yoshihiro <nyan@jp.FreeBSD.org> writes: > : The bus_memio.h and bus_pio.h for a micro-optimization depend on the > : implementation of the bus_space on i386 and amd64, so they are > : meaningless files on the other archs. I'd like to remove a MD part > : like this from MI drivers at least. > : > : I think that a increasing performance by using this method is very > : trivial on recent machines. If there is not strong objection, I'll > : remove bus_{mem,p}io.h and related code from all archs. > : > : Comments? > > Short answer: > > Great idea. aac and bfe should be tested after the change to > see if there is any benefit for them. Other drivers almost > certainly will see no benefit from this. > > Longer, more detailed answer. > > The original idea was to provide a hint to busspace that this driver > only ever used a certain subset of the available mappings so it should > assume that subset and agressively optimize the code. The assumption > was that one could know at compile time that one would never use > certain features. In an i386 centric world, this made good sense, > especially since the bus_space_* macros expanded to inb or whatever > and nothing else (compiler technology innovations may have changed > this over time). > > You are correct in that other architectures might have more than two > kinds of address space, might have other complicating factors. pc98 > has, as you know, an indirect vector because devices on the > motherboard and cbus are rarely mapped at contiguous locations due to > the dual 8-bit bus nature of the internal buses. In that case it > makes no sense to do any optimization at all, and these files should > be empty for such an implementation. > > Alpha, sparc64, powerpc and arm all have much more complex bus space > implementations due to their greater intra-architectural differences, > as well as their large difference with i386. To similarly optimize > these architectures, one would need additional MD info to know how to > inline things. None of them have chosen to support this level of > optimization. It is unclear to me how big a win such optimiztion > would be, even on the slower CPUs some of these platforms support. > > The lowest end of FreeBSD/i386 these days[*] is likely a Pentium II > running at 300MHz or a soekris box. The 4510 box is still only > 166MHz. However, the only device that it has that are likely to > benefit from this is sio. Well, in extreme cases, one could make the > case for any pci card or pccard, but I think that's too extreme to > consider. Since the soekris box has only one free serial port, we > need only keep up with a ppp connection on that serial port, so I'm > pretty sure we're OK. > > A number of drivers include only one of these two include files: > ti, bfe, trm, stg, scd, aac, kbd, ie, idt, hfa, gfb, fb, dpt, > cnw, aic, aha, ahb, adv > and some of the mii phy drivers, plus some other trivial uses. > > The only ones on the list that stand out are bfe and aac (the dpt > optimization is only for EISA cards, and only for the EISA specific > portions of the driver). I do not know how much this optimization > helps these devices, but they are the only ones that I see might be > affected. Simple benchmarks should be easy enough to do on aac and > bfe. > > Warner > > [*] Yes, I know that slower CPUs are supported, and do still perform > decently if you have enough memory. This is an arbitrary cutoff for > the cost-benefit analysis. This kind of makes me sad. I don't see how this was harming anything, it just wasn't documented so people didn't know how to use it. If it didn't apply to non-i386 and amd64, fine, just don't implement it for those platform. This optimization might have seemed trivial, but it's all of the little trivial optimizations that add up to make a nice system. I'm guessing that Justin only put effort into this originally because he did see a benefit; discounting it without doing any testing of your own is a bit disingenuous. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4299FD87.1000505>