Date: Fri, 29 Nov 1996 16:53:01 +1100 From: Bruce Evans <bde@zeta.org.au> To: bde@zeta.org.au, toor@dyson.iquest.net Cc: current@freebsd.org, phk@critter.tfs.com Subject: Re: users of "ft" tapes, please test! Message-ID: <199611290553.QAA15253@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
>I built the system with a de-inlined splx() and found an approx 10K savings.
This saves 20K out of 1096K here (I have a lot of rarely used drivers and
file systems in my kernel for testing).
>make the changes. It would be very suprising to see that an appropriately
>coded splvm/splimp/splxxx would be much smaller than the subroutine call...
Yes, it would be surprising :-). A function call with no args takes 5
bytes. Just referencing 2 different memory locations (cpl and xxx_imask)
takes a miniumum of 10 bytes unless pointers to the locations are kept
in registers. A function call with args takes many more bytes but the
inline code to handle the args is likely to take even more.
>Have you considered coding the splxxx inlines in tight asm? Would that help?
Yes. No. For the simplest case (splhigh()), inline asm can't possibly be
tighter than:
movl $_cpl,%eax # 5 bytes
movl (%eax),%another_reg # 2 bytes
movl $0xffffffff,(%eax) # 6 bytes
Writing it in C allows generation of code like:
movl (%reg1),%reg2 # 2 bytes ($_cpl already in %reg1)
movl %reg3,(%reg1) # 2 bytes ($0xffffffff already in %reg3)
gcc doesn't actually generate code like this. There usually aren't enough
registers, but gcc doesn't even generate it for:
for (i = 0; i < 1000; ++i) {
s = splhigh();
foo();
splx(s);
}
gcc apparently thinks that loading address constants into registers is a
waste of time on x86's. It's right in x86's with no cache :-).
Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199611290553.QAA15253>
