Date: Fri, 29 Nov 1996 16:53:01 +1100 From: Bruce Evans <bde@zeta.org.au> To: bde@zeta.org.au, toor@dyson.iquest.net Cc: current@freebsd.org, phk@critter.tfs.com Subject: Re: users of "ft" tapes, please test! Message-ID: <199611290553.QAA15253@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
>I built the system with a de-inlined splx() and found an approx 10K savings. This saves 20K out of 1096K here (I have a lot of rarely used drivers and file systems in my kernel for testing). >make the changes. It would be very suprising to see that an appropriately >coded splvm/splimp/splxxx would be much smaller than the subroutine call... Yes, it would be surprising :-). A function call with no args takes 5 bytes. Just referencing 2 different memory locations (cpl and xxx_imask) takes a miniumum of 10 bytes unless pointers to the locations are kept in registers. A function call with args takes many more bytes but the inline code to handle the args is likely to take even more. >Have you considered coding the splxxx inlines in tight asm? Would that help? Yes. No. For the simplest case (splhigh()), inline asm can't possibly be tighter than: movl $_cpl,%eax # 5 bytes movl (%eax),%another_reg # 2 bytes movl $0xffffffff,(%eax) # 6 bytes Writing it in C allows generation of code like: movl (%reg1),%reg2 # 2 bytes ($_cpl already in %reg1) movl %reg3,(%reg1) # 2 bytes ($0xffffffff already in %reg3) gcc doesn't actually generate code like this. There usually aren't enough registers, but gcc doesn't even generate it for: for (i = 0; i < 1000; ++i) { s = splhigh(); foo(); splx(s); } gcc apparently thinks that loading address constants into registers is a waste of time on x86's. It's right in x86's with no cache :-). Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199611290553.QAA15253>