Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Nov 1996 16:53:01 +1100
From:      Bruce Evans <bde@zeta.org.au>
To:        bde@zeta.org.au, toor@dyson.iquest.net
Cc:        current@freebsd.org, phk@critter.tfs.com
Subject:   Re: users of "ft" tapes, please test!
Message-ID:  <199611290553.QAA15253@godzilla.zeta.org.au>

next in thread | raw e-mail | index | archive | help
>I built the system with a de-inlined splx() and found an approx 10K savings.

This saves 20K out of 1096K here (I have a lot of rarely used drivers and
file systems in my kernel for testing).

>make the changes.  It would be very suprising to see that an appropriately
>coded splvm/splimp/splxxx would be much smaller than the subroutine call...

Yes, it would be surprising :-).  A function call with no args takes 5
bytes.  Just referencing 2 different memory locations (cpl and xxx_imask)
takes a miniumum of 10 bytes unless pointers to the locations are kept
in registers.  A function call with args takes many more bytes but the
inline code to handle the args is likely to take even more.

>Have you considered coding the splxxx inlines in tight asm?  Would that help?

Yes.  No.  For the simplest case (splhigh()), inline asm can't possibly be
tighter than:

	movl	$_cpl,%eax		# 5 bytes
	movl	(%eax),%another_reg	# 2 bytes
	movl	$0xffffffff,(%eax)	# 6 bytes

Writing it in C allows generation of code like:

	movl	(%reg1),%reg2		# 2 bytes ($_cpl already in %reg1)
	movl	%reg3,(%reg1)		# 2 bytes ($0xffffffff already in %reg3)

gcc doesn't actually generate code like this.  There usually aren't enough
registers, but gcc doesn't even generate it for:

	for (i = 0; i < 1000; ++i) {
		s = splhigh();
		foo();
		splx(s);
	}

gcc apparently thinks that loading address constants into registers is a
waste of time on x86's.  It's right in x86's with no cache :-).

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199611290553.QAA15253>