From owner-freebsd-smp  Tue Dec  3 19:25:42 1996
Return-Path: owner-smp
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id TAA02342
          for smp-outgoing; Tue, 3 Dec 1996 19:25:42 -0800 (PST)
Received: from base486.synet.net (imdave@DIAL10.SYNET.NET [168.113.1.12])
          by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id TAA02328
          for <freebsd-smp@freebsd.org>; Tue, 3 Dec 1996 19:25:37 -0800 (PST)
Received: (from imdave@localhost)
          by base486.synet.net (8.8.4/8.8.4)
	  id VAA13549; Tue, 3 Dec 1996 21:25:20 -0600 (CST)
Date: Tue, 3 Dec 1996 21:25:20 -0600 (CST)
From: Dave Bodenstab <imdave@synet.net>
Message-Id: <199612040325.VAA13549@base486.synet.net>
To: ccsanady@friley216.res.iastate.edu
Subject: Re: Finished locks..
Cc: freebsd-smp@freebsd.org
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

>
> For what it's worth, I have completed a simple implementation of spin
> locks.  (However, I'm sure someone will tell me different.. :)  Either
> way, it was nice to learn a bit of assembly.
>
> If these are not sufficient, how can I improve upon them?  I'd like to
> move some of the locking around, and they would help.  In particular, I
> am planning to move the global lock out into the syscalls, and perhaps
> create a separate lock for the run queues and such.  Then from there, we
> can we can reduce the graininess a bit. :)

Hi,

Just thought I'd try to contribute a bit...  maybe next year I'll
get a MP board and try to dig back into kernel hacking...

Back around 1988-1991 (or sometime a while ago) I worked on
the SVR3 kernel for AT&T's 3B2 non-symmetric multiprocessor
system.  It was non-symmetric because only the `main' processor
had access to the I/O bus.  We initially modified the scheduler
to only run user-level code on the `co-processors'.  Then,
bit by bit, we began to move the locking deeper into the kernel.

As for the locks, we found it very useful during our development
to have debugging versions of the locks.  We used in-line assembler --
AT&T's assembler allowed this, and so does GNU's for FreeBSD.  The
debugging versions of the spin locks maintained simple counts and
statistics for each lock (which we gave a global name):
	how many times hit
	how many times blocked
	number of spins when blocked
	the instruction pointer of who owns the lock
	the pid of who owns the lock
	etc.

We then wrote a simple utility that used nlist to dump out the counts
of a running kernel or the crash dump image.  (The utility had the names
of the locks built-in.)  It was *very* useful to actually measure -- although
crudely -- what was going on.  You might want to consider doing this for your
spin-locks.

Good luck!  I'm looking forward to running a SMP kernel someday!

Dave Bodenstab
imdave@synet.net


PS.  Another *very* useful thing is to implement a `trace' package.
We sprinkled calls that looked like:
	monitor( char id, long arg1, long arg2, long arg3, long arg4 );

in key areas of the kernel:
	interrupt entry and exit
	scheduler entry and exit
	system call entry and exit
	trap entry and exit
	signal delivery

For each call, the argn's were pertinent information relating to
that particular kernel routine; for the scheduler we'd save the old
and new pid's, and perhaps some of the scheduling criteria, for instance.
A production kernel had the calls #ifdef'ed out.

The monitor package stored the arguments into a large (2048+ slots),
circular buffer, adding the current lbolt value and the return address
of the caller.  Several of the `id' characters were pre-defined, for
instance we used 'P' for scheduler entry and 'p' for scheduler exit.
For interrupt routines, the code would check to see if the last entry
stored was for the same interrupt routine, and if so, just bump a
counter -- this prevented the clock interrupts from flooding the buffer
and wiping out all the other information in cases where the system
locked up.  Another utility was written to dump the contents of the
buffer (again either from a running system or a crash image) and format
it properly -- we knew what each `id' character was used for and what
information each of the argn's was.  What this gives you is a real-time
trace of key kernel activity.  It's much like sticking printf's into the
kernel, but *much* more practical.  There is much less chance that the
additional debugging code will perturb the system as printf's will.
This trace package proved even more useful when we had multiple processors!
Any time one had to investigate a problem, one could sprinkle additional
calls throughout the code as one zeros in on the problem.  Post-mortem
analysis was also tremendously improved.