Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 Jun 1999 21:15:47 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        vanmaren@cs.utah.edu (Kevin Van maren)
Cc:        freebsd-smp@FreeBSD.ORG
Subject:   Re: high-efficiency SMP locks - submission for review
Message-ID:  <199906292115.OAA26919@usr08.primenet.com>
In-Reply-To: <199906291705.LAA20627@zane.cs.utah.edu> from "Kevin Van maren" at Jun 29, 99 11:05:02 am

next in thread | previous in thread | raw e-mail | index | archive | help
> I'm really glad to see that there is so much activity on the list!
> 
> Just a quick summary (from my point of view):
> 1) Multi-threading the kernel will require locking, not just spl().
> 2) Locking will likely slow down uni-processor systems.

Locking in combination with multithreading should actually speed
up uniprocessor systems.

The SVR4.2 (UnixWare 2.0) release was ~15% faster on UP, *with*
locking, due to the fact that a lot of th UP code benefitted
from deserialization of multiple simultaneous kernel operations.


> 3) Removing the BGL will allow more parallelism in the kernel
> for multiple system-bound applications under SMP.

And for UP systems, wich may wish to have multiple asynchronous
operations outstanding simultaneously.  Admittedly, the POSIX
implementation of this leaves a lot to be desired (specifically,
POSIX only deigned to "lower" itself to making asynchronous versions
of a subset of file I/O related system calls).


> 4) It will be a LOT of work to re-write the kernel to be thread-safe.
> Changing the execution environment will violate a lot of assumptions.

Yep.


> 5) Very few people both know enough about the kernel internals and
> multiprocessor/multi-threaded locking to do the job "right".

I don't think that's true.  I count at least 5 people people in
the FreeBSD camp, not including myself (Simon Shapiro, Bakul Shah,
John Dyson, et. al.), just off the top of my head (e.g. if your
name belong here and I didn't put you here, I'm not snubbing you).


> 6) The method most likely to succeed will be evolutionary; there is
> simply too much code to change everything at once and get it all working.

I doubt this.  It's unlikely to be possible to achieve Solaris
or even NT level SMP performance without a willingness to rewire
the guts of things.

It's possible to break the tasks down by subsystem, but after you
do that, there's a limit to how divisible the tasks are.


> SMP will scale with a BGL as long as we minimize the system time.
> If system time with 4 CPUs is under 10%, the kernel is not the problem.
> Since it is not always practical to make the kernel fast enough to
> do that for many applications/workloads, we need to move enough out
> of the BGL so that we can get the necessary parallelism.

You need to distinguish "system time" from "active system time",
as a first approximation.  Things that are sleeping are acounted
system time from both areas.  The only one that really matters
is active system time.

I think a good approach would be to divorce the idea of user space
process context (user stack and memory map) and kernel space
process context (kernel stack per kernel entry, user space memory
map, sleep context) as a first run.  Once we have the idea that
the things that run in the kernel aren't the same as the things in
user space, then kernel work can be scheduled sepeerately from user
space work to achive parallelism wins (e.g. your async read request
is serviced by CPU 2 while your program continues to run on CPU 0).


> Linux lost to NT because of the slow Linux protocol stack tested.

This was their presumption.


> The review said that a multi-threaded stack would have made a difference.
> However, Linux lost on the UP case too, so I doubt that's the only problem.
> The fact is, if the protocol stack was fast enough, it wouldn't need
> to be multi-threaded.  I would like to see how FreeBSD does on
> that same test -- it has a much faster TCP/IP stack than Linux
> (especially using sendfile for the static pages!)

FreeBSD did worse than Linux, both SMP and UP.


> On the x86, you do need to lock the bus to guarantee operations
> are atomic, with the exception of xchg (but not the variants),
> which is guaranteed to be atomic.  They also must be naturally-aligned.

You lock the bus to mutex the soft lock on objects.  In general,
it's a good idea to use xchg for this, and not get too complicated
with the soft lock access.  The soft locks themselves can be as
complicated as necessary to support the architecture.  For a mostly
unchanged FreeBSD kernel, this would mean, minimally, intention
mode shared/exclusive multiple reader single write locks with
reader draining and queueing after writer pending.



> To clarify: there DO exist some dual-processor 486 systems.  They
> use APICs, and in *theory* can run FreeBSD without too much difficulty
> (there is no mptable, the processor APICs are at different addresses,
> so you have to know which processor you are on to access the APIC,
> and the AP initialization is a little different).  I don't think
> anyone cares enough to implement the support code, and Intel discontinued
> the parts necessary to build them, so it probably won't be too painful
> to break possible support for them.

I don't think Intel discontinued production of external APIC's.  They
are useful in embedded systems for non-Intel coprocessors.  8-).

The most interesting place is SMP SPARC, but I think that if someone
wants to port FreeBSD to their Sequent box, it should be possible (e.g.
don't architect against it).  Likewise, the BeBox, which uses an MEI
coherency model based on removing the L2 cache chips and replacing
them with specific arbitration logic.


> Because the cost of setting up the mappings, and the wasted page
> of memory, greatly exceeds the cost of a system call that is used
> at most (usually) once per process.  The only time getpid() is called
> several times is during (bad) synthetic benchmarks.  Having libc cache
> the value would be a more viable solution; it would have to trap fork()
> calls, however, to invalidate the stored pid.

Several libc implementations do exactly this...



> Compiling and storing 8 kernels for the loader to choose from sounds
> like a bad idea as well; it may be practical for CD-ROM installation,
> although I think it is more likely the user will select the right one.

I definitely agree.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199906292115.OAA26919>