Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Mar 2003 05:50:06 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Petri Helenius <pete@he.iki.fi>
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: mbuf cache
Message-ID:  <3E75D28E.9AB91468@mindspring.com>
References:  <0ded01c2e295$cbef0940$932a40c1@PHE> <20030304164449.A10136@unixdaemons.com> <0e1b01c2e29c$d1fefdc0$932a40c1@PHE> <20030304173809.A10373@unixdaemons.com> <0e2b01c2e2a3$96fd3b40$932a40c1@PHE> <20030304182133.A10561@unixdaemons.com> <0e3701c2e2a7$aaa2b180$932a40c1@PHE> <20030304190851.A10853@unixdaemons.com> <001201c2e2ee$54eedfb0$932a40c1@PHE> <20030307093736.A18611@unixdaemons.com> <008101c2e4ba$53d875a0$932a40c1@PHE> <3E68ECBF.E7648DE8@mindspring.com> <3E70813B.7040504@he.iki.fi> <3E750D52.FFA28DA2@mindspring.com> <048601c2ec59$0696dd30$932a40c1@PHE> <3E75820D.C7EC28E1@mindspring.com> <001d01c2ec6c$4fa80630$932a40c1@PHE>

next in thread | previous in thread | raw e-mail | index | archive | help
Petri Helenius wrote:
[ ... Citeseer earch terms for professional strength networking ... ]

> These seem quite network-heavy, I was more interested in references
> of SMP stuff and how the coherency is maintained and what is
> the overhead of maintaining the coherency in read/write operations
> and how alignment helps/screws you with different word-sizes in IA32
> architechture.

Ah.  I misunderstood.  I thought you meant networking specifically,
because of the receiver livelock discussion context.

Let me change my answer... 8-).


Generally, there are not reference works available online unless
you are an IEEE member.

One of my favorite printed references is a "special order" title,
which is a collection of IEEE proceedings directly on the topic:

	Scheduling and Load Balancing in Parallel and Distributed
		Systems
	Behrooz A. Shirazi (Editor), Ali R. Hurson (Editor),
		Krishna M. Kavi (Editor)
	Wiley-IEEE Press; 1st edition (April 30, 1995)
	ISBN: 0818665874

It usually costs about US$30.

A couple of other good books (more general, but some coverage) are:

	UNIX(R) Systems for Modern Architectures: Symmetric
		Multiprocessing and Caching for Kernel Programmers
	Curt Schimmel
	Addison-Wesley Pub Co; 1st edition (June 30, 1994) =

	ISBN: 0201633388 =


	UNIX Internals: The New Frontiers
	Uresh Vahalia
	Prentice Hall; 1st edition (October 23, 1995)
	ISBN: 0131019082 =


	Solaris Internals: Core Kernel Architecture
	Jim Mauro, Richard McDougall
	Prentice Hall PTR; 1st edition (October 5, 2000)
	ISBN: 0130224960

The first is usually about US$70; the second is usually about US$70;
the third is usually about US$60.

Note: I am biased about the second; I did technical review on it
for Prentice Hall before it was published, and am mentioned in it,
so fair warning ;^).

You might also check out:

	The Magic Garden Explained: The Internals of Unix System V
		Release 4: An Open Systems Design
	Berny Goodheart, James Cox, John R. Mashey
	Prentice Hall; (August 1994)
	ASIN: 0130981389

I know Mashey in passing, and I know the other two by reputation; the
book is a good SVR4 book, but the index royally sucks: every time I
sent to look something up that I was interested in seeing, I couldn't
find it.  Also, IMO, SVR4 is not that hot.  I'm sure this book will
end up introduced into evidence, though ;^).


> Writing a coarse SMP memory benchmark should be easy, I wonder if
> it has been done?

Sure.  Intel has written a lot of them in assmebly language in
their server product division, and then shared only the results.
8-).

Actually, Intel has a couple of good publications on compiler design
for the P4 (basically the say "don't do what GCC does"); they would
also apply to higher level design, I think.  You can find them as
PDF's on their web site under P4 programming for Hyperthreading.


> Judging from the profiling I=B4ve done on both kernel and userland thin=
gs,
> copying memory around is among the most expensive things to do in moder=
n
> multi-GHz machines. Doing arithmetic to decrease memory bandwidth
> requirements pays off very well. The thing I=B4m still wondering about =
is
> how expensive is writing compared to reading.

Depends on your cache configuration; write-through sucks, if the
other CPU's L1 is aware of the memory you are touching.  The L2
cache can hide some of it otherwise, up to the point that it has
to flush pages over the 133MHz DRAM bus.  8-(.

You should expect data copying to be *the* most expensive thing,
short of device I/O, BTW.  Arbitration is a bugger, and moving
data anywhere from L1 to L1 on the same CPU is going to cost you
an arm and a leg.  8-(.

This is why people are so hot-to-trot about "zero copy TCP", and
"zero copy NFS" and "zero copy floor wax"... ;^).

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E75D28E.9AB91468>