Date: Mon, 17 Mar 2003 05:50:06 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Petri Helenius <pete@he.iki.fi> Cc: freebsd-current@FreeBSD.ORG Subject: Re: mbuf cache Message-ID: <3E75D28E.9AB91468@mindspring.com> References: <0ded01c2e295$cbef0940$932a40c1@PHE> <20030304164449.A10136@unixdaemons.com> <0e1b01c2e29c$d1fefdc0$932a40c1@PHE> <20030304173809.A10373@unixdaemons.com> <0e2b01c2e2a3$96fd3b40$932a40c1@PHE> <20030304182133.A10561@unixdaemons.com> <0e3701c2e2a7$aaa2b180$932a40c1@PHE> <20030304190851.A10853@unixdaemons.com> <001201c2e2ee$54eedfb0$932a40c1@PHE> <20030307093736.A18611@unixdaemons.com> <008101c2e4ba$53d875a0$932a40c1@PHE> <3E68ECBF.E7648DE8@mindspring.com> <3E70813B.7040504@he.iki.fi> <3E750D52.FFA28DA2@mindspring.com> <048601c2ec59$0696dd30$932a40c1@PHE> <3E75820D.C7EC28E1@mindspring.com> <001d01c2ec6c$4fa80630$932a40c1@PHE>
next in thread | previous in thread | raw e-mail | index | archive | help
Petri Helenius wrote: [ ... Citeseer earch terms for professional strength networking ... ] > These seem quite network-heavy, I was more interested in references > of SMP stuff and how the coherency is maintained and what is > the overhead of maintaining the coherency in read/write operations > and how alignment helps/screws you with different word-sizes in IA32 > architechture. Ah. I misunderstood. I thought you meant networking specifically, because of the receiver livelock discussion context. Let me change my answer... 8-). Generally, there are not reference works available online unless you are an IEEE member. One of my favorite printed references is a "special order" title, which is a collection of IEEE proceedings directly on the topic: Scheduling and Load Balancing in Parallel and Distributed Systems Behrooz A. Shirazi (Editor), Ali R. Hurson (Editor), Krishna M. Kavi (Editor) Wiley-IEEE Press; 1st edition (April 30, 1995) ISBN: 0818665874 It usually costs about US$30. A couple of other good books (more general, but some coverage) are: UNIX(R) Systems for Modern Architectures: Symmetric Multiprocessing and Caching for Kernel Programmers Curt Schimmel Addison-Wesley Pub Co; 1st edition (June 30, 1994) = ISBN: 0201633388 = UNIX Internals: The New Frontiers Uresh Vahalia Prentice Hall; 1st edition (October 23, 1995) ISBN: 0131019082 = Solaris Internals: Core Kernel Architecture Jim Mauro, Richard McDougall Prentice Hall PTR; 1st edition (October 5, 2000) ISBN: 0130224960 The first is usually about US$70; the second is usually about US$70; the third is usually about US$60. Note: I am biased about the second; I did technical review on it for Prentice Hall before it was published, and am mentioned in it, so fair warning ;^). You might also check out: The Magic Garden Explained: The Internals of Unix System V Release 4: An Open Systems Design Berny Goodheart, James Cox, John R. Mashey Prentice Hall; (August 1994) ASIN: 0130981389 I know Mashey in passing, and I know the other two by reputation; the book is a good SVR4 book, but the index royally sucks: every time I sent to look something up that I was interested in seeing, I couldn't find it. Also, IMO, SVR4 is not that hot. I'm sure this book will end up introduced into evidence, though ;^). > Writing a coarse SMP memory benchmark should be easy, I wonder if > it has been done? Sure. Intel has written a lot of them in assmebly language in their server product division, and then shared only the results. 8-). Actually, Intel has a couple of good publications on compiler design for the P4 (basically the say "don't do what GCC does"); they would also apply to higher level design, I think. You can find them as PDF's on their web site under P4 programming for Hyperthreading. > Judging from the profiling I=B4ve done on both kernel and userland thin= gs, > copying memory around is among the most expensive things to do in moder= n > multi-GHz machines. Doing arithmetic to decrease memory bandwidth > requirements pays off very well. The thing I=B4m still wondering about = is > how expensive is writing compared to reading. Depends on your cache configuration; write-through sucks, if the other CPU's L1 is aware of the memory you are touching. The L2 cache can hide some of it otherwise, up to the point that it has to flush pages over the 133MHz DRAM bus. 8-(. You should expect data copying to be *the* most expensive thing, short of device I/O, BTW. Arbitration is a bugger, and moving data anywhere from L1 to L1 on the same CPU is going to cost you an arm and a leg. 8-(. This is why people are so hot-to-trot about "zero copy TCP", and "zero copy NFS" and "zero copy floor wax"... ;^). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E75D28E.9AB91468>