Date: Fri, 8 Dec 2000 03:50:04 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: msmith@FreeBSD.ORG (Mike Smith) Cc: tlambert@primenet.com (Terry Lambert), smp@FreeBSD.ORG Subject: Re: Netgraph and SMP Message-ID: <200012080350.UAA03298@usr08.primenet.com> In-Reply-To: <200012080332.eB83WtF00456@mass.osd.bsdi.com> from "Mike Smith" at Dec 07, 2000 07:32:55 PM
next in thread | previous in thread | raw e-mail | index | archive | help
> > In Solaris, the entry into the driver would hold a reference, > > which would result in the reference count being incremented. > > Only modules with a 0 reference count can be unloaded. This > > same mechanism is used for vnodes, and for modules on which > > other modules depend. It works well, ans is very light weight. > > The whole problem is that it *isn't* very light weight. > > The reference count has to be atomic, which means that it ping-pongs > around from CPU to CPU, causing a lot of extra cache traffic. > > OTOH, there's not much we can do about this short of going looking for > better multi-CPU reference count implementations once we have time to > worry about performance. Actually, you can just put it in non-cacheable memory, and the penalty will only be paid by the CPU(s) doing the referencing. This means a clock multiplier worth of cycles, though, to get it in and out of main memory from the CPU. Back when all this started, clock multipliers weren't 1/5th the problem they pose today... Still, for a very large number of CPUs, this would work fine for all but frequently contended objects. I think that it is making more and more sense to lock interrupts to a single CPU. What happens if you write to a page that's marked non-cachable on the CPU on which you are running, but cacheable on another CPU? Does it do the right thing, and update the cache on the caching CPU? If so, locking the interrupt processing for each card to a particular CPU could be very worthwhile, since you would never take the hit, unless you were doing something extraordinary. BTW, you would want to grab a ref, check for the heavy lock, and back off it it were held. The unload would want to grab the heavy lock, grab a ref, and do its work when everyone is backed off (ref = 1). This ordering would ensure the least overhead for the normal case (no heavy lock, ref > 0). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200012080350.UAA03298>