Date: Mon, 6 Jan 1997 14:26:58 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: proff@iq.org (Julian Assange) Cc: archie@whistle.com, hackers@FreeBSD.ORG Subject: Re: divert code not thread/smp compatible Message-ID: <199701062126.OAA12572@phaeton.artisoft.com> In-Reply-To: <199701022007.HAA12208@profane.iq.org> from "Julian Assange" at Jan 3, 97 07:07:31 am
next in thread | previous in thread | raw e-mail | index | archive | help
> Well, I looked at threading/locking issues in netinet/* generally and > have come to view the chances of seeing more than one thread in anything > close to the bsd44 inet code is zero. The whole thing is locked by > a few fat, totally non-granular splnet()s. This is true for one CPU. However, both CPU's do not need to be at the same SPL level. Neither do both CPU's have to sequentially process interrupts through a single queue in symmetric I/O mode. Locking needs to be on data object, not on subsystem entry... though during push-down (the process of increasing the granularity of kernel concurrency for multiple processors) if the interfaces are sufficiently abstract (ie: call down, no violation of layering), then a single global entrancy lock *could* be used per subsystem. Actually, this was my take on the "correct" way to do the pushdown through the system call trap layer for FS's. Interrupts (like network interrupts) are another way of entering the kernel, as are exceptions (real or page faults). If you look at each of these as seperate mechanisms for entering kernel space, then you can lock them (assuming a hierarchical locking system to prevent starvation/deadly-embrace deadlocks using computation of transitive closure over the hierarchy for deadlock avoidance) as if they were seperate things. As far as handling the code that runs at int time, the most effective method is to virtualize as much of the interrupt as you can so that long delay operations block kernel threads instead of processors (leaving CPU's as schedulable resources which can be applied against ready-to-run lists). Thus you would have a bottom end handler that would get the bus conflict out of the way as quickly as possible, and then queue the rest of the operation to be completed *not* in interrupt mode. Both NT and SVR4 ES/MP do this. Topologically, there is no difference between SMP and kernel threading issues in this regard... the only exception is inter-CPU cache synchronization events on locking (and non-synchronizing versions of this type of locking are already being discussed). > While we are on the subjects of locking, I notice a distinct lack > of atomic test-and-set or test-and-inc instructions for struct > usage counts (which are instead relying on non-atomic C). Is there > such an existing kernel macro that does: > > int > tas(slock_t *m) > { > slock_t res; > __asm__("xchgb %0,%1":"=q" (res),"=m" (*m):"0" (0x1)); > return(res); > } There is, but it's unreliable in an SMP case, and has a potential race in a DMA case within a concurrency (top/bottom interrupt handler) optimized UP multithreaded kernel... mostly because of the lock area being potentially paged if it doesn't apply to a locked-in-core subsystem. Actually, it's not the locks themselves, but the cache lines you have to worry about. In general, the P5 and above will update their instruction and data cache internally, but this has to work for 386's and 486's, too... (Van Gilluwe, _The Undocumented PC_, "determining L1 and instruction cache depth"). An IPI based mechanism, which (as a side effect of the MP spec and vagries of IPI delivery guarantess because Intel was not specific enough about implementation) lineralizes mutex (semaphore) access has been discussed on the SMP list (and is probably in the archives for that list). Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199701062126.OAA12572>