Date: Sat, 26 Apr 2003 10:15:27 -0400 (EDT) From: Daniel Eischen <eischen@pcnet1.pcnet.com> To: David Xu <davidxu@viatech.com.cn> Cc: freebsd-threads@freebsd.org Subject: Re: SMPing libpthread Message-ID: <Pine.GSO.4.10.10304260947370.1812-100000@pcnet1.pcnet.com> In-Reply-To: <004001c30bef$648ff0b0$0701a8c0@tiger>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 26 Apr 2003, David Xu wrote: > ----- Original Message ----- > From: "Daniel Eischen" <eischen@pcnet1.pcnet.com> > To: "David Xu" <davidxu@freebsd.org> > Cc: <freebsd-threads@freebsd.org> > Sent: Saturday, April 26, 2003 12:52 PM > Subject: Re: SMPing libpthread > > > > On Thu, 24 Apr 2003, David Xu wrote: > > > > > I have put a patch to enable userland support SMP scheduling. > > > http://people.freebsd.org/~davidxu/libpthread_smp.diff > > > The patch works on my SMP machine, but not fully tested, > > > and I will work on idle kses stuffs. At least, it seems > > > the SMP speed is not slower than UP. :-) > > > > David, I noticed that we hold the scheduling lock before and > > after calling the scheduler. Is this necessary? And if so, > > is it necessary to hold hold it after return from the > > scheduling switch? One you're in the scheduler, and choose > > another thread (releasing the lock after doing so), shouldn't > > you be able to switch to it without having the lock? > > > > I am trying to do things in atomic way. because I found on SMP, thread > state is always out of synchronized in current code. I want to make > state change and context switch in an atomic operation, when thread is > in state transit, no other threads can use _thr_setrunnable_unlocked() etc > to change its state, this avoids lots of "thread already in priority queue" > messages, and if I test the "in priority queue" flag every where, I sometimes > lose chance to wakeup a thread because of some unknown races, whole > process just suddenly stops there. I think the problem is that the signal code doesn't properly check the thread's state and whether it's currently active or not. Sometimes, the signal code thinks that if the state is not PS_RUNNING that the thread is inactive. It should really be checking td->active and the UTS should make sure it sets and clears this when switching the thread in and out. The other problem is that the mutex and cond variable code doesn't take the scheduling lock when changing the thread's state (it relies on holding the mutex or cv low-level lock to protect the thread state). This is one of the things I was going to fix. It is OK to take the scheduling lock while holding the low-level mutex lock; we don't ever take them in reverse (that I can see). > I am just copying the idea from current kernel code, I don't intend to inventing > new things. :-) > > One deadlock I found is in kse_sched_multi(), there is an optimization: > if (curthread == NULL) > ; /* Nothing to do here. */ > else if ((curthread->need_switchout == 0) && > (curthread->blocked == 0) && (THR_IN_CRITICAL(curthread))) { > /* > * Resume the thread and tell it to yield when > * it leaves the critical region. > */ > > These code cause deadlock, at least there is a deadlock between malloc/free and > static mutex initializing code, I think unless we believe that thread is locking a leaf > node, it is not safe to give cpu to the thread again. I suggest comment out these > optimization code. I think that's because spinlock is really still a spinlock (we really have to get rid of these). I think this can be solved by putting the thread in a critical region while it holds the spinlock. Currently, a higher priority thread trying to get a spinlock that is held by a lower priority thread will cause a deadlock (if both threads are running on the same KSE). We could treat spinlocks as static mutexes or as low-level locks. There are 4 fields to the spinlock: typedef struct { volatile long access_lock; volatile long lock_owner; volatile char *fname; volatile int lineno; } spinlock_t; #define _SPINLOCK_INITIALIZER { 0, 0, 0, 0 } Since they are all initialized to 0, we could detect an uninitialized spinlock. Is there any platform where sizeof(long) != sizeof(void *)? Or I suppose we could use fname to point to a low-level lock or mutex. > I am still working on it, hope that I can post a patch soon. > A bad news is kernel seems too unstable when testing KSE program, it just silently > locks up, not panic, no dump, just locks up, this is newest kernel source code. :( Hmm, I haven't got that yet. I have had a lot of lock order reversals lately though. -- Dan Eischen
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.10.10304260947370.1812-100000>