From owner-freebsd-threads@FreeBSD.ORG  Mon Jul  5 11:44:26 2004
Return-Path: <owner-freebsd-threads@FreeBSD.ORG>
Delivered-To: freebsd-threads@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2CF4416A4CE
	for <freebsd-threads@freebsd.org>;
	Mon,  5 Jul 2004 11:44:26 +0000 (GMT)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A23C143D39
	for <freebsd-threads@freebsd.org>;
	Mon,  5 Jul 2004 11:44:25 +0000 (GMT)
	(envelope-from eischen@vigrid.com)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id i65BiBfP012546;
	Mon, 5 Jul 2004 07:44:11 -0400 (EDT)
Date: Mon, 5 Jul 2004 07:44:11 -0400 (EDT)
From: Daniel Eischen <eischen@vigrid.com>
X-Sender: eischen@pcnet5.pcnet.com
To: Julian Elischer <julian@elischer.org>
In-Reply-To: <Pine.BSF.4.21.0407050014390.66234-100000@InterJet.elischer.org>
Message-ID: <Pine.GSO.4.10.10407050702500.5210-100000@pcnet5.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: Andrew Gallatin <gallatin@cs.duke.edu>
cc: freebsd-threads@freebsd.org
Subject: Re: pthread switch  (was Odd KSE panic)
X-BeenThere: freebsd-threads@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Threading on FreeBSD <freebsd-threads.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-threads>
List-Post: <mailto:freebsd-threads@freebsd.org>
List-Help: <mailto:freebsd-threads-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-threads>,
	<mailto:freebsd-threads-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 05 Jul 2004 11:44:26 -0000

On Mon, 5 Jul 2004, Julian Elischer wrote:
> 
> On Sun, 4 Jul 2004, Andrew Gallatin wrote:
> 
> > The problem turned out to be that the worker thread was sleeping in
> > cv_wait_sig() on a cv which was used elsewhere in the driver.  When I
> > fixed this, pretty much everything got better.  I still don't
> > understand exactly what happened.  I have no idea if the worker woke
> > too early, or if the other place this cv was (mis)used was where the
> > early wake happened.  (this would be where mx_free() is called).
> > 
> > Anyway, it no longer crashes the machine.   Sorry to have wasted your
> > time.  
> > 
> > Now to figure out why libthr does pthread_cond_signal() in my scenario
> > 47us faster than libpthread... (ULE, 1 HTT P4 running SMP kernel)
> > 
> > Scenario is that the mainline thread sleeps waiting for a packet
> > (using pthread_cond_timedwait()) and the worker thread is asleep in
> > kernel.  When a packet arrives, the worker wakes up, returns from the
> > ioctl, does a pthread_cond_signal() to wakeup the mainline thread, and
> > goes back into the kernel to sleep (via an ioctl).  This is the sort
> > of scenario where I thought KSE would be faster than a 1:1 lib..

ULE & HTT & KSE is probably the most inefficient combination...
I find true SMP & 4BSD to be better for KSE (perhaps after
julian finishes his cleanup, things will be better).

> Actually it is the sort of scenario where it is not clear what is best
> ;-)
> 
> The mainline thread is asleep in userland. thus to become active, if
> there is no running kernel thread to run it, one needs to be woken up
> from the kernel and made to upcall to userspace to be assigned to run
> the thread. This is pretty much what is happenning in M:M threads. (1:1 
> is in my opinion a non threadded app :-) Except that in KSE there is
> more overhead to do it.. One thing that could be tuned in KSE to make it
> a lot faster is to reverse the order that the threads are run in the
> following way:
> 
> ------ currently:
> 
> Kernel thread 1 returns from waiting for the packet, and as user thread
> A, signals (enters) the UTS to make user thread B runnable. The UTS
> realises that there is no kernel thread currently running to run thread
> B, and enters the kernel to wake one up, and returns to userland, in
> order to re-enter thread A and go back into teh kernel to wait for the
> next packet.

Actually, the UTS doesn't really think of it in terms of creating
new kernel threads.  It only sees idle and non-idle KSEs.

> Kernel thread 2 is started up some time after it is awoken and enters
> userland via an upcall and is assigned thread B to run.
> 
> This is all assuming there are 2 (or more CPUS).
> If there is only one CPU then when user thread A sleeps in the kernel,
> the kernel thread (1) is allowed to upcall back to the UTS and take on 
> user thread B.

This is true regardless of the number of CPUs, unless you have
system scope threads.  Any blocking in the kernel will result
in an upcall.

> ------ the changed version:
> 
> Kernel thread 1 returns from waiting for the packet, and as user thread
> A, signals (enters) the UTS to make user thread B runnable. The UTS
> realises that there is no kernel thread currently running to run thread
> B it enters the kernel briefly to make another thread wake up and
> upcall, and then suspends thread A, and takes on thread B.
> Eventually th e upcall occurs and thread A is assigned to the newly
> awoken thread and re-enters the kernel, waitign for the next kernel.
> 
> ------------
> 
> Dan, can the 2nd version happen now? 
> can Drew make it happen by assigning a higher priority to thread B?

No, not yet; pthread_cond_signal() and pthread_mutex_unlock() are
not currently preemption points.  I'm not sure that you would want
pthread_cond_signal() to be a preemption point -- I'd imagine that
a lot of applications would do something like this:

	/* sleeping */
	pthread_mutex_lock(&m);
	foo->sleeping_count++;
	pthread_cond_wait(&cv, &m);
	pthread_mutex_unlock(&m);

	/* waking up */
	pthread_mutex_lock(&m);
	if (foo->sleeping_count > 0) {
		foo->sleeping_count--;
		pthread_cond_signal(&cv);
	}
	pthread_mutex_unlock(&m);


If you make pthread_cond_signal() a preemption point, then you
would get some ping-pong-like effect.  The newly awoken thread
would block again on the mutex.

With a small number of threads, it probably doesn't make sense
to have more than one KSE (unless they are mostly CPU-bound).
In Drew's example, there are 2 KSEs (HTT enabled) and only 2 threads.
Each time a thread sleeps, the KSE enters the kernel to sleep
(kse_release()) because there are no other threads to run.

Drew, can you try lowering the concurrency?  You can
either try using pthread_setconcurrency(1) or setting
kern.threads.virtual_cpu=1.

Julian, why does it take so long for the woken KSE to become
active?

-- 
Dan