Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Feb 2005 17:41:26 -0500
From:      David Schultz <das@FreeBSD.ORG>
To:        Maxim Sobolev <sobomax@portaone.com>
Cc:        Peter Edwards <peadar.edwards@gmail.com>
Subject:   Re: Pthreads performance
Message-ID:  <20050211224126.GA43252@VARK.MIT.EDU>
In-Reply-To: <420CEC42.2070504@portaone.com>
References:  <420CC9F7.40802@portaone.com> <34cb7c840502110903356a5813@mail.gmail.com> <420CEC42.2070504@portaone.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Feb 11, 2005, Maxim Sobolev wrote:
> Thank you for the analysis! Looks like you have at least some valid 
> points. I've modified the code to count how many times producer calls 
> malloc() to allocate a new slot, and got the following numbers:
> 
> -bash-2.05b$ ./aqueue_linuxthreads -n 10000000
> pusher started
> poper started
> total 237482 slots used
> -bash-2.05b$ ./aqueue_kse -n 10000000
> pusher started
> poper started
> total 403966 slots used
> -bash-2.05b$ ./aqueue_thr -n 10000000
> pusher started
> poper started
> total 223634 slots used
> -bash-2.05b$ ./aqueue_c_r -n 10000000
> pusher started
> poper started
> total 55589 slots used
> 
> This suggests that indeed, it is unfair to compare KSE times to LT 
> times, since KSE have done almost 2x more malloc()s than LT. However, as 
> you can see, libthr have done comparable number of allocations, while 
> c_r about 4 times less, so that only malloc() cost can't fully explain 
> the difference in results.

The difference in the number of mallocs may be related to the way
mutex unlocks work.  Some systems do direct handoff to the next
waiting thread.  Suppose one thread does:

	pthread_mutex_lock()
	pthread_mutex_unlock()
	pthread_mutex_lock()

With direct handoff, the second lock operation would automatically
cause an immediate context switch, since ownership of the mutex
has already been transferred to the other thread.  Without direct
handoff, the thread may be able to get the lock back immediately;
in fact, this is almost certainly what will happen on a uniprocessor.
Since the example code has no mechanism to ensure fairness, without
direct handoff, one of the threads could perform thousands of
iterations before the other one wakes up, and this could explain
all the calls to malloc().

The part of this picture that doesn't fit is that I was under the
impression that KSE uses direct handoff...

FWIW, there's a separate threads@ list for this sort of thing.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050211224126.GA43252>