Skip site navigation (1)Skip section navigation (2)
Date:      26 Apr 1998 15:52:21 -0700
From:      Christoph Toshok <toshok@Hungry.COM>
To:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: threads performance
Message-ID:  <m2af98qjq2.fsf@terror.hungry.com>
In-Reply-To: tlambert@primenet.com's message of 25 Apr 1998 23:23:14 -0700
References:  <m23ef2vzca.fsf@terror.hungry.com> <199804260617.XAA27669@usr05.primenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
tlambert@primenet.com (Terry Lambert) writes:
> 
> > I'm working on japhar (the hungry java vm) and I'm primarily using
> > freebsd for my work.  One of the central "features" of japhar is that
> > it uses platform's thread library -- pthreads on freebsd and linux,
> > cthreads on nextstep and the hurd.
> > 
> > On freebsd the performance is just abysmal.  Really, it's *awful*.
> > Just for kicks, I ported the thread api to NSPR (Netscape's Portable
> > Runtime) and the runtime for javac compiling a trivial .java file
> > drops from 39 seconds to 18 seconds.
> 
> 
> This is a long post; sorry about that.  Please read the whole thing;
> I give a couple of possible causes for the problem, analysis, fixes
> and workarounds, an analysis of why you are probably getting the
> NSPR vs. pthreads results you are seeing, and an analysis of why
> they mean that kernel threads probably wouldn't help in two of the
> three possible root causes.
> 
> -- 1 --
> 
> I'll assume you are running FreeBSD 2.2.6 or -current, since the
> pthreads before those releases is not Draft 4 compliant, and you
> should.  If not, it may just be that you are using a buggy libc_r.

yup.  running 2.2.6 (well, post 2.2.6 -stable).

> -- 2 --
> 
> More likely you are doing a compute intensive task, it's not
> explicitly calling pthread_yield(), and other threads are not
> running concurrently.

but that's just it - the program (at least in this instance) isn't
running multiple threads at all.

> The pthreads implementation is a call conversion implemenetation.  It
> takes blocking calls and converts them into non-blocking calls plus a
> context switch.

right, this is exactly what NSPR does as well.

> -- 3 --
> 
> The next most likely thing is that you are doing the same broken thing
> the LDAP implementors did in their code.  The problem is that it's not
> obviously broken, so it's hard to steer clear of it.
> 
> What they did was use getdtablesize(2) and/or sysctl(3) to get the
> maximum possible number of fd's, and then pass that as the first
> argument to select.
> 
> The number was larger than FD_SETSIZE, and, as a result, select(2)
> was returning "true" for the fd's off in space (some of which, when
> dereferenced, pointed to 0, 1,and 2 as far as the kernel could tell).

nope, nothing like this in japhar.

> > The fact that NSPR can drop 21 seconds off the
> > runtime (in this very contrived example) makes me think that there is
> > a lot going on in libc_r that is suboptimal, but perhaps there is just
> > no other way to implement things so they conform to the posix spec.
> 
> The fact that NSPR can drop 21 seconds off the runtime means that
> threading is not your bottleneck, and that kernel threads would
> probably help, but only because the code is badly behaved.

I'm about to delve into libc_r...  the only thing I can think of that
may be causing this disparity is the signal mask modification foo.  I
can only presume it's being invoked much more often than is necessary.
There really is just no other explanation for the large difference in
speed.

I mean, HelloWorld.class runs about twice as fast on NSPR as libc_r.

> NSPR can't implement kernel services that aren't there in the base
> OS.  That means that the best it can do is to build upon what's
> already there.

right.  NSPR uses a scheme very similar to libc_r's (setjmp/longjmp)
to implement threads on freebsd.

> Most likely, you either have a run-away program (because of the select()
> coding error or a similar problem), OR the NSPR implementation is making
> explicit yield calls that the native implementation doesn't because it
> assumes a kernel implementation of pthreads.

nope.  neither.  I mean, NSPR might be making explicit yield calls,
but those would only serve to slow things down in the single threaded
case.

Chris

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m2af98qjq2.fsf>