Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Apr 2004 11:13:55 -0400
From:      Alexandre "Sunny" Kovalenko <Alex.Kovalenko@verizon.net>
To:        current@freebsd.org
Subject:   Is it possible to make -lpthread program to use 100% CPU?
Message-ID:  <20040429111355.7eb83170.Alex.Kovalenko@verizon.net>

next in thread | raw e-mail | index | archive | help
Good people,

I apologize for possible off-topic, but since we have new default threading
library in -current, I thought it proper to ask this question here.

Another (hopefully unnecessary) disclaimer -- I am not looking forward to 
sparking controversy over comparative merits of libthr and new 
libpthread (libkse?).

I just would like a pointer to the tunable or a pthread_attribute value which
would allow me to use my hardware at 100% with new default threading library.

The reason for the question is dramatic difference in performance
between libthr and libpthread on my application. Application in 
question performs heavy duty computation on relatively small amount of data.
For those, familiar with Adobe PDF, it comutes O-Value, given user password.
Computation involves RC4 and MD5 and uses OpenSSL library. Since computation
is performed on the range of passwords, range could be arbitrarily broken
into smaller subranges and fed to independent processing threads.

Number of per/second computations for program built with -lpthread seems to
be 20-30% lower comparing to the same program built with -lthr. Fittingly, 
'top' reports 20-30% lower CPU utilization (basically starting with 4 
threads it reports solid 100% for libthr and 75-80% with libpthread). At 32
threads with libpthread CPU utilization is somewhere around 90-95% and 
performance is about 25%.

Hardware in question is dual 2.4GHz Xeon with HyperThreading enabled.

Kernel is built with no WITNESS or INVARIANTS.

There is seem to be no difference between SCHED_4BSD and SCHED_ULE.

Threads were created with SCOPE_SYSTEM for libthr and with both _SYSTEM
and _PROCESS for libpthread. With SCOPE_SYSTEM, libpthread starts catching
up to its own SCOPE_PROCESS performance at much higher number of threads (16+).

Any suggestions would be appreciated.

Alternatively, if there is a need to test a patch or time a program 
in this environment, it could easily be accomplished -- it is a non-
production box.

-- 
Alexandre "Sunny" Kovalenko.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040429111355.7eb83170.Alex.Kovalenko>