Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Mar 2013 16:03:02 -0700
From:      Yuri <yuri@rawbw.com>
To:        FreeBSD Hackers <hackers@freebsd.org>
Subject:   top(1) doesn't report the correct CPU time for a multithreaded process
Message-ID:  <514105A6.40800@rawbw.com>

next in thread | raw e-mail | index | archive | help
I have a process that is CPU bound with 1 thread in its first 5 seconds, 
then it creates 200 threads that are all reading/writing from the 
network, and becomes network bound for the other 6.5min.
When I look at this process in top(1), right after 200 threads are 
created, I see WCPU and CPU values around 3400% and then it goes down to 
the values below 1% for the rest of the run:
50619 yuri          206  20    0   621M   555M uwait   7   0:31 0.68% myapp

In the end, after all threads have quit, process measures its resources 
with getrusage(RUSAGE_SELF, &u); and it shows that CPU time consumed was 
like this:
user=104609ms sys=8758ms wall=395938ms

So "real" CPU percentage wasn't ~0.68%, but was more like 25%. Or maybe 
it is 6% if to consider 400% the max (there are 4 cores). I am inclined 
to trust getrusage(2).

It was this PR, that is now marked as closed with patch checked in: 
http://www.freebsd.org/cgi/query-pr.cgi?pr=127331
But it doesn't seem like this code from the patch is even in 
usr.bin/top/machine.c now (9.1-STABLE).
My original PR, considered a duplicate, is also closed: 
http://www.freebsd.org/cgi/query-pr.cgi?pr=135823

Why top(1) doesn't show the correct CPU time, aggregate for all threads? 
Is this a regression of the patch in the above PR#127331?
Also, why do I ever see 3400% CPU time? This doesn't seem right in any case.

Yuri




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?514105A6.40800>