Date: Wed, 13 Mar 2013 16:03:02 -0700 From: Yuri <yuri@rawbw.com> To: FreeBSD Hackers <hackers@freebsd.org> Subject: top(1) doesn't report the correct CPU time for a multithreaded process Message-ID: <514105A6.40800@rawbw.com>
next in thread | raw e-mail | index | archive | help
I have a process that is CPU bound with 1 thread in its first 5 seconds, then it creates 200 threads that are all reading/writing from the network, and becomes network bound for the other 6.5min. When I look at this process in top(1), right after 200 threads are created, I see WCPU and CPU values around 3400% and then it goes down to the values below 1% for the rest of the run: 50619 yuri 206 20 0 621M 555M uwait 7 0:31 0.68% myapp In the end, after all threads have quit, process measures its resources with getrusage(RUSAGE_SELF, &u); and it shows that CPU time consumed was like this: user=104609ms sys=8758ms wall=395938ms So "real" CPU percentage wasn't ~0.68%, but was more like 25%. Or maybe it is 6% if to consider 400% the max (there are 4 cores). I am inclined to trust getrusage(2). It was this PR, that is now marked as closed with patch checked in: http://www.freebsd.org/cgi/query-pr.cgi?pr=127331 But it doesn't seem like this code from the patch is even in usr.bin/top/machine.c now (9.1-STABLE). My original PR, considered a duplicate, is also closed: http://www.freebsd.org/cgi/query-pr.cgi?pr=135823 Why top(1) doesn't show the correct CPU time, aggregate for all threads? Is this a regression of the patch in the above PR#127331? Also, why do I ever see 3400% CPU time? This doesn't seem right in any case. Yuri
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?514105A6.40800>