From owner-freebsd-questions@FreeBSD.ORG Thu May 6 18:40:55 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BF673106566C for ; Thu, 6 May 2010 18:40:55 +0000 (UTC) (envelope-from bonomi@mail.r-bonomi.com) Received: from mail.r-bonomi.com (ns2.r-bonomi.com [204.87.227.129]) by mx1.freebsd.org (Postfix) with ESMTP id 5290E8FC16 for ; Thu, 6 May 2010 18:40:54 +0000 (UTC) Received: (from bonomi@localhost) by mail.r-bonomi.com (8.14.3/rdb1) id o46IeYvn003676; Thu, 6 May 2010 13:40:34 -0500 (CDT) Date: Thu, 6 May 2010 13:40:34 -0500 (CDT) From: Robert Bonomi Message-Id: <201005061840.o46IeYvn003676@mail.r-bonomi.com> To: freebsd-questions@freebsd.org Subject: Re: User cpu time VS system cpu time X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 May 2010 18:40:55 -0000 > From: cronfy > Date: Thu, 6 May 2010 15:33:40 +0400 > Subject: Re: User cpu time VS system cpu time > > Hello, > > >> I want to understand difference between user CPU time and system CPU > >> time in system accounting. > > But keep in mind that "kernel time" is a broad category - while IO time in > > itself does not count as CPU time, file system operations for example do, > > because they really can be CPU intensive. > > Ivan, thanks for the great explanation. > > I think that I can measure user filesystem usage with sa - it reports > number of IO operations per user/command. In which other cases kernel > time is used instead of user time for a process? I do not mean all of > them - just that usually occur in practice. > > I've noticed that there are moments when system load in top for system > time is very high (60-80% while user load is 15-25%, this produces > very high LA also). All processes that were run at this time show high > kernel time usage, although they usually do not. System is getting > back to normal after Apache restart (I think this is related to Apache > shared memory somehow, but not sure). > > This makes me suspect that system time in sa can not be relied on > while measuring user system usage, because it notably varies under > some circumstances for same operations. Am I wrong? CPU time tracking is -really- simple to understand. logically you look at where the PC is, at regular intervals. it is in one of 3 types of locations -- in 'user' space, somewhere 'inside' the kernel itself, *OR* in the system 'idle loop'. Time spent executing _most_ kernel functions (system calls) is _not_ strictly deterministic -- it depends on what all 'else' the O/S is doing at the time, as well as that which is 'strictly necessary' to perform just the user-initiated action. take a simple case of appending a block of data to a disk file (Berkeley FFS). Assume the file pointer is already at EOF -- it may be that the data fits into the unused part of the last already allocated block for the file, or it *MAY*NOT*. If not there is extra work to do -- get a block from the free list, zero it, copy the user data into it, and add the block to the block-list for that file. It may be possible to record that block's address in the space already allocated for the block list, or it MAY NOT. If not, one has to get a block from the free list, and add it to the 'meta-data' for the file. This _may_ necessitate adding a 2nd-level index block, which *MAY* necessitate adding a 3rd level index block, (and possibly a 4th). Adding to the 'uncertainty' of the numbers, _between_ sampling intervals it is possible for an interrupt to occur, be serviced, and control returned to the lower priority code. If the interrupt-service duration is _less_ than the sampling interval, then the interrupt-service time gets counted, as if it were part of the 'class' of code that was interrupted. This can result in small amounts of what should be 'system' time for one user getting charged as time (user -or- system) for a different user. Similar things can happen when transferring data to/from other kinds of devices, e.g. printers, terminals, etc. On a busy system, there can be a variance of 20% or more, between two successive runs of the same job.