From owner-freebsd-bugs@FreeBSD.ORG Fri Mar 18 09:06:22 2005 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EBEE916A4CE; Fri, 18 Mar 2005 09:06:22 +0000 (GMT) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 59B5B43D3F; Fri, 18 Mar 2005 09:06:22 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])j2I96JA6024532; Fri, 18 Mar 2005 20:06:19 +1100 Received: from epsplex.bde.org (katana.zip.com.au [61.8.7.246]) j2I96FMq023647; Fri, 18 Mar 2005 20:06:17 +1100 Date: Fri, 18 Mar 2005 20:06:15 +1100 (EST) From: Bruce Evans X-X-Sender: bde@epsplex.bde.org To: Jakub Kruszona-Zawadzki In-Reply-To: <200503172047.j2HKlps4043804@www.freebsd.org> Message-ID: <20050318194752.F1050@epsplex.bde.org> References: <200503172047.j2HKlps4043804@www.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed cc: freebsd-bugs@freebsd.org cc: freebsd-gnats-submit@freebsd.org Subject: Re: kern/78957: time counter per process stops (syscall: getrusage) X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Mar 2005 09:06:23 -0000 On Thu, 17 Mar 2005, Jakub Kruszona-Zawadzki wrote: >> Description: > When a process is running for a long time (several days) time counter per process stops on value: > ru_utime.tv_sec:305221 > ru_utime.tv_usec:322735 This may be the same bug as in PR 76972. Overflow occurs at about 48592008 ticks = 379625 seconds = 105 hours for a a process that consumes 100% of the CPU if the statclock frequency is 128 Hz (which is the default and not easy to change). There is another overflow bug at 2^32 ticks = 388 days. This one is harder to fix. See PR 76972 for details and a fix for the first overflow bug. 37965 seconds is a little larger than 305221 seconds. The difference might be due to the other 70000+ seconds being in ru_stime. The behaviour when overflow occurs is undefined, but stopping on a value is quite likely to occur due to the algorithm for updating ru_*time. Integer overflow tends to cause counters to reset to 0 (or INT_MIN), but the kernel enforces monotonicity of the usage times, so they will stick instead of going backwards to 0. Bruce