From owner-freebsd-arch@FreeBSD.ORG Fri Jun 1 09:50:37 2007 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E379B16A41F for ; Fri, 1 Jun 2007 09:50:37 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail33.syd.optusnet.com.au (mail33.syd.optusnet.com.au [211.29.132.104]) by mx1.freebsd.org (Postfix) with ESMTP id 67EF413C448 for ; Fri, 1 Jun 2007 09:50:37 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c211-30-225-63.carlnfd3.nsw.optusnet.com.au [211.30.225.63]) by mail33.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id l519i6b6015553 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 1 Jun 2007 19:50:27 +1000 Date: Fri, 1 Jun 2007 19:44:03 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Jeff Roberson In-Reply-To: <20070531181228.W799@10.0.0.1> Message-ID: <20070601192834.S4657@besplex.bde.org> References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070531181228.W799@10.0.0.1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@FreeBSD.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jun 2007 09:50:38 -0000 On Thu, 31 May 2007, Jeff Roberson wrote: > Now that I've said all of that and committed the patch, I just realized that > there is still one race that is unacceptable. When the thread exits in > thread_exit() and adds the stats of both threads together we could lose > changes in the still-running thread. I think I see. The same problem seems to affect all calls to ruxagg() and rucollect() for threads that aren't curthread. You cannot control the stats for other threads using a spinlock since statclock() doesn't use a spinlock for the tick counts and shouldn't (modulo this bug) use one for the rss's. Resetting the tick counts in ruxagg() is particulary dangerous. Resetting the runtime in ruxagg() isn't a problem because the runtime isn't touched by statclock(). ruxcollect() only does insufficently locked accesses for reading the rss's, except in thread_exit(). It should be easy to avoid the resettings by accumulating into a local rux as is already done for ru's (put an rux in each thread and add these up when required). This reduces to the same problem as for the rss's. Bruce