From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 14:10:26 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B001E16A468 for ; Fri, 8 Jun 2007 14:10:26 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.168]) by mx1.freebsd.org (Postfix) with ESMTP id 1410A13C44B for ; Fri, 8 Jun 2007 14:10:25 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: by ug-out-1314.google.com with SMTP id u2so1095531uge for ; Fri, 08 Jun 2007 07:10:25 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=t9kejLvD/PqRW621ODbNLIhQjf1rO/oxOou1mN3K25eE0XG6JzMfjZeOi4uzgqvgtPGCwZo9hkcGUnsTxbQF7TxqJWYIAPMdWu9mcE2JiSZLQiwfoGw0bHtItsPQAN1K8zHlTK8uD8TTB7ubd4EP6N/FFk6NBAE8xXELM6QHvLU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=jDF8b/6wpJ/oj2N1T4OScYZmeNMfpd/Y6vAy12b9olaOU9av1OgU8LZC1cWlxZcTxBH0YmkqligHfuOFyR5MqW59Y6OVt5mphv7tpc5EBP6OxfR/SrBEfBkqcDqe5LEVTPbxpE/vn1uRAYQw1Q4d9SAqizMYoLNeX1h+Jq7odps= Received: by 10.82.189.6 with SMTP id m6mr5447021buf.1181311824737; Fri, 08 Jun 2007 07:10:24 -0700 (PDT) Received: from ?172.31.5.25? ( [89.97.252.178]) by mx.google.com with ESMTP id h6sm576509nfh.2007.06.08.07.10.23 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 08 Jun 2007 07:10:24 -0700 (PDT) Message-ID: <4669633D.9090809@FreeBSD.org> Date: Fri, 08 Jun 2007 16:10:05 +0200 From: Attilio Rao User-Agent: Thunderbird 1.5 (X11/20060526) MIME-Version: 1.0 To: Jeff Roberson References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> <46652D17.5090903@FreeBSD.org> <20070605214404.X47001@delplex.bde.org> <20070606152352.H606@10.0.0.1> <20070607135511.P606@10.0.0.1> In-Reply-To: <20070607135511.P606@10.0.0.1> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: Attilio Rao Cc: freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 14:10:26 -0000 Jeff Roberson wrote: > On Wed, 6 Jun 2007, Jeff Roberson wrote: > >> On Tue, 5 Jun 2007, Bruce Evans wrote: >> >>> >>> This can probably be fixed more simply by calling rufetch() to reset the >>> time state in threads as a side effect. Do this before resetting the >>> state in the process. >> >> Ok, I agree with bde here, just call rufetch and this will clear each >> thread, and then you can clear the rux in the proc. >> >> I'd like to make a list of the remaining problems with rusage and >> potential fixes. Then we can decide which ones myself and attilio >> will resolve immediately to clean up some of the effect of the sched >> lock changes. >> >> 1) The ruadd() in thread_exit() is not safe since we're accessing >> another thread's unlocked rusage structure. Potential solution is to >> allocate p_ru as part of the proc struct and add into there, which >> will be protected by the PROC_SLOCK, which bde seemed to like better >> anyway. >> >> 2) We may lose information between exit1() and thread_exit() due to >> the way p_ru is initialized before we're done exiting. There also >> seems to be a race where wait() operates on a process before it's done >> in thread_exit() which means wait may return rusage information >> without the child added in! The solution will be to fix this race, >> and then access p_ru directly in wait(). > > The patch at http://people.freebsd.org/~jeff/rusage3.diff fixes points 1 > and 2 as well as the p_runtime iniitialization problem. This moves the > collection of child rusage back into exit1() and changes the exiting > threads to accumulate their rusage into p_ru under protection of the > process spinlock. This also removes the gross lock/unlock of proc slock > (formerly sched_lock) from wait and implements something more sensible. > > Jeff > >> >> 3) There is no locking around rufetch() and calcru(). calcru() may >> apply new rux values to an old rusage, giving inaccurate results. The >> solution is to either require the proc slock around both calls, or >> provide a new routine which does the fetch and calc while grabbing the >> lock itself. And this should fix (3): http://users.gufi.org/~rookie/works/patches/schedlock/rusage2.diff (and reorders rucollect() declaration sorted by name). A thought: Shouldn't we actually remove in calcru() (and rufetchcalc()) the copy to the rux object? When sched_lock was there it would be useful since it had a lot of contention, but now that the per-proc spinlock is protecting those fields it is useless. And consider that calcru1() has no locking inside (so you won't expect particulary long execution times). Thanks, Attilio