From owner-freebsd-arch@FreeBSD.ORG Wed May 30 11:18:00 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6692E16A46D for ; Wed, 30 May 2007 11:18:00 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx01.syd.optusnet.com.au (fallbackmx01.syd.optusnet.com.au [211.29.132.93]) by mx1.freebsd.org (Postfix) with ESMTP id F176913C44B for ; Wed, 30 May 2007 11:17:59 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail31.syd.optusnet.com.au (mail31.syd.optusnet.com.au [211.29.132.102]) by fallbackmx01.syd.optusnet.com.au (8.12.11.20060308/8.12.11) with ESMTP id l4TKqjSB008093 for ; Wed, 30 May 2007 06:52:45 +1000 Received: from c211-30-225-63.carlnfd3.nsw.optusnet.com.au (c211-30-225-63.carlnfd3.nsw.optusnet.com.au [211.30.225.63]) by mail31.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id l4TKqfA0030283 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 30 May 2007 06:52:43 +1000 Date: Wed, 30 May 2007 06:52:42 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Kip Macy In-Reply-To: Message-ID: <20070530062757.L93410@delplex.bde.org> References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@freebsd.org Subject: Re: rusage breakdown and cpu limits. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 May 2007 11:18:00 -0000 On Tue, 29 May 2007, Kip Macy wrote: >> I think using a per-process spin lock (or a pool of spin locks) would be a >> good first step. I wouldn't do anything more complicated unless the simple >> approach doesn't work. The only reason to not move the check into >> userret() >> would be if one is worried about threads chewing up CPU time while they are >> in the kernel w/o bouncing out to userland. Also, it matters which one >> happens more often (userret() vs mi_switch()). If on average threads >> perform >> multiple system calls during a single time slice (no idea if this is true >> or >> not), then moving the check to userret() would actually hurt performance. > > Processes can certainly make numerous system calls within a single > time slice. Not many more than a few hundred million syscalls can be made within a timeslice of 100 mS. FreeBSD does too many context switches for interrupts, so the number in practice seem to be mostly in the range of 1-10, but I hope for 100-1000. > However, in userret it would be protected by a per process > or per thread blocking mutex as opposed to a global spin mutex. It > would be surprising if it isn't a net win, although it is quite > possible that on a 2-way system the extra locking could have an > adverse effect on some workloads. Any locking within userret() would be a good pessimization. There are none now, but still a lot of bloat. In this case, correct proc locking isn't even possible, since the runtime update must occur while something like a global scheduler lock is held. When a context switch occurs, the lock must protect at least the old process and the new process, and somehow prevent interference from other processes. The update of the runtime needs essentially the same lock. Any locking in userret() would need to use the same lock as the update to be perfectly correct. Fortunately, the cpulimit limit check only needs to be correct to within seconds or even minutes. A sloppy unlocked check don't often enough would work OK, at least if you re-check with correct locking before killing the process. Alternatively, the sloppiness can be due to delayed updates -- let the rusage data lag by up to a second or so in the context of the check; the runtime would accumulate accurately somewhere, but the check wouldn't see it all the accumulation step would need the full lock for reading and writing the scattered data and a lesser lock for updating the accumulated data. userret() still shouldn't be pessimized by acquiring the lesser lock. I still think this misses the point -- the check is the easy part, and can be done at no extra locking cost while the full lock is held. Bruce