From owner-freebsd-arch@FreeBSD.ORG Tue Mar 5 09:06:59 2013 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1F4DE128; Tue, 5 Mar 2013 09:06:59 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id D4865A7E; Tue, 5 Mar 2013 09:06:58 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 788457300A; Tue, 5 Mar 2013 10:07:35 +0100 (CET) Date: Tue, 5 Mar 2013 10:07:35 +0100 From: Luigi Rizzo To: Alexander Motin Subject: Re: tickless design guidelines Message-ID: <20130305090735.GB18221@onelab2.iet.unipi.it> References: <20130305080134.GC13187@onelab2.iet.unipi.it> <5135AFAD.70408@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5135AFAD.70408@FreeBSD.org> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: Davide Italiano , arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 09:06:59 -0000 On Tue, Mar 05, 2013 at 10:41:17AM +0200, Alexander Motin wrote: > On 05.03.2013 10:01, Luigi Rizzo wrote: > > Hi, > > i would like a bit of advice on how to get coarse time estimates > > (such as "no need to do anything unless at least X msec have elapsed") > > in a tickless kernel. > > > > In the past (and I mean a distant one, i got this habit back in 2.x), > > rather than calling get*time(), i used to look at "ticks" to > > see if it had changed, and only fetch the actual time accordingly. > > > > This worked well with HZ=1000 or above, when my required accuracy > > a few milliseconds. With tickless kernels, i am not sure anymore > > how often ticks gets updated. > > > > I may need to do this tests quite frequently (e.g. up to a million > > times per second), so I'd rather have some lightweight function. > > At this point each active CPU still executes hardclock() with HZ rate as > before, skipping only ticks when CPU is idle. As result, both ticks and > getbinuptime() are working as before. First -- giving resolution of > 1/HZ, second -- min(1/HZ, 1ms). The only place where it is unreliable to > use them is hardware interrupt handlers (not interrupt threads), as if > they fire during a long idle period timers may not be updated yet. excellent - can you add the above paragraph to share/man/man9/microtime.9 so it is clearly documented ? Also i wonder if it may make sense to add a feature so that whenever we get an interrupt and a fast and suitable timecounter is available, some system-wide bintime is updated. So getbinuptime() could just use this counter, becoming extremely cheap (perhaps it is already, i am not sure) and in the long term, as CPUs with fixed frequency TSC become ubiquitous, we would get higher resolution as the interrupt load increases. > Another way that may be better then polling is to let callout(9) manage > time events. You can specify desired time and precision, and depending > on precision it will use either binuptime(), or getbinuptime(), or fast > custom way equivalent to ticks. Recent tests show that on x86 with LAPIC > timer and TSC timecounter present code can handle up to 600K event per > second from user-level on one core. From kernel I think rate could be > even higher. I will keep the suggestion in mind although this is not my current use case; right now i need to get some coarse timestamps on incoming packets to implement features such as "codel" (where i need to detect if a packet has been queued for at least ~5ms, but do not care about microsecond resolution or absolute timestamps). thanks luigi > -- > Alexander Motin