From owner-freebsd-arch@FreeBSD.ORG  Tue Mar  5 09:06:59 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: arch@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 1F4DE128;
 Tue,  5 Mar 2013 09:06:59 +0000 (UTC)
 (envelope-from luigi@onelab2.iet.unipi.it)
Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238])
 by mx1.freebsd.org (Postfix) with ESMTP id D4865A7E;
 Tue,  5 Mar 2013 09:06:58 +0000 (UTC)
Received: by onelab2.iet.unipi.it (Postfix, from userid 275)
 id 788457300A; Tue,  5 Mar 2013 10:07:35 +0100 (CET)
Date: Tue, 5 Mar 2013 10:07:35 +0100
From: Luigi Rizzo <rizzo@iet.unipi.it>
To: Alexander Motin <mav@FreeBSD.org>
Subject: Re: tickless design guidelines
Message-ID: <20130305090735.GB18221@onelab2.iet.unipi.it>
References: <20130305080134.GC13187@onelab2.iet.unipi.it>
 <5135AFAD.70408@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5135AFAD.70408@FreeBSD.org>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: Davide Italiano <davide@FreeBSD.org>, arch@freebsd.org
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 09:06:59 -0000

On Tue, Mar 05, 2013 at 10:41:17AM +0200, Alexander Motin wrote:
> On 05.03.2013 10:01, Luigi Rizzo wrote:
> > Hi,
> > i would like a bit of advice on how to get coarse time estimates
> > (such as "no need to do anything unless at least X msec have elapsed")
> > in a tickless kernel.
> > 
> > In the past (and I mean a distant one, i got this habit back in 2.x),
> > rather than calling get*time(),  i used to look at "ticks" to
> > see if it had changed, and only fetch the actual time accordingly.
> > 
> > This worked well with HZ=1000 or above, when my required accuracy
> > a few milliseconds. With tickless kernels, i am not sure anymore
> > how often ticks gets updated.
> > 
> > I may need to do this tests quite frequently (e.g. up to a million
> > times per second), so I'd rather have some lightweight function.
> 
> At this point each active CPU still executes hardclock() with HZ rate as
> before, skipping only ticks when CPU is idle. As result, both ticks and
> getbinuptime() are working as before. First -- giving resolution of
> 1/HZ, second -- min(1/HZ, 1ms). The only place where it is unreliable to
> use them is hardware interrupt handlers (not interrupt threads), as if
> they fire during a long idle period timers may not be updated yet.

excellent - can you add the above paragraph to
share/man/man9/microtime.9 so it is clearly documented ?

Also i wonder if it may make sense to add a feature so that whenever
we get an interrupt and a fast and suitable timecounter is available,
some system-wide bintime is updated.

So getbinuptime() could just use this counter, becoming extremely
cheap (perhaps it is already, i am not sure) and in the long term,
as CPUs with fixed frequency TSC become ubiquitous,
we would get higher resolution as the interrupt load increases.

> Another way that may be better then polling is to let callout(9) manage
> time events. You can specify desired time and precision, and depending
> on precision it will use either binuptime(), or getbinuptime(), or fast
> custom way equivalent to ticks. Recent tests show that on x86 with LAPIC
> timer and TSC timecounter present code can handle up to 600K event per
> second from user-level on one core. From kernel I think rate could be
> even higher.

I will keep the suggestion in mind although this is not my current use case;
right now i need to get some coarse timestamps on incoming packets
to implement features such as "codel" (where i need to detect if a
packet has been queued for at least ~5ms, but do not care about microsecond
resolution or absolute timestamps).

thanks
luigi
> -- 
> Alexander Motin