From owner-freebsd-arch@FreeBSD.ORG Wed Feb 29 20:55:20 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0FBDC106566C for ; Wed, 29 Feb 2012 20:55:20 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 886938FC08 for ; Wed, 29 Feb 2012 20:55:19 +0000 (UTC) Received: by bkcjc3 with SMTP id jc3so4715377bkc.13 for ; Wed, 29 Feb 2012 12:55:03 -0800 (PST) Received-SPF: pass (google.com: domain of mavbsd@gmail.com designates 10.205.135.132 as permitted sender) client-ip=10.205.135.132; Authentication-Results: mr.google.com; spf=pass (google.com: domain of mavbsd@gmail.com designates 10.205.135.132 as permitted sender) smtp.mail=mavbsd@gmail.com; dkim=pass header.i=mavbsd@gmail.com Received: from mr.google.com ([10.205.135.132]) by 10.205.135.132 with SMTP id ig4mr1154425bkc.20.1330548903811 (num_hops = 1); Wed, 29 Feb 2012 12:55:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=T5Im6v+xo0fI0dI+03awtElalqqQwepSlZObycirdlo=; b=fa3o4zqZJzdDRaznhAmZn4eXOICsphMEFcWxZMGqDQiPZjdVsE3JcRtsP/Z1NL8j3q 0rOMiACY/L+vDOs0zWIqpoFVRaiU9iBhj798yIZtFtdwyRyZ+F908mLA7w+Omwo+ffkp qToTeKhUS41YPFFvAyKCMOjKaYGXBCcQ2Ke4g= Received: by 10.205.135.132 with SMTP id ig4mr901293bkc.20.1330547439306; Wed, 29 Feb 2012 12:30:39 -0800 (PST) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id x22sm39515997bkw.11.2012.02.29.12.30.37 (version=SSLv3 cipher=OTHER); Wed, 29 Feb 2012 12:30:38 -0800 (PST) Sender: Alexander Motin Message-ID: <4F4E8AE4.6080705@FreeBSD.org> Date: Wed, 29 Feb 2012 22:30:28 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120116 Thunderbird/9.0 MIME-Version: 1.0 To: Luigi Rizzo References: In-Reply-To: Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-arch@FreeBSD.org Subject: Re: select/poll/usleep precision on FreeBSD vs Linux vs OSX X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Feb 2012 20:55:20 -0000 On 29.02.2012 21:40, Luigi Rizzo wrote: > I have always been annoyed by the fact that FreeBSD rounds timeouts > in select/usleep/poll in very conservative ways, so i decided to > try how other systems behave in this respect. Attached is a simple > program that you should be able to compile and run on various OS > and see what happens. > > Here are the results (HZ=1000 on the system under test, and FreeBSD > has the same behaviour since at least 4.11): > > | Actual timeout > | select | poll | usleep| > timeout | FBSD | Linux | OSX | FBSD | FBSD | > usec | 9.0 | Vbox | 10.6 | 9.0 | 9.0 | > --------+-------+-------+--------+-------+-------+ > 1 2000 99 6 0 2000 > 10 2000 109 15 0 2000 > 50 2000 149 66 0 2000 > 100 2000 196 133 0 2000 > 500 2000 597 617 0 2000 > 1000 2000 1103 1136 2000 2000 > 1001 3000 1103 1136 2000 3000<--- > 1500 3000 1608 1631 2000 3000<--- > 2000 3000 2096 2127 3000 3000 > 2001 4000 3000 4000<--- > 3001 5000 4000 5000<--- > > > Note how the rounding (poll has the timeout in milliseconds) affects > the actual timeouts when you are past multiples of 1/HZ. > > I know that until we have some hi-res interrupt source there is no > hope to have better than 1/HZ granularity. However we are doing > much worse by adding up to 2 extra ticks. This makes apps less > responsive than they could be, and gives us no way to > "yield until the next tick". > > So what I would like to do is add a sysctl (disabled by > default) that enables a better approximation of the desired delay. > > I see in the kernel that all three syscalls loop around a blocking > function (tsleep or seltdwait), and do check the "actual" elapsed > time by calling getmicrouptime() or getnanouptime() around the > sleeping function . So the actual timeout passed to tsleep does > not really matter (as long as it is greater than 0 ). > > The only concern is that getmicrouptime()/getnanouptime() are documented > as "less precise, but faster to obtain". The question is how precise is > "less precise": do we have some way to get an upper bound for the > precision of the timers used in get*time(), so we can use that value > in the equation instead of the extra 1/HZ that tvtohz() puts in > after computing floor(timeout*HZ) ? "less precise" there means they are updated on hardclock() invocation every 1/HZ. > For reference, below is the core of usleep and select/poll > (from kern_time.c and sys_generic.c) > > usleep: > getnanouptime(now) > end = now + timeout; > for (;;) { > getnanouptime(now); > delta = end - now; > if (delta<= 0) > break; > tsleep(..., tvtohz(delta) ) > } > > select/poll: > itimerfix(timeout) // force at least 1/HZ > getmicrouptime(now) > end = now + timeout; > for (;;) { > delta = end - now; > seltdwait(..., tvtohz(delta) ) > getmicrouptime(now); > if (some_fd_is_ready() || now>= end) > break; > } > -- Alexander Motin