From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 08:03:52 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6235D106566B; Fri, 8 Jun 2012 08:03:52 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by mx1.freebsd.org (Postfix) with ESMTP id E7F178FC0C; Fri, 8 Jun 2012 08:03:51 +0000 (UTC) Received: from c122-106-171-232.carlnfd1.nsw.optusnet.com.au (c122-106-171-232.carlnfd1.nsw.optusnet.com.au [122.106.171.232]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q5883gvL010688 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 8 Jun 2012 18:03:44 +1000 Date: Fri, 8 Jun 2012 18:03:42 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov In-Reply-To: <20120607091243.GV85127@deviant.kiev.zoral.com.ua> Message-ID: <20120608174919.S1594@besplex.bde.org> References: <20120606165115.GQ85127@deviant.kiev.zoral.com.ua> <201206061423.53179.jhb@freebsd.org> <20120606205938.GS85127@deviant.kiev.zoral.com.ua> <20120607130029.K1962@besplex.bde.org> <20120607091243.GV85127@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@freebsd.org Subject: Re: Fast gettimeofday(2) and clock_gettime(2) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2012 08:03:52 -0000 On Thu, 7 Jun 2012, Konstantin Belousov wrote: > On Thu, Jun 07, 2012 at 01:00:34PM +1000, Bruce Evans wrote: >> >> tc_windup()'s close in succession are bugs, since they cycle the timehands >> faster than they were designed to be. We already have too many of these >> bugs (where tc_setclock() calls tc_windup(). I didn't notice this >> particular problem with it before). Now I will point out that version >> 2 of your patch adds more of these calls, apparently to get changes to >> happen sooner. But in sysctl_kern_timecounter_hardware(), such a call >> was intentionaly left out since it is not needed. Note that tc_tick >> prevents calls to tc_windup() more often than about once per msec if >> hz > 1000. > No, I did not added more tc_windup calls. I added a recalculation > of the shared page content on the timecounter change, which is not > the same as tc_windup() call. This is exactly to handle a disable > of usermode rdtsc use when kernel timecounter hardware changes. Oops. I saw a parameter named tc_windup and didn't look too closely at the event handler for this. Please use a slightly different name. Frequent updates of the shared page may cause the same too-fast cycling as frequent calls to tc_windup(). Are event handlers rate-limited? If not, then someone changing the timecounter hardware from a loop in userland could cause similar problems to a settimeofday() loop. Both are privileged operations so this is not a large problem, but it is a stress test that should pass. >> [jhb wrote] >>> There was apparently another issue with version 2. The bcopy() is not >>> atomic, so potentially libc could read wrong tk_current. I redid >>> the interface to write to the shared page to allow use of real atomics. >> >> Timecounter code is supposed to be lock-free except for some time-domain >> locking. I only see 1 problem with this: where tc_windup() writes the >> generation count and other things without asking for these writes to >> be ordered. In most cases, the time-domain locking prevents problems. > In fact, on x86 the ordering is strong enough that no barriers are needed, > this is why the problem goes unnoticed so far. Only the x86 write ordering is clearly strong enough (see another reply). Bruce