Date: Wed, 9 Sep 2009 03:16:06 +0200 From: Luigi Rizzo <rizzo@iet.unipi.it> To: current@freebsd.org, jeff@freebsd.org, re@freebsd.org Subject: Re: clock error: callouts are run one tick late Message-ID: <20090909011606.GA77031@onelab2.iet.unipi.it> In-Reply-To: <20090909010137.GA74897@onelab2.iet.unipi.it> References: <20090909010137.GA74897@onelab2.iet.unipi.it>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Sep 09, 2009 at 03:01:37AM +0200, Luigi Rizzo wrote: > RELENG_8/amd64 (can not try on i386) has the following problem: > > callout_reset(..., t, ...) > > processes the callout after t+1 ticks instead of the t ticks > that one sees on RELENG_7 and earlier. > > I found it by pure chance, noticing that dummynet on RELENG_8 > has a jitter that is two ticks instead of one tick. > Other systems with rely on frequent callouts might see > problems as well. > > An indirect way to see the problem is the following: > > kldload dummynet > > sysctl net.inet.ip.dummynet.tick_adjustment; \ > sleep 1; sysctl net.inet.ip.dummynet.tick_adjustment > > on a working system, the variable should remain mostly unchanged; > on a non-working system you see it growing at a rate HZ/2 > > I suspect the bug is introduced by the change in kern_timeout.c rev. 1.114 > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/kern_timeout.c.diff?r1=1.113;r2=1.114 > > which changes softclock() to stop before the current 'ticks' > so processes everything one tick late. > > I understand the race described in the cvs log, but this does > not seem a proper fix -- it violates POLA by changing the semantics > of callout_reset(), and does not really fix the race, but just adds > an extra uncertainty of 1 tick on when a given callout will be run > > If the problem is a race between hardclock() which updates 'ticks', > and the various hardclock_cpu() instances which might arrive early, > I would suggest two alternative options: > > 1. create a per-cpu 'ticks' (say a field cc_ticks in struct callout_cpu), > increment it at the start of hardclock_cpu, and use cc->ticks > instead of 'ticks' in the various callout functions involved > with manipulation of the callwheel > callout_tick(), softclock(), callout_reset_on() > > 2. start softclock() at cc->cc_softticks -1, i.e. > > ... > CC_LOCK(cc) > - while (cc->cc_softticks != ticks) { > + while (cc->cc_softticks-1 != ticks) { > ... #2 also need this change in callout_tick() mtx_lock_spin_flags(&cc->cc_lock, MTX_QUIET); - for (; (cc->cc_softticks - ticks) < 0; cc->cc_softticks++) { + for (; (cc->cc_softticks - ticks) <= 0; cc->cc_softticks++) { bucket = cc->cc_softticks & callwheelmask; Just tested it, it seems to fix the problem. cheers luigi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090909011606.GA77031>