Date: Sat, 7 Nov 2015 05:46:53 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Ian Lepore <ian@freebsd.org> Cc: Hans Petter Selasky <hps@selasky.org>, Luigi Rizzo <rizzo@iet.unipi.it>, Rasool Al-Saadi <ralsaadi@swin.edu.au>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Alexander Motin <mav@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: Timing issue with Dummynet on high kernel timer interrupt Message-ID: <20151107050810.M3139@besplex.bde.org> In-Reply-To: <1446829230.91534.425.camel@freebsd.org> References: <6545444AE21C2749939E637E56594CEA3C0DCCC4@gsp-ex02.ds.swin.edu.au> <5638B7B5.3030802@selasky.org> <6545444AE21C2749939E637E56594CEA3C0DE7FF@gsp-ex02.ds.swin.edu.au> <563B2703.5080402@selasky.org> <6545444AE21C2749939E637E56594CEA3C0E0BD9@gsp-ex02.ds.swin.edu.au> <563C6864.2090907@selasky.org> <CA%2BhQ2%2Bhm2z0MkB-8w5xJM7%2Biz13r_ZjwmpZBnb30_D_48gaf-w@mail.gmail.com> <563C786C.1050305@selasky.org> <CA%2BhQ2%2Bj0WiGgzV119M1ZQiXP5Z7fq7deVxDPkOhvTc7hpTETKw@mail.gmail.com> <563CC186.9000807@selasky.org> <563CD533.2000909@selasky.org> <1446828229.91534.417.camel@freebsd.org> <563CDA8F.5010901@selasky.org> <1446829230.91534.425.camel@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 6 Nov 2015, Ian Lepore wrote: > On Fri, 2015-11-06 at 17:51 +0100, Hans Petter Selasky wrote: >> On 11/06/15 17:43, Ian Lepore wrote: >>> On Fri, 2015-11-06 at 17:28 +0100, Hans Petter Selasky wrote: >>>> Hi, >> >>> >>> Do the test II results change with this setting? >>> >>> sysctl kern.timecounter.alloweddeviation=0 >> >> Yes, it looks much better: >> >> debug.total: 10013 -> 0 >> debug.total: 10013 -> 0 >> ... > This isn't the first time that the alloweddeviation feature has led > people (including me in the past) to think there is a timing bug. I > think the main purpose of the feature is to help save battery power on > laptops by clustering nearby scheduled wakeups to all happen at the > same time and then allow for longer sleeps between each wakeup. I was trying to remember the flag for turning off that "feature". It gives the bizarre behaviour that on an old system with a timer resolution of 10 msec, "time sleep 1" sleeps for 1 second with an average error of < 10 msec, but with a timer resolution of 1 msec for hardclock and finer for short timeouts, "time sleep 1" sleeps for an average of an extra 30 msec (worst case 1.069 seconds IIRC). Thus high resolution timers give much lower resolution for medium-sized timeouts. (For "sleep 10", the average error is again 30 msec but this is relatively smaller, and for "sleep .001" the average error must be less than 1 msec to work at all, though it is likely to be relatively large.) > I've been wondering lately whether this might also be behind the > unexplained "load average is always 0.60" problem people have noticed > on some systems. If load average is calculated by sampling what work > is happening when a timer interrupt fires, and the system is working > hard to ensure that a timer interrupt only happens when there is actual > work to do, you'd end up with statistics reporting that there is work > being done most of the time when it took a sample. I use HZ = 100 and haven't seen this. Strangely, HZ = 100 gives the same 69 msec max error for "sleep 1" as HZ = 1000. Schedulers should mostly use the actual thread runtimes to avoid sampling biases. That might even be faster. But it doesn't work so well for the load average, or at all for resource usages that are averages, or for the usr/sys/intr splitting of the runtime. It is good enough for scheduling since the splitting is not need for scheduling. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151107050810.M3139>