From owner-freebsd-current@FreeBSD.ORG Sun Dec 30 23:13:58 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2FD5D9D9; Sun, 30 Dec 2012 23:13:58 +0000 (UTC) (envelope-from freebsd@damnhippie.dyndns.org) Received: from duck.symmetricom.us (duck.symmetricom.us [206.168.13.214]) by mx1.freebsd.org (Postfix) with ESMTP id 3C1C58FC0A; Sun, 30 Dec 2012 23:13:57 +0000 (UTC) Received: from damnhippie.dyndns.org (daffy.symmetricom.us [206.168.13.218]) by duck.symmetricom.us (8.14.5/8.14.5) with ESMTP id qBUNDuIG079079; Sun, 30 Dec 2012 16:13:56 -0700 (MST) (envelope-from freebsd@damnhippie.dyndns.org) Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id qBUNDhtn084905; Sun, 30 Dec 2012 16:13:43 -0700 (MST) (envelope-from freebsd@damnhippie.dyndns.org) Subject: Re: [RFC/RFT] calloutng From: Ian Lepore To: Alexander Motin In-Reply-To: <50DB4EFE.2020600@FreeBSD.org> References: <50CCAB99.4040308@FreeBSD.org> <50CE5B54.3050905@FreeBSD.org> <50D03173.9080904@FreeBSD.org> <20121225232126.GA47692@alchemy.franken.de> <50DB4EFE.2020600@FreeBSD.org> Content-Type: multipart/mixed; boundary="=-0QeMdI0U6ePkF5/otAWO" Date: Sun, 30 Dec 2012 16:13:43 -0700 Message-ID: <1356909223.54953.74.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Cc: Davide Italiano , freebsd-arch@freebsd.org, FreeBSD Current , Marius Strobl X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Dec 2012 23:13:58 -0000 --=-0QeMdI0U6ePkF5/otAWO Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Wed, 2012-12-26 at 21:24 +0200, Alexander Motin wrote: > On 26.12.2012 01:21, Marius Strobl wrote: > > On Tue, Dec 18, 2012 at 11:03:47AM +0200, Alexander Motin wrote: > >> Experiments with dummynet shown ineffective support for very short > >> tick-based callouts. New version fixes that, allowing to get as many > >> tick-based callout events as hz value permits, while still be able to > >> aggregate events and generating minimum of interrupts. > >> > >> Also this version modifies system load average calculation to fix some > >> cases existing in HEAD and 9 branches, that could be fixed with new > >> direct callout functionality. > >> > >> http://people.freebsd.org/~mav/calloutng_12_17.patch > >> > >> With several important changes made last time I am going to delay commit > >> to HEAD for another week to do more testing. Comments and new test cases > >> are welcome. Thanks for staying tuned and commenting. > > > > FYI, I gave both calloutng_12_15_1.patch and calloutng_12_17.patch a > > try on sparc64 and it at least survives a buildworld there. However, > > with the patched kernels, buildworld times seem to increase slightly but > > reproducible by 1-2% (I only did four runs but typically buildworld > > times are rather stable and don't vary more than a minute for the > > same kernel and source here). Is this an expected trade-off (system > > time as such doesn't seem to increase)? > > I don't think build process uses significant number of callouts to > affect results directly. I think this additional time could be result of > the deeper next event look up, done by the new code, that is practically > useless for sparc64, which effectively has no cpu_idle() routine. It > wouldn't affect system time and wouldn't show up in any statistics > (except PMC or something alike) because it is executed inside timer > hardware interrupt handler. If my guess is right, that is a part that > probably still could be optimized. I'll look on it. Thanks. > > > Is there anything specific to test? > > Since the most of code is MI, for sparc64 I would mostly look on related > MD parts (eventtimers and timecounters) to make sure they are working > reliably in more stressful conditions. I still have some worries about > possible deadlock on hardware where IPIs are used to fetch present time > from other CPU. > > Here is small tool we are using for test correctness and performance of > different user-level APIs: http://people.freebsd.org/~mav/testsleep.c > I grabbed testsleep.c to test an arm event timer implementation, and had to fix a couple nits... kqueueto was missing from the names[] array, and I had to add a "* 1000" to a couple places where usec was stuffed into a timespec's tv_nsec. I also tested the calloutng_12_17 patches and the kqueue stuff behaved very strangely. Then I noticed you had a 12_26 patchset so I tested that (after crudely fixing a couple uninitialized var warnings), and it all looks good on this arm (Raspberry Pi). I'll attach the results. It's so sweet to be able to do precision sleeps. -- Ian --=-0QeMdI0U6ePkF5/otAWO Content-Disposition: inline; filename="calloutng_test.txt" Content-Type: text/plain; name="calloutng_test.txt"; charset="us-ascii" Content-Transfer-Encoding: 7bit for t in 1 300 3000 30000 300000 ; do for m in select poll usleep nanosleep kqueue kqueueto syscall ; do ./testsleep $t $m done done With calloutng_12_26.patch... HZ=100 HZ=250 HZ=1000 ---------- ---------------- ---------------- ---------------- select 1 55.79 1 50.96 1 61.32 poll 1 1109.46 1 1107.86 1 1114.38 usleep 1 56.33 1 72.90 1 62.78 nanosleep 1 52.66 1 55.23 1 64.23 kqueue 1 1114.23 1 1113.81 1 1121.21 kqueueto 1 65.44 1 71.00 1 75.01 syscall 1 4.70 1 4.45 1 4.55 select 300 355.79 300 357.76 300 362.35 poll 300 1107.85 300 1122.55 300 1115.62 usleep 300 355.28 300 357.28 300 360.79 nanosleep 300 354.49 300 355.82 300 360.62 kqueue 300 1112.57 300 1118.13 300 1117.16 kqueueto 300 375.98 300 378.62 300 395.61 syscall 300 4.41 300 4.45 300 4.54 select 3000 3246.75 3000 3246.74 3000 3252.72 poll 3000 3238.10 3000 3229.12 3000 3250.10 usleep 3000 3242.47 3000 3237.06 3000 3249.61 nanosleep 3000 3238.79 3000 3231.55 3000 3248.11 kqueue 3000 3240.01 3000 3236.07 3000 3247.60 kqueueto 3000 3265.36 3000 3267.22 3000 3274.96 syscall 3000 4.69 3000 4.44 3000 4.50 select 30000 31714.60 30000 31941.17 30000 32467.69 poll 30000 31522.76 30000 31983.00 30000 32497.81 usleep 30000 31459.67 30000 31980.76 30000 32458.71 nanosleep 30000 31431.02 30000 31982.22 30000 32525.20 kqueue 30000 31466.75 30000 31873.90 30000 31973.54 kqueueto 30000 31564.67 30000 32522.35 30000 32475.59 syscall 30000 4.70 30000 4.73 30000 4.89 select 300000 319133.02 300000 311562.33 300000 309918.62 poll 300000 319604.27 300000 311422.94 300000 310000.76 usleep 300000 319314.60 300000 311269.69 300000 309996.34 nanosleep 300000 319497.58 300000 311425.40 300000 309997.13 kqueue 300000 309995.55 300000 303980.27 300000 309908.82 kqueueto 300000 319505.88 300000 311424.97 300000 309996.16 syscall 300000 4.41 300000 4.45 300000 4.89 With no patches... HZ=100 HZ=250 HZ=1000 ---------- ---------------- ---------------- ---------------- select 1 19941.70 1 7989.10 1 1999.16 poll 1 19904.61 1 7987.32 1 1999.78 usleep 1 19904.95 1 7993.30 1 1999.96 nanosleep 1 19905.64 1 7993.71 1 1999.72 kqueue 1 10001.61 1 4004.00 1 1000.27 kqueueto 1 19904.00 1 7993.03 1 1999.54 syscall 1 4.04 1 4.05 1 4.75 select 300 19904.66 300 7998.39 300 2000.27 poll 300 19904.35 300 7993.47 300 1999.86 usleep 300 19903.96 300 7994.11 300 1999.81 nanosleep 300 19904.48 300 7993.77 300 1999.80 kqueue 300 10001.68 300 4004.18 300 1000.31 kqueueto 300 19997.86 300 7993.37 300 1999.59 syscall 300 4.01 300 4.00 300 4.32 select 3000 19904.80 3000 7998.85 3000 3998.43 poll 3000 19904.92 3000 8005.93 3000 3999.39 usleep 3000 19904.50 3000 7992.88 3000 3999.44 nanosleep 3000 19904.84 3000 7993.34 3000 3999.36 kqueue 3000 10001.58 3000 4003.97 3000 3000.72 kqueueto 3000 19903.56 3000 7993.24 3000 3999.34 syscall 3000 4.02 3000 4.37 3000 4.29 select 30000 39905.02 30000 35991.79 30000 31051.77 poll 30000 39905.49 30000 35980.35 30000 30995.64 usleep 30000 39903.78 30000 35979.48 30000 30995.23 nanosleep 30000 39904.55 30000 35981.61 30000 30995.87 kqueue 30000 30002.73 30000 32019.54 30000 30004.83 kqueueto 30000 39903.59 30000 35979.64 30000 30996.05 syscall 30000 4.44 30000 4.04 30000 4.31 select 300000 310001.23 300000 303995.86 300000 300994.30 poll 300000 309902.73 300000 303981.58 300000 300996.17 usleep 300000 309903.64 300000 303980.17 300000 300997.42 nanosleep 300000 309903.32 300000 303980.36 300000 300993.64 kqueue 300000 300002.77 300000 300019.46 300000 300006.90 kqueueto 300000 309903.31 300000 303978.10 300000 300996.84 syscall 300000 4.01 300000 4.04 300000 4.29 --=-0QeMdI0U6ePkF5/otAWO--