From owner-freebsd-net@freebsd.org Fri Nov 6 08:42:40 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5C6DFA27451 for ; Fri, 6 Nov 2015 08:42:40 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E12211806 for ; Fri, 6 Nov 2015 08:42:39 +0000 (UTC) (envelope-from hps@selasky.org) Received: from laptop015.home.selasky.org (unknown [62.141.129.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id BFDE61FE023; Fri, 6 Nov 2015 09:42:36 +0100 (CET) Subject: Re: Timing issue with Dummynet on high kernel timer interrupt To: Rasool Al-Saadi , "freebsd-net@freebsd.org" References: <6545444AE21C2749939E637E56594CEA3C0DCCC4@gsp-ex02.ds.swin.edu.au> <5638B7B5.3030802@selasky.org> <6545444AE21C2749939E637E56594CEA3C0DE7FF@gsp-ex02.ds.swin.edu.au> <563B2703.5080402@selasky.org> <6545444AE21C2749939E637E56594CEA3C0E0BD9@gsp-ex02.ds.swin.edu.au> From: Hans Petter Selasky Message-ID: <563C6864.2090907@selasky.org> Date: Fri, 6 Nov 2015 09:44:20 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <6545444AE21C2749939E637E56594CEA3C0E0BD9@gsp-ex02.ds.swin.edu.au> Content-Type: multipart/mixed; boundary="------------010202050408090608050403" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Nov 2015 08:42:40 -0000 This is a multi-part message in MIME format. --------------010202050408090608050403 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit On 11/06/15 01:08, Rasool Al-Saadi wrote: > > On Thursday, 5 November 2015 8:53 PM, Hans Petter Selasky wrote: >> >> On 11/05/15 00:44, Rasool Al-Saadi wrote: >>> >>> On Wednesday, 4 November 2015 12:34 AM, Hans Petter Selasky wrote: >>>> On 11/03/15 14:14, Rasool Al-Saadi wrote: >>>>> Does anyone have thoughts on what we can test next to narrow down >>>>> the >>>> root-cause of these unusual timing jumps? >>>> >>>> You might also want to test the "projects/hps_head" branch, which >>>> uses a bit different callout implementation. >>> >>> Thanks Hans for your suggestion. >>> I have tried "projects/hps_head" branch and the result is better (number >> of spikes is less than in the master branch). However, the problem still exists >> on the same timer interrupt frequencies (>3000 in my case). You can see in >> this graph https://goo.gl/photos/C2Mqx4xhMQuzxWnz6 the RTT spikes still >> there. >>> >>> Do you have any further suggestions? >>> >> >> Hi, >> >> If the jitter is in the xx milliseconds range, like your graph shows, my guess is >> that td_owepreemt is not set when we return from the dummynet() >> callback. See attached patch. > > The patch doesn't solve the problem. I tried it on the master and hps branches. > >> Else you might want to try to remove the C_HARDCLOCK flag from >> callout_reset_sbt() in ip_dummynet.c. > > Removing C_HARDCLOCK reduces the problem but doesn't solve it completely. However, removing C_DIRECT_EXEC instead solves the problem (but occasionally very small spike(s) appears in high hz values). > I mentioned in my first email that removing these flags makes the issue to disappear. But what the effects of removing these flags? If it cause timing issue to Dummynet, why we should use them? > Hi, The C_DIRECT_EXEC flag reduces task switching overhead, that you don't have to wakeup a thread to wakeup the dummynet worker thread. It affects timing. Here is one more patch you can try. See attachment. It restarts the timeout from within the timer callback, instead of when the worker thread is executing. --HPS --------------010202050408090608050403 Content-Type: text/x-patch; name="ipfw_reschedule.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ipfw_reschedule.diff" Index: sys/netpfil/ipfw/ip_dn_io.c =================================================================== --- sys/netpfil/ipfw/ip_dn_io.c (revision 290134) +++ sys/netpfil/ipfw/ip_dn_io.c (working copy) @@ -712,7 +712,7 @@ } DN_BH_WUNLOCK(); - dn_reschedule(); + //dn_reschedule(); if (q.head != NULL) dummynet_send(q.head); CURVNET_RESTORE(); Index: sys/netpfil/ipfw/ip_dummynet.c =================================================================== --- sys/netpfil/ipfw/ip_dummynet.c (revision 290134) +++ sys/netpfil/ipfw/ip_dummynet.c (working copy) @@ -84,6 +84,7 @@ (void)arg; /* UNUSED */ taskqueue_enqueue_fast(dn_tq, &dn_task); + dn_reschedule(); } void --------------010202050408090608050403--