From owner-freebsd-net@freebsd.org  Fri Feb  2 10:35:44 2018
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id AFCBEECEF29
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Fri,  2 Feb 2018 10:35:44 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au
 [211.29.132.97])
 by mx1.freebsd.org (Postfix) with ESMTP id DB4898535E
 for <freebsd-net@freebsd.org>; Fri,  2 Feb 2018 10:35:43 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 9815F1019D9
 for <freebsd-net@freebsd.org>; Fri,  2 Feb 2018 21:35:34 +1100 (AEDT)
Date: Fri, 2 Feb 2018 21:35:34 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
cc: freebsd-net@freebsd.org
Subject: Re: [Bug 225535] Delays in TCP connection over Gigabit Ethernet
 connections; Regression from 6.3-RELEASE
In-Reply-To: <bug-225535-2472-aO6XhDULXP@https.bugs.freebsd.org/bugzilla/>
Message-ID: <20180202205843.S1047@besplex.bde.org>
References: <bug-225535-2472@https.bugs.freebsd.org/bugzilla/>
 <bug-225535-2472-aO6XhDULXP@https.bugs.freebsd.org/bugzilla/>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=LKgWeNe9 c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=9cW_t1CCXrUA:10 a=kj9zAlcOel0A:10 a=6I5d2MoRAAAA:8 a=e-Ovw6bNAAAA:8
 a=dGRcDdDoN0Y2xC7d2zwA:9 a=CjuIK1q_8ugA:10 a=IjZwj45LgO3ly-622nXo:22
 a=nOxMtLUM5ck9Hn2diPJ4:22
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Feb 2018 10:35:44 -0000

On Thu, 1 Feb 2018 a bug that doesn't want replies@freebsd.org wrote:

> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225535
> ...
> --- Comment #21 from Aleksander Derevianko <aeder@list.ru> ---
> OK, problem solved. 12-hours test doesn't show any delays longer then 1 ms.
>
> For the future reference, only following parameters must be set in
> /etc/sysctl.conf:
> --------------------------------------------------
> # Exact clock for nanosleep()
> kern.timecounter.alloweddeviation=0
>
> # Disable random delays in network/adapter code
> kern.eventtimer.periodic=1
> --------------------------------------------------
> All other parameters (boot.loader) can be left as system defaults.
>
> Closing defect with "Works As Intended" because it's possible to solve just in
> OS configuration.

It doesn't work as intended.

kern.timecounter.alloweddeviation=0 might be the correct setting for the
default, since nonzero gives many other suprising behaviours (like
"time sleep 1" usually appearing to sleep for precisely 1 second in old
versions of FreeBSD despite a large timer granularity (10 msec, the same
as time(1)), while under -current it often sleeps by up to about 66 msec
extra despite (actually because of) a timer granularity of 1 msec (actually
much smaller, provided the caller asks for it).  High resolution timers
can be inefficient and are usually not needed, so FreeBSD allows inaccuracy
of about 5% by default, up to an interval of about 1 second (time sleep 10
has the same absolute inaccuracy of up to about 66 msec, but that is only
0.66% relative inaccuracy).

Changing kern.timecounter.alloweddeviation to 0 breaks the optimization for
all uses to fix only 1 known problem here.

The main bug here are probably that some TCP or application timer doesn't
ask for the high resolution that it needs (I don't know how to specify the
resolution for applications, and there are security problems with allowing
it to be small).  The timeouts are apparently long, so 5% relative is a lot
absolute.  I think 5% relative should only apply to short timeouts, say
10 msec instead of 1 second.

kern.eventtimer.periodic=1 is never the correct setting except with buggy
hardware.  It turns off most of the new timer code, and breaks setting of
resolutions below 1/HZ.  I actually prefer simpler timeout code with only
periodic timers supported, and rarely need high resolutions, but I also
prefer large HZ and the new timer code works better with that provided
kern.eventtimer.periodic is not 1.  Periodic timers can be more efficient
but this doesn't show up in my benchmarks.

Since kern.eventtimer.periodic=1 seems to help, there might be further bugs,
but I suspect this is an artifact of the measurements.  With periodic timers,
everything tends to wake up at the same time and see only constant differences
in times (of an integer times the timeout period).  Worse, activity between
timeout interrupts tends to be invisible.

Bruce