From owner-freebsd-current@FreeBSD.ORG Fri Jan 6 18:23:21 2006 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7EFE916A41F for ; Fri, 6 Jan 2006 18:23:21 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3614643D49 for ; Fri, 6 Jan 2006 18:23:21 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.13.4/8.13.4) with ESMTP id k06INKb9082000; Fri, 6 Jan 2006 10:23:21 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.13.4/8.13.4/Submit) id k06INKYn081999; Fri, 6 Jan 2006 10:23:20 -0800 (PST) Date: Fri, 6 Jan 2006 10:23:20 -0800 (PST) From: Matthew Dillon Message-Id: <200601061823.k06INKYn081999@apollo.backplane.com> To: Chuck Swiger References: <73774.1136109554@critter.freebsd.dk> <20060101035958.A86264@xorpc.icir.org> <43B7E1EC.5090301@mac.com> <200601060636.k066aNYn079015@apollo.backplane.com> <43BEA718.6090306@mac.com> Cc: Luigi Rizzo , Poul-Henning Kamp , current@freebsd.org Subject: Re: FreeBSD handles leapsecond correctly X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jan 2006 18:23:21 -0000 :Out of curiosity, what is DragonFly doing with the network timing counters (ie, :TCPOPT_TIMESTAMP and the stuff in ), has that been :seperated from HZ too? : :I'm pretty sure that setting: : :#define TCPTV_MSL ( 30*hz) /* max seg lifetime (hah!) */ : :...with HZ=1000 or more is not entirely correct. :-) Not when it started with :the TTL in hops being equated to one hop per second... : :-- :-Chuck Well, you know what they say... if it aint broke, don't fix it. In this case the network stacks use that wonderful callwheel code that was written years ago (in FreeBSD). SYSTIMERS aren't designed to handle billions of timers like the callwheel code is so it wouldn't be a proper application. The one change I made to the callwheel code was to make it per-cpu in order to guarentee that e.g. a device driver that installs an interrupt and a callout would get both on the same cpu and thus be able to use normal critical sections to interlock between them. This is a particularly important aspect of our lockless per-cpu tcp protocol threads. DragonFly's crit_enter()/crit_exit() together only take 9ns (with INVARIANTS turned on), whereas the minimum non-contended inline mutex (lwkt_serialize_enter()/exit()) takes around 20ns. I don't know what edge cases exist when 'hz' is set so high. Since we don't use hz for things that would normally require it to be set to a high frequency, we just leave hz set to 100. -- One side note. I've found both our userland (traditional bsd4) and our LWKT scheduler to be really finicky about being properly woken up via AST when a reschedule is required. Preemption by <> threads is not beneficial at all since most kernel ops take < 1uS to execute major operations. 'hz' is not relevant because it only effects processes operating in batch. But 'forgetting' to queue an AST to reschedule a thread ASAP (without preempting) when you are supposed to can result in terrible interactive response because you have processes winding up using their whole quantum before they realize that they should have rescheduled. I've managed to break this three times over the years in DragonFly... stupid things like forgetting a crit_exit() or clearing the reschedule bit without actually rescheduling or doing the wrong check in doreti(), etc. The bugs often went unnoticed for weeks because it wasn't noticed until someone did some heavily cpu-bound work or test. It is the A#1 problem that you have to look for if you have scheduler issues. All non-interrupt-thread preemption accomplishes is to blow up your caches and prevent you from being able to aggregate work between threads (which could be especially important since your I/O is threaded in FreeBSD). -Matt Matthew Dillon