From owner-freebsd-current@FreeBSD.ORG Wed Feb 8 17:17:12 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B0C8116A420; Wed, 8 Feb 2006 17:17:12 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3CCE043D5F; Wed, 8 Feb 2006 17:17:12 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.13.4/8.13.4) with ESMTP id k18HHB8N005977 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 8 Feb 2006 12:17:11 -0500 (EST) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id k18HH6hc015485; Wed, 8 Feb 2006 12:17:06 -0500 (EST) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17386.10130.139455.567203@grasshopper.cs.duke.edu> Date: Wed, 8 Feb 2006 12:17:06 -0500 (EST) To: John Baldwin In-Reply-To: <200602081033.00953.jhb@freebsd.org> References: <17379.56708.421007.613310@grasshopper.cs.duke.edu> <200602071730.53881.jhb@freebsd.org> <17385.9034.309439.331530@grasshopper.cs.duke.edu> <200602081033.00953.jhb@freebsd.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Cc: freebsd-current@freebsd.org Subject: Re: machdep.cpu_idle_hlt and SMP perf? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Feb 2006 17:17:12 -0000 John Baldwin writes: > On Tuesday 07 February 2006 17:46, Andrew Gallatin wrote: > > John Baldwin writes: > > > On Tuesday 07 February 2006 17:15, Andrew Gallatin wrote: > > > > John Baldwin writes: > > > > > On Monday 06 February 2006 17:37, Andrew Gallatin wrote: > > > > > > John Baldwin writes: > > > > > > > On Monday 06 February 2006 14:46, Andrew Gallatin wrote: > > > > > > > > Andre Oppermann writes: > > > > > > > > > Andrew Gallatin wrote: > > > > > > > > > > Why dooes machdep.cpu_idle_hlt=1 drop my 10GbE network > > > > > > > > > > rx performance by a considerable amount (7.5Gbs -> > > > > > > > > > > 5.5Gbs)? > > > > > > > > > > > > > > You may be seeing problems because it might simply take a > > > > > > > while for the CPU to wake up from HLT when an interrupt comes > > > > > > > in. The 4BSD scheduler tries to do IPIs to wakeup any > > > > > > > sleeping CPUs when it schedules a new thread, but that would > > > > > > > add higher latency for ithreads than just preempting directly > > > > > > > to the ithread. Oh, you have to turn that on, it's off by > > > > > > > default > > > > > > > (kern.sched.ipiwakeup.enabled=1). > > > > > > > > > > > > Hmm.. It seems to be on by default. Unfortunately, it does not > > > > > > seem to help. > > > > > > > > > > I'm not sure. > > > > > > > > One thing which really helps is disabling preemption. If I do that, > > > > I get 7.7Gb/sec with machdep.cpu_idle_hlt=1. This is slightly better > > > > than machdep.cpu_idle_hlt=0 and no PREEMPTION. > > > > > > > > BTW, net.isr.direct=1 in all testing. > > > > > > Do you have very little userland activity in this test? > > > > Essentially none. netserver just sits in a loop, reading from the > > socket and throwing the data away. > > If you disable preemption then in effect you are letting the idle CPUs pick up > the ithread and not disturbing what is running on the non-idle CPU. > sched_4bsd is supposed to be triggering the same behavior, except that it has > to send an IPI to awaken the idle CPUs. When you have idle_hlt=0, there are > no idle CPUs, so 4bsd thinks they are all busy and preempts. When you > disable preemption, it just leaves the ithread on the runqueue until one of > the idle CPUs notices the new thread in its idle loop and runs it. When you > have idle_hlt=1, then 4bsd doesn't preempt but sends an IPI. It doesn't even > try to preempt unless it thinks all CPUs are busy. I wish we had a lightweight way to watch all this stuff. I can't wait for dtrace. FWIW, if I use SCHED_ULE, performance sucks regardless of idle_hlt. > One thing disabling PREEMPTION does is that it enables some explicit > FULL_PREEMPTION-like behavior in _mtx_unlock_sleep(). You might want to try > #if 0'ing that code out to see if that is why having PREEMPTION off makes a > difference. (Ironically, having PREEMPTION on means _mtx_unlock_sleep() will > preempt less often.) Removing that code did not seem to matter. I still get good performance with SCHED_4BSD, PREEMPTION disabled, idle_hlt=1, and that code removed. Drew