From owner-freebsd-current@FreeBSD.ORG Thu Feb 9 16:13:59 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DAEB116A420 for ; Thu, 9 Feb 2006 16:13:59 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4EE9C43D75 for ; Thu, 9 Feb 2006 16:13:49 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 8094421 for multiple; Thu, 09 Feb 2006 11:12:53 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k19GDarD068059; Thu, 9 Feb 2006 11:13:38 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Andrew Gallatin Date: Thu, 9 Feb 2006 11:13:30 -0500 User-Agent: KMail/1.9.1 References: <17379.56708.421007.613310@grasshopper.cs.duke.edu> <200602081033.00953.jhb@freebsd.org> <17386.10130.139455.567203@grasshopper.cs.duke.edu> In-Reply-To: <17386.10130.139455.567203@grasshopper.cs.duke.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200602091113.31900.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1281/Wed Feb 8 14:59:33 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=4.2 tests=ALL_TRUSTED autolearn=ham version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: freebsd-current@freebsd.org Subject: Re: machdep.cpu_idle_hlt and SMP perf? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Feb 2006 16:14:00 -0000 On Wednesday 08 February 2006 12:17, Andrew Gallatin wrote: > John Baldwin writes: > > On Tuesday 07 February 2006 17:46, Andrew Gallatin wrote: > > > John Baldwin writes: > > > > On Tuesday 07 February 2006 17:15, Andrew Gallatin wrote: > > > > > John Baldwin writes: > > > > > > On Monday 06 February 2006 17:37, Andrew Gallatin wrote: > > > > > > > John Baldwin writes: > > > > > > > > On Monday 06 February 2006 14:46, Andrew Gallatin wrote: > > > > > > > > > Andre Oppermann writes: > > > > > > > > > > Andrew Gallatin wrote: > > > > > > > > > > > Why dooes machdep.cpu_idle_hlt=1 drop my 10GbE > > > > > > > > > > > network rx performance by a considerable amount > > > > > > > > > > > (7.5Gbs -> 5.5Gbs)? > > > > > > > > > > > > > > > > You may be seeing problems because it might simply take a > > > > > > > > while for the CPU to wake up from HLT when an interrupt > > > > > > > > comes in. The 4BSD scheduler tries to do IPIs to wakeup > > > > > > > > any sleeping CPUs when it schedules a new thread, but > > > > > > > > that would add higher latency for ithreads than just > > > > > > > > preempting directly to the ithread. Oh, you have to turn > > > > > > > > that on, it's off by default > > > > > > > > (kern.sched.ipiwakeup.enabled=1). > > > > > > > > > > > > > > Hmm.. It seems to be on by default. Unfortunately, it does > > > > > > > not seem to help. > > > > > > > > > > > > I'm not sure. > > > > > > > > > > One thing which really helps is disabling preemption. If I do > > > > > that, I get 7.7Gb/sec with machdep.cpu_idle_hlt=1. This is > > > > > slightly better than machdep.cpu_idle_hlt=0 and no PREEMPTION. > > > > > > > > > > BTW, net.isr.direct=1 in all testing. > > > > > > > > Do you have very little userland activity in this test? > > > > > > Essentially none. netserver just sits in a loop, reading from the > > > socket and throwing the data away. > > > > If you disable preemption then in effect you are letting the idle CPUs > > pick up the ithread and not disturbing what is running on the non-idle > > CPU. sched_4bsd is supposed to be triggering the same behavior, except > > that it has to send an IPI to awaken the idle CPUs. When you have > > idle_hlt=0, there are no idle CPUs, so 4bsd thinks they are all busy and > > preempts. When you disable preemption, it just leaves the ithread on > > the runqueue until one of the idle CPUs notices the new thread in its > > idle loop and runs it. When you have idle_hlt=1, then 4bsd doesn't > > preempt but sends an IPI. It doesn't even try to preempt unless it > > thinks all CPUs are busy. > > I wish we had a lightweight way to watch all this stuff. I can't > wait for dtrace. You can try using KTR with KTR_SCHED and then using schedgraph.py to look at what happens. I'm not sure how lightweight that might be if you just have KTR on and no other debug stuff. > FWIW, if I use SCHED_ULE, performance sucks regardless of idle_hlt. Hmmmm. > > One thing disabling PREEMPTION does is that it enables some explicit > > FULL_PREEMPTION-like behavior in _mtx_unlock_sleep(). You might want to > > try #if 0'ing that code out to see if that is why having PREEMPTION off > > makes a difference. (Ironically, having PREEMPTION on means > > _mtx_unlock_sleep() will preempt less often.) > > Removing that code did not seem to matter. I still get good > performance with SCHED_4BSD, PREEMPTION disabled, idle_hlt=1, and that > code removed. Ok. Hmmmmm. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org