From owner-freebsd-sparc64@FreeBSD.ORG Fri May 20 21:41:08 2011 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6990C106566C for ; Fri, 20 May 2011 21:41:08 +0000 (UTC) (envelope-from jroberson@jroberson.net) Received: from mail-pv0-f182.google.com (mail-pv0-f182.google.com [74.125.83.182]) by mx1.freebsd.org (Postfix) with ESMTP id 4749A8FC0A for ; Fri, 20 May 2011 21:41:07 +0000 (UTC) Received: by pvg11 with SMTP id 11so2378367pvg.13 for ; Fri, 20 May 2011 14:41:07 -0700 (PDT) Received: by 10.68.15.229 with SMTP id a5mr34074pbd.42.1305923603363; Fri, 20 May 2011 13:33:23 -0700 (PDT) Received: from [10.0.1.198] ([72.253.42.56]) by mx.google.com with ESMTPS id t9sm2651274pbq.31.2011.05.20.13.33.20 (version=SSLv3 cipher=OTHER); Fri, 20 May 2011 13:33:21 -0700 (PDT) Date: Fri, 20 May 2011 10:37:31 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Marius Strobl In-Reply-To: <20110520150439.GR92688@alchemy.franken.de> Message-ID: References: <20110519195245.GA3039@server.vk2pj.dyndns.org> <20110520103841.GA40497@alchemy.franken.de> <20110520124102.GA80878@server.vk2pj.dyndns.org> <20110520150439.GR92688@alchemy.franken.de> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: jeff@freebsd.org, freebsd-sparc64@freebsd.org Subject: Re: SCHED_ULE on sparc64 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 21:41:08 -0000 On Fri, 20 May 2011, Marius Strobl wrote: > On Fri, May 20, 2011 at 10:41:02PM +1000, Peter Jeremy wrote: >> On 2011-May-20 12:38:41 +0200, Marius Strobl wrote: >>> The main problem with SCHED_ULE on sparc64 is that the MD code >>> (ab)uses the global sched_lock of SCHED_4BSD to protect pm_context, >>> pm_active and pc_pmap, partially of all CPUs, and SCHED_ULE doesn't >>> use/provide such a lock. One could replace the use of sched_lock >>> for that with a global MD spin lock but this has the issue that it >>> would have to be acquired and released in cpu_switch(), which is next >>> to impossible to do properly in assembler. >> >> Definitely messy but MIPS and PPC do it (at least the acquire - I >> don't see how the lock is released in either case). > > I don't think these actually acquire a lock, all lock-related I can > identify there are the equivalents of the following: > atomic_store_rel_ptr(&old->td_lock, mtx); > and: > #if defined(SCHED_ULE) && defined(SMP) > while (atomic_load_acq_ptr(&new->td_lock) == &blocked_lock) > cpu_spinwait(); > #endif Yes the goal of passing the lock pointer into the switch function is so that the outgoing thread's lock is not released until we are off of its stack. Otherwise another cpu could start switching into it as we are on the way out. > >>> The bottom line >>> is that watching the various mailing lists so far didn't provide the >>> necessary motivation to work on that to me though (even today you still >>> find reports about performance problems with SCHED_ULE and suggestions >>> to use SCHED_4BSD instead, just see 4DD55CE0.50202@m5p.com as current >>> example). Can you give me another reference to this? You have to realize that no scheduling policy will be faster for everything. The goal is to be faster for most things and eliminate worst case scenarios. I can look at this soon if there is something to be done. >> >> OTOH, not using it won't get the bugs fixed. > > They certainly won't but typically I hit enough problems when trying to > get code developed on x86 or actually written with only x86 in mind to > work on sparc64 that I don't really feel the desire to go out hunting for > generic bugs in that code. In any case my motivation for getting SCHED_ULE > to work on sparc64 suddenly vanished with r171488 for some strange reason. I really don't know what the status of the sparc64 port is. If it is intended to be first tier it should support ULE. Features like cpusets and topology aware scheduling are better supported on ULE. It is generally considered the path forward for SMP. 4BSD with its global run queue and global lock is a dead end unless someone wants to salvage its priority computation mechanism and add the cpu load balancing features that end up making ULE slower in some cases. > >> My rationale for firing >> up the spare V890 at $work was to try and stress some of the big >> systems code and SCHED_ULE is supposed to be better at handling lots of >> CPUs than SCHED_4BSD. >> > > I don't think 16 cores counts as a lot these days :) The per-cpu scheduler locks showed massive improvements on some workloads with only 4 cores. The global scheduler lock is a significant point of contention probably for any workload at 16 cores. 16 cores is not a big machine anymore but it's plenty to have heavy contention soak up too many cycles. Thanks, Jeff > > Marius >