From owner-svn-src-head@freebsd.org Fri Jul 6 20:54:05 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 15D78102BDCF for ; Fri, 6 Jul 2018 20:54:05 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x242.google.com (mail-io0-x242.google.com [IPv6:2607:f8b0:4001:c06::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 952F896B67 for ; Fri, 6 Jul 2018 20:54:04 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x242.google.com with SMTP id r24-v6so11968012ioh.9 for ; Fri, 06 Jul 2018 13:54:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kdAtoEF+HxHHP4WVrPozVCaC0wQoqTqxMLIQNT8HaO0=; b=EGqaVq0cjMtUzWhUzKdhQhqRbcOhjvdDV8ATYpiTl8Wsl3GZaT7bz4miV7R5ar5YsJ dZdF4VgwxhYYVJUNoA+UKpqo3CFhXWP4K6saI0uuvhVUFObTEOkbEqfUQJJAxabFG2m4 wWnkrMmkbijeKxuWHZrmzWu0X3NJD5kVCPYevLd79jvHY2DCPmP4haND440fXSBNpZYE L5knrKLABDcvtsulOiVFFFEo/ui/F62r47+JECD1NxPVzOaI18Qby85cJD7SNiMOYots fwIBXpzJQ7kuDWozVCr+lbPQaIRkN/ymaaIPoiBN11LXPOaFJRSlblApeekuIFTw9dEC 6Hrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kdAtoEF+HxHHP4WVrPozVCaC0wQoqTqxMLIQNT8HaO0=; b=PYXlxYxC27wlANw2c7K8De/D54PFYGLhnF+V1qEn05LTMT0a6B0+nkrHVbVGWNusnb sJk+NGly15DUxSKguU3MyXstlmA+Scj/guesf5zKDa4BFOsTzAFiIWXV5QamxXRE0mFf mWXx1gNXpCaVpdbQ4jh4C2ftKibgHjv0ZNLA3MmQ9hMVMGhNO73WuWD15Vopccwh3B7w ZfjlYSK34wXitScQlMjejbfwbwWi01xjC8i47z4FwPkgFB5mq6nSCvKB0EBHff8oTJDx Lts+Fus0ZK4E8QYbRLtx/Gmj0NPVdpSvdRNwrzmmvKZvzGIV1U26Eemj38jr7kzYSo4Z uAMg== X-Gm-Message-State: AOUpUlFBWQEqoTwm6fJt3MaOJUtkiPxtyo0azTsKgqZh/JaY5ZY3LFcH YqxvWmTSIQj0j1GhDByJ/gMeX0wLt0o2SAGDcaO5Hg== X-Google-Smtp-Source: AAOMgperaN0+USAvBdfQcbAfd8QDDBhj4Vcy0bQrglBe2EXA0mCtRFIu2kHMRq2/65ZmVkmYiK6ejV1a37LjBtqJO/Q= X-Received: by 2002:a6b:3902:: with SMTP id g2-v6mr3133133ioa.168.1530910443828; Fri, 06 Jul 2018 13:54:03 -0700 (PDT) MIME-Version: 1.0 References: <201807061809.w66I9RVR053596@pdx.rh.CN85.dnsmgr.net> <20180706225030.2e689882@kalimero.tijl.coosemans.org> In-Reply-To: <20180706225030.2e689882@kalimero.tijl.coosemans.org> From: Warner Losh Date: Fri, 6 Jul 2018 14:53:51 -0600 Message-ID: Subject: Re: svn commit: r336025 - in head/sys: amd64/include i386/include To: Tijl Coosemans Cc: "Rodney W. Grimes" , "Rodney W. Grimes" , Hans Petter Selasky , src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jul 2018 20:54:05 -0000 On Fri, Jul 6, 2018, 3:50 PM T=C4=B3l Coosemans wrote: > On Fri, 6 Jul 2018 11:09:27 -0700 (PDT) "Rodney W. Grimes" < > freebsd@pdx.rh.CN85.dnsmgr.net> wrote: > > > On Fri, Jul 6, 2018, 12:27 PM Rodney W. Grimes < > > > freebsd@pdx.rh.cn85.dnsmgr.net> wrote: > > > > > > > > On Fri, Jul 6, 2018 at 9:52 AM, Rodney W. Grimes < > > > > > freebsd@pdx.rh.cn85.dnsmgr.net> wrote: > > > > > > > > > > > > On Fri, Jul 6, 2018 at 9:32 AM, Rodney W. Grimes < > > > > > > > freebsd@pdx.rh.cn85.dnsmgr.net> wrote: > > > > > > > > > > > > > > > > Author: hselasky > > > > > > > > > Date: Fri Jul 6 10:13:42 2018 > > > > > > > > > New Revision: 336025 > > > > > > > > > URL: https://svnweb.freebsd.org/changeset/base/336025 > > > > > > > > > > > > > > > > > > Log: > > > > > > > > > Make sure kernel modules built by default are portable > between > > > > UP > > > > > > and > > > > > > > > > SMP systems by extending defined(SMP) to include > > > > > > defined(KLD_MODULE). > > > > > > > > > > > > > > > > > > This is a regression issue after r335873 . > > > > > > > > > > > > > > > > > > Discussed with: mmacy@ > > > > > > > > > Sponsored by: Mellanox Technologies > > > > > > > > > > > > > > > > Though this fixes the issue, it also means that now when > > > > > > > > anyone intentionally builds a UP kernel with modules > > > > > > > > they are getting SMP support in the modules and I am > > > > > > > > not sure they would want that. I know I don't. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On UP systems, these additional opcodes are harmless. They > take a few > > > > > > extra > > > > > > > cycles (since they lock an uncontested bus) and add a couple > extra > > > > memory > > > > > > > barriers (which will be NOPs). On MP systems, atomics now wor= k > by > > > > > > default. > > > > > > > Had we not defaulted like this, all modules built outside of = a > kernel > > > > > > build > > > > > > > env would have broken atomics. Given that (a) the > overwhelming > > > > majority > > > > > > > (99% or more) is SMP and (b) the MP code merely adds a few > cycles to > > > > > > what's > > > > > > > already a not-too-expensive operation, this was the right > choice. > > > > > > > > > > > > > > It simply doesn't matter for systems that are relevant to the > project > > > > > > > today. While one could try to optimize this a little (for > example, by > > > > > > > having SMP defined to be 0 or 1, say, and changing all the > ifdef SMP > > > > to > > > > > > if > > > > > > > (defined(SMP) && SMP !=3D 0)), it's likely not going to matte= r > enough > > > > for > > > > > > > anybody to make the effort. UP on x86 is simply not relevant > enough > > > > to > > > > > > > optimize for it. Even in VMs, people run SMP kernels typicall= y > even > > > > when > > > > > > > they just allocate one CPU to the VM. > > > > > > > > > > > > > > So while we still support the UP config, and we'll let people > build > > > > > > > optimized kernels for x86, we've flipped the switch from > pessimized > > > > for > > > > > > SMP > > > > > > > modules to pessimized for UP modules, which seems like quite > the > > > > > > reasonable > > > > > > > trade-off. > > > > > > > > > > > > > > Were it practical to do so, I'd suggest de-orbiting UP on > x86. > > > > However, > > > > > > > it's a lot of work for not much benefit and we'd need to > invent much > > > > > > crazy > > > > > > > to get there. > > > > > > > > > > > > Trivial to fix this with > > > > > > +#if defined(SMP) || !defined(_KERNEL) || defined(KLD_MODULE) |= | > > > > > > !defined(KLD_UP_MODULES) > > > > > > > > > > > > > > > Nope. Not so trivial. Who defines KLD_UP_MODULES? > > > > > > > > Call it SMP_KLD_MODULES, and it gets defined the same place SMP doe= s. > > > > > > > > > > Not so simple. SMP is defined in the config file, and winds up in one > of > > No problem, that is where I would be defining this anyway, or in the > > latest case removing it and SMP for my UP kernel build. > > > > > the option files. It will be absent for stand alone builds, > > I am ok with that. And it would be reasonable to default to SMP. > > > > > though. These > > > change tweak the default yo be inlined and to include the sequence th= at > > > works everywhere. > > > > > > > > > > > > And really, it's absolutely not worth it unless someone shows up > with > > > > > numbers to show the old 'function call to optimal routine' is > actually > > > > > faster than the new 'inline to slightly unoptimal code'. Since I > think > > > > the > > > > > function call overhead is larger than the pessmizations, I'm not > sure > > > > what > > > > > the fuss is about. > > > > > > > > I have no issues with the SMP converting from function calls to > > > > inline locks, I just want to retain the exact same code I had > > > > before any of these changes, and that was A UP built system > > > > without any SMP locking. Is it too much to ask to keep what > > > > already worked? > > > > > > > > > > This doesn't enable or disable locks in the muted sense. It just > changes > > > the atomic ops for the kernel from a function call to an inlined > function. > > > The inlining is more efficient than the call, even with the overhead > added > > > by always inlining the same stuff. It still is faster than before. > > > > > > And userland has done this forever... > > > > > > So I honestly think even UP builds are better off, even if it's not > hyper > > > optimized for UP. The lock instruction prefix is minimal overhead (a > cycle > > > I think). > > > > I do not believe, and Bruce seems to have evidence, that LOCK is not > > a one cycle cost. And in my head I know that it can not be that > > simple as it causes lots of very special things to happen in the > > pipeline to ensure you are locked. > > > > > This is different than the mutexes we optimize for the UP cases > > > (and which aren't affected by this change). It's really not a big > deal. > > > > CPU's are not getting any faster, cycles are cycles, and I think we > > should at least investigate further before we just start making > > assumptions about the lock prefix being a 1 cycle cheap thing to > > do. > > > Just install opt_*.h headers already. It's not just about the SMP option= . > The nvidia-driver ports want to know if PAE is enabled on i386. > Sadly, I don't think it will be that simple... Warner >