Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 Jul 2018 22:50:30 +0200
From:      =?UTF-8?B?VMSzbA==?= Coosemans <tijl@FreeBSD.org>
To:        "Rodney W. Grimes" <freebsd@pdx.rh.CN85.dnsmgr.net>
Cc:        rgrimes@freebsd.org, Warner Losh <imp@bsdimp.com>, Hans Petter Selasky <hselasky@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r336025 - in head/sys: amd64/include i386/include
Message-ID:  <20180706225030.2e689882@kalimero.tijl.coosemans.org>
In-Reply-To: <201807061809.w66I9RVR053596@pdx.rh.CN85.dnsmgr.net>
References:  <CANCZdfrzJK47xroYRHO1aG6Qdos-RZFvN7H7ME4zjX9hhYx-0A@mail.gmail.com> <201807061809.w66I9RVR053596@pdx.rh.CN85.dnsmgr.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 6 Jul 2018 11:09:27 -0700 (PDT) "Rodney W. Grimes" <freebsd@pdx.rh.CN85.dnsmgr.net> wrote:
> > On Fri, Jul 6, 2018, 12:27 PM Rodney W. Grimes <  
> > freebsd@pdx.rh.cn85.dnsmgr.net> wrote:  
> >   
> > > > On Fri, Jul 6, 2018 at 9:52 AM, Rodney W. Grimes <  
> > > > freebsd@pdx.rh.cn85.dnsmgr.net> wrote:  
> > > >  
> > > > > > On Fri, Jul 6, 2018 at 9:32 AM, Rodney W. Grimes <  
> > > > > > freebsd@pdx.rh.cn85.dnsmgr.net> wrote:  
> > > > > >  
> > > > > > > > Author: hselasky
> > > > > > > > Date: Fri Jul  6 10:13:42 2018
> > > > > > > > New Revision: 336025
> > > > > > > > URL: https://svnweb.freebsd.org/changeset/base/336025
> > > > > > > >
> > > > > > > > Log:
> > > > > > > >   Make sure kernel modules built by default are portable between  
> > > UP  
> > > > > and  
> > > > > > > >   SMP systems by extending defined(SMP) to include  
> > > > > defined(KLD_MODULE).  
> > > > > > > >
> > > > > > > >   This is a regression issue after r335873 .
> > > > > > > >
> > > > > > > >   Discussed with:             mmacy@
> > > > > > > >   Sponsored by:               Mellanox Technologies  
> > > > > > >
> > > > > > > Though this fixes the issue, it also means that now when
> > > > > > > anyone intentionally builds a UP kernel with modules
> > > > > > > they are getting SMP support in the modules and I am
> > > > > > > not sure they would want that.  I know I don't.
> > > > > > >  
> > > > > >
> > > > > >
> > > > > > On UP systems, these additional opcodes are harmless. They take a few  
> > > > > extra  
> > > > > > cycles (since they lock an uncontested bus) and add a couple extra  
> > > memory  
> > > > > > barriers (which will be NOPs). On MP systems, atomics now work by  
> > > > > default.  
> > > > > > Had we not defaulted like this, all modules built outside of a kernel  
> > > > > build  
> > > > > > env would have broken atomics. Given that (a) the overwhelming  
> > > majority  
> > > > > > (99% or more) is SMP and (b) the MP code merely adds a few cycles to  
> > > > > what's  
> > > > > > already a not-too-expensive operation, this was the right choice.
> > > > > >
> > > > > > It simply doesn't matter for systems that are relevant to the project
> > > > > > today. While one could try to optimize this a little (for example, by
> > > > > > having SMP defined to be 0 or 1, say, and changing all the ifdef SMP  
> > > to  
> > > > > if  
> > > > > > (defined(SMP) && SMP != 0)), it's likely not going to matter enough  
> > > for  
> > > > > > anybody to make the effort. UP on x86 is simply not relevant enough  
> > > to  
> > > > > > optimize for it. Even in VMs, people run SMP kernels typically even  
> > > when  
> > > > > > they just allocate one CPU to the VM.
> > > > > >
> > > > > > So while we still support the UP config, and we'll let people build
> > > > > > optimized kernels for x86, we've flipped the switch from pessimized  
> > > for  
> > > > > SMP  
> > > > > > modules to pessimized for UP modules, which seems like quite the  
> > > > > reasonable  
> > > > > > trade-off.
> > > > > >
> > > > > > Were it practical to do so, I'd suggest de-orbiting UP on x86.  
> > > However,  
> > > > > > it's a lot of work for not much benefit and we'd need to invent much  
> > > > > crazy  
> > > > > > to get there.  
> > > > >
> > > > > Trivial to fix this with
> > > > > +#if defined(SMP) || !defined(_KERNEL) || defined(KLD_MODULE) ||
> > > > > !defined(KLD_UP_MODULES)  
> > > >
> > > >
> > > > Nope. Not so trivial. Who defines KLD_UP_MODULES?  
> > >
> > > Call it SMP_KLD_MODULES, and it gets defined the same place SMP does.
> > >  
> > 
> > Not so simple. SMP is defined in the config file, and winds up in one of  
> No problem, that is where I would be defining this anyway, or in the
> latest case removing it and SMP for my UP kernel build.
> 
> > the option files. It will be absent for stand alone builds,  
> I am ok with that.  And it would be reasonable to default to SMP.
> 
> > though. These
> > change tweak the default yo be inlined and to include the sequence that
> > works everywhere.
> >   
> > >  
> > > > And really, it's absolutely not worth it unless someone shows up with
> > > > numbers to show the old 'function call to optimal routine' is actually
> > > > faster than the new 'inline to slightly unoptimal code'. Since I think  
> > > the  
> > > > function call overhead is larger than the pessmizations, I'm not sure  
> > > what  
> > > > the fuss is about.  
> > >
> > > I have no issues with the SMP converting from function calls to
> > > inline locks, I just want to retain the exact same code I had
> > > before any of these changes, and that was A UP built system
> > > without any SMP locking.  Is it too much to ask to keep what
> > > already worked?
> > >  
> > 
> > This doesn't enable or disable locks in the muted sense. It just changes
> > the atomic ops for the kernel from a function call to an inlined function.
> > The inlining is more efficient than the call, even with the overhead added
> > by always inlining the same stuff. It still is faster than before.
> > 
> > And userland has done this forever...
> > 
> > So I honestly think even UP builds are better off, even if it's not hyper
> > optimized for UP. The lock instruction prefix is minimal overhead (a cycle
> > I think).  
> 
> I do not believe, and Bruce seems to have evidence, that LOCK is not
> a one cycle cost.  And in my head I know that it can not be that
> simple as it causes lots of very special things to happen in the
> pipeline to ensure you are locked.
> 
> > This is different than the mutexes we optimize for the UP cases
> > (and which aren't affected by this change). It's really not a big deal.  
> 
> CPU's are not getting any faster, cycles are cycles, and I think we
> should at least investigate further before we just start making
> assumptions about the lock prefix being a 1 cycle cheap thing to
> do.


Just install opt_*.h headers already.  It's not just about the SMP option.
The nvidia-driver ports want to know if PAE is enabled on i386.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180706225030.2e689882>