From owner-svn-src-head@FreeBSD.ORG Tue Oct 4 16:52:18 2011 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B0E5B1065678; Tue, 4 Oct 2011 16:52:18 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by mx1.freebsd.org (Postfix) with ESMTP id 2EA898FC20; Tue, 4 Oct 2011 16:52:17 +0000 (UTC) Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au (c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p94GqE14022513 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 5 Oct 2011 03:52:15 +1100 Date: Wed, 5 Oct 2011 03:52:14 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Attilio Rao In-Reply-To: Message-ID: <20111005023142.G8617@besplex.bde.org> References: <201109041307.p84D72GY092462@svn.freebsd.org> <20110905023251.C832@besplex.bde.org> <20111004043232.K11186@besplex.bde.org> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1554855478-1317747134=:8617" Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Bruce Evans Subject: Re: svn commit: r225372 - head/sys/kern X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Oct 2011 16:52:18 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1554855478-1317747134=:8617 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Tue, 4 Oct 2011, Attilio Rao wrote: > 2011/10/3 Bruce Evans : >> On Mon, 26 Sep 2011, Attilio Rao wrote: >> >>> 2011/9/4 Bruce Evans : >>>> >>>> On Sun, 4 Sep 2011, Attilio Rao wrote: >>>> >>>>> Also please notice that intr enable/disable happens in the wrong way >>>>> as it is done via the MD (x86 specific likely) interface. This is >>>>> wrong for 2 reasons: >>>> >>>> No, intr_disable() is MI. =C2=A0It is also used by witness. =C2=A0disa= ble_intr() >>>> is the corresponding x86 interface that you may be thinking of. =C2=A0= The MI >>>> interface intr_disable() was introduced to avoid the MD'ness of >>>> intr_disable(). >>> >>> I was a bit surprised to verify that you are right but >>> spinlock_enter() has the big difference besides disable_intr() of also >>> explicitly disabling preemption via critical_enter() which some >>> codepath can trigger without even noticing it. >>> This means it is more safer in presence of PREEMPTION option on and >>> thus should be preferred to the normal intr_disable(), in particular >>> for convoluted codepaths. >> >> I think this is another implementation detail which shouldn't be depende= d >> on. =C2=A0Spinlocks may or may not need either interrupts disabled or a = critical >> section to work. =C2=A0Now I'm a little surprised to remember that they = use a >> critical section. =C2=A0This is to prevent context switching. =C2=A0It i= s useful >> behaviour, but not strictly necessary. >> >> Since disabling interrupts also prevents context switching (excep by bug= gy >> trap handlers including NMI), it is safe to use hard interrupt disabling >> instead of critical_enter() to prevent context switching. =C2=A0This is = safe >> because code that has interrupts disabled cannot wander off into other >> code that doesn't understand this and does context switching! (unless it >> is broken). =C2=A0But for preventing context switching, critical_enter()= is >> better now that it doesn't hard-disable interrupts internally. > > This is not entirely correct, infact you may have preemption even with > interrupts disabled by calling code that schedule threads. This is why > spinlock_enter() disables interrupts _and_ preemption altogether. > Look for example at hardclock() and its internal magic (this is what I > meant, earlier, with "non convoluted codepaths"). That is a bug in -current. As I said, only broken code can wander off into other code that doesn't understand the caller's context. This is one of the things that prevents hardclock() being a non-broken fast interrupt handler. hardclock() wants to call scheduling code, but non- broken fast interrupt handlers can't do that. >> By un-inlining (un-macroizing) mtx_lock_spin(), but inlining >> critical_enter(), I get the same number of function calls but much small= er >> code since it is the tiny critical_enter() function and not the big >> mtx_lock_spin() one that is inlined. > > I'm not entirely sure I follow. > > In -CURRENT, right now mtx_lock_spin() just yields > _mtx_lock_spin_flags() which is not inlined. That is the debugging version. is obfuscated as follows: - mtx_lock_spin(m) is mtx_lock_spin_flags((m), 0) - if LOCK_DEBUG > 0 or defined(MUTEX_NOINLINE) mtx_lock_spin_flags((m), 0) is _mtx_lock_spin_flags((m), curthread, (0= ), =09LOCK_FILE, LOCK_LINE) This gives the version that you described. if LOCK_DEBUG > 0 LOCK_FILE is __FILE__ and LOCK_LINE is __LINE__ else LOCK_FILE is NULL and LOCK_LINE is 0 else mtx_lock_spin_flags((m), 0) is __mtx_lock_spin((m), curthread, (0), =09NULL, 0) This gives the version that I described. __mtx_lock_spin() is passed a dummy LOCK_FILE and LOCK_LINE although it doesn't use them in this case. It is convoluted so that it can be used both in this case and in the non-inlined case, where it is expanded in _mtx_lock_spin_flags(), where it is passed __FILE__ and __LINE__ iff LOCK_DEBUG > 0. But this reuse is not so good since it gives a further obfuscations: __mtx_lock_spin() takes flags args but doesn't have `flags' in its name like some other mtx functions. In the non-inlined case: mtx_lock_spin(...) is _mtx_lock_spin_flags() as described above =09_mtx_lock_spin_flags(...) invokes __mtx_lock_spin(...) as desc. above __mtx_lock_spin(...) invokes _mtx_lock_spin(...) =09The macro obfuscations end at this point -- _mtx_lock_spin() is =09always a function. _mtx_lock_spin(), like __mtx_lock_spin(), takes flags args but =09doesn't have `flags' in its name. end if > So the improvement here is just to have inlined critical_enter()? How > this can lead to smaller code? This should be obvious now. Another detail is that my mtx_lock_spin() takes only 1 arg (this is the normal API). It doesn't need to support passing (td, opt, file, line). critical_enter() takes no args at all (td =3D curthread is implicit for it, as it is for mtx_lock_spin()). So mtx_lock_spin(mp) calls expand to slightly more object code than critical_enter() calls. In -current for the inlined case, mtx_lock_spin() expands to critical_enter() plus about 40 bytes (on i386) for the atomic op on the mutex followed by the _mtx_lock_function call. >> The complications are mainly in critical_exit(): >> - in my version, when td_critnest is decremented to 0, a MD function is >> =C2=A0called to "unpend" any pending interrupts that have accumulated wh= ile >> =C2=A0in the critical region. =C2=A0Since mtx_lock_spin() doesn't hard-d= isable >> =C2=A0interrupts, they may occur when a spinlock is held. =C2=A0Fast int= errupts >> =C2=A0proceed. =C2=A0Others are blocked until critical_exit() unpends th= em. >> =C2=A0This includes software interrupts. =C2=A0The only really complicat= ed part >> =C2=A0is letting fast interrupts proceed. =C2=A0Fast interrupt handlers = cannot >> =C2=A0use or be blocked by any normal locking, since they don't respect >> =C2=A0normal spinlocks. =C2=A0So for example, hardclock() cannot be a fa= st interrupt >> =C2=A0handler. > > I don't think this is a good idea. > We hardly rely on interrupts disabling during spinlock helding in > order to get them be used by fast handlers and then avoid deadlocks > with code running in interrupt/kernel context. I think you mean "We strongly rely on...". -current certainly relies on this. IMO this is mostly a bug. There is the implementation detail that spinlocks hard-disable interrupts to avoid a deadlock problem in the UP case. Too much code has come to depend on this IMO. >>> [register_t intr_disable() interface not being quite NMI] >>> >>> I mostly agree, I think we should have an MD specified type to replace >>> register_t for this (it could alias it, if it is suitable, but this >>> interface smells a lot like x86-centric). >> >> Really vax-centric. =C2=A0spl "levels" are from vax or earlier CPUs. =C2= =A0x86 >> doesn't really have levels (the AT PIC has masks and precedences. =C2=A0= The >> precedences correspond to levels are but rarely depended on or programme= d >> specifically). =C2=A0alpha and sparc seem to have levels much closer to >> vax. >> >> With only levels, even an 8-bit interface for the level is enough (255 >> levels should be enough for anyone). =C2=A0With masks, even a 64-bit int= erface >> for the mask might not be enough. =C2=A0When masks were mapped to levels >> for FreeBSD on i386, the largish set of possible mask values was mapped >> into < 8 standard levels (tty, net, bio, etc). =C2=A0Related encoding of >> MD details as cookies would probably work well enough in general. > > For cookie you just mean a void * ptr? Either a pointer, or an integer that indexes a table of pointers, or an an integer that encodes all the info in the integer's bits, possibly with an identity encoding. Bruce --0-1554855478-1317747134=:8617--