From owner-freebsd-current  Fri Mar  8 15:42:26 2002
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP
	id D076A37B416; Fri,  8 Mar 2002 15:42:08 -0800 (PST)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.11.6/8.9.1) id g28Ng7S79983;
	Fri, 8 Mar 2002 15:42:07 -0800 (PST)
	(envelope-from dillon)
Date: Fri, 8 Mar 2002 15:42:07 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200203082342.g28Ng7S79983@apollo.backplane.com>
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: FreeBSD current users <current@FreeBSD.ORG>,
	Robert Watson <rwatson@FreeBSD.ORG>, Jake Burkholder <jake@locore.ca>
Subject: Re: Patch for critical_enter()/critical_exit() & interrupt assem
References:  <XFMail.020308142827.jhb@FreeBSD.org>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-current.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-current>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-current>
X-Loop: FreeBSD.ORG


:On 08-Mar-02 Matthew Dillon wrote:
:> 
:>:I agree that the use of cpu_critical_enter/exit could use cleaning up.
:>:Can you give an example of where critical_enter is used improperly?
:>:You mean in fork_exit?  Your changes to cpu_switch solve that problem
:>:with critical_exit almost unchanged.  The savecrit stuff should really
:>:just be made MD.  If cpu_critical_exit was changed to take the thread
:>:as its argument as I suggested before this would be easy.
:
:Hmm, I agree with getting rid of td_savecrit as an MI field and changing the
:API for cpu_critical_*.  I also agree that cpu_critical_* is abused in many
:cases.  I just fixed a few that can be fixed.  I think most others can be fixed
:with the explicit intr_disable/intr_restore API which shouldn't be a problem
:since it's basically what cpu_critical_* is now but just misnamed.  This would
:fix the remaining instance in witness for example.  ast() is harder to solve,
:and although I don't like having stuff duplicated all over the place making
:maintenance harder, moving the loop and test out to MD code in either asm or C
:as Jake suggested would work fine.
:
:>     fork_exit() is a good example.  The existing code explicitly
:>     initializes td->td_critnest to 1 and then sets td->td_savecrit
:>     to CRITICAL_FORK:
:>     
:>       td->td_critnest = 1;
:>       td->td_savecrit = CRITICAL_FORK;
:> 
:>     It then proceeds to unlock the sched_lock spin lock.
:> 
:>     This code only works because interrupts are hard-disabled in the
:>     fork trampoline code.  In otherwords, the code assumes that 
:>     cpu_critical_enter() and cpu_critical_exit() play with hard interrupt
:>     disablement.   If interrupts were enabled there would be two severe
:>     race conditions in the code:  The first upon entering fork_exit
:>     prior to ke_oncpu being set, and the second when td->td_critnest is set
:>     to 1 prior to td_savecrit being set to CRITICAL_FORK.
:
:No, the savecrit does assume that, but the critnest is still correct and will
:still work fine.  By definition, you don't switch inside a critical section
:(taht's what critical sections do) so critnest will always be 1 here.  Any
:interrupt or what not that comes in if interrutps aren't disabled might dink
:with teh count but the count will still be 1 by the time we get to releasing
:sched_lock.  Alternatively, we could set td_critnest to 1 in fork() which would
:be ok with me.

    Huh?  Are you actually advocating that it is ok to allow an interrupt
    to occur while curthread is in an inconsistent state?  I mean, we aren't
    talking about a little inconsistency here John, we are talking about
    the sched_lock being in a weird state, and we are talking about
    curthread's ke_oncpu not being set properly.

    We could set td_critnest to 1 in fork().  I considered doing that but
    decided it was easier to keep my code as close to the existing code
    as possible so I clean it up in the fork trampoline code while
    interrupts really are disabled.  With the critical section in a
    completely consistent state I can then enable interrupts prior
    to calling fork_exit().

:I always said that Peter's hack was gross.  I didn't like it when he did it
:originally either.  Actually, doing cpu_critical_exit() though is safe since
:all we need is for the critnest to be >= 1 to prevent context switches (which
:is all critical sections do if you read the manpage that documents them).  We
:only ever need to defer bottom-half code (or Primary Interrupt Context as Apple
:likes to call it) while holding a spin mutex.

    By my read of the code it is not safe to allow interrupts to be enabled
    while critnest >= 1 because while the interrupt may not context
    switch, it will still attempt to schedule a new thread which requires
    interactions with spin locks which the interrupted process may be
    holding.  The result:  the interrupted process is not protected
    by the spin lock it thought protected it because the interrupt can
    pop in and do something horrible.

:The MD code does abuse cpu_critical_* which I am quite happy to fix using
:intr_*().  Once this is done cpu_critical_* should be out of bogus land and you
:should be able to adjust your patches to change that API's implementation.
:I'm willing to do all the work to fixup the cpu_critical_* abuse right now
:before doing anything else.  It really needs to be done anyway.
:...
:They won't stay one line.  That's the whole point.  The entire reason we have
:MI versions is that I committed part of the preemption tree specifically to
:try and minimize the amount of the preemption code not currently checked in. 
:The MI versions are still a WIP part of the preemption tree.  They wouldn't
:even exist if it weren't for the preemption tree.

    I find it highly unlikely that they will ever exceed four or five lines.
    About the worst I can envision is a conditional like this:

    critical_exit()
    {
	if (--td->td_critnest == 0) {		<<<<<<< this is MI
		if (td->td_need_to_do_mi_stuff)	<<<<<<< this is MI
			mifunctioncall();	<<<<<<< this is MI
		... do MD stuff ...
	}
    }

    That's it.  That is the sole reason that you want critical_enter()
    and critical_exit() to be MI, and one of many reasons why *I* want
    critical_enter() and critical_exit() to be MD.

:I defer to Jake's opinions on the style and profiling issues.
:
:All that I ask is that you let me cleanup and fix the cpu_critical_* bogusness
:which will then allow you to implement your changes in that API while not
:breaking the kernel preemption WIP.
:
:John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/

    Look John, I don't want to wait for an arbitrary period of time for
    you to cleanup cpu_critical*() AND commit your preemption WIP.  What
    do you think about these scenarios:

    * You give me a hard time-frame that isn't months... I've already wasted
      two and a half weeks on this mess, how much more time do you need?
      Another week?

      After you put all your cleanups and preemption work in I would STILL
      like to move critical_enter() and critical_exit() from MI to MD.
      Because I believe it is the right thing to do and because I do not
      consider the two minutes required to dup the MI critical_*() functions
      into MD files to be burdensome (I don't know why you seem to think
      it would be a burden).

      Keep in mind that by waiting for you to do these cleanups you will
      be forcing me to make adjustments to my already-want-to-commit code.
      Code which I has been committable for two and a half weeks now.
      So the tradeoff here is less work for you and more for me if you do
      all this stuff, or less work for me and more for you if you allow
      me to commit my stuff first.  (and, as I said, I think the extra
      work for you would be no more then 2 minutes of work).

    * Or, you indicate that you will not be able to accomplish these goals
      in a reasonable timeframe (e.g. I believe a week is at the edge of
      what I would consider reasonable), in which case I will commit
      my stage-1 work and put the patches up for review for the stage-2
      work (stage-2 just creates <arch>/<arch>/critical.c and moves 
      critical_enter() and critical_exit() into this file on every
      architecture).  Since stage-2 does not make any programmatic changes
      I can get stage-2 into the tree probably just a few days after
      the stage-1 commit.

      After this is all done if you aren't willing to spend the two minutes
      required to integrate your preemption WIP with the new placement
      of critical_*(), I would be happy to do it for you.

    My position is that these interfaces belong in MD code sections even
    if you clean them up, and that I am willing to take over integration
    work from you for any pieces you want to add to e.g. critical_exit()
    (e.g. like the preemption bits) if you believe it would be too onerous
    for you to do so yourself.

    I do not accept that making these routines MD somehow makes it more
    difficult to extend them further, I do not believe that the duplicated
    code introduces any new bugs or any realistic chance of introducing a
    new bug, and I do not believe it is as big a hassle as you seem to 
    believe.  I mean, give me a break, how hard can adding two lines of
    code to four or five similar files (<arch>/<arch>/critical.c) be verses
    adding the two lines to a single file?  It's just no justification in
    my book to implement the routines in MI source files.
	
					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message