From owner-freebsd-current Fri Mar 8 15:42:26 2002 Delivered-To: freebsd-current@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id D076A37B416; Fri, 8 Mar 2002 15:42:08 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g28Ng7S79983; Fri, 8 Mar 2002 15:42:07 -0800 (PST) (envelope-from dillon) Date: Fri, 8 Mar 2002 15:42:07 -0800 (PST) From: Matthew Dillon Message-Id: <200203082342.g28Ng7S79983@apollo.backplane.com> To: John Baldwin Cc: FreeBSD current users , Robert Watson , Jake Burkholder Subject: Re: Patch for critical_enter()/critical_exit() & interrupt assem References: Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :On 08-Mar-02 Matthew Dillon wrote: :> :>:I agree that the use of cpu_critical_enter/exit could use cleaning up. :>:Can you give an example of where critical_enter is used improperly? :>:You mean in fork_exit? Your changes to cpu_switch solve that problem :>:with critical_exit almost unchanged. The savecrit stuff should really :>:just be made MD. If cpu_critical_exit was changed to take the thread :>:as its argument as I suggested before this would be easy. : :Hmm, I agree with getting rid of td_savecrit as an MI field and changing the :API for cpu_critical_*. I also agree that cpu_critical_* is abused in many :cases. I just fixed a few that can be fixed. I think most others can be fixed :with the explicit intr_disable/intr_restore API which shouldn't be a problem :since it's basically what cpu_critical_* is now but just misnamed. This would :fix the remaining instance in witness for example. ast() is harder to solve, :and although I don't like having stuff duplicated all over the place making :maintenance harder, moving the loop and test out to MD code in either asm or C :as Jake suggested would work fine. : :> fork_exit() is a good example. The existing code explicitly :> initializes td->td_critnest to 1 and then sets td->td_savecrit :> to CRITICAL_FORK: :> :> td->td_critnest = 1; :> td->td_savecrit = CRITICAL_FORK; :> :> It then proceeds to unlock the sched_lock spin lock. :> :> This code only works because interrupts are hard-disabled in the :> fork trampoline code. In otherwords, the code assumes that :> cpu_critical_enter() and cpu_critical_exit() play with hard interrupt :> disablement. If interrupts were enabled there would be two severe :> race conditions in the code: The first upon entering fork_exit :> prior to ke_oncpu being set, and the second when td->td_critnest is set :> to 1 prior to td_savecrit being set to CRITICAL_FORK. : :No, the savecrit does assume that, but the critnest is still correct and will :still work fine. By definition, you don't switch inside a critical section :(taht's what critical sections do) so critnest will always be 1 here. Any :interrupt or what not that comes in if interrutps aren't disabled might dink :with teh count but the count will still be 1 by the time we get to releasing :sched_lock. Alternatively, we could set td_critnest to 1 in fork() which would :be ok with me. Huh? Are you actually advocating that it is ok to allow an interrupt to occur while curthread is in an inconsistent state? I mean, we aren't talking about a little inconsistency here John, we are talking about the sched_lock being in a weird state, and we are talking about curthread's ke_oncpu not being set properly. We could set td_critnest to 1 in fork(). I considered doing that but decided it was easier to keep my code as close to the existing code as possible so I clean it up in the fork trampoline code while interrupts really are disabled. With the critical section in a completely consistent state I can then enable interrupts prior to calling fork_exit(). :I always said that Peter's hack was gross. I didn't like it when he did it :originally either. Actually, doing cpu_critical_exit() though is safe since :all we need is for the critnest to be >= 1 to prevent context switches (which :is all critical sections do if you read the manpage that documents them). We :only ever need to defer bottom-half code (or Primary Interrupt Context as Apple :likes to call it) while holding a spin mutex. By my read of the code it is not safe to allow interrupts to be enabled while critnest >= 1 because while the interrupt may not context switch, it will still attempt to schedule a new thread which requires interactions with spin locks which the interrupted process may be holding. The result: the interrupted process is not protected by the spin lock it thought protected it because the interrupt can pop in and do something horrible. :The MD code does abuse cpu_critical_* which I am quite happy to fix using :intr_*(). Once this is done cpu_critical_* should be out of bogus land and you :should be able to adjust your patches to change that API's implementation. :I'm willing to do all the work to fixup the cpu_critical_* abuse right now :before doing anything else. It really needs to be done anyway. :... :They won't stay one line. That's the whole point. The entire reason we have :MI versions is that I committed part of the preemption tree specifically to :try and minimize the amount of the preemption code not currently checked in. :The MI versions are still a WIP part of the preemption tree. They wouldn't :even exist if it weren't for the preemption tree. I find it highly unlikely that they will ever exceed four or five lines. About the worst I can envision is a conditional like this: critical_exit() { if (--td->td_critnest == 0) { <<<<<<< this is MI if (td->td_need_to_do_mi_stuff) <<<<<<< this is MI mifunctioncall(); <<<<<<< this is MI ... do MD stuff ... } } That's it. That is the sole reason that you want critical_enter() and critical_exit() to be MI, and one of many reasons why *I* want critical_enter() and critical_exit() to be MD. :I defer to Jake's opinions on the style and profiling issues. : :All that I ask is that you let me cleanup and fix the cpu_critical_* bogusness :which will then allow you to implement your changes in that API while not :breaking the kernel preemption WIP. : :John Baldwin <>< http://www.FreeBSD.org/~jhb/ Look John, I don't want to wait for an arbitrary period of time for you to cleanup cpu_critical*() AND commit your preemption WIP. What do you think about these scenarios: * You give me a hard time-frame that isn't months... I've already wasted two and a half weeks on this mess, how much more time do you need? Another week? After you put all your cleanups and preemption work in I would STILL like to move critical_enter() and critical_exit() from MI to MD. Because I believe it is the right thing to do and because I do not consider the two minutes required to dup the MI critical_*() functions into MD files to be burdensome (I don't know why you seem to think it would be a burden). Keep in mind that by waiting for you to do these cleanups you will be forcing me to make adjustments to my already-want-to-commit code. Code which I has been committable for two and a half weeks now. So the tradeoff here is less work for you and more for me if you do all this stuff, or less work for me and more for you if you allow me to commit my stuff first. (and, as I said, I think the extra work for you would be no more then 2 minutes of work). * Or, you indicate that you will not be able to accomplish these goals in a reasonable timeframe (e.g. I believe a week is at the edge of what I would consider reasonable), in which case I will commit my stage-1 work and put the patches up for review for the stage-2 work (stage-2 just creates //critical.c and moves critical_enter() and critical_exit() into this file on every architecture). Since stage-2 does not make any programmatic changes I can get stage-2 into the tree probably just a few days after the stage-1 commit. After this is all done if you aren't willing to spend the two minutes required to integrate your preemption WIP with the new placement of critical_*(), I would be happy to do it for you. My position is that these interfaces belong in MD code sections even if you clean them up, and that I am willing to take over integration work from you for any pieces you want to add to e.g. critical_exit() (e.g. like the preemption bits) if you believe it would be too onerous for you to do so yourself. I do not accept that making these routines MD somehow makes it more difficult to extend them further, I do not believe that the duplicated code introduces any new bugs or any realistic chance of introducing a new bug, and I do not believe it is as big a hassle as you seem to believe. I mean, give me a break, how hard can adding two lines of code to four or five similar files (//critical.c) be verses adding the two lines to a single file? It's just no justification in my book to implement the routines in MI source files. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message