From owner-svn-src-all@FreeBSD.ORG Sun Sep 4 18:04:34 2011 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 276D01065670; Sun, 4 Sep 2011 18:04:34 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail10.syd.optusnet.com.au (mail10.syd.optusnet.com.au [211.29.132.191]) by mx1.freebsd.org (Postfix) with ESMTP id AF0FC8FC1B; Sun, 4 Sep 2011 18:04:33 +0000 (UTC) Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au (c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191]) by mail10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p84I4Ps6008983 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 5 Sep 2011 04:04:30 +1000 Date: Mon, 5 Sep 2011 04:04:18 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Attilio Rao In-Reply-To: Message-ID: <20110905023251.C832@besplex.bde.org> References: <201109041307.p84D72GY092462@svn.freebsd.org> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1840991722-1315159458=:832" Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r225372 - head/sys/kern X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Sep 2011 18:04:34 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1840991722-1315159458=:832 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Sun, 4 Sep 2011, Attilio Rao wrote: >> Log: >> =C2=A0Interrupts are disabled/enabled when entering and exiting the KDB = context. >> =C2=A0While this is generally good, it brings along a serie of problems, >> =C2=A0like clocks going off sync and in presence of SW_WATCHDOG, watchdo= gs >> =C2=A0firing without a good reason (missed hardclock wdog ticks update). Please fix your mailer. It corrupts spaces to binary data \xc2\xa0. My version resets the timecounter 3 seconds after leaving ddb (not immediately, to avoid setting it thousands or millions of times per second for trace traps and possibly making it drift a few usec every time (I get a drift of ~ 1 usec per second stopped)). In fact, I don't see how anyone can use ddb without this. When running current, I have to remember to stop ntpd before running ddb so that ntpd doesn't corrupt its database (just the driftfile and logs). > Also please notice that intr enable/disable happens in the wrong way > as it is done via the MD (x86 specific likely) interface. This is > wrong for 2 reasons: No, intr_disable() is MI. It is also used by witness. disable_intr() is the corresponding x86 interface that you may be thinking of. The MI interface intr_disable() was introduced to avoid the MD'ness of intr_disable(). > 1) There may be some codepaths leading to explicit preemption > 2) It should really use an MI interface > > The right way to do this should be via spinlock_enter(). spinlock_enter() is MI, but has wrong semantics. In my version of i386, spinlocks don't disable any h/w interrupt, as is needed for fast interrupt handlers to actually work. I believe sparc64 is similar, except its spinlock_enter() disables most h/w interrupts and this includes fast interrupt handlers. I don't understand sparc64, but it looks like its spinlock_enter() disables all interrupts visible in C code, but not all interrupts: from cpufunc.h: % static __inline register_t % intr_disable(void) % { % =09register_t s; %=20 % =09s =3D rdpr(pstate); % =09wrpr(pstate, s & ~PSTATE_IE, 0); % =09return (s); % } This seems to mask all interrupts, as required. (The interface here is slightly broken (non-MI). It returns register_= t. This assumes that the interrupt state can be represented in a single register. The type critical_t exists to avoid the same bug in an old version of critical_enter(). Now this type is just bogus. critical_enter() no longer returns it. Instead, spinlock_enter() uses a non-reentrant interface which stores what used to be the return valu= e of critical_enter() in a per-thread MD data structure (md_saved_pil in the above). Most or all arches use register_t for this. This leaves critical_t as pure garbage -- the only remaining references to it are for its definition.) From=20machep.c: % void % spinlock_enter(void) % { % =09struct thread *td; % =09register_t pil; %=20 % =09td =3D curthread; % =09if (td->td_md.md_spinlock_count =3D=3D 0) { % =09=09pil =3D rdpr(pil); % =09=09wrpr(pil, 0, PIL_TICK); % =09=09td->td_md.md_spinlock_count =3D 1; % =09=09td->td_md.md_saved_pil =3D pil; % =09} else % =09=09td->td_md.md_spinlock_count++; % =09critical_enter(); % } I believe this disables all interrupts at and below a certain level. From=20intr_machdep.h: % #define=09PIL_LOW=09=091=09/* stray interrupts */ % #define=09PIL_ITHREAD=092=09/* interrupts that use ithreads */ % #define=09PIL_RENDEZVOUS=093=09/* smp rendezvous ipi */ % #define=09PIL_AST=09=094=09/* ast ipi */ % #define=09PIL_STOP=095=09/* stop cpu ipi */ % #define=09PIL_PREEMPT=096=09/* preempt idle thread cpu ipi */ % #define=09PIL_HARDCLOCK=097=09/* hardclock broadcast */ % #define=09PIL_FILTER=0912=09/* filter interrupts */ % #define=09PIL_BRIDGE=0913=09/* bridge interrupts */ % #define=09PIL_TICK=0914=09/* tick interrupts */ And this level includes all interrupts that have a level, including tick interrupts. In my version of x86, spinlocks mask everything except the equivalent of filter interrupts (=3D non-broken fast interrupts in my version).=20 Non-broken fast-interrupt handlers have to be written very carefully to be reentrant. Since full reentrancy is too hard, they must run with _all_ h/w interrupts disabled. kdb needs all h/w interrupts disabled for similar reasons. It cannot even call things like spinlock_enter() without knowing too much about their internals. In fact, the non-reentrant state for spinlock_enter() and/or critical _enter() was a reliable cause of panics for many years even when it was accessed externally near kdb -- tracing through these functions often paniced. Below the filter level is a much higher level than sparc64. I think the interrupts not masked at PIL_TICK are mostly handled in asm, which I am further from understanding. Anyway, this is all at a low level for sparc64. Higher levels just need to use the correct interface to disable all interrupts. That is disable_intr(), not spinlock_enter(). Bruce --0-1840991722-1315159458=:832--