Date: Mon, 5 Sep 2011 04:04:18 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Attilio Rao <attilio@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r225372 - head/sys/kern Message-ID: <20110905023251.C832@besplex.bde.org> In-Reply-To: <CAJ-FndDa=xmvrcn9CdgMvDZ_vG3pjUdFqLH=Q%2BVK%2BSWjvGFO9g@mail.gmail.com> References: <201109041307.p84D72GY092462@svn.freebsd.org> <CAJ-FndDa=xmvrcn9CdgMvDZ_vG3pjUdFqLH=Q%2BVK%2BSWjvGFO9g@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
--0-1840991722-1315159458=:832
Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
On Sun, 4 Sep 2011, Attilio Rao wrote:
>> Log:
>> =C2=A0Interrupts are disabled/enabled when entering and exiting the KDB =
context.
>> =C2=A0While this is generally good, it brings along a serie of problems,
>> =C2=A0like clocks going off sync and in presence of SW_WATCHDOG, watchdo=
gs
>> =C2=A0firing without a good reason (missed hardclock wdog ticks update).
Please fix your mailer. It corrupts spaces to binary data \xc2\xa0.
My version resets the timecounter 3 seconds after leaving ddb (not
immediately, to avoid setting it thousands or millions of times per
second for trace traps and possibly making it drift a few usec every
time (I get a drift of ~ 1 usec per second stopped)). In fact, I don't
see how anyone can use ddb without this. When running current, I
have to remember to stop ntpd before running ddb so that ntpd doesn't
corrupt its database (just the driftfile and logs).
> Also please notice that intr enable/disable happens in the wrong way
> as it is done via the MD (x86 specific likely) interface. This is
> wrong for 2 reasons:
No, intr_disable() is MI. It is also used by witness. disable_intr()
is the corresponding x86 interface that you may be thinking of. The MI
interface intr_disable() was introduced to avoid the MD'ness of
intr_disable().
> 1) There may be some codepaths leading to explicit preemption
> 2) It should really use an MI interface
>
> The right way to do this should be via spinlock_enter().
spinlock_enter() is MI, but has wrong semantics. In my version of i386,
spinlocks don't disable any h/w interrupt, as is needed for fast interrupt
handlers to actually work. I believe sparc64 is similar, except its
spinlock_enter() disables most h/w interrupts and this includes fast
interrupt handlers. I don't understand sparc64, but it looks like its
spinlock_enter() disables all interrupts visible in C code, but not
all interrupts:
from cpufunc.h:
% static __inline register_t
% intr_disable(void)
% {
% =09register_t s;
%=20
% =09s =3D rdpr(pstate);
% =09wrpr(pstate, s & ~PSTATE_IE, 0);
% =09return (s);
% }
This seems to mask all interrupts, as required.
(The interface here is slightly broken (non-MI). It returns register_=
t.
This assumes that the interrupt state can be represented in a single
register. The type critical_t exists to avoid the same bug in an
old version of critical_enter(). Now this type is just bogus.
critical_enter() no longer returns it. Instead, spinlock_enter() uses
a non-reentrant interface which stores what used to be the return valu=
e
of critical_enter() in a per-thread MD data structure (md_saved_pil
in the above). Most or all arches use register_t for this. This
leaves critical_t as pure garbage -- the only remaining references to
it are for its definition.)
From=20machep.c:
% void
% spinlock_enter(void)
% {
% =09struct thread *td;
% =09register_t pil;
%=20
% =09td =3D curthread;
% =09if (td->td_md.md_spinlock_count =3D=3D 0) {
% =09=09pil =3D rdpr(pil);
% =09=09wrpr(pil, 0, PIL_TICK);
% =09=09td->td_md.md_spinlock_count =3D 1;
% =09=09td->td_md.md_saved_pil =3D pil;
% =09} else
% =09=09td->td_md.md_spinlock_count++;
% =09critical_enter();
% }
I believe this disables all interrupts at and below a certain level.
From=20intr_machdep.h:
% #define=09PIL_LOW=09=091=09/* stray interrupts */
% #define=09PIL_ITHREAD=092=09/* interrupts that use ithreads */
% #define=09PIL_RENDEZVOUS=093=09/* smp rendezvous ipi */
% #define=09PIL_AST=09=094=09/* ast ipi */
% #define=09PIL_STOP=095=09/* stop cpu ipi */
% #define=09PIL_PREEMPT=096=09/* preempt idle thread cpu ipi */
% #define=09PIL_HARDCLOCK=097=09/* hardclock broadcast */
% #define=09PIL_FILTER=0912=09/* filter interrupts */
% #define=09PIL_BRIDGE=0913=09/* bridge interrupts */
% #define=09PIL_TICK=0914=09/* tick interrupts */
And this level includes all interrupts that have a level, including
tick interrupts.
In my version of x86, spinlocks mask everything except the equivalent
of filter interrupts (=3D non-broken fast interrupts in my version).=20
Non-broken fast-interrupt handlers have to be written very carefully
to be reentrant. Since full reentrancy is too hard, they must run
with _all_ h/w interrupts disabled. kdb needs all h/w interrupts
disabled for similar reasons. It cannot even call things like
spinlock_enter() without knowing too much about their internals.
In fact, the non-reentrant state for spinlock_enter() and/or critical
_enter() was a reliable cause of panics for many years even when it
was accessed externally near kdb -- tracing through these functions
often paniced.
Below the filter level is a much higher level than sparc64. I think
the interrupts not masked at PIL_TICK are mostly handled in asm, which
I am further from understanding. Anyway, this is all at a low level
for sparc64. Higher levels just need to use the correct interface to
disable all interrupts. That is disable_intr(), not spinlock_enter().
Bruce
--0-1840991722-1315159458=:832--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110905023251.C832>
