Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Sep 2011 04:04:18 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r225372 - head/sys/kern
Message-ID:  <20110905023251.C832@besplex.bde.org>
In-Reply-To: <CAJ-FndDa=xmvrcn9CdgMvDZ_vG3pjUdFqLH=Q%2BVK%2BSWjvGFO9g@mail.gmail.com>
References:  <201109041307.p84D72GY092462@svn.freebsd.org> <CAJ-FndDa=xmvrcn9CdgMvDZ_vG3pjUdFqLH=Q%2BVK%2BSWjvGFO9g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-1840991722-1315159458=:832
Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

On Sun, 4 Sep 2011, Attilio Rao wrote:

>> Log:
>> =C2=A0Interrupts are disabled/enabled when entering and exiting the KDB =
context.
>> =C2=A0While this is generally good, it brings along a serie of problems,
>> =C2=A0like clocks going off sync and in presence of SW_WATCHDOG, watchdo=
gs
>> =C2=A0firing without a good reason (missed hardclock wdog ticks update).

Please fix your mailer.  It corrupts spaces to binary data \xc2\xa0.

My version resets the timecounter 3 seconds after leaving ddb (not
immediately, to avoid setting it thousands or millions of times per
second for trace traps and possibly making it drift a few usec every
time (I get a drift of ~ 1 usec per second stopped)).  In fact, I don't
see how anyone can use ddb without this.  When running current, I
have to remember to stop ntpd before running ddb so that ntpd doesn't
corrupt its database (just the driftfile and logs).

> Also please notice that intr enable/disable happens in the wrong way
> as it is done via the MD (x86 specific likely) interface. This is
> wrong for 2 reasons:

No, intr_disable() is MI.  It is also used by witness.  disable_intr()
is the corresponding x86 interface that you may be thinking of.  The MI
interface intr_disable() was introduced to avoid the MD'ness of
intr_disable().

> 1) There may be some codepaths leading to explicit preemption
> 2) It should  really use an MI interface
>
> The right way to do this should be via spinlock_enter().

spinlock_enter() is MI, but has wrong semantics.  In my version of i386,
spinlocks don't disable any h/w interrupt, as is needed for fast interrupt
handlers to actually work.  I believe sparc64 is similar, except its
spinlock_enter() disables most h/w interrupts and this includes fast
interrupt handlers.  I don't understand sparc64, but it looks like its
spinlock_enter() disables all interrupts visible in C code, but not
all interrupts:

from cpufunc.h:
% static __inline register_t
% intr_disable(void)
% {
% =09register_t s;
%=20
% =09s =3D rdpr(pstate);
% =09wrpr(pstate, s & ~PSTATE_IE, 0);
% =09return (s);
% }

This seems to mask all interrupts, as required.

     (The interface here is slightly broken (non-MI).  It returns register_=
t.
     This assumes that the interrupt state can be represented in a single
     register.  The type critical_t exists to avoid the same bug in an
     old version of critical_enter().  Now this type is just bogus.
     critical_enter() no longer returns it.  Instead, spinlock_enter() uses
     a non-reentrant interface which stores what used to be the return valu=
e
     of critical_enter() in a per-thread MD data structure (md_saved_pil
     in the above).  Most or all arches use register_t for this.  This
     leaves critical_t as pure garbage -- the only remaining references to
     it are for its definition.)

From=20machep.c:
% void
% spinlock_enter(void)
% {
% =09struct thread *td;
% =09register_t pil;
%=20
% =09td =3D curthread;
% =09if (td->td_md.md_spinlock_count =3D=3D 0) {
% =09=09pil =3D rdpr(pil);
% =09=09wrpr(pil, 0, PIL_TICK);
% =09=09td->td_md.md_spinlock_count =3D 1;
% =09=09td->td_md.md_saved_pil =3D pil;
% =09} else
% =09=09td->td_md.md_spinlock_count++;
% =09critical_enter();
% }

I believe this disables all interrupts at and below a certain level.

From=20intr_machdep.h:
% #define=09PIL_LOW=09=091=09/* stray interrupts */
% #define=09PIL_ITHREAD=092=09/* interrupts that use ithreads */
% #define=09PIL_RENDEZVOUS=093=09/* smp rendezvous ipi */
% #define=09PIL_AST=09=094=09/* ast ipi */
% #define=09PIL_STOP=095=09/* stop cpu ipi */
% #define=09PIL_PREEMPT=096=09/* preempt idle thread cpu ipi */
% #define=09PIL_HARDCLOCK=097=09/* hardclock broadcast */
% #define=09PIL_FILTER=0912=09/* filter interrupts */
% #define=09PIL_BRIDGE=0913=09/* bridge interrupts */
% #define=09PIL_TICK=0914=09/* tick interrupts */

And this level includes all interrupts that have a level, including
tick interrupts.

In my version of x86, spinlocks mask everything except the equivalent
of filter interrupts (=3D non-broken fast interrupts in my version).=20
Non-broken fast-interrupt handlers have to be written very carefully
to be reentrant.  Since full reentrancy is too hard, they must run
with _all_ h/w interrupts disabled.  kdb needs all h/w interrupts
disabled for similar reasons.  It cannot even call things like
spinlock_enter() without knowing too much about their internals.
In fact, the non-reentrant state for spinlock_enter() and/or critical
_enter() was a reliable cause of panics for many years even when it
was accessed externally near kdb -- tracing through these functions
often paniced.

Below the filter level is a much higher level than sparc64.  I think
the interrupts not masked at PIL_TICK are mostly handled in asm, which
I am further from understanding.  Anyway, this is all at a low level
for sparc64.  Higher levels just need to use the correct interface to
disable all interrupts.  That is disable_intr(), not spinlock_enter().

Bruce
--0-1840991722-1315159458=:832--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110905023251.C832>