From owner-cvs-src@FreeBSD.ORG  Fri Apr 16 00:25:19 2004
Return-Path: <owner-cvs-src@FreeBSD.ORG>
Delivered-To: cvs-src@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 618)
	id AD62016A4D2; Fri, 16 Apr 2004 00:25:19 -0700 (PDT)
In-Reply-To: <20040416115056.J11580@gamplex.bde.org> from Bruce Evans at "Apr
	16, 2004 11:51:16 am"
To: bde@zeta.org.au (Bruce Evans)
Date: Fri, 16 Apr 2004 00:25:19 -0700 (PDT)
X-Mailer: ELM [version 2.4ME+ PL54 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-Id: <20040416072519.AD62016A4D2@hub.freebsd.org>
From: wpaul@FreeBSD.ORG (Bill Paul)
cc: cvs-src@freebsd.org
cc: src-committers@freebsd.org
cc: cvs-all@freebsd.org
cc: jhb@freebsd.org
Subject: Re: cvs commit: src/sys/compat/ndis hal_var.h kern_ndis.c
	ndis_var.h ntoskrnl_var.h pe_var.h subr_hal.c subr_ndis.c
	subr_ntoskrn
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Apr 2004 07:25:19 -0000


[...]

> > These two routines do the actual work of acquiring and releasing the
> > lock (they map to KefAcquireSpinLockAtDpcLevel() and
> > KefReleaseSpinLockFromDpcLevel). Are you saying the former routine
> > should use atomic_cmpset_acq_int() and the latter atomic_cmpset_rel_int()?
> 
> I think using the "acq" versions is semantically required.  However, using
> them would make no difference, since this code is very i386-dependent
> despite it being in in an MI directory, and atomic_cmpset_{acq,rel}_int
> are just aliases for atomic_cmpset_int on i386's.
> 
> Using atomic_cmpset_int() for unlocking is bogus, especially since it is
> used without a loop to check that the store succeeded.  It only works if
> the lock is already held so that the "cmp" part has no effect.  This can
> be expressed more clearly using atomic_store_rel_int() like the normal
> mutex code does in _release_lock_quick().

ntoskrnl_unlock_dpc() should only ever be called when a lock is already
held. If it's ever called without a lock held, that's a bug.

> The above code also has lots of style bugs.
> 
> Fixing these problems gives:
> 
> %%%
> __stdcall void
> ntoskrnl_lock_dpc(/* lock */ void)
> {
> 	int *lock;
> 
> 	__asm __volatile("" : "=c" (lock));
> 	while (atomic_cmpset_int(lock, 0, 1) == 0)
> 		;
> }
> 
> __stdcall void
> ntoskrnl_unlock_dpc(/* lock */ void)
> {
> 	int *lock;
> 
> 	__asm __volatile("" : "=c" (lock));
> 	atomic_store_rel(lock, 0);
> }
> %%%

Noted.

> The asm is still too fragile.  Isn't there a way to directly specify that
> "int *lock" is an arg passed in %ecx?

Yes: upgrade to gcc 3.4, which has __attribute__((__fastcall__)). Of course,
first you have to wait for gcc 3.4 to be released.

People have suggested various different hacks to duplicate _fastcall.
semantics. This is the only one which produced the correct result in all
cases that I tested, with optimization turned on. You can engage in a
certain amount of deception with the __regparm()__ attribute, but gcc has
certain ideas about what arguments go into which registers when you use
regparm which don't agree with the way _fastcall does things. (This is
why I don't use __regparm(3)__: it will use %ecx and %edx, but it will
also drag %eax into the mix.)

> I copied half of the fixed version from my end run around mutexes in
> my version of the cy driver, which was in turn half copied from the
> simple locking (simple_lock interfaces) in RELENG_4.
> 
> %%%
> static void
> CY_LOCK(void)
> {
> 	while (atomic_cmpset_acq_int(&cy_slock, 0, 1) == 0)
> 		;
> }
> static void
> CY_UNLOCK(void)
> {
> 	atomic_store_rel_int(&cy_slock, 0);
> }
> %%%
> 
> This leaves out half of the complications for locking in the cy driver.
> Simple locks are only used in the SMP case.  For UP, locking at this
> level consists of disabling all interrupts.  For SMP, locking consists
> of the above plus disabling all interrupts on the running CPU.  Disabling
> interrupts on the running CPU is needed to avoid deadlock.  I'm not sure
> how ndis works without it.

At this point, I am forced to conclude that you didn't actually read
the code, since if you had, you might have understood. I will try to
explain it again, hopefully in a way that makes it clear how Windows
does things.

First of all, note that the Windows API has the following routines:
 
KeRaiseIrql() -- raise IRQL to specified level, return previous level
KeLowerIrql() -- lower IRQL to specified level
KeAcquireSpinLockAtDpcLevel() -- do the atomic set portion of acquiring
        a spin lock -- assumes you're already at DISPATCH_LEVEL
KeReleaseSpinLockFromDpcLevel() -- do the atomic clear portion of
        acquiring a spinlock -- assumes you want to remain at DISPATCH_LEVEL
KeAcquireSpinLock() -- perform oldirql = KeRaiseIrql(DISPATCH_LEVEL) and then
        KeAcquireSpinLockAtDpcLevel()
KeReleaseSpinLock() -- peform KeReleaseSpinLockFromDpcLevel() and then
        KeLowerIrql(oldirql)
NdisAcquireSpinLock() -- wrapper for KeAcquireSpinLock()
NdisReleaseSpinLock() -- wrapper for KeReleaseSpinLock()
NdisDprAcquireSpinLock() -- wrapper for KeAcquireSpinLockAtDpcLevel()
NdisDprReleaseSpinLock() -- wrapper for KeReleaseSpinLockFromDpcLevel()

Note that you can quickly find the documentation for any of these
functions by punching their names (minus the parens) into Google;
it will take you straight to the MSDN 'manual page' for the routine.

ntoskrnl_lock_dpc() and ntoskrnl_unlock_dpc() map to
KeAcquireSpinLockAtDpcLevel() and KeReleaseSpinLockFromDpcLevel(),
respectively. The idea behind these two routines is that if you're
already running with IRQL == DISPATCH_LEVEL, then you don't need to
futz with disabling preemption before spinning on the lock because
preemption has already been disabled. The NDIS spec says that
NDIS interrupt handlers (along with a few other things) run at
DISPATCH_LEVEL, so you already know preemption is disabled. You
_could_ call the full blown KeAcquireSpinLock() routine in this
case, but it would perform a useless KeRaiseIrql(DISPATCH_LEVEL)
before spinning on the lock. Trying to raise the IRQL to DISPATCH_LEVEL
when you're already at DISPATCH_LEVEL is harmless, but wastes cycles,
so Microsoft tells you to just use KeAcquireSpinLockAtDpcLevel()
(or NdisDprAcquireSpinLock()) instead.

Sadly, things are complicated by the fact that many of the KeXXX
routines listed above are in fact just macros around KfXXX routines
which, at least on the x86 arch, live in the HAL. For example,
while you would expect KeAcquireSpinLock() to be a function that
lives in ntoskrnl.exe (and hence subr_ntoskrnl.c), it's actually
just a macro that calls KfAcquireSpinLock(), which lives in HAL.dll
(and hence subr_hal.c). To add to the confusion,
KeAcquireSpinLockAtDpcLevel() is also a macro that calls a function
named KefAcquireSpinLockAtDpcLevel(), however this function actually
_does_ live in ntoskrnl.exe (and hence in subr_ntoskrnl.c). Logically,
all the spinlock routines should go in the same module, but thanks to
Microsoft, they don't. (Of course, things may be entirely different
on the amd64 or ia64 arches: the only way to know for sure is to
slog through the Windows DDK header files, which is exactly what I
did.)

Now, had you taken the time to actually read subr_hal,c, you would
have seen the following:

__stdcall uint8_t
hal_lock(/*lock*/void)
{
        kspin_lock              *lock;
        uint8_t                 oldirql;

        __asm__ __volatile__ ("" : "=c" (lock));

        /* I am so going to hell for this. */
        if (hal_irql() > DISPATCH_LEVEL)
                panic("IRQL_NOT_LESS_THAN_OR_EQUAL");

        oldirql = FASTCALL1(hal_raise_irql, DISPATCH_LEVEL);
        FASTCALL1(ntoskrnl_lock_dpc, lock);

        return(oldirql);
}

__stdcall void
hal_unlock(/*lock, newirql*/void)
{
        kspin_lock              *lock;
        uint8_t                 newirql;

        __asm__ __volatile__ ("" : "=c" (lock), "=d" (newirql));

        FASTCALL1(ntoskrnl_unlock_dpc, lock);
        FASTCALL1(hal_lower_irql, newirql);

        return;
}

At the bottom of the file, you can see how these map to the Windows
API functions:

        { "KfAcquireSpinLock",          (FUNC)hal_lock },
        { "KfReleaseSpinLock",          (FUNC)hal_unlock },

The subr_hal.c module also contains hal_raise_irql()/hal_lower_irql()
routines.

Hopefully this makes things a little clearler. If you have any further
answers, I will be happy to provide complete and detailed questions.

-Bill

--
=============================================================================
-Bill Paul            (510) 749-2329 | Senior Engineer, Master of Unix-Fu
                 wpaul@windriver.com | Wind River Systems
=============================================================================
              <adamw> you're just BEGGING to face the moose
=============================================================================