Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Jun 1999 23:10:01 -0600
From:      Wes Peters <wes@softweyr.com>
To:        Jesus Monroy <jesus.monroy@usa.net>
Cc:        Ville-Pertti Keinonen <will@iki.fi>, hackers@FreeBSD.ORG
Subject:   Re: [Re: coarse vs fine-grained locking in SMP systems]
Message-ID:  <3775B229.C3981D1D@softweyr.com>
References:  <19990626055629.6978.qmail@www0i.netaddress.usa.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Jesus Monroy wrote:
> 
> Ville-Pertti Keinonen <will@iki.fi> wrote:
> > mo@servo.ccr.org (Mike O'Dell) writes:
> > > we published the best Unix SMP paper I've ever seen in Computing
> > > Systems - from the Amdahl guys who did an SMP version of the kernel
> > > by very clever hacks on SPLx() macros to make them spin locks and
> > > a bit of other clever trickery on the source.  they could take a stock
> >
> > An approach like that can't possibly be sufficient if code has been
> > written with the assumption that only interrupt-like events or
> > blocking calls can change things from under it.  There is quite a bit
> > of code in FreeBSD that relies on this.
> >
>    Can you elaborate on this a bit more? I think I missing
>    some of the finer points on what you are saying.
> 
>    I work on interrupt driven device drivers and I'm trying
>    to see how this ties in.

Here's a good example.  It's from a different system that uses a 
modified BSD TCP/IP stack, but the example still holds.  This system
keeps a linked list of network interfaces.  A periodic timer callback
goes through this list to handle timeout issues for the IP stack.
The timer routine didn't bother to acquire the semaphore protecting
the list of network interfaces.  When we moved the code to an SMP
system, it would occasionally (as in once or twice a year across the
entire customer base) crash with a null pointer in the list of
network interfaces.

The timer routine was being preempted by a higher-prioty user interface
task removing a network interface.  The timer would run off an invalid
pointer and crash the system.

This never happened on the single-processor system because the timer
ran in timer context, which is not interruptable by a normal process
regardless of the priority, but had failed to protect itself from a
normal process on the other CPU.

-- 
       "Where am I, and what am I doing in this handbasket?"

Wes Peters                                                 Softweyr LLC
http://www.softweyr.com/~softweyr                      wes@softweyr.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3775B229.C3981D1D>