Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Aug 2005 14:15:55 -0400
From:      John Baldwin <jhb@FreeBSD.org>
To:        hselasky@c2i.net
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: How to do proper locking
Message-ID:  <200508041415.56140.jhb@FreeBSD.org>
In-Reply-To: <200508041850.14253.hselasky@c2i.net>
References:  <200508030023.04748.hselasky@c2i.net> <200508040953.59316.jhb@FreeBSD.org> <200508041850.14253.hselasky@c2i.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 04 August 2005 12:50 pm, Hans Petter Selasky wrote:
> On Thursday 04 August 2005 15:53, John Baldwin wrote:
> > On Thursday 04 August 2005 07:40 am, Hans Petter Selasky wrote:
> > > On Wednesday 03 August 2005 19:21, John Baldwin wrote:
> > > > On Tuesday 02 August 2005 06:23 pm, Hans Petter Selasky wrote:
> > > > > Hi,
> > > > >
> > > > > I am looking for a safe way to access structures that can be freed.
> > > > > The solution I am looking for must not:
> > > > >
> > > > > - hinder scaleability
> > > > > - lead to use of a single lock
> > > > > - lead to lock order reversal
> > >
> > > There is one more thing. When the structure is freed it should not
> > > block waiting for any reference counts.
> >
> > crhold() is always called when the caller knows that it has a reference
> > to cr that can't go away.  Thus, in the case of p_ucred, one holds the
> > associated PROC_LOCK() while doing crhold().  After the crhold() you can
> > drop the proc lock, so that lock is only held for a very short period of
> > time.  You then have your own reference to the cred structure that is
> > valid until you call crfree().  In the case of curthread->td_ucred, the
> > reference is valid until curthread releases it on a cred update at the
> > start of a syscall, so your reference is valid without needing locks for
> > the life of a syscall.  The key here is that you solve this problem at a
> > higher level, not within the structure itself.
>
> Yes, this works with mbufs, crhold() and alike, where one doesn't actually
> free the object until the refcount reaches zero. But I am speaking about
> destroying an object while the refcount is non-zero. Do you see the
> difference?
>
> There are two ways to handle this:
>
> 1) blocking: wait until the refcount reaches zero.
>
> 2) nonblocking: increment some other refcount that the
>    callback checks before accessing any data.
>
>
> Do you agree that method 2 is going to:
>
> - avoid deadlock
> - avoid use of sx-locks, locks that allow calls to msleep(),
>   to keep synchronization while destroying objects
>   with refcounts.

Not always. :)

> When calling functions like "callout_stop()", it is likely that one is
> holding a lock to prevent other code from calling "callout_start()" on the
> same callback before "callout_stop()" has been called. Now, if
> "callout_stop()" sleeps, the lock that is held while calling this function,
> must allow calls msleep(). So this must be an sx-lock (see "man sx"). Now
> if one routine sleeps, then any callers of this routine must sleep too. So
> this affects the synchronization of the whole system.
>
> Isn't nonblocking operation preferred over blocking operation?

You can call callout_stop() while holding the lock, then do a callout_drain() 
later when you are prepared to block.  This is what I'm doing with ethernet 
drivers right now that typically call callout_stop in their foo_stop() 
routine so detach looks like:

	FOO_LOCK(sc);
	foo_stop(sc);	// calls callout_stop()
	FOO_UNLOCK(sc);
	callout_drain(&sc->foo_stat_ch);

> The callback can be called after that "callout_stop()" has been called!
>
> You can use _callout_stop_safe(), but the problem is that it will sleep,
> waiting for "the other thread" to finish. The solution I have described,
> makes this process non-blocking, which means that one can hold a lock while
> calling callout_stop() which must be an advantage.

See above.

> > > - setting up devices. Make sure that read/write/ioctl/poll callbacks
> > > are never called after that the device has been freed.
> >
> > This is managed with refcounts on the dev_t object by devfs.  Once you
> > call destroy_dev() devfs manages the rest.
>
> No, the callback functions read/write/ioctl/poll can be called after that
> "destroy_dev()" has been called.

Nope.  Evert call to read/write/ioctl bumps a reference count on the 
underlying dev_t and destroy_dev() blocks until the reference count drops to 
zero.

> > > - setting up interrupts. Make sure that interrupt routine is never
> > > called after that it is been teared down.
> >
> > bus_teardown_intr() already does this.
>
> I'm sure it is the same here. If the interrupt handler doesn't have its own
> lock then, I think the interrupt callback can be called after that
> "bus_teardown_intr()" has been called.

Nope.  The interrupt code checks if the ithread is running and if so it flags 
the handler for removal and blocks to let the ithread remove the handler and 
then wake it up.

> > > And more. There are lots of situations in the kernel that run into the
> > > same scheme, without a proper solution.
> >
> > Are there any other places you can think of that don't handle this
> > already?
>
> I am thinking about all the device drivers that create devices in /dev/
> that can be detache, and that really never checks anything before accessing
> the softc:

As above, since they use destroy_dev() they are already ok.

> > The approach FreeBSD's kernel has taken is to solve this problem by
> > duplicating a lot of checks all over the place for each data structure,
> > but to solve it in the primitives themselves instead (hence
> > callout_drain(), bus_teardown_intr(), destroy_dev(), etc.)
>
> Yes, but why are the implementations based on blocking operation, hence
> this is not required?
>
>
>
> What I want is that the kernel provides some routine that can do locking
> based on a structure like below. Also the kernel must provide some routines
> to allocate, free and read a refcount.
>
> struct callback_args {
>   void *sc;
>   struct mtx *p_mtx;
>   u_int32_t refcount_copy;
>   u_int32_t *p_refcount;
> };
>
> Then I want routines like "callout_reset()", "make_dev()",
> "bus_setup_intr()" and so on, to pass these four parameters to the callback
> function, either directly, or like a pointer to this structure.

You are just going to move the blocking behavior into contention on your 
locks, you aren't going to actually remove it anywhere, and since all users 
of the structures have to do more checking at runtime, I think you will end 
up increasing overhead.

-- 
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200508041415.56140.jhb>