From owner-cvs-all  Sun Jul  8  3:57:18 2001
Delivered-To: cvs-all@freebsd.org
Received: from sneakerz.org (sneakerz.org [216.33.66.254])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0E7E137B405; Sun,  8 Jul 2001 03:56:59 -0700 (PDT)
	(envelope-from bright@sneakerz.org)
Received: by sneakerz.org (Postfix, from userid 1092)
	id 5EB7B5D01F; Sun,  8 Jul 2001 05:56:58 -0500 (CDT)
Date: Sun, 8 Jul 2001 05:56:58 -0500
From: Alfred Perlstein <bright@sneakerz.org>
To: Greg Lehey <grog@FreeBSD.org>
Cc: Matt Dillon <dillon@earth.backplane.com>,
	cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject: Re: cvs commit: src/sys/sys systm.h condvar.h src/sys/kern kern_
Message-ID: <20010708055658.M88962@sneakerz.org>
References: <XFMail.010705123747.jhb@FreeBSD.org> <200107052228.f65MSeU64741@aslan.scsiguy.com> <20010705174135.A79818@sneakerz.org> <200107060214.f662ElT61708@earth.backplane.com> <20010708110449.E75626@wantadilla.lemis.com> <20010707211344.I88962@sneakerz.org> <200107080406.f68467f82907@earth.backplane.com> <20010708143124.Y80862@wantadilla.lemis.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2i
In-Reply-To: <20010708143124.Y80862@wantadilla.lemis.com>; from grog@FreeBSD.org on Sun, Jul 08, 2001 at 02:31:24PM +0930
Sender: owner-cvs-all@FreeBSD.ORG
Precedence: bulk
List-ID: <cvs-all.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo?subject=subscribe%20cvs-all>
List-Unsubscribe: <mailto:majordomo?subject=unsubscribe%20cvs-all>
X-Loop: FreeBSD.ORG

* Greg Lehey <grog@FreeBSD.org> [010708 00:01] wrote:
> On Saturday,  7 July 2001 at 21:06:07 -0700, Matt Dillon wrote:
> >>
> >> Problems:
> >> 1) it doesn't wakeup the highest priority process, this can be
> >>   easily fixed.
> >> 2) any processes it comes across that are swapped out are woken
> >>   up.  this is to avoid letting processes die, however it makes
> >>   for a rude suprise especially when you have dozens of apache
> >>   processes swapped out waiting on thier listening socket, it
> >>   effectively causes much pain as thrashing starts and the
> >>   machine goes down in a firery death.
> >>   the solution is to implement a max on the number of swapped
> >>   out processes that wakeup_one will swap in, and keep it somewhat
> >>   low.
> >>
> >     The solution is to not mess with unregulated wakeup routines
> >     like wakeup-one hoping they'd ever do the right thing.  I will
> >     personally bop anyone who tries to 'fix' wakeup-one with a
> >     clue-bat.
> 
> Agreed.  There's no way to fix wakeup_one with a clue-bat.

There seems to be some confusion as to what the point of wakeup_one
is and where exactly and how exactly problems arise from incorrect
usage.

The point and proper use of wakeup_one is to provide a mechanism
where you have multiple waiters for a _single_ condition, this
condition may only exist at one address.

Basically, you may not have two (or more) places or two (or more)
reasons for sleeping on an address if you plan on using wakeup
one.

If you want to see a place where wakeup_one is effective, see the
code for waiting for an incommming connection on a listen() socket,
this is the prime example of today's usage as well as textbook
usage. (kern/uipc_socket2.c and kern/uipc_syscalls.c)

Basically, there's nothing other than threads waiting for a new
connection to come in waiting on &head->so_timeo, so it's safe
to use wakeup_one in this situation, it's been safe forever.

Note carefully though that if the woken up thread can't actually
use the new connection (runs out of fds or some other error), it
passes the buck by calling wakeup_one (kern/uipc_syscalls.c) again
and then returning an error.

Remeber, you can _not_ just re-sleep on the address because then
you may have an infinite loop. :)

My assumption is that in vinum, you may have two threads
sleeping on the address of a vinum object however they aren't
waiting for the same thing.  Hence, one gets woken up and sees
the object still locked (or something) but the one waiting for
some other event doesn't see the wakeup, and there you have your
lockup.  The event was signalled, but someone got the signal that
was actually waiting for something else.

Let's figure out why wakeup_one() failed for vinum:

Process A has a rangelock
Process B has a rangelock (lower in the plex->lock[] array than A's)
Process C wants the rangelock that A has and is waiting on it
Process D wants the rangelock that A has and is waiting on it

Now things execute in this order:

  Process B runs, it completes its work and releases its lock.
  This opens a hole earlier in the lock table.

  Process A runs, it releases its lock wakeing up _only_ C (*)

  C restarts its search through the plex->lock[] array for a free slot,
  it happens upon B's slot.

  C uses that slot instead of the slot it was previously waiting for.

  C later releases the lock, it issues a wakeup, however D is still
  waiting on the old address.

  D is now hosed.

  (*) it could have woken up 'D' instead, but the result would have
  been the same.

My head hurts from figuring that out.

The use of wakeup_one on &plex->usedlocks should have been fine.

Now I'd like to know why Matt would hit me with a cluebat if I were
to take a shot at making wakeup_one a bit smarter about priorities
and not waking fewer swapped out processes.

-- 
-Alfred Perlstein [alfred@freebsd.org]
Ok, who wrote this damn function called '??'?
And why do my programs keep crashing in it?

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message