Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Feb 2002 01:50:37 -0800
From:      Alfred Perlstein <bright@mu.org>
To:        smp@freebsd.org
Cc:        dillon@freebsd.org, jhb@freebsd.org, smp@freebsd.org, Chad David <davidc@acns.ab.ca>
Subject:   select/poll locking wrong?
Message-ID:  <20020228095037.GJ80761@elvis.mu.org>
In-Reply-To: <20020228011644.A28982@colnta.acns.ab.ca>
References:  <20020227111605.GH80761@elvis.mu.org> <20020227122606.A15980@colnta.acns.ab.ca> <20020227193059.GP80761@elvis.mu.org> <20020227124514.A27497@colnta.acns.ab.ca> <20020227202205.GT80761@elvis.mu.org> <20020227235736.A28776@colnta.acns.ab.ca> <20020228073743.GD80761@elvis.mu.org> <20020228011644.A28982@colnta.acns.ab.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
* Chad David <davidc@acns.ab.ca> [020228 00:16] wrote:
> 
> A question:
> 
> condvar(9) says that the same mtx must be used with a given cv, but doesn't
> select() call cv_*_sig() with selwait and many proc mtx's; as well,
> selwakeup() broadcasts on selwait without owning the locks?  Am I missing
> something here, or are the rules no quite as strict as documented?

A condvar should record the first mutex it is used with, if it is
used with another mutex before calling some sort of special "reset"
function it should trigger an assertion.

Ok, based on that it looks like using the proc lock for select/poll
seems wrong.

The proper way to do this seems to require a global mutex lock
that is held for select ops.

One of the problems with doing that is that selinfo uses a pid
to record which processes/threads are watching it.  This makes
selwakeup/selrecord call pfind which in turn needs the allproc
lock.  It's hard to express the badness associated with that
so I'll just move on to a proposed solution.

Solution:

  One global lock for select/poll operations.  (BSD/os style)

  cv_wait* interlocks with the global lock.

  struct selinfo is modified (and bloated :() to contain a pointer
  to the thread as well as a TAILQ so all the selinfo's registered
  to a particular thread are on a list hung off of that thread
  struct.

  all the lists are protected by the select mutex.

  selrecord is modified to link the selinfo off the thread and set
  the back pointer to the thread.  (under select lock)

  selwakeup is modified to grab the select lock, do its checks.

  On return from from select/poll you walk the list and point all
  the selinfos to null and remove them from yourself.

basically, every time you see:
  sip->si_pid = 0;
you want to remove that selinfo from the list hung off the proc.

each time you see
  sip->si_pid = mypid;
you want to link that selinfo into yourself.

Because selwakeup is only called when someone is interested in
the socket/pipe/whatever it's not going to be a global contention
point other than with other threads heavily engaged in select/poll
activity.

I'd really appreciate some feedback, I'd like it even more if someone
did this work, failing that I'll take a shot at it.

Or is this not going to work? :)

thanks,
-- 
-Alfred Perlstein [alfred@freebsd.org]
'Instead of asking why a piece of software is using "1970s technology,"
 start asking why software is ignoring 30 years of accumulated wisdom.'
Tax deductible donations for FreeBSD: http://www.freebsdfoundation.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020228095037.GJ80761>