From owner-freebsd-smp Thu Feb 28 1:50:45 2002 Delivered-To: freebsd-smp@freebsd.org Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by hub.freebsd.org (Postfix) with ESMTP id E4E3837B417; Thu, 28 Feb 2002 01:50:37 -0800 (PST) Received: by elvis.mu.org (Postfix, from userid 1192) id B9BDFAE279; Thu, 28 Feb 2002 01:50:37 -0800 (PST) Date: Thu, 28 Feb 2002 01:50:37 -0800 From: Alfred Perlstein To: smp@freebsd.org Cc: dillon@freebsd.org, jhb@freebsd.org, smp@freebsd.org, Chad David Subject: select/poll locking wrong? Message-ID: <20020228095037.GJ80761@elvis.mu.org> References: <20020227111605.GH80761@elvis.mu.org> <20020227122606.A15980@colnta.acns.ab.ca> <20020227193059.GP80761@elvis.mu.org> <20020227124514.A27497@colnta.acns.ab.ca> <20020227202205.GT80761@elvis.mu.org> <20020227235736.A28776@colnta.acns.ab.ca> <20020228073743.GD80761@elvis.mu.org> <20020228011644.A28982@colnta.acns.ab.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020228011644.A28982@colnta.acns.ab.ca> User-Agent: Mutt/1.3.27i Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org * Chad David [020228 00:16] wrote: > > A question: > > condvar(9) says that the same mtx must be used with a given cv, but doesn't > select() call cv_*_sig() with selwait and many proc mtx's; as well, > selwakeup() broadcasts on selwait without owning the locks? Am I missing > something here, or are the rules no quite as strict as documented? A condvar should record the first mutex it is used with, if it is used with another mutex before calling some sort of special "reset" function it should trigger an assertion. Ok, based on that it looks like using the proc lock for select/poll seems wrong. The proper way to do this seems to require a global mutex lock that is held for select ops. One of the problems with doing that is that selinfo uses a pid to record which processes/threads are watching it. This makes selwakeup/selrecord call pfind which in turn needs the allproc lock. It's hard to express the badness associated with that so I'll just move on to a proposed solution. Solution: One global lock for select/poll operations. (BSD/os style) cv_wait* interlocks with the global lock. struct selinfo is modified (and bloated :() to contain a pointer to the thread as well as a TAILQ so all the selinfo's registered to a particular thread are on a list hung off of that thread struct. all the lists are protected by the select mutex. selrecord is modified to link the selinfo off the thread and set the back pointer to the thread. (under select lock) selwakeup is modified to grab the select lock, do its checks. On return from from select/poll you walk the list and point all the selinfos to null and remove them from yourself. basically, every time you see: sip->si_pid = 0; you want to remove that selinfo from the list hung off the proc. each time you see sip->si_pid = mypid; you want to link that selinfo into yourself. Because selwakeup is only called when someone is interested in the socket/pipe/whatever it's not going to be a global contention point other than with other threads heavily engaged in select/poll activity. I'd really appreciate some feedback, I'd like it even more if someone did this work, failing that I'll take a shot at it. Or is this not going to work? :) thanks, -- -Alfred Perlstein [alfred@freebsd.org] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' Tax deductible donations for FreeBSD: http://www.freebsdfoundation.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message