From owner-freebsd-arch@FreeBSD.ORG Fri Apr 16 19:12:43 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from green.homeunix.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 2ED3416A4CE for ; Fri, 16 Apr 2004 19:12:43 -0700 (PDT) Received: from localhost (green@localhost [127.0.0.1]) by green.homeunix.org (8.12.11/8.12.11) with ESMTP id i3H2Cg8n031749 for ; Fri, 16 Apr 2004 22:12:42 -0400 (EDT) (envelope-from green@green.homeunix.org) Message-Id: <200404170212.i3H2Cg8n031749@green.homeunix.org> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: arch@FreeBSD.org From: Brian Fundakowski Feldman Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 16 Apr 2004 22:12:42 -0400 Sender: green@green.homeunix.org Subject: kqueue giant-locking (&kq_Giant, locking) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Apr 2004 02:12:43 -0000 I believe I have come up with a good solution to the kqueue woes in 5.X, and I'd like to get some feedback on work that so far is letting me (on uniprocessor, at least) run make -j8 buildworld, with USE_KQUEUE in make(1), with no ill effect :) The locking thus far is one global kqueue lock, and I firmly believe we should use MUTEX_PROFILING to determine if we should lock it down any further at this point. There are several major differences so far (of course, fixing that stack-paged-out-kernel-crash-bug is one of them) and several major things still to be fixed. 1. The recursion has been removed from kqueue. This means kqueues cannot be added to other kqueues for EVFILT_READ -- yes, that ability has been around since r1.1 of kern_event.c, but it is utterly pointless and if you take a look at my previous patch, severely complicates many things. Of course, I'm sure someone will notice and complain, but there isn't any documentation that suggests you should kevent() another kqueue(). 2. Because of this, KNOTE() can't end up calling another KNOTE() unless the consumer does something stupid (call KNOTE() from filter::event()). 3. Kqueue does the locking for you when it comes to the non-object lists. All of the filter::attach() and filter::detach() routines need to lock their object lists, but they don't touch kqueue or knote other than setting their own knote's fields. Both of those routines are called without any locks held on kqueue's part. 4. The filter::event() routines are called with internal kqueue locking held. You can lock anything else you need to, but you may not sleep; it is essentially like an interrupt handler. You must not call into KNOTE() with locks held, but you should reference your object. I've fixed what appears to be the most egregious offender, sys_pipe.c 5. If KNOTE() as an interrupt does not work for you, you may call KNOTE() with any locks you like except the ones it uses internally (mainly filedesc and file), but the only information you can give your filter::event() is the hint argument. Examples of #4 are bpf and pipe; they do not need to pass any information in the filter::event() hint, and as every handler that works on the object instead of on hints needs to do, they verify for certain whether or not the KNOTE() should have actually fired and ignore falses. The biggest example of #5 is process events. There are many different process-type locks that may be held when KNOTE() is called, but the implementation of filter::event() is mostly correct in locking nothing. In kern_fork.c, KNOTE() is called outside of the proc lock (p1->p_klist not locked as it should be) because it has to be special-cased somehow. This is the most disgusting thing EVAR. (NB: See http://green.homeunix.org/~green/kqueue-locking.1.patch for that.) Current patch at: http://green.homeunix.org/~green/kqueue-giant-locking.0.patch -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\