From owner-freebsd-current@FreeBSD.ORG Sun Aug 29 00:05:10 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AA3E416A4CE; Sun, 29 Aug 2004 00:05:10 +0000 (GMT) Received: from smtp2.server.rpi.edu (smtp2.server.rpi.edu [128.113.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 41A5343D69; Sun, 29 Aug 2004 00:05:10 +0000 (GMT) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp2.server.rpi.edu (8.13.0/8.13.0) with ESMTP id i7T057mk012110; Sat, 28 Aug 2004 20:05:07 -0400 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: In-Reply-To: <200408271337.i7RDbXgu052801@pooker.samsco.org> References: <200408271337.i7RDbXgu052801@pooker.samsco.org> Date: Sat, 28 Aug 2004 20:05:05 -0400 To: re@freebsd.org, current@freebsd.org From: Garance A Drosihn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: CanIt (www . canit . ca) Subject: Re: 5.3-RELEASE TODO - make/kqueue X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Aug 2004 00:05:10 -0000 At 7:37 AM -0600 8/27/04, Scott Long wrote: > >Testing focuses for 5.3-RELEASE And update on Issue: > |---------------------------------+ > | make -DUSE_KQUEUE causes lockup | > | with buildworld -jBIGNUM | > |---------------------------------+ The description says: > |-------------------+---------------+--------------+------------| > | Attempts to use make(1) with KQueues appears to result in a | > | kernel hang under "heavy load". It would be desirable to fix | > | this both from the perspective of building FreeBSD quickly | > | as a developer, but also because it's an instability that | > | could show up under other high load and heavy use of | > | KQueues. See PR kern/57945 for a proposed patch and details. | > | This appear to be the product of a locking problem, and must | > | be fixed for 5.3. | > |-------------------+---------------+--------------+------------| I have done many buildworlds using the WITH_KQUEUE make over the past week. I have done at least 50 buildworlds in my dual-proc Althon machine, with -j ranging from 3 to 15. I have not seen any lockups since the fix for IPI deadlocks went in. I do still get the "*** Signal 6"s, even though I am now running with v1.76 of src/sys/kern/kern_lock.c. Actually I had updated that one source file, expecting to get revision 1.75 (and thus backout revision 1.74), as recently mentioned by Doug White. I just now realized that I ended up with 1.76... I guess I should try it one more time with 1.75 instead of 1.76. One observation which is perhaps interesting. I also modified sys/kern/kern_sig.c so that it prints out a message to the console whenever kill() or killpg1() is called with a SIGABRT. I tested that change, and it seems to work correctly with programs caling kill(SIGABRT), abort(), or raise(SIGABORT). However, when my buildworld dies with `make' claiming it saw a Signal 6, these printf's in kern_sig.c are never triggered. This failure is "eventually repeatable" for me, in that I can trigger it within 10 buildworlds. And *seems* that it only happens if I am also running a "folding-at-home" client at the same time. That client program is a Linux ELF binary, so maybe that is significant. Or maybe it's a red herring. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu