From owner-freebsd-current@FreeBSD.ORG Sun Aug 15 02:30:22 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 83E0316A4CE for ; Sun, 15 Aug 2004 02:30:22 +0000 (GMT) Received: from pimout3-ext.prodigy.net (pimout3-ext.prodigy.net [207.115.63.102]) by mx1.FreeBSD.org (Postfix) with ESMTP id 24C6A43D1F for ; Sun, 15 Aug 2004 02:30:22 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (adsl-68-124-233-133.dsl.snfc21.pacbell.net [68.124.233.133])i7F2UI3d195646; Sat, 14 Aug 2004 22:30:19 -0400 Message-ID: <411ECAB8.9000107@elischer.org> Date: Sat, 14 Aug 2004 19:30:16 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4b) Gecko/20030524 X-Accept-Language: en, hu MIME-Version: 1.0 To: Robert Watson References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: Martin Blapp cc: freebsd-current@freebsd.org Subject: Re: Deadlocks with recent SMP current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Aug 2004 02:30:22 -0000 Robert Watson wrote: > On Sat, 14 Aug 2004, Jon Noack wrote: > > >>Here's a data point: My dual Pentium3 system has been up for 20+ hours >>with this patch. Previously, it wouldn't survive for more than an hour >>or so (regardless of load). > > > Unfortunately, I'm running a box with the same patch and did get a hang. > The patch appears to correct some known stability issues associated with > threaded processes, but the build I was using to trigger the hang doesn't > use threads, so... Note.. this is understandable the patch NARROWS a window.. it does not close it.. the more other processes are on teh system the more likely that the hang will still occur. the problem is that the critical section holds off the preemption until teh thread has "PROBABLY" go the KSE back but if it doesn;t get it back, then the held off preemtion still causes the problem.. we need to somehow alter the [atch so that the critical section is held across the cpu_switch.. One possible fix is to make the pre-emption do nothing if (td->td_kse->ke_thread != td) (where td == curthread) > > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects > robert@fledge.watson.org Principal Research Scientist, McAfee Research > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"