From owner-freebsd-current@FreeBSD.ORG Sat Jul 10 07:30:28 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 943D316A4CE; Sat, 10 Jul 2004 07:30:28 +0000 (GMT) Received: from tomoyo.MyBSD.org.my (duke.void.net.my [202.157.186.223]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3F52243D1F; Sat, 10 Jul 2004 07:30:28 +0000 (GMT) (envelope-from skywizard@MyBSD.org.my) Received: from kasumi.MyBSD.org.my (unknown [219.94.117.9]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by tomoyo.MyBSD.org.my (Postfix) with ESMTP id ACAEE6CC1F; Sat, 10 Jul 2004 15:33:19 +0800 (MYT) Date: Sat, 10 Jul 2004 15:30:35 +0800 From: Ariff Abdullah To: noackjr@alumni.rice.edu Message-Id: <20040710153035.1a525507.skywizard@MyBSD.org.my> In-Reply-To: <40EF96E7.3090608@alumni.rice.edu> References: <20040710150620.7595b207.skywizard@MyBSD.org.my> <40EF96E7.3090608@alumni.rice.edu> Organization: MyBSD X-Mailer: /usr/local/lib/ruby/1.8/net/smtp.rb Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit cc: freebsd-current@freebsd.org cc: rwatson@freebsd.org Subject: Re: Native preemption is the culprit [was Re: today's CURRENT lockups] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Jul 2004 07:30:28 -0000 On Sat, 10 Jul 2004 02:12:39 -0500 Jon Noack wrote: > On 07/10/04 02:06, Ariff Abdullah wrote: > > On Sat, 10 Jul 2004 01:18:06 -0400 (EDT) > > Robert Watson wrote: > >> FYI, UP+SCHED_ULE with PREEMPTION hung within three seconds of > >> starting the benchmark. Without PREEMPTION it seems to run fine. > >> > >> So it looks like either PREEMPTION has a problem, or it's > >> triggering an existing problem elsewhere. If it's only one > >problem,> it seems not to depend on either SMP/UP or the scheduler > >choice. If> it's multiple problems, who knows :-). As the MySQL > >test relies on > threading, we could be looking at an edge case > >involving threading > and scheduling/preemption-- the other reports > >I've seen mention > X11/KDE, which would also involve threading. On > >the other hand, it > could just be load. Tomorrow I'll load up a > >box with non-threaded > apps and see what happens. > > > > I'm suspecting bad combination between threaded apps and current > > native preemption, either the preemption itself, or threads. > > Running current kernel without any threaded apps turns up nothing > > suspicious. Once the threaded apps started, it's like sending the > > entire system to the death row. > > > > I'm reverting following files to pre-July 2 to achive solid > > stability: > > > > sys/sys/interrupt.h - v1.27 > > sys/kern/kern_intr.c - v1.110 > > sys/i386/i386/intr_machdep.c - v1.6 > > sys/kern/sched_ule.c - v1.109 > > Note that I haven't run across any issues after just reverting > sys/kern/sched_ule.c to rev. 1.113. The same workload > (X11/KDE/etc.) that crashes native preemption quite quickly has been > running solidly for over 14 hours now. > rev. 1.113 causing annoying latency issue (as stated in the commit log). Buildworld + xmms just ain't fun anymore. I think the main culprit is within ithread_schedule() itself, as it common to both ULE/4BSD. -- Ariff Abdullah MyBSD http://www.MyBSD.org.my (IPv6/IPv4) http://staff.MyBSD.org.my (IPv6/IPv4) http://tomoyo.MyBSD.org.my (IPv6/IPv4)