From owner-freebsd-current@FreeBSD.ORG Mon Jul 28 14:12:07 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 874651065670; Mon, 28 Jul 2008 14:12:07 +0000 (UTC) (envelope-from bkelly@vadev.org) Received: from ianto.vadev.org (vadev.org [66.92.166.151]) by mx1.freebsd.org (Postfix) with ESMTP id 15C8F8FC16; Mon, 28 Jul 2008 14:12:06 +0000 (UTC) (envelope-from bkelly@vadev.org) Received: from harkness.vadev.org (harkness.vadev.org [192.168.1.210]) (authenticated bits=0) by ianto.vadev.org (8.14.2/8.14.2) with ESMTP id m6SDn7dC072265 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 28 Jul 2008 13:49:07 GMT (envelope-from bkelly@vadev.org) Message-Id: <46022669-C9A3-4699-9BBA-E1C583BF3AC4@vadev.org> From: Ben Kelly To: Ivan Voras In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Date: Mon, 28 Jul 2008 09:49:06 -0400 References: <20080318124019.O910@desktop> X-Mailer: Apple Mail (2.926) X-Spam-Score: -1.44 () ALL_TRUSTED X-Scanned-By: MIMEDefang 2.64 on 192.168.1.110 Cc: freebsd-current@freebsd.org Subject: Re: Panic of 8-CURRENT in VMWare X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jul 2008 14:12:07 -0000 On Mar 18, 2008, at 7:07 PM, Ivan Voras wrote: > Jeff Roberson wrote: >> On Tue, 18 Mar 2008, Ivan Voras wrote: >>> Ivan Voras wrote: >>>> Hi, >>>> I cannot boot a very recent build (minutes ago) of 8-CURRENT on >>>> VMWare Server. Panic ("integer divide fault" - is this division >>>> by zero?) is in sched_rr_interval(). >>>> >>>> More info here: >>>> http://ivoras.sharanet.org/stuff/panic/ >>>> >>>> It might be because I'm trying to run without WITNESS+INVARIANTS. >>> >>> No, building a GENERIC kernel doesn't change anything. It's also >>> not a cvsup glitch - todays sources panic in exactly the same way. >>> >>> >> Can you tell me what the values of: >> sysctl kern.sched.slice >> and >> sysctl kern.clockrate >> are? > > The machine doesn't finish booting the kernel (i.e. init isn't > executed) and fetching sysctls apparently isn't supported by the > kernel debugger (though it would be nice if it did work, at least > for simple variables). > > The only old kernel I have is 7.0RC1, and in it I can only access > kern.clockrate, which is { hz=50, tick=20000, profhz=33, stathz=6 }. > > Since you brought up the issue of clocks, I removed the tuning of > kern.hz (it was present there practically forever) and the panic's > gone. I use low values for kern.hz in VMWare to (noticably) reduce > problems with clock drift and context switches, so it would be nice > to not have the kernel panic with it :) > > Apparently lowering kern.hz works upto about 75 - anything lower > triggers the integer divide fault. I ran into this problem recently. It appears that sched_slice is set to zero when realstathz drops below 10 in sched_ule.c: sched_slice = (realstathz/10); /* ~100ms */ I was able to work around the problem with the following patch. The image no longer panics, but I have not done any stress or performance testing. Is there a better solution to this problem? For reference: > uname -a FreeBSD vm7.vadev.org 7.0-STABLE FreeBSD 7.0-STABLE #3 r50:55M: Mon Jul 28 09:27:04 EDT 2008 root@vm7.vadev.org:/usr/obj/usr/src/sys/ VMWARE i386 > sysctl -a | grep kern.clock kern.clockrate: { hz = 50, tick = 20000, profhz = 33, stathz = 6 } The patch is against 7-STABLE from 7/24/2008. Thanks. - Ben Index: src/sys/kern/sched_ule.c =================================================================== --- src/sys/kern/sched_ule.c (revision 53) +++ src/sys/kern/sched_ule.c (working copy) @@ -1325,6 +1325,7 @@ */ realstathz = hz; sched_slice = (realstathz/10); /* ~100ms */ + sched_slice = sched_slice ? sched_slice : 1; tickincr = 1 << SCHED_TICK_SHIFT; /* Add thread0's load since it's running. */ @@ -1345,6 +1346,7 @@ realstathz = stathz ? stathz : hz; sched_slice = (realstathz/10); /* ~100ms */ + sched_slice = sched_slice ? sched_slice : 1; /* * tickincr is shifted out by 10 to avoid rounding errors due to