From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 13:14:07 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D148C106566C for ; Thu, 30 Sep 2010 13:14:07 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id C22B88FC13 for ; Thu, 30 Sep 2010 13:14:05 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8U8nr8r081019; Thu, 30 Sep 2010 01:49:57 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009300849.o8U8nr8r081019@gw.catspoiler.org> Date: Thu, 30 Sep 2010 01:49:53 -0700 (PDT) From: Don Lewis To: avg@icyb.net.ua In-Reply-To: <4CA42A0A.6090003@icyb.net.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 13:14:07 -0000 On 30 Sep, Andriy Gapon wrote: > on 30/09/2010 02:27 Don Lewis said the following: > vmstat -i ? I didn't see anything odd in the vmstat -i output that I posted to the list earlier. It looked more or less normal as the ntp offset suddenly went insane. >> I did manage to catch the problem with lock profiling enabled: >> >> I'm currently testing SMP some more to verify if it really avoids this >> problem. > > OK. I wasn't able to cause SMP on stable to break. The silent reboots that I was seeing with WITNESS go away if I add WITNESS_SKIPSPIN. Witness doesn't complain about anything. I tested -CURRENT and !SMP seems to work ok. One difference in terms of hardware between the two tests is that I'm using a SATA drive when testing -STABLE and a SCSI drive when testing -CURRENT. At this point, I think the biggest clues are going to be in the lock profile results.