From owner-cvs-src@FreeBSD.ORG  Wed May 23 18:40:56 2007
Return-Path: <owner-cvs-src@FreeBSD.ORG>
X-Original-To: cvs-src@freebsd.org
Delivered-To: cvs-src@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 6F10816A41F;
	Wed, 23 May 2007 18:40:56 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail31.syd.optusnet.com.au (mail31.syd.optusnet.com.au
	[211.29.132.102])
	by mx1.freebsd.org (Postfix) with ESMTP id 07EBD13C46A;
	Wed, 23 May 2007 18:40:55 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from besplex.bde.org (c211-30-216-190.carlnfd3.nsw.optusnet.com.au
	[211.30.216.190])
	by mail31.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	l4NIeoab002997
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 24 May 2007 04:40:53 +1000
Date: Thu, 24 May 2007 04:40:51 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Robert Watson <rwatson@freebsd.org>
In-Reply-To: <20070523174102.L12345@fledge.watson.org>
Message-ID: <20070524040432.N33950@besplex.bde.org>
References: <200705202211.l4KMBo8a028637@repoman.freebsd.org>
	<20070523174102.L12345@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: cvs-src@freebsd.org, Jeff Roberson <jeff@freebsd.org>,
	src-committers@freebsd.org, cvs-all@freebsd.org
Subject: Re: cvs commit: src/sys/kern kern_clock.c subr_prof.c subr_witness.c
 src/sys/sys systm.h
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 23 May 2007 18:40:56 -0000

On Wed, 23 May 2007, Robert Watson wrote:

> On Sun, 20 May 2007, Jeff Roberson wrote:
>
>>  Modified files:
>>    sys/kern             kern_clock.c subr_prof.c subr_witness.c
>>    sys/sys              systm.h
>>  Log:
>>   - Move clock synchronization into a seperate clock lock so the global
>>     scheduler lock is not involved.  sched_lock still protects the 
>> sched_clock
>>     call.  Another patch will remedy this.
>> 
>>  Contributed by: Attilio Rao <attilio@FreeBSD.org>
>>  Tested by:      kris, jeff
>
> I believe that this commit may be causing the 'vga0' hangs being reported on 
> current@.  A brief hand transcription from parallels:
>
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
> panic: corrupt spinlock
> KDB enter: panic
> [thread pid 0 tid 0 ]
> Stopped at	kdb_enter+0x32: leave
> db> bt
> Tracing pid 0 tid 0 td 0xc0b36540
> kdb_enter(...) at kdb_enter+0x32
> panic(...) at panic+0xc5
> _mtx_lock_spin_flags(...) at _mtx_lock_spin_flags+0xd4
> statclock(...) at statclock+0x108
> hardclock(...) at hardclock+0x37
> clkintr(...) at clkintr+0xd2
> intr_execute_handlers(...) at intr_execute_handlers+0xe5
> atpic_handle_intr(...) at atpic_handle_intr+0xba
> Xatpic_intr0() at Xatpic_intr0+0x21
> --- interrupt, eip = 0xc0993f2b, esp=0xc1020cb8, ebp = 0xc1020cbc ---
> spinlock_exit(...) at spinlock_exit+0x2b
> _mtx_unlock_spin_flags(...) at _mtx_unlock_spin_flags+0xdc
> atpic_enable_source(...) at atpic_enable_source+0x86
> intr_add_handler(...) at intr_add_handler+0xb3
> cpu_initclockS(...) at cpu_initclocks+0x50
> initclocks(...) at initclocks+0xb
> mi_startup() at mi_startup+0x96
> begin() at begin+0x2c
>
> The intr_execuite_handlers() call takes the same first argument as 
> atpic_enable_source(), so it seems that the first clock tick is happening 
> while the clock device code is still kicking off -- could we be seeing an 
> unitialized spinlock being used, or is the memory getting overwritten?  I 
> think this is mostly turning up for Parallel users due to the way Parallels 
> is doing clock tick timing, but that is speculation.

time_lock is not mtx_init()ed until after cpu_initclocks() returns, but
the above shows it being used before cpu_initclocks() returns.

The above also shows hardclock() calling statclock().  Normally
statclock() is on a separate interrupt, and it is unexpected for it
to be called before that interrupt is set up.  Probably not a problem,
but cpu_initclocks() might need to be aware of it.  The following seems
to be the extent of this problem on i386's:
- hardclock() also calls profclock() (before calling statclock()).  This
   is harmless because profiling can't be enabled this early, so the code
   that uses time_lock in profclock() is not reached this early.
- stathz, profhz and psratio are not initialized (except statically to 0)
   when hardclock() calls statclock().  This doesn't matter because
   statclock() doesn't use them.  The only uninitialized use is where
   hardclock() uses (stathz == 0) as the condition for calling profclock()
   and hardclock().

Initialization in at least the i386 cpu_init_clocks() is delicate but
seems to be safe.  Initialization for clkintr is not complete when the
interrupt fires as above but potential problems from this are handled
by a null pointer check (and probably also by using a dummy timecounter).
Initialization for rtcintr is completed before the interrupt can fire,
except for actually enabling the interrupt.  We disable rtc interrupts
while initializing them, but I think one can fire anyway due to latching.
Then after attaching the interrupt, we enable it.

Bruce