From owner-svn-src-head@FreeBSD.ORG Mon Jun 15 22:16:08 2009 Return-Path: Delivered-To: svn-src-head@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C4EC2106566B; Mon, 15 Jun 2009 22:16:08 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 598B98FC17; Mon, 15 Jun 2009 22:16:07 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c122-106-159-184.carlnfd1.nsw.optusnet.com.au (c122-106-159-184.carlnfd1.nsw.optusnet.com.au [122.106.159.184]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n5FMG3XV022515 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 16 Jun 2009 08:16:05 +1000 Date: Tue, 16 Jun 2009 08:16:03 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Bruce Evans In-Reply-To: <20090615153040.R1080@besplex.bde.org> Message-ID: <20090616070732.Q25544@delplex.bde.org> References: <200906141801.n5EI1Zti056239@svn.freebsd.org> <4A356A0F.3050800@freebsd.org> <20090615075134.K24645@delplex.bde.org> <4A359AA6.7010101@freebsd.org> <20090615114142.B775@besplex.bde.org> <20090615153040.R1080@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: src-committers@FreeBSD.org, jhb@FreeBSD.org, svn-src-all@FreeBSD.org, Ed Schouten , Sam Leffler , svn-src-head@FreeBSD.org Subject: Re: svn commit: r194204 - in head/sys: amd64/conf i386/conf X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Jun 2009 22:16:09 -0000 On Mon, 15 Jun 2009, Bruce Evans wrote: > ... > This version for RELENG_7 seems to work as intended. jhb should look at > its atomic ops. No reply yet. I thought of some problems with this version. Mainly with the delaying code: - Some versions of DELAY() need to lock the hardware, so calling DELAY() can deadlock. E.g., on amd64 before the TSC is initialized, and on i386 with no TSC and/or before the TSC is initialized, and when kdb is not active on both, DELAY() calls getit(), and getit() locks the clock hardware unconditionally using a non-recursive spin mutex. Contrary to what I said in previous mail, detection of erroneous recursion isn't broken in the usual case. The usual case is probably INVARIANTS, and then recursion is detected. The handling of erroneous version then causes endless recursion on printf(): it is a failing KASSERT() which will call panic(), which will call printf(), which will reach the failing KASSERT() again. The non-recursive spinlock in cnputs() has the same bug (deadlock --> recursive deadlock). This problem in DELAY() is well known, so it is worked around when kdb is active by not calling getit() then. Nearby bugs in DELAY(): if DELAY() is first called after the TSC is initialized, then its debugging code is never reached. Its debugging code is a non-NOTEd non-option and could have been removed after the getit() version of DELAY() was verified to give reasonably accurate timing, but it is more useful for the TSC version since the TSC version has not been verified to give reasonably accurate timing. The TSC version must fail any reasonable verifiications except probably for P-state invariant TSCs since the TSC varies and DELAY() makes no attempt to compensate for its variance). If DELAY() is first called before the TSC is initialized, then the debugging code still works for the i8254 but its placement is confusing, and when the counter is changed to the TSC there is no code to debug the change. - timecounters are no better than DELAY() for implementing the delaying, since apart from them possibly not working on cold and/or deadlocked systems, although the upper-level timecounter code is lock-free, the timecounter hardware code might need to use a lock. Again, the i8254 timecounter hardware code uses the clock spinlock. - KTR uses plain printf(), and KTR can produce a lot of output, so the delay should be as short as possible, as for mcount_trylock(), and 1ms is too long. Recursion is a relatively unimportant part of the problem here. Too-long delays are possible in normal operation, when one CPU is in a normal printf() and other CPUs want to do KTR printfs. Serialization of the printf()s is especially important for voluminous concurrent KTR output, but so is printing such output fast. jhb should look at this too. I use KTR less than once a year. > % Index: subr_prf.c > % =================================================================== > % RCS file: /home/ncvs/src/sys/kern/subr_prf.c,v > % retrieving revision 1.130.2.1 > % diff -u -2 -r1.130.2.1 subr_prf.c > % --- subr_prf.c 21 Jan 2009 00:26:45 -0000 1.130.2.1 > % +++ subr_prf.c 15 Jun 2009 05:32:03 -0000 > % @@ -112,4 +112,27 @@ > % &always_console_output, 0, "Always output to console despite > TIOCCONS."); > % % +static int printf_lockcount; > % + > % +static void > % +printf_lock(void) > % +{ > % + int timeout; > % + > % + timeout = 1000; > % + do { > % + if (atomic_cmpset_acq_int(&printf_lockcount, 0, 1)) > % + return; > % + DELAY(1000); > % + } while (--timeout != 0); > % + atomic_add_acq_int(&printf_lockcount, 1); > % +} If the DELAY() is removed, the initial value of `timeout' would need to be (possibly dynamically) calibrated. The timeouts for the panics for spinlocks and threadlocks in kern_mutex.c have similar problems and could benefit from calibration. First they do up to 10 million cpu_spinwait()s. 10 million might be too small or too large. Then they do up to 60 million DELAY(1)s. DELAY() can deadlock as above. 60 million is supposed to give a delay of 60 seconds, but short delays can be very inaccurate (the minimum DELAY() is about 5 us with i8254 hardware on fast CPUs and about 30 us with i8254 hardware on 1990's CPUs), so the timeouts can be more like 5 minutes than 1 minute. A non-dynamically calibrated loop using uncached memory or device accesses has a better chance of working accurately than the non-dynamically calibrated TSC loop in the amd64 and i8254 DELAY()s, since throttling of the TSC is more common than throttling of memory systems. Bruce