From owner-freebsd-arch@FreeBSD.ORG Sun Jan 18 20:17:45 2009 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1253F1065670 for ; Sun, 18 Jan 2009 20:17:45 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id A7F4A8FC1E for ; Sun, 18 Jan 2009 20:17:44 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id 674665B61; Sun, 18 Jan 2009 12:12:02 -0800 (PST) To: Peter Holm In-reply-to: Your message of "Sun, 18 Jan 2009 15:09:24 +0100." <20090118140924.GA27264@x2.osted.lan> References: <20090118082145.GA18067@x2.osted.lan> <86iqocstjm.fsf@ds4.des.no> <20090118131028.GA26179@x2.osted.lan> <20090118132819.GS48057@deviant.kiev.zoral.com.ua> <20090118140924.GA27264@x2.osted.lan> Date: Sun, 18 Jan 2009 12:12:02 -0800 From: Bakul Shah Message-Id: <20090118201202.674665B61@mail.bitblocks.com> Cc: Kostik Belousov , Dag-Erling Sm?rgrav , freebsd-arch@freebsd.org Subject: Re: stress2 is now in projects X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2009 20:17:45 -0000 On Sun, 18 Jan 2009 15:09:24 +0100 Peter Holm wrote: > On Sun, Jan 18, 2009 at 03:28:19PM +0200, Kostik Belousov wrote: > > On Sun, Jan 18, 2009 at 02:10:28PM +0100, Peter Holm wrote: > > > On Sun, Jan 18, 2009 at 01:11:25PM +0100, Dag-Erling Sm?rgrav wrote: > > > > Peter Holm writes: > > > > > The key functionality of this test suite is that it runs a random > > > > > number of test programs for a random period, in random incarnations > > > > > and in random sequence. > > > > > > > > In other words, it's non-deterministic and non-reproducable. > > > > > > > > > > Yes, by design. > > > > > > > You should at the very least allow the user to specify the random seed. > > > > > > > > > > Yes, it would be interesting to see if this is enough to reproduce a > > > problem in a deterministic way. I'll look into this. > > > > I shall state from my experience using it (or, rather, inspecting bug > > reports generated by stress2), that in fact it is quite repeatable. > > I.e., when looking into one area, you almost always get _that_ problem, > > together with 2-3 related issues. > > > > Due to the nature of the tests and kernel undeterministic operations, > > I think that use of the same random seed gains nothing in regard with > > repeatability of the tests. > > It is an old issue that has come up many times: It would be so great > if it was possible to some how record the exact sequence that lead up > to a panic and play it back. > > But on the other hand, as you say, it *is* repeatable. The only > issue is that it may take 5 minutes or 5 hours. > > But I'm still game to see if it is possible at all (in single user > mode with no network activity etc.) Allowing a user to specify the random seed (and *always* reporting the random seed of every test) can't hurt and it may actually gain you repeatability in some cases. Most bugs are typically of garden variety, not dependent on some complex interactions between parallel programs (or worse, on processor heisenbugs). You can always try repeating a failing test on a more deterministic set up like qemu etc. One trick I have used in the past is to record "significant" events in one or more ring buffers using some cheap encoding. You have then access to past N events during any post kernel crash analysis. This has far less of an overhead than debug printfs and you can even leave it enabled in production use.