From owner-freebsd-current@FreeBSD.ORG Wed Nov 19 01:36:05 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 302366FE for ; Wed, 19 Nov 2014 01:36:05 +0000 (UTC) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1bon0072.outbound.protection.outlook.com [157.56.111.72]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D5B091AC for ; Wed, 19 Nov 2014 01:36:04 +0000 (UTC) Received: from [IPv6:2601:7:9f80:1000::846c] (2601:7:9f80:1000::846c) by DM2PR0801MB667.namprd08.prod.outlook.com (10.242.173.25) with Microsoft SMTP Server (TLS) id 15.1.16.15; Wed, 19 Nov 2014 01:35:55 +0000 Message-ID: <546BF3F5.8030109@panasas.com> Date: Tue, 18 Nov 2014 20:35:49 -0500 From: "Ellis H. Wilson III" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.8.0 MIME-Version: 1.0 To: Benjamin Kaduk Subject: Re: WITNESS observes 2 LORs on Boot of Release 10.1 References: <546BA9D3.6070007@panasas.com> In-Reply-To: Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [2601:7:9f80:1000::846c] X-ClientProxiedBy: BL2PR01CA0016.prod.exchangelabs.com (10.141.66.16) To DM2PR0801MB667.namprd08.prod.outlook.com (10.242.173.25) X-Microsoft-Antispam: UriScan:; X-Microsoft-Antispam: BCL:0;PCL:0;RULEID:;SRVR:DM2PR0801MB667; X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:;SRVR:DM2PR0801MB667; X-Forefront-PRVS: 04004D94E2 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6009001)(24454002)(189002)(199003)(164054003)(51704005)(479174003)(377454003)(86362001)(92566001)(102836001)(101416001)(50986999)(19580395003)(62966003)(50466002)(64126003)(80316001)(83506001)(15975445006)(92726001)(59896002)(23756003)(47776003)(20776003)(87266999)(46102003)(65956001)(122386002)(65816999)(54356999)(65806001)(64706001)(76176999)(33656002)(99396003)(77096003)(2171001)(87976001)(42186005)(107046002)(120916001)(31966008)(95666004)(4396001)(77156002)(40100003)(36756003)(21056001)(106356001)(105586002)(97736003)(3826002); DIR:OUT; SFP:1101; SCL:1; SRVR:DM2PR0801MB667; H:[IPv6:2601:7:9f80:1000::846c]; FPR:; MLV:sfv; PTR:InfoNoRecords; MX:1; A:1; LANG:en; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:;SRVR:DM2PR0801MB667; X-OriginatorOrg: panasas.com Cc: freebsd-current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2014 01:36:05 -0000 On 11/18/2014 05:39 PM, Benjamin Kaduk wrote: > On Tue, 18 Nov 2014, Ellis H. Wilson III wrote: >> I'm observing the following two WITNESS LORs being thrown upon boot-up of 10.0 >> and I was tracking current, hoping they would go away by 10.1, but it seems >> they persist as shown below. I suspect this is because current is being built >> with WITNESS on but also with SKIPSPIN on. So these issues are unlikely to >> show up for any devs but those who specifically enable WITNESS and disable >> SKIPSPIN like myself. At my work we would greatly like to do our debugging >> with checking of spin-locking order included and panicing upon LOR detection. >> That's not possible with these in existence. > > However, I was under the impression that a kernel built with WITNESS and > without WITNESS_SKIPSPIN would panic on boot on the cnputs_mutx (see, > e.g., > https://lists.freebsd.org/pipermail/freebsd-stable/2014-January/076864.html). > So, (1) I'm surprised you can boot it, and (2) that would explain why no > one else has been using it. That's a very interesting thread. I've seen another where a fellow developer suggested just throwing on the WITNESS_SKIPSPIN flag to "solve" the issue. I can't say that I agree with the approach, but I understand. I'd be willing to tackle a bit of WITNESS massaging to help it be instructed about known false positives better, if that's desirable. Why I'm able to boot however, is simple: I haven't enabled a full suite of debugging flags, KDB/DDB being the key ones that cause a panic to occur on a failure like the one I've seen. We originally were seeing the panic so WITNESS in its entirety was shut off. I was asked to try and get that back on-track, so to start I at least wanted to see how many LORs we were dealing with on boot. 2 is apparently that magic number, and maybe 3 if the cputs issue you refer to is still possible to be hit in the 10.1 wild. If nobody has seen these before, I'll try and put together fixes for them. Please somebody speak up if you have seen them or have useful information for me to go on in my patches. Thanks, ellis