From owner-freebsd-current@FreeBSD.ORG Tue Mar 18 12:22:10 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9CF39C80 for ; Tue, 18 Mar 2014 12:22:10 +0000 (UTC) Received: from eu1sys200aog106.obsmtp.com (eu1sys200aog106.obsmtp.com [207.126.144.121]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id EA4A58B2 for ; Tue, 18 Mar 2014 12:22:09 +0000 (UTC) Received: from mail-we0-f173.google.com ([74.125.82.173]) (using TLSv1) by eu1sys200aob106.postini.com ([207.126.147.11]) with SMTP ID DSNKUyg6b8ybJPmDmhPgUNuVL8piMXkPHPIo@postini.com; Tue, 18 Mar 2014 12:22:10 UTC Received: by mail-we0-f173.google.com with SMTP id w61so5824942wes.32 for ; Tue, 18 Mar 2014 05:22:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:date:from:message-id:to:subject:reply-to :in-reply-to; bh=iP1vPrWqtP4EYNeRWZkMJ5qI00XfGgLy/gamd5s5T1M=; b=hQ+GlgTQWXyCDMg2R9zvg5Nsvtfw5bbhWn7jPNlPxT8nz5OJ7jO5+WnEK8Jr4QdZa6 10OlappRT7D1sM8p24lSWnx+ZGmV4l56/ul4DuMrsejfQDH1amrZRSnp0DrQmE65s5GG vCDr7G3Y9K23jI87IFFTydEpFlhO8kqt40mgP7p1rfC43gHE//E8zbkhCCzaGpnis8ht clkwDhvep5MUsVHmOw8M0w/N/Z3SE+azwRgTQKDk3z0T7RN4W7MF/Wds+uBNMbVy6crA TfRzzlvilpcXZVKbEy3lGb+kg7xQBA2lNfPMb+hbFbdtrt9TNn+5RAhPhrmq2fXbrMw5 wThQ== X-Received: by 10.194.63.103 with SMTP id f7mr10427670wjs.38.1395139145740; Tue, 18 Mar 2014 03:39:05 -0700 (PDT) X-Gm-Message-State: ALoCoQkkIqtk3RzPmni7MUKDeRZbg9k0wpovrTOarBaj19q3htEWoP1IsuY2Axc3EqiEEXInbUIndMlEXmD8KcsN6wMUjrrK+0ho9W+KNe3/FPJVQ9jPQWZnMRoBG2MPMY/GjzYC4Reio0BSU4d8O8FFAiNvs5BhsI/gOWDdW1oxYvKZjVnuw14= X-Received: by 10.194.63.103 with SMTP id f7mr10427667wjs.38.1395139145663; Tue, 18 Mar 2014 03:39:05 -0700 (PDT) Received: from mech-cluster241.men.bris.ac.uk (mech-cluster241.men.bris.ac.uk. [137.222.187.241]) by mx.google.com with ESMTPSA id hy8sm46122878wjb.2.2014.03.18.03.39.03 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Mar 2014 03:39:04 -0700 (PDT) Sender: Anton Shterenlikht Received: from mech-cluster241.men.bris.ac.uk (localhost [127.0.0.1]) by mech-cluster241.men.bris.ac.uk (8.14.7/8.14.6) with ESMTP id s2IAd2XI030640 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 18 Mar 2014 10:39:02 GMT (envelope-from mexas@mech-cluster241.men.bris.ac.uk) Received: (from mexas@localhost) by mech-cluster241.men.bris.ac.uk (8.14.7/8.14.6/Submit) id s2IAd20h030639; Tue, 18 Mar 2014 10:39:02 GMT (envelope-from mexas) Date: Tue, 18 Mar 2014 03:39:04 -0700 (PDT) From: Anton Shterenlikht Message-Id: <201403181039.s2IAd20h030639@mech-cluster241.men.bris.ac.uk> To: freebsd-current@freebsd.org, freebsd-sparc64@freebsd.org Subject: Re: reproducible panic every day at 03:02, probably triggered by daily periodic scipts - help In-Reply-To: <1394134260.15679.91454577.172E5E29@webmail.messagingengine.com> X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: mexas@bris.ac.uk List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Mar 2014 12:22:10 -0000 I've spent a lot of time on this. At some point I started suspecting disk failures, based on dd errors and smartmontools reports. So I replaced the disks, and then replaced the whole box for another nominally identical SunBlade 1500. The panics persisted. I now think that multiple cold reboots might have damaged disks, not the other way round. In the end I had to conclude that this is not a hardware problem, but the OS issue, perhaps triggered by some heavy disk I/O. Various frequent panics exist at least from r260689 to r263096: r260689: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/186760 r260914: http://www.freebsd.org/cgi/query-pr.cgi?pr=187219 r261798: http://www.freebsd.org/cgi/query-pr.cgi?pr=187080 r263096: http://www.freebsd.org/cgi/query-pr.cgi?pr=187527 I now reverted as far back as r258000, and the system seems stable, but I probably need few more days to be sure. If the system is indeed stable at r258000, when I have the time, I'll try to narrow down the problem revision. But I'd appreciate any hint that might save time. Anton