From owner-freebsd-stable@FreeBSD.ORG Wed Nov 26 16:38:16 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0FC77A1E for ; Wed, 26 Nov 2014 16:38:16 +0000 (UTC) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by mx1.freebsd.org (Postfix) with ESMTP id 9E019F70 for ; Wed, 26 Nov 2014 16:38:15 +0000 (UTC) Received: from ppp14-2-30-215.lns21.adl2.internode.on.net (HELO leader.local) ([14.2.30.215]) by ipmail06.adl6.internode.on.net with ESMTP; 27 Nov 2014 03:08:14 +1030 Message-ID: <547601F4.2040005@ShaneWare.Biz> Date: Thu, 27 Nov 2014 03:08:12 +1030 From: Shane Ambler User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: =?UTF-8?B?TmF0YWNoYSBQb3J0w6k=?= , FreeBSD stable Subject: Re: Need help with unexpected reboot in 10.1-RELEASE References: <20141126082923.GA87180@nat.rebma.instinctive.eu> In-Reply-To: <20141126082923.GA87180@nat.rebma.instinctive.eu> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Nov 2014 16:38:16 -0000 On 26/11/2014 18:59, Natacha Porté wrote: > Hello, > > last week, I updated my main personal computer from 9.2-RELEASE to > 10.1-RELEASE. Since then, I experienced four sudden and unexpected > reboots (is that what is called "crashes"?). They were immediate, so > it's not a kernel panic (which keeps the system unusable for 15s before > rebooting). Nothing appears in the logs, but who knows what could be in > the uncommitted buffers? I haven't had reboots but my machine has hung, forcing me to reset nearly every day. When it doesn't hang the usb system fails to create new devices forcing me to restart to access a disk. > I run with a ZFS root, and the zpool is directly on the unsliced disk. > I have a nVidia graphics card, with the proprietary driver, on two > screens with two displays (":0" and ":0.1") and two window managers. > It's an amd64 platform. How much ram? one disk in zpool? > I doubt this is a purely hardware issue, since I generally choose my > hardware for its reliability, and I regularly reached three-digit days > of uptime with 9.2-RELEASE. I used to install updates and restart monthly on 9.2. > I did take a snapshot of my 9.2-RELEASE, so I'm one zfs rollback away > from checking whether it sill happens with 9.2-RELEASE. However, if as > is likely it does work around the problem, I will probably have a hard > time motivating myself to come back to the problem, rather than just > waiting for the next release to see whether it has been magically solved > without me. CAUTION - If you performed a zpool upgrade after upgrading to 10.1 then you can't read the zpool in 9.x so the rollback will fail. The way back will involve creating a new pool and transferring data. > So before rolling back to 9.2-RELEASE, I would like to try diagnosing > the problem as much as possible, as one of the few ways a humble user > like me can contribute. > > Is there anything I can do to gather more information about the problem? > Would you need extra information about my setup first (lspci, dmidecode, > ports options, etc)? I can't help with a solution. One thing I noticed on my machine is the high allocations of wired memory, in RC3 I got 7G wired out of 8G installed at the time of having to reset. 10.1-RELEASE appears to be more resilient in this regard but I wonder if filling ram is a factor. I just had top running in a terminal window to watch the wired value. Two simultaneous writes to disk appear to accelerate this for me. I reported my issue at https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194654