From owner-freebsd-stable@FreeBSD.ORG Thu Nov 8 21:29:27 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7B3D116A420 for ; Thu, 8 Nov 2007 21:29:27 +0000 (UTC) (envelope-from jdc@parodius.com) Received: from mx01.sc1.parodius.com (mx01.sc1.parodius.com [72.20.106.3]) by mx1.freebsd.org (Postfix) with ESMTP id 6AD1613C4B7 for ; Thu, 8 Nov 2007 21:29:27 +0000 (UTC) (envelope-from jdc@parodius.com) Received: by mx01.sc1.parodius.com (Postfix, from userid 1000) id 668551CC07C; Thu, 8 Nov 2007 13:29:21 -0800 (PST) Date: Thu, 8 Nov 2007 13:29:21 -0800 From: Jeremy Chadwick To: David Naylor Message-ID: <20071108212921.GA34721@eos.sc1.parodius.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.16 (2007-06-09) Cc: freebsd-stable@freebsd.org Subject: Re: Harddisk failure causes system crash, please help X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Nov 2007 21:29:27 -0000 On Thu, Nov 08, 2007 at 10:40:49PM +0200, David Naylor wrote: > I have been using this laptop for a few months now with FreeBSD without any > problems with the hard disk however today as I installed editors/vim the > system crashed (without a core dump or any message). > > When ever the system boots (and proceeds to do a fsck on ad0e (/usr)) it > also crashes without any message. I have tried the following commands: > > # dd if=/dev/ad0 of=/dev/null bs=1M ( System crashes) > > # smartctl -C -t short ( Succeeds ) > # smartctl -C -t long ( Failes with a message: ad0: FAILED - SMART timed out) Sounds like something mechanical inside of the disk is failing, or possibly the drive firmware is somewhat buggy when it comes to handling bad blocks. What brand/model of hard disk is this? atacontrol output would suffice. I'm just curious (personal interest). > I have no idea what is wrong (if the disk has corrupted should the kernel > not display error messages?). Can you please help/advise? Not necessarily, although I would expect to see a bus timeout of some kind, but it doesn't surprise me that you don't see one. If a long SMART test results in the drive timing out and falling off the bus, there's a much bigger problem at hand. There is a possibility that the system is simply going bad in some way (RAM issues or mainboard that's broken somehow), but all your problems seem to indicate issues with the disk. If I was in your shoes, I would try to get all the data off that disk, purchase a replacement, install FreeBSD on it, and restore your data. I'd then take the old/possibly-bad disk and download one of the drive fitness test utilities from the manufacturer's website. Run that and see if anything comes up / if anything bad happens. Laptop hard disks are sometimes a pain to deal with (some laptop manufacturers have BIOS tweakery where they refuse to recognise any hard disk other than ones of a specific brand/model. I haven't seen this in recent years, but it's something I've seen in the past), so I wish you luck. Laptops -- such a pain. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |