From owner-freebsd-fs@FreeBSD.ORG Tue Oct 31 14:03:32 2006 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 63D4F16A415 for ; Tue, 31 Oct 2006 14:03:32 +0000 (UTC) (envelope-from dudu@dudu.ro) Received: from nz-out-0102.google.com (nz-out-0102.google.com [64.233.162.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id E04A343D86 for ; Tue, 31 Oct 2006 14:03:22 +0000 (GMT) (envelope-from dudu@dudu.ro) Received: by nz-out-0102.google.com with SMTP id o37so1453252nzf for ; Tue, 31 Oct 2006 06:03:22 -0800 (PST) Received: by 10.65.43.17 with SMTP id v17mr6881713qbj; Tue, 31 Oct 2006 06:03:21 -0800 (PST) Received: by 10.65.112.4 with HTTP; Tue, 31 Oct 2006 06:03:21 -0800 (PST) Message-ID: Date: Tue, 31 Oct 2006 16:03:21 +0200 From: "Vlad Galu" To: freebsd-stable@freebsd.org, freebsd-fs@freebsd.org In-Reply-To: <200610010015.k910F6Ba001594@cwsys.cwsent.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200610010015.k910F6Ba001594@cwsys.cwsent.com> Cc: Subject: Re: Frequent VFS crashes with RELENG_6 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Oct 2006 14:03:32 -0000 On 10/1/06, Cy Schubert wrote: > In message , > "Vlad > GALU" writes: > > On 9/30/06, Martin Blapp wrote: > > > > > > Hi, > > > > > > 1.) Bad ram ? Have you run some memory tester ? > > > > Yes, memtest86 didn't show anything weird. > > > > > 2.) Have you background fsck running on this disk ? If > > > so try to boot into single user and do a full fsck on this > > > disk. > > > > > > > I have background_fsck="NO" in rc.conf and I checked the whole disk > > several times. > > Something I forgot to mention earlier: the crash is easier to > > reproduce when running rtorrent. The machine did crash without running > > it as well, but far more seldom. > > I've been experiencing the same problem as well. I discovered that the disk on which the filesystem was had some bad sectors causing dump -0Lauf to fail while taking snapshot causing the system to panic. Running smartctl on the device indicated that there were bad sectors 40% within the surface scan being performed by SMART. The drive, an 80 GB Maxtor, was replaced with a 250 GB Western Digital (for a very good price, so good a price I purchased two of them). It was 906 days old, having only been powered off maybe a dozen times over the last three years. During the last 2 weeks I ran the same system with WITNESS turned on. The fact that the purpose of this machine is not I/O dependant allowed me to run bonnie++ and iozone every second day for the whole 24 hours. At the same time I ran several instances of rtorrent. This morning I rebooted to a non-WITNESS kernel (the same sources from 2 weeks ago) and the exact same crash occured within a few hours from bootup. In all this time, smartd didn't report anything suspicious. WITNESS only reported a LOR related to kqueue that is already known. Any ideas for further stresstesting would be welcome. I am familiar with a few parts of the kernel, but VFS is a total stranger to me. -- If it's there, and you can see it, it's real. If it's not there, and you can see it, it's virtual. If it's there, and you can't see it, it's transparent. If it's not there, and you can't see it, you erased it.