Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Oct 2006 08:30:02 -0600
From:      Eric Anderson <anderson@centtech.com>
To:        Vlad Galu <dudu@dudu.ro>
Cc:        freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: Frequent VFS crashes with RELENG_6
Message-ID:  <45475DEA.2030506@centtech.com>
In-Reply-To: <ad79ad6b0610310603t7f00cd0ejdb3e4082466cd8a3@mail.gmail.com>
References:  <dudu@dudu.ro>	<ad79ad6b0609300901q4215c809ye28fd861007494da@mail.gmail.com>	<200610010015.k910F6Ba001594@cwsys.cwsent.com> <ad79ad6b0610310603t7f00cd0ejdb3e4082466cd8a3@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 10/31/06 08:03, Vlad Galu wrote:
> On 10/1/06, Cy Schubert <Cy.Schubert@spqr.komquats.com> wrote:
>> In message <ad79ad6b0609300901q4215c809ye28fd861007494da@mail.gmail.com>,
>> "Vlad
>>  GALU" writes:
>>> On 9/30/06, Martin Blapp <mb@imp.ch> wrote:
>>>> Hi,
>>>>
>>>> 1.) Bad ram ? Have you run some memory tester ?
>>>    Yes, memtest86 didn't show anything weird.
>>>
>>>> 2.) Have you background fsck running on this disk ? If
>>>> so try to boot into single user and do a full fsck on this
>>>> disk.
>>>>
>>>    I have background_fsck="NO" in rc.conf and I checked the whole disk
>>> several times.
>>>    Something I forgot to mention earlier: the crash is easier to
>>> reproduce when running rtorrent. The machine did crash without running
>>> it as well, but far more seldom.
>> I've been experiencing the same problem as well. I discovered that the disk on which the filesystem was had some bad sectors causing dump -0Lauf to fail while taking snapshot causing the system to panic. Running smartctl on the device indicated that there were bad sectors 40% within the surface scan being performed by SMART. The drive, an 80 GB Maxtor, was replaced with a 250 GB Western Digital (for a very good price, so good a price I purchased two of them). It was 906 days old, having only been powered off maybe a dozen times over the last three years.
> 
>      During the last 2 weeks I ran the same system with WITNESS turned
> on. The fact that the purpose of this machine is not I/O dependant
> allowed me to run bonnie++ and iozone every second day for the whole
> 24 hours. At the same time I ran several instances of rtorrent. This
> morning I rebooted to a non-WITNESS kernel (the same sources from 2
> weeks ago) and the exact same crash occured within a few hours from
> bootup. In all this time, smartd didn't report anything suspicious.
> WITNESS only reported a LOR related to kqueue that is already known.
>      Any ideas for further stresstesting would be welcome. I am
> familiar with a few parts of the kernel, but VFS is a total stranger
> to me.
> 
> 


Did you get a crash dump?  If not, you might want to start with adding 
all the debugger options into the kernel.


Eric



-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.
------------------------------------------------------------------------



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45475DEA.2030506>