Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Jan 2007 01:02:45 -0800 (PST)
From:      "R. B. Riddick" <arne_woerner@yahoo.com>
To:        Oliver Fromme <olli@lurza.secnetix.de>, freebsd-geom@FreeBSD.ORG
Subject:   Re: gmirror or ata problem
Message-ID:  <20070130090246.42397.qmail@web30315.mail.mud.yahoo.com>
In-Reply-To: <200701300851.l0U8pEkO005250@lurza.secnetix.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi!

--- Oliver Fromme <olli@lurza.secnetix.de> wrote:
> This is strange.  gmirror just detached one of its disks
> for no apparent reason.  I've built a mirror consisting of
> the components ad0 and ad1 (both SATA drives).  It has
> been running fine.  This is RELENG_6 from 2006-12-20.
> 
> Yesterday evening ad1 was detached.  There is no other
> error message logged on console or in the logs (i.e. no
> I/O error such as a bad sector or anything).  There was
> no particularly high load at that time.  In fact, the
> machine had been under much higher load before, without
> anything bad happening.
> 
> This is from the logs:
> 
> Jan 29 19:10:13 pluto -- MARK --
> Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached
> Jan 29 19:20:26 pluto kernel: subdisk1: detached
> Jan 29 19:20:26 pluto kernel: ad1: detached
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1
> (device=gm0, error=6).
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1
> (error=6).
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1
> (error=6).
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1
> disconnected.
> Jan 29 19:50:13 pluto -- MARK --
>
My theory is:
1. Ur ad1 disk when to bed like others in ur time zone...
2. Then gmirror tried to write meta data, which woke up the disk
3. BUT: The disk was too slow, so ata_disk.c decided to detach the disk without
another try.
4. Then gmirror complained about its unability to write meta data.

Remember: Meta data is written from time to time by gmirror, because it likes
to mark the mirror clean/dirty depending on the write requests...

Remark: I think, etc@fluffles.net reported that some weeks ago...

> This almost looks like typical Windows problems:  Something
> reports a "failure", but no reason or any other useful
> information.  :-(
>
Ooch... That was mean... :-)

> "atacontrol list" reports for ad1::
> 
>     Master:      no device present
>
This looks like that bug, etc@fluffles.net reported...

It helped her box to increase some timeout from 5 sec to 15sec...
Maybe this is a mission for sos@ ?

> After an atacontrol detach/attach cycle, the device is back
> again:
> 
>     Master:  ad1 <SAMSUNG HD160JJ/WU100-41> Serial ATA II
>
Lucky u! :)

> I inserted it back into the gmirror, and right now it's
> synchronizing happily.
>
:-)

-Arne


 
____________________________________________________________________________________
Don't pick lemons.
See all the new 2007 cars at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070130090246.42397.qmail>