Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 May 2005 08:21:13 +0200
From:      =?ISO-8859-1?Q?S=F8ren_Schmidt?= <sos@DeepCore.dk>
To:        Joe Rhett <jrhett@meer.net>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: drive failure during rebuild causes page fault
Message-ID:  <F1F4AB07-A2C3-4EC9-8D4E-BDE0AF4BA409@DeepCore.dk>
In-Reply-To: <20050519002015.GA25329@meer.net>
References:  <20041213052628.GB78120@meer.net> <20041213054159.GC78120@meer.net> <20041212215841.X83257@carver.gumbysoft.com> <20041213060549.GE78120@meer.net> <20041213102333.V92964@carver.gumbysoft.com> <20041213192119.GB4781@meer.net> <20041213183336.T97507@carver.gumbysoft.com> <41BE8F2D.8000407@DeepCore.dk> <20041215005359.GK27283@meer.net> <20050519002015.GA25329@meer.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On 19/05/2005, at 2.20, Joe Rhett wrote:

> Soren, I've just retested all of this with 5.4-REL and most of the =20
> problems
> listed here are solved.  The only problems appear to be related to =20
> these
> ghost arrays that appear when it finds a drive that was taken offline
> earlier.  For example, pull a drive and then reboot the system.

This depends heavily on the metadata format used, some of them simply =20=

doesn't have the info to avoid this and some just ignores the problem.

> 1. If you reboot the system you can delete the array cleanly, but =20
> it returns
> next time.  I can't figure out how to make this information go =20
> away, and
> I've tried low-level formatting the disks :-(

You need to overwrite the metadata (se above) which are located in =20
different places again depending on metadata format.

> 2. Removing the array using "atacontrol delete" after an =20
> "atacontrol reinit
> channel" will always produce a page fault.  For example, if you =20
> have only a
> single array in a system and you lose a drive, and then it returns =20
> later..
>
>     # atacontrol status 1
>     atacontrol: ioctl(ATARAIDSTATUS): Device not configured
>     # atacontrol reinit 5
>         ...finds disk
>     # atacontrol status 1
>     ar1: ATA RAID1 subdisks: DOWN DOWN status: DEGRADED
>     # atacontrol delete 1
>         *Page Fault*
>
> We can't run -current, so I'm hoping to find options to work with =20
> this as
> is.  If you know for a fact that this has changed in the mkIII =20
> patches then
> I'd be willing to investigate, but I will need to be certain.

ATA mkIII is exactly about getting ata-raid rewritten from the old =20
cruft that originally was written before even ATA-ng was done, so yes =20=

I'd expect it to behave better but not necessarily solve all your =20
problems as some of them might be "features" of the metadata

> I know that you have no desire to work on this older code, but =20
> could you at
> least clue me in on how to get atacontrol to drop these ghost arrays?

see above.

- S=F8ren




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F1F4AB07-A2C3-4EC9-8D4E-BDE0AF4BA409>