Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Aug 2008 22:35:54 -0700
From:      "Vye Wilson" <vyeperman@gmail.com>
To:        "Jeremy Chadwick" <koitsu@freebsd.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: zpool degraded - 'UNAVAIL cannot open' functioning drive
Message-ID:  <6c3c36d00808062235v5cbb4470v990b76d569f85614@mail.gmail.com>
In-Reply-To: <6c3c36d00808062212y4e9a1464i48e146e84725a36e@mail.gmail.com>
References:  <6c3c36d00808062109y6ae176a0ha055129392b00542@mail.gmail.com> <20080807044759.GA7505@eos.sc1.parodius.com> <6c3c36d00808062212y4e9a1464i48e146e84725a36e@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I tested it again and it seems I was mistaken in my last email. If the drive
is removed manually it will _not_ show up in atacontrol.

atacontrol attach ata3
atacontrol: ioctl(IOCATAATTACH): File exists
[root@Touzyoh /home/vye]# atacontrol list
ATA channel 3:
    Master:      no device present
    Slave:       no device present
[root@Touzyoh /home/vye]# atacontrol detach ata6
[root@Touzyoh /home/vye]# atacontrol attach ata6
Master:      no device present
Slave:       no device present

I'm not sure what happened the first few times but if I reboot the server
the device will show up in atacontrol and the zpool will resilver. This
doesn't seem to be a ZFS problem like I originally thought.

Thanks.

On Wed, Aug 6, 2008 at 10:12 PM, Vye Wilson <vyeperman@gmail.com> wrote:

> When I physically disconnected the disk it showed:
>
> subdisk18: detached
> ad18: detached
>
> There was nothing in dmesg after plugging the disk back in but atacontrol
> showed it on channel 9:
>
> ATA channel 9:
>     Master: ad18 <ST3300620AS/3.AAE> Serial ATA v1.0
>     Slave:       no device present
>
> After detaching and reattaching the device with atacontrol this is what
> dmesg had to say:
> subdisk18: detached
> ad18: detached
> ata9: [ITHREAD]
> ad18: 286168MB <Seagate ST3300620AS 3.AAE> at ata9-master SATA150
>
> zpool is still saying it cannot read from ad18 after detaching and
> reattaching with atacontrol. However, I remade the zpool and instead of
> physically removing the drive I just used the atacontrol detatch/attach and
> it was able to resilver without any issues. That helps but what if I do have
> a drive failure? I shouldn't need to halt the system to switch out the
> drive.
>
> Thanks.
>
>
> On Wed, Aug 6, 2008 at 9:47 PM, Jeremy Chadwick <koitsu@freebsd.org>wrote:
>
>> On Wed, Aug 06, 2008 at 09:09:02PM -0700, Vye Wilson wrote:
>> > Hello,
>> >
>> > I setup a raidz1 zpool to test ZFS with a device failure and to see how
>> > quickly the zpool could be resilvered. The system I'm using has a
>> backplane
>> > that all the drives are connected to, so everything is hotswappable. I
>> > created the raidz1 zpool and then removed one of the drives. zpool
>> status
>> > showed that the pool was degraded but online. Ok great, so lets bring
>> the
>> > now functioning drive back online.
>> >
>> > [root@Touzyoh /home/vye]# zpool online ztemp ad18
>> > Bringing device ad18 online
>> >
>> > Everything looks good... lets check the zpool status
>> >
>> >   pool: ztemp
>> >  state: DEGRADED
>> > status: One or more devices could not be opened.  Sufficient replicas
>> exist
>> > for
>> >     the pool to continue functioning in a degraded state.
>> > action: Attach the missing device and online it using 'zpool online'.
>> >    see: http://www.sun.com/msg/ZFS-8000-D3
>> >  scrub: resilver completed with 0 errors on Wed Aug  6 20:59:54 2008
>> > config:
>> >
>> >     NAME        STATE     READ WRITE CKSUM
>> >     ztemp       DEGRADED     0     0     0
>> >       raidz1    DEGRADED     0     0     0
>> >         ad10    ONLINE       0     0     0
>> >         ad14    ONLINE       0     0     0
>> >         ad18    UNAVAIL      0     0     0  cannot open
>> >
>> > errors: No known data errors
>> >
>> > Doh! still degraded. It shows 'UNAVAIL cannot open' I've tried rebooting
>> but
>> > it will not open that drive at all. According to dmesg the drive is
>> > functional, and if I destroy the pool and recreate it the drive works
>> fine.
>> > I wasn't able to find any similar issues on this mailing list or in
>> google.
>> > Does anyone have any ideas? I've attached my dmesg output.
>>
>> What was in your dmesg when you yanked the disk?  What was in your dmesg
>> when you re-inserted the disk?
>>
>> Did you try detaching it administratively using "atacontrol detach"
>> first, then retaching it using "atacontrol attach"?
>>
>> --
>> | Jeremy Chadwick                                jdc at parodius.com |
>> | Parodius Networking                       http://www.parodius.com/ |
>> | UNIX Systems Administrator                  Mountain View, CA, USA |
>> | Making life hard for others since 1977.              PGP: 4BD6C0CB |
>>
>>
>
>
> --
> --Vye
>



-- 
--Vye



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6c3c36d00808062235v5cbb4470v990b76d569f85614>