Date: Wed, 2 Sep 2015 10:23:40 -0700 From: Sean Bruno <sbruno@freebsd.org> To: freebsd-scsi@freebsd.org Subject: Re: da2:ciss1:0:0:0): Periph destroyed Message-ID: <55E7309C.8010406@freebsd.org> In-Reply-To: <55E72440.8070507@intersonic.se> References: <55E72440.8070507@intersonic.se>
next in thread | previous in thread | raw e-mail | index | archive | help
On 09/02/15 09:30, Per olof Ljungmark wrote: > Hi, > > Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10 via a P812 > controller, 7TB capacity as one volume, ZFS. > > If I pull a drive from the array, the following occurs and I am not sure > about the logic here because the array is still intact and no data loss > occurs. > > Despite that the volume is gone. > > # zpool clear imap > cannot clear errors for imap: I/O error > > # zpool online imap da2 > cannot online da2: pool I/O is currently suspended > > Only a reboot helped and then the pool came up just fine, no errors, but > that is not exactly what you want on a production box. > > Did I miss something? > > Would > geli_autodetach="NO" > help? > > syslog output: > > Sep 2 17:55:19 <kern.crit> str kernel: ciss1: *** Hot-plug drive > removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD > Sep 2 17:55:19 <kern.crit> str kernel: ciss1: *** Physical drive > failure, Port=1E Box=1 Bay=2 > Sep 2 17:55:19 <kern.crit> str kernel: ciss1: *** State change, logical > drive 0, new state=REGENING > Sep 2 17:55:19 <kern.crit> str kernel: ciss1: logical drive 0 (da2) > changed status OK->interim recovery, spare status 0x21<configured> > Sep 2 17:55:19 <kern.crit> str kernel: ciss1: *** State change, logical > drive 0, new state=NEEDS_REBUILD > Sep 2 17:55:19 <kern.crit> str kernel: ciss1: logical drive 0 (da2) > changed status interim recovery->ready for recovery, spare status > 0x11<configured,available> > Sep 2 17:55:19 <kern.crit> str kernel: da2 at ciss1 bus 0 scbus2 target > 0 lun 0 > Sep 2 17:55:19 <kern.crit> str kernel: da2: <HP RAID 1(1+0) read> s/n > PAGXQ0BRH1W0WA detached > Sep 2 17:55:19 <kern.crit> str kernel: (da2:ciss1:0:0:0): Periph destroyed > Sep 2 17:55:19 <user.notice> str devd: Executing 'logger -p kern.notice > -t ZFS 'vdev is removed, pool_guid=13539160044045520113 > vdev_guid=1325849881310347579'' > Sep 2 17:55:19 <user.notice> str ZFS: vdev is removed, > pool_guid=13539160044045520113 vdev_guid=1325849881310347579 > Sep 2 17:55:19 <kern.crit> str kernel: (da2:ciss1:0:0:0): fatal error, > could not acquire reference count > Sep 2 17:55:23 <kern.crit> str kernel: ciss1: *** State change, logical > drive 0, new state=REBUILDING > Sep 2 17:55:23 <kern.crit> str kernel: ciss1: logical drive 0 (da2) > changed status ready for recovery->recovering, spare status > 0x13<configured,rebuilding,available> > Sep 2 17:55:23 <kern.crit> str kernel: cam_periph_alloc: attempt to > re-allocate valid device da2 rejected flags 0x18 refcount 1 > Sep 2 17:55:23 <kern.crit> str kernel: daasync: Unable to attach to new > device due to status 0x6 > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" > This looks like a bug I introduced at r249170. Now that I stare deeply into the abyss of ciss(4), I think the entire change is wrong. Do you want to try and revert that change from your kernel and rebuild for a test? I don't have access to ciss(4) hardware anylonger and cannot verify. sean ref https://svnweb.freebsd.org/base/head/sys/dev/ciss/ciss.c?r1=249170&r2=249169&pathrev=249170
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55E7309C.8010406>