Date: Wed, 02 Sep 2015 21:02:04 +0200 From: Per olof Ljungmark <peo@intersonic.se> To: freebsd-scsi@freebsd.org Subject: Re: da2:ciss1:0:0:0): Periph destroyed Message-ID: <55E747AC.6020302@intersonic.se> In-Reply-To: <55E742B9.1060002@freebsd.org> References: <55E72440.8070507@intersonic.se> <55E7309C.8010406@freebsd.org> <55E73900.5080302@intersonic.se> <55E742B9.1060002@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2015-09-02 20:40, Sean Bruno wrote: > > > On 09/02/15 10:59, Per olof Ljungmark wrote: >> On 2015-09-02 19:23, Sean Bruno wrote: >>> >>> >>> On 09/02/15 09:30, Per olof Ljungmark wrote: >>>> Hi, >>>> >>>> Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10 via a >>>> P812 controller, 7TB capacity as one volume, ZFS. >>>> >>>> If I pull a drive from the array, the following occurs and I am >>>> not sure about the logic here because the array is still intact >>>> and no data loss occurs. >>>> >>>> Despite that the volume is gone. >>>> >>>> # zpool clear imap cannot clear errors for imap: I/O error >>>> >>>> # zpool online imap da2 cannot online da2: pool I/O is >>>> currently suspended >>>> >>>> Only a reboot helped and then the pool came up just fine, no >>>> errors, but that is not exactly what you want on a production >>>> box. >>>> >>>> Did I miss something? >>>> >>>> Would geli_autodetach="NO" help? >>>> >>>> syslog output: >>>> >>>> Sep 2 17:55:19 <kern.crit> str kernel: ciss1: *** Hot-plug >>>> drive removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD Sep >>>> 2 17:55:19 <kern.crit> str kernel: ciss1: *** Physical drive >>>> failure, Port=1E Box=1 Bay=2 Sep 2 17:55:19 <kern.crit> str >>>> kernel: ciss1: *** State change, logical drive 0, new >>>> state=REGENING Sep 2 17:55:19 <kern.crit> str kernel: ciss1: >>>> logical drive 0 (da2) changed status OK->interim recovery, >>>> spare status 0x21<configured> Sep 2 17:55:19 <kern.crit> str >>>> kernel: ciss1: *** State change, logical drive 0, new >>>> state=NEEDS_REBUILD Sep 2 17:55:19 <kern.crit> str kernel: >>>> ciss1: logical drive 0 (da2) changed status interim >>>> recovery->ready for recovery, spare status >>>> 0x11<configured,available> Sep 2 17:55:19 <kern.crit> str >>>> kernel: da2 at ciss1 bus 0 scbus2 target 0 lun 0 Sep 2 >>>> 17:55:19 <kern.crit> str kernel: da2: <HP RAID 1(1+0) read> >>>> s/n PAGXQ0BRH1W0WA detached Sep 2 17:55:19 <kern.crit> str >>>> kernel: (da2:ciss1:0:0:0): Periph destroyed Sep 2 17:55:19 >>>> <user.notice> str devd: Executing 'logger -p kern.notice -t ZFS >>>> 'vdev is removed, pool_guid=13539160044045520113 >>>> vdev_guid=1325849881310347579'' Sep 2 17:55:19 <user.notice> >>>> str ZFS: vdev is removed, pool_guid=13539160044045520113 >>>> vdev_guid=1325849881310347579 Sep 2 17:55:19 <kern.crit> str >>>> kernel: (da2:ciss1:0:0:0): fatal error, could not acquire >>>> reference count Sep 2 17:55:23 <kern.crit> str kernel: ciss1: >>>> *** State change, logical drive 0, new state=REBUILDING Sep 2 >>>> 17:55:23 <kern.crit> str kernel: ciss1: logical drive 0 (da2) >>>> changed status ready for recovery->recovering, spare status >>>> 0x13<configured,rebuilding,available> Sep 2 17:55:23 >>>> <kern.crit> str kernel: cam_periph_alloc: attempt to >>>> re-allocate valid device da2 rejected flags 0x18 refcount 1 Sep >>>> 2 17:55:23 <kern.crit> str kernel: daasync: Unable to attach to >>>> new device due to status 0x6 >>>> _______________________________________________ >>>> freebsd-scsi@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To >>>> unsubscribe, send any mail to >>>> "freebsd-scsi-unsubscribe@freebsd.org" >>>> >>> >>> >>> This looks like a bug I introduced at r249170. Now that I stare >>> deeply into the abyss of ciss(4), I think the entire change is >>> wrong. >>> >>> Do you want to try and revert that change from your kernel and >>> rebuild for a test? I don't have access to ciss(4) hardware >>> anylonger and cannot verify. >>> > >> Yes, I can try. The installed rev is 281826 but I assume the change >> can apply here too? >> _______________________________________________ >> freebsd-scsi@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To >> unsubscribe, send any mail to >> "freebsd-scsi-unsubscribe@freebsd.org" > > > > yeah, I think a "svn merge -c -249170" from /usr/src should do it if > you are managing your system from svn > Sep 2 20:54:05 <kern.crit> str kernel: ciss1: *** Hot-plug drive removed, Port=1E Box=1 Bay=3 SN= W4Z1G4BD Sep 2 20:54:05 <kern.crit> str kernel: ciss1: *** Physical drive failure, Port=1E Box=1 Bay=3 Sep 2 20:54:50 <kern.crit> str kernel: ciss1: *** Hot-plug drive inserted, Port=1E Box=1 Bay=3 SN= WD-WMC1P0F66XVC Sep 2 20:54:50 <kern.crit> str kernel: ciss1: *** HP Array Controller Firmware Ver = 6.64, Build Num = 0 Right, this time it survived, the volume did not detach after reverting. If this change does not cause any other problems do you think it can go into -STABLE? Thanks! //per
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55E747AC.6020302>