Date: Sun, 31 Oct 2010 10:42:09 +0100 From: Fabian Keil <freebsd-listen@fabiankeil.de> To: freebsd-fs@freebsd.org Subject: Re: ZFS inresponsive. Message-ID: <20101031104209.771cc47e@r500.local> In-Reply-To: <BE2EF896-4990-4A75-9DD4-B0534DFE0B71@pean.org> References: <BE2EF896-4990-4A75-9DD4-B0534DFE0B71@pean.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_//nUatHWA_bdzcjmJ_IYD4LJ Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Peter Ankerst=E5l <peter@pean.org> wrote: > I accidentally aborted a zfs send mid-tranfer. >=20 > Now the receiveing side cant do anything with the affected filesystem. bo= th zfs list and zfs destroy hangs. > How can I fix this? I can list other filesystems without a problem but a = general list or any command concerning > the one the got broken in transer hangs the zfs command. I occasionally see this, too, when sending snapshots to zpools on USB disks using geli. If the device gets lost in transfer, general zfs and zpool commands hang: Oct 18 17:09:28 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/geli attach -k /home/fk/geli-keys/poppstar.key /d= ev/label/poppstar Oct 18 17:09:34 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zpool import poppstar Oct 18 17:09:34 r500 kernel: GEOM_ELI: Device label/poppstar.eli created. Oct 18 17:09:34 r500 kernel: GEOM_ELI: Encryption: AES-CBC 128 Oct 18 17:09:34 r500 kernel: GEOM_ELI: Crypto: software Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense data Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 0 = 0 0 0 0 0 0 0 0=20 Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status E= rror Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Condit= ion Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY asc= :3a,1 (Medium not present - tray closed) Oct 18 17:09:35 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable error Oct 18 17:09:58 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zfs create poppstar/dvds/sliders Oct 18 17:10:32 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zfs send wde2/dvds/sliders/season-1-2@2009-12-05 Oct 18 17:10:32 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zfs receive -v poppstar/dvds/sliders/season-1-2 Oct 18 17:38:06 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zfs receive -v poppstar/dvds/sliders/season-3 Oct 18 17:38:06 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zfs send wde2/dvds/sliders/season-3@2009-03-28 Oct 18 17:39:47 r500 kernel: acpi_lid0: Lid closed Oct 18 18:13:19 r500 kernel: acpi_lid0: Lid opened Oct 18 18:22:56 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zfs send wde2/dvds/sliders/season-4@2009-12-05 Oct 18 18:22:56 r500 sudo: fk : TTY=3Dpts/9 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zfs receive -v poppstar/dvds/sliders/season-4 Oct 18 18:33:05 r500 kernel: ugen3.3: <Generic> at usbus3 (disconnected) Oct 18 18:33:05 r500 kernel: umass2: at uhub3, port 1, addr 3 (disconnected) Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): SCSI status error Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Requesting SCSI sense = data Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): AutoSense failed Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Error 5, Unretryable e= rror Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362645020672, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362645151744, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362645282816, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362645413888, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362645544960, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362645676032, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362645807104, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362645938176, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362646069248, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D5). label/poppstar.eli[WRITE(offset=3D362644889600, length=3D131072)] Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Selection timeout Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Retrying command Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Selection timeout Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Retrying command Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Selection timeout Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Retrying command Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Selection timeout Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Retrying command Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): lost device Oct 18 18:33:05 r500 kernel: (pass4:umass-sim2:2:0:0): lost device Oct 18 18:33:05 r500 kernel: (pass4:umass-sim2:2:0:0): removing device entry Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Error 6, Unretryable e= rror Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0):G EOM_ELIInvalidating p= ack:=20 Oct 18 18:33:05 r500 kernel: g_eli_read_done() failed label/poppstar.eli[RE= AD(offset=3D2000398327808, length=3D8192)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: g_eli_read_done() failed label/popps= tar.eli[READ(offset=3D270336, length=3D8192)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: g_eli_read_done() failed label/popps= tar.eli[READ(offset=3D2000398589952, length=3D8192)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362646200320, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362646331392, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362646593536, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362646724608, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362646462464, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362646855680, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362646986752, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362647117824, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362647248896, length=3D131072)] Oct 18 18:33:05 r500 kernel: GEOM_ELI: Crypto WRITE request failed (error= =3D6). label/poppstar.eli[WRITE(offset=3D362647379968, length=3D131072)] Oct 18 18:33:05 r500 root: ZFS: vdev I/O failure, zpool=3Dpoppstar path=3D/= dev/label/poppstar.eli offset=3D2000398327808 size=3D8192 error=3D6 Oct 18 18:33:05 r500 root: ZFS: vdev I/O failure, zpool=3Dpoppstar path=3D/= dev/label/poppstar.eli offset=3D270336 size=3D8192 error=3D6 Oct 18 18:33:05 r500 root: ZFS: vdev I/O failure, zpool=3Dpoppstar path=3D/= dev/label/poppstar.eli offset=3D2000398589952 size=3D8192 error=3D6 Oct 18 18:33:05 r500 root: ZFS: zpool I/O failure, zpool=3Dpoppstar error= =3D6 Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): Synchronize cache fail= ed, status =3D=3D 0xa, scsi status =3D=3D 0x0 Oct 18 18:33:05 r500 kernel: (da2:umass-sim2:2:0:0): removing device entry Oct 18 18:33:05 r500 root: ZFS: zpool I/O failure, zpool=3Dpoppstar error= =3D6 Oct 18 18:33:05 r500 last message repeated 151 times Oct 18 18:33:05 r500 root: ZFS: zpool I/O failure, zpool=3Dpoppstar error= =3D28 Oct 18 18:33:06 r500 last message repeated 161 times Oct 18 18:33:06 r500 root: ZFS: vdev I/O failure, zpool=3Dpoppstar path=3D = offset=3D size=3D error=3D Oct 18 18:33:06 r500 root: Unknown USB device: vendor 0x13fd product 0x1240= bus uhub3 Oct 18 18:33:06 r500 kernel: ugen3.3: <Generic> at usbus3 Oct 18 18:33:06 r500 kernel: umass2: <Generic External, class 0/0, rev 2.00= /2.10, addr 3> on usbus3 Oct 18 18:33:06 r500 kernel: umass2: SCSI over Bulk-Only; quirks =3D 0x0000 Oct 18 18:33:08 r500 kernel: umass2:4:2:-1: Attached to scbus4 Oct 18 18:33:08 r500 kernel: pass4 at umass-sim2 bus 2 scbus4 target 0 lun 0 Oct 18 18:33:08 r500 kernel: pass4: <Generic External 2.10> Fixed Direct Ac= cess SCSI-4 device=20 Oct 18 18:33:08 r500 kernel: pass4: Serial Number 395857304652454620202020 Oct 18 18:33:08 r500 kernel: pass4: 40.000MB/s transfers Oct 18 18:33:08 r500 kernel: GEOM: new disk da2 Oct 18 18:33:08 r500 kernel: da2 at umass-sim2 bus 2 scbus4 target 0 lun 0 Oct 18 18:33:08 r500 kernel: da2: <Generic External 2.10> Fixed Direct Acce= ss SCSI-4 device=20 Oct 18 18:33:08 r500 kernel: da2: Serial Number 395857304652454620202020 Oct 18 18:33:08 r500 kernel: da2: 40.000MB/s transfers Oct 18 18:33:08 r500 kernel: da2: 1907729MB (3907029168 512 byte sectors: 2= 55H 63S/T 243201C) Oct 18 18:34:02 r500 sudo: fk : TTY=3Dpts/14 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zpool list Oct 18 18:34:02 r500 root: ZFS: zpool I/O failure, zpool=3Dpoppstar error= =3D6 Oct 18 18:34:02 r500 root: ZFS: vdev I/O failure, zpool=3Dpoppstar path=3D = offset=3D size=3D error=3D Oct 18 18:36:34 r500 kernel: ugen3.3: <Generic> at usbus3 (disconnected) Oct 18 18:36:34 r500 kernel: umass2: at uhub3, port 1, addr 3 (disconnected) Oct 18 18:36:34 r500 kernel: (da2:umass-sim2:2:0:0): lost device Oct 18 18:36:34 r500 kernel: (da2:umass-sim2:2:0:0): removing device entry Oct 18 18:36:34 r500 kernel: (pass4:umass-sim2:2:0:0): lost device Oct 18 18:36:34 r500 kernel: (pass4:umass-sim2:2:0:0): removing device entry Oct 18 18:36:44 r500 root: Unknown USB device: vendor 0x13fd product 0x1240= bus uhub3 Oct 18 18:36:44 r500 kernel: ugen3.3: <Generic> at usbus3 Oct 18 18:36:44 r500 kernel: umass2: <Generic External, class 0/0, rev 2.00= /2.10, addr 3> on usbus3 Oct 18 18:36:44 r500 kernel: umass2: SCSI over Bulk-Only; quirks =3D 0x0000 Oct 18 18:36:46 r500 kernel: umass2:4:2:-1: Attached to scbus4 Oct 18 18:36:46 r500 kernel: pass4 at umass-sim2 bus 2 scbus4 target 0 lun 0 Oct 18 18:36:46 r500 kernel: pass4: <Generic External 2.10> Fixed Direct Ac= cess SCSI-4 device=20 Oct 18 18:36:46 r500 kernel: pass4: Serial Number 395857304652454620202020 Oct 18 18:36:46 r500 kernel: pass4: 40.000MB/s transfers Oct 18 18:36:46 r500 kernel: GEOM: new disk da2 Oct 18 18:36:46 r500 kernel: da2 at umass-sim2 bus 2 scbus4 target 0 lun 0 Oct 18 18:36:46 r500 kernel: da2: <Generic External 2.10> Fixed Direct Acce= ss SCSI-4 device=20 Oct 18 18:36:46 r500 kernel: da2: Serial Number 395857304652454620202020 Oct 18 18:36:46 r500 kernel: da2: 40.000MB/s transfers Oct 18 18:36:46 r500 kernel: da2: 1907729MB (3907029168 512 byte sectors: 2= 55H 63S/T 243201C) Oct 18 18:36:46 r500 kernel: ugen3.3: <Generic> at usbus3 (disconnected) Oct 18 18:36:46 r500 kernel: umass2: at uhub3, port 1, addr 3 (disconnected) Oct 18 18:36:46 r500 kernel: ugen7.3: <Western Digital> at usbus7 (disconne= cted) Oct 18 18:36:46 r500 kernel: umass1: at uhub7, port 1, addr 3 (disconnected) Oct 18 18:36:46 r500 kernel: (da2:umass-sim2:2:0:0): lost device Oct 18 18:36:46 r500 kernel: (da2:umass-sim2:2:0:0): removing device entry Oct 18 18:36:46 r500 kernel: (pass4:umass-sim2:2:0:0): lost device Oct 18 18:36:46 r500 kernel: (pass4:umass-sim2:2:0:0): removing device entry Oct 18 18:36:46 r500 kernel: (da1:umass-sim1:1:0:0): lost device Oct 18 18:36:46 r500 kernel: (pass3:umass-sim1:1:0:0): lost device Oct 18 18:36:46 r500 kernel: (pass3:umass-sim1:1:0:0): removing device entry Oct 18 18:36:46 r500 kernel: (da1:umass-sim1:1:0:0): Synchronize cache fail= ed, status =3D=3D 0xa, scsi status =3D=3D 0x0 Oct 18 18:36:46 r500 kernel: (da1:umass-sim1:1:0:0): removing device entry Oct 18 18:36:46 r500 kernel: ugen7.2: <Western Digital> at usbus7 (disconne= cted) Oct 18 18:36:46 r500 kernel: umass0: at uhub7, port 2, addr 2 (disconnected) Oct 18 18:36:46 r500 kernel: (da0:umass-sim0:0:0:0): lost device Oct 18 18:36:46 r500 kernel: (pass2:umass-sim0:0:0:0): lost device Oct 18 18:36:46 r500 kernel: (pass2:umass-sim0:0:0:0): removing device entry Oct 18 18:36:46 r500 kernel: (da0:umass-sim0:0:0:0): Synchronize cache fail= ed, status =3D=3D 0xa, scsi status =3D=3D 0x0 Oct 18 18:36:46 r500 kernel: (da0:umass-sim0:0:0:0): removing device entry Oct 18 18:36:54 r500 root: ZFS: zpool I/O failure, zpool=3Dwde2 error=3D28 Oct 18 18:36:54 r500 last message repeated 3 times Oct 18 18:36:54 r500 root: ZFS: vdev I/O failure, zpool=3Dwde2 path=3D offs= et=3D size=3D error=3D Oct 18 18:36:54 r500 root: ZFS: zpool I/O failure, zpool=3Dwde3 error=3D28 Oct 18 18:36:54 r500 last message repeated 3 times Oct 18 18:36:54 r500 root: ZFS: vdev I/O failure, zpool=3Dwde3 path=3D offs= et=3D size=3D error=3D fk@r500 ~ $ps waux | grep \ [z] root 5501 0.0 0.1 13364 1288 9 I+ 6:22PM 0:00.01 sudo zfs= receive -v poppstar/dvds/sliders/season-4 root 5502 0.0 0.1 18944 1312 9 D+ 6:22PM 0:21.73 zfs rece= ive -v poppstar/dvds/sliders/season-4 fk 4711 0.0 0.1 19980 1364 11 D+ 5:10PM 0:01.03 zpool io= stat poppstar 1 root 5947 0.0 0.1 13364 1728 14 I+ 6:34PM 0:00.01 sudo zpo= ol list root 5948 0.0 0.1 19980 1760 14 D+ 6:34PM 0:00.01 zpool li= st fk 5990 0.0 0.1 19980 1744 16 D+ 6:36PM 0:00.01 zpool li= st -H -o name fk@r500 ~ $sudo procstat -kk `pgrep zfs` PID TID COMM TDNAME KSTACK = =20 5502 100519 zfs - mi_switch+0x176 sleepq_wait+= 0x42 _cv_wait+0x129 txg_wait_synced+0x7c dmu_tx_assign+0x16c dmu_recv_strea= m+0x125f zfs_ioc_recv+0x282 zfsdev_ioctl+0x8f devfs_ioctl_f+0x7b kern_ioctl= +0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b Xfast_syscall+0xe2=20 fk@r500 ~ $sudo procstat -kk `pgrep zpool` PID TID COMM TDNAME KSTACK = =20 5990 100507 zpool - mi_switch+0x176 sleepq_wait+= 0x42 _sx_xlock_hard+0x3c2 _sx_xlock+0x6e spa_all_configs+0xbf zfs_ioc_pool_= configs+0x29 zfsdev_ioctl+0x8f devfs_ioctl_f+0x7b kern_ioctl+0x102 ioctl+0x= fd syscallenter+0x331 syscall+0x4b Xfast_syscall+0xe2=20 5948 100292 zpool - mi_switch+0x176 sleepq_wait+= 0x42 _cv_wait+0x129 zio_wait+0x61 dbuf_read+0x39a dnode_hold_impl+0xf0 dmu_= buf_hold+0x34 zap_lockdir+0x52 zap_cursor_retrieve+0x194 spa_prop_get+0x26f= zfs_ioc_pool_get_props+0xb2 zfsdev_ioctl+0x8f devfs_ioctl_f+0x7b kern_ioct= l+0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b Xfast_syscall+0xe2=20 4711 100269 zpool - mi_switch+0x176 sleepq_wait+= 0x42 _sx_xlock_hard+0x3c2 _sx_xlock+0x6e spa_open_common+0x76 spa_get_stats= +0x42 zfs_ioc_pool_stats+0x2c zfsdev_ioctl+0x8f devfs_ioctl_f+0x7b kern_ioc= tl+0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b Xfast_syscall+0xe2=20 At the time I've been using: FreeBSD r500.local 9.0-CURRENT FreeBSD 9.0-CURRENT #193 r+3be3a96: Sun Oct = 17 12:51:52 CEST 2010 fk@r500.local:/usr/obj/usr/src/sys/ZOEY amd64 with ZFS pool version 15, but I saw the problem with earlier FreeBSD and ZFS versions, too. Fabian --Sig_//nUatHWA_bdzcjmJ_IYD4LJ Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iEYEARECAAYFAkzNOfgACgkQBYqIVf93VJ0KfwCcDApuTT8QkxyMXt6HLEMOeslR kVAAoINh2rLwsOyXgcP+5uH1vjvZv3pN =ue8o -----END PGP SIGNATURE----- --Sig_//nUatHWA_bdzcjmJ_IYD4LJ--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101031104209.771cc47e>