Date: Mon, 7 Mar 2011 20:25:31 +0100 From: Fabian Keil <freebsd-listen@fabiankeil.de> To: freebsd-fs@freebsd.org Subject: Re: ZFSv28: log_sysevent: type 19 is not implemented Message-ID: <20110307202531.2c90ff5a@r500.local> In-Reply-To: <20110307200634.3c0f92df@r500.local> References: <20110227202957.GD1992@garage.freebsd.pl> <20110228192129.119cac0c@r500.local> <20110307200634.3c0f92df@r500.local>
next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/CgFsehXtW9zopU6fiNqtLkR Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Fabian Keil <freebsd-listen@fabiankeil.de> wrote: > Fabian Keil <freebsd-listen@fabiankeil.de> wrote: >=20 > > Pawel Jakub Dawidek <pjd@FreeBSD.org> wrote: > >=20 > > > I just committed ZFSv28 to HEAD. >=20 > > Anyway, the things I tested so far (zfs/zpool upgrade, > > delegation, send, receive, snapshot) worked fine. >=20 > After unintentionally unplugging an USB disk with this > zpool on it: >=20 > | fk@r500 ~ $zpool status toshiba > | pool: toshiba > | state: ONLINE > | scan: none requested > | config: > | > | NAME STATE READ WRITE CKSUM > | toshiba ONLINE 0 0 0 > | label/toshiba.eli ONLINE 0 0 0 > | > | errors: No known data errors >=20 > the system became sluggish and /var/log/messages got spammed > with error messages: >=20 > Mar 6 21:33:10 r500 kernel: ugen7.2: <TOSHIBA> at usbus7 (disconnected) > Mar 6 21:33:10 r500 kernel: umass1: at uhub7, port 1, addr 2 (disconnect= ed) > Mar 6 21:33:10 r500 kernel: (pass3:umass-sim1:1:0:0): lost device > Mar 6 21:33:10 r500 kernel: (pass3:umass-sim1:1:0:0): removing device en= try > Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): lost device > Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): Synchronize cache fa= iled, status =3D=3D 0xa, scsi status =3D=3D 0x0 > Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): removing device entry > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 last message repeated 50 times > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is not im > Mar 6 21:33:10 r500 kernel: plemented > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 last message repeated 238 times > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is > Mar 6 21:33:10 r500 kernel: not implemented > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is not impleme > Mar 6 21:33:10 r500 kernel: nted > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 last message repeated 87 times > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 kernel:=20 > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 last message repeated 47 times > Mar 6 21:33:11 r500 kernel: type 19 is not implemented > Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:11 r500 last message repeated 1527 times > Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is not implemented > Mar 6 21:33:11 r500 kernel:=20 > Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implemented >=20 > fk@r500 ~ $zcat /var/log/messages.0.bz2 | grep -c "type 19 is not impleme= nted" > 34101 >=20 > At the time of the unplugging, a video was read from the pool. >=20 > When trying to export the pool, zpool export hung. >=20 > After rebooting the system, the message was shown two > more times between the creation of the provider for the > main zpool and the swap device: >=20 > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Device ada0s1d.eli created. > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Encryption: AES-CBC 128 > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Crypto: software > Mar 6 21:43:20 r500 kernel: Trying to mount root from ufs:/dev/ada0s1a [= rw]... > Mar 6 21:43:20 r500 kernel: WARNING: / was not properly dismounted > Mar 6 21:43:20 r500 kernel: start_init: trying /sbin/init > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense d= ata > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 = 0 0 0 0 0 0 0 0 0=20 > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status= Error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Cond= ition > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY a= sc:3a,1 (Medium not present - tray closed) > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable er= ror > Mar 6 21:43:20 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:43:20 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Device ada0s1b.eli created. > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Encryption: AES-XTS 256 > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Crypto: software > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense d= ata > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 = 0 0 0 0 0 0 0 0 0=20 > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status= Error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Cond= ition > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY a= sc:3a,1 (Medium not present - tray closed) > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable er= ror > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense d= ata > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 = 0 0 0 0 0 0 0 0 0=20 > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status= Error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Cond= ition > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY a= sc:3a,1 (Medium not present - tray closed) > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable er= ror > Mar 6 21:43:20 r500 kernel: lo1: bpf attached > Mar 6 21:43:20 r500 kernel: wlan0: bpf attached > Mar 6 21:43:20 r500 kernel: wlan0: bpf attached > Mar 6 21:43:20 r500 kernel: wlan0: Ethernet address: 00:[...] > Mar 6 21:43:20 r500 kernel: firmware: 'iwn5000fw' version 0: 353240 byte= s loaded at 0xffffffff814120b0 > Mar 6 21:43:20 r500 kernel: firmware: 'iwn5000fw' version 0: 353240 byte= s loaded at 0xffffffff814120b0 > Mar 6 21:43:20 r500 kernel: bge0: Disabling fastboot > Mar 6 21:43:20 r500 kernel: bge0: Disabling fastboot > Mar 6 21:43:20 r500 savecore: /dev/ada0s1b: Operation not permitted > Mar 6 21:43:21 r500 named[2219]: starting BIND 9.6.3 -t /var/named -u bi= nd > Mar 6 21:43:21 r500 named[2219]: built with '--prefix=3D/usr' '--infodir= =3D/usr/share/info' '--mandir=3D/usr/share/man' '--enable-threads' '--enabl= e-getifaddrs' '--disable-linux-caps' '--with-openssl=3D/usr' '--with-random= dev=3D/dev/random' '--without-idn' '--without-libxml2' > Mar 6 21:43:21 r500 named[2219]: command channel listening on 127.0.0.1#= 953 > Mar 6 21:43:21 r500 named[2219]: command channel listening on ::1#953 > Mar 6 21:43:21 r500 named[2219]: the working directory is not writable > Mar 6 21:43:21 r500 named[2219]: running > Mar 6 21:43:53 r500 wpa_supplicant[512]: WPA: Group rekeying completed w= ith 00:[...] [GTK=3DTKIP] > Mar 6 21:44:14 r500 syslogd: exiting on signal 15 >=20 > As the boot process got stuck with no additional messages > printed, I rebooted into single-user mode, exported the > faulted pool and finished the boot process. The system > came back normally and the pool could be imported without > issues >=20 > fk@r500 ~ $grep zfs /boot/loader.conf | grep -v "^ *#" > zfs_load=3D"YES" >=20 > I used the attached patch to stop the log spam, but the > main issue seems to be reproducible. The top output after > detaching the pool: >=20 > last pid: 4985; load averages: 10.47, 3.96, 2.02 up 0+02:01:49 19:= 20:49 > 552 processes: 12 running, 518 sleeping, 22 waiting > CPU: 1.2% user, 0.0% nice, 98.8% system, 0.0% interrupt, 0.0% idle > Mem: 267M Active, 95M Inact, 859M Wired, 2032K Cache, 7872K Buf, 692M Free > Swap: 2048M Total, 2048M Free >=20 > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMM= AND = =20 > 2 root 1 -8 - 0K 16K CPU1 1 1:26 96.97% g_ev= ent > 0 root 387 -8 0 0K 6192K - 0 2:11 80.66% kern= el > 1702 root 1 76 0 6276K 796K RUN 0 0:15 11.96% devd > 11 root 2 155 ki31 0K 32K RUN 0 26:46 0.00% idle > 3395 fk 1 22 0 465M 329M select 0 7:27 0.00% Xorg > 3398 fk 1 20 0 96120K 9704K RUN 0 0:49 0.00% e16 > 12 root 22 -84 - 0K 352K WAIT 1 0:37 0.00% intr > 26 root 1 20 - 0K 16K geli:w 0 0:31 0.00% g_el= i[0] ada0s1d > 27 root 1 22 - 0K 16K geli:w 1 0:29 0.00% g_el= i[1] ada0s1d >=20 > The stuck zpool's stack: >=20 > fk@r500 ~ $sudo procstat -kk $(pgrep zpool) > PID TID COMM TDNAME KSTACK = =20 > 5087 100490 zpool initial thread mi_switch+0x174 sleepq_wai= t+0x42 __lockmgr_args+0x7a3 vop_stdlock+0x39 VOP_LOCK1_APV+0x52 _vn_lock+0x= 47 vflush+0x125 zfs_umount+0x9f dounmount+0x31e unmount+0x38b syscallenter+= 0x331 syscall+0x4b Xfast_syscall+0xdd=20 >=20 > When I re-attached the disk, g_event kept eating cpu. >=20 > I'm using a script to attach and import pools on USB devices and as > it currently doesn't handle faulted pools, it tried to import the > already faulted pool, which resulted in a zpool core dump. >=20 > Mar 7 19:27:25 r500 sudo: fk : TTY=3Dttyv0 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/geli attach -j - -k /home/fk/geli-keys/toshiba.ke= y /dev/label/toshiba > Mar 7 19:27:33 r500 sudo: fk : TTY=3Dttyv0 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zpool import toshiba > Mar 7 19:27:33 r500 kernel: GEOM_ELI: Device label/toshiba.eli created. > Mar 7 19:27:33 r500 kernel: GEOM_ELI: Encryption: AES-CBC 128 > Mar 7 19:27:33 r500 kernel: GEOM_ELI: Crypto: software > Mar 7 19:27:33 r500 kernel: pid 5206 (zpool), uid 0: exited on signal 11= (core dumped) >=20 > fk@r500 ~ $gdb /sbin/zpool zpool.core=20 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > [...] > Loaded symbols for /libexec/ld-elf.so.1 > #0 0x000000080085ec7d in avl_insert () from /lib/libavl.so.2 > [New Thread 802807400 (LWP 100524/initial thread)] > (gdb) where > #0 0x000000080085ec7d in avl_insert () from /lib/libavl.so.2 > #1 0x000000080085ed5e in avl_add () from /lib/libavl.so.2 > #2 0x0000000801ae9105 in zpool_find_import_cached () from /lib/libzfs.so= .2 > #3 0x0000000000409deb in zpool_do_import () > #4 0x0000000000406fa9 in main () >=20 > This time rebooting into single-user mode to export the faulted > pool wasn't necessary, but I doubt that the patch had anything > to do with it. >=20 > The second time I was using today's CURRENT instead of yesterday's, > but I don't think there were any ZFS-related commits. I just tried it a third time, this time with the pool imported but not in active use. Again I couldn't export the pool afterwards: fk@r500 ~ $sudo zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT tank 228G 166G 62.1G 72% 1.00x ONLINE - toshiba 1.36T 1.31T 47.9G 96% 1.00x UNAVAIL - fk@r500 ~ $sudo zpool export toshiba load: 1.05 cmd: zpool 4117 [tx->tx_sync_done_cv)] 249.93r 0.00u 0.06s 0% 2= 528k load: 1.05 cmd: zpool 4117 [tx->tx_sync_done_cv)] 250.11r 0.00u 0.06s 0% 2= 528k load: 1.05 cmd: zpool 4117 [tx->tx_sync_done_cv)] 250.25r 0.00u 0.06s 0% 2= 528k fk@r500 ~ $sudo procstat -kk $(pgrep zpool) PID TID COMM TDNAME KSTACK = =20 4117 102251 zpool initial thread mi_switch+0x174 sleepq_wait+= 0x42 _cv_wait+0x129 txg_wait_synced+0x85 dmu_tx_assign+0x170 spa_history_lo= g+0x43 zfs_log_history+0x82 zfs_ioc_pool_export+0x2a zfsdev_ioctl+0xe6 devf= s_ioctl_f+0x7b kern_ioctl+0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b = Xfast_syscall+0xdd=20 And importing the pool again wasn't possible either: fk@r500 ~ $zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT tank 228G 166G 62.1G 72% 1.00x ONLINE - toshiba 1.36T 1.31T 47.9G 96% 1.00x UNAVAIL - fk@r500 ~ $sudo zpool import toshiba cannot import 'toshiba': a pool with that name is already created/imported, and no additional pools with that name were found The system stayed responsive, though, and other pools could still be imported and exported. It also didn't result in a "log_sysevent: type 19 is not implemented" message. Fabian --Sig_/CgFsehXtW9zopU6fiNqtLkR Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iEYEARECAAYFAk11MS4ACgkQBYqIVf93VJ2AagCgj3v2JqFg5PX/mIIwqiPwRWbV LUcAoJExtK7u/bRsdssIvV/F9PzFFhNq =A/dc -----END PGP SIGNATURE----- --Sig_/CgFsehXtW9zopU6fiNqtLkR--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110307202531.2c90ff5a>