Date: Wed, 28 Dec 2016 15:30:02 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 215634] zfs receive trips up and live-locks for non-incremental fs Message-ID: <bug-215634-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D215634 Bug ID: 215634 Summary: zfs receive trips up and live-locks for non-incremental fs Product: Base System Version: 10.3-STABLE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: johannes@jo-t.de CC: freebsd-amd64@FreeBSD.org CC: freebsd-amd64@FreeBSD.org Hi, when I'm trying to zfs-send a filesystem from one machine to another, the receiving end gets stuck with zfskern spinning one CPU core. No observable problems sending incremental streams. Here's what top shows (truncated) on the receiver: > last pid: 2848; load averages: 1.08, 1.08, 1.04 > 243 processes: 5 running, 220 sleeping, 18 waiting > CPU: 0.0% user, 0.0% nice, 50.6% system, 0.0% interrupt, 49.4% idle > Mem: 8904K Active, 77M Inact, 551M Wired, 5310M Free > ARC: 284M Total, 104M MFU, 46M MRU, 2448K Anon, 1979K Header, 131M Other > Swap: 8192M Total, 8192M Free > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 5 root -8 - 0K 128K CPU0 0 252:00 100.00% zfskern{= solthread 0xffff} > 11 root 155 ki31 0K 32K RUN 1 225:09 91.89% idle{idl= e: cpu1} > 11 root 155 ki31 0K 32K RUN 0 51:02 7.86% idle{idl= e: cpu0} > 0 root -16 - 0K 2496K swapin 1 0:18 0.00% kernel{s= wapper} > 16 root 16 - 0K 16K syncer 1 0:16 0.00% syncer > 12 root -92 - 0K 288K WAIT 1 0:09 0.00% intr{irq= 257: virtio_p} > 12 root -60 - 0K 288K WAIT 1 0:07 0.00% intr{swi= 4: clock} > 15 root -16 - 0K 16K vlruwt 1 0:02 0.00% vnlru > 6 root -16 - 0K 32K psleep 1 0:02 0.00% pagedaem= on{pagedaemon} > 14 root -16 - 0K 16K RUN 1 0:02 0.00% rand_har= vestq > 5 root -8 - 0K 128K tx->tx 1 0:02 0.00% zfskern{= txg_thread_enter} > 1806 root 40 0 44420K 3692K rwa.cv 1 0:01 0.00% zfs And here's procstat for zfskern on the receiver: > #procstat -kk 5 > PID TID COMM TDNAME KSTACK=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > 5 100044 zfskern arc_reclaim_thre mi_switch+0xe1 sleepq_time= dwait+0x3a _cv_timedwait_sbt+0x19e arc_reclaim_thread+0x2be fork_exit+0x9a = fork_trampoline+0xe=20 > 5 100045 zfskern arc_user_evicts_ mi_switch+0xe1 sleepq_time= dwait+0x3a _cv_timedwait_sbt+0x19e arc_user_evicts_thread+0x17d fork_exit+0= x9a fork_trampoline+0xe=20 > 5 100046 zfskern l2arc_feed_threa mi_switch+0xe1 sleepq_time= dwait+0x3a _cv_timedwait_sbt+0x19e l2arc_feed_thread+0xc73 fork_exit+0x9a f= ork_trampoline+0xe=20 > 5 100322 zfskern trim seppel mi_switch+0xe1 sleepq_time= dwait+0x3a _cv_timedwait_sbt+0x19e trim_thread+0x126 fork_exit+0x9a fork_tr= ampoline+0xe=20 > 5 100334 zfskern txg_thread_enter mi_switch+0xe1 sleepq_wait= +0x3a _cv_wait+0x17d txg_quiesce_thread+0x16b fork_exit+0x9a fork_trampolin= e+0xe=20 > 5 100335 zfskern txg_thread_enter mi_switch+0xe1 sleepq_time= dwait+0x3a _cv_timedwait_sbt+0x19e txg_sync_thread+0x160 fork_exit+0x9a for= k_trampoline+0xe=20 > 5 100425 zfskern solthread 0xffff <running> The sender is running (custom trimmed-down GENERIC kernel): > FreeBSD XXX 10.3-STABLE FreeBSD 10.3-STABLE #1 r308740: Sat Nov 19 21:15:= 27 GMT 2016 root@XXX:/usr/obj/usr/src/sys/XXX amd64 And the receiver is running (a differently trimmed GENERIC kernel): > FreeBSD YYY 10.3-RELEASE-p15 FreeBSD 10.3-RELEASE-p15 #9 r310507: Sat Dec= 24 21:22:15 UTC 2016 root@XXX:/path/usr/src/sys/YYY amd64 The file systems that zfs-send just fine are clones of snapshots. These are send as incremental streams. The problematic one is a fresh zfs-create'd file system, with only a few small files in it. The command used to send/receive, initiated from the sender-side, is: > /sbin/zfs send -v -R senderpool/myrootfs | /usr/bin/gzip | /usr/bin/ssh r= oot@${HOST} "/usr/bin/gunzip | /sbin/zfs recv -v -ud receiverpool" And what I thus get in the console (from zfs recv -v) is (trimmed): > found clone origin receiverpool/base@20.46-r310507 > receiving incremental stream of senderpool/myrootfs@clean-install into re= ceiverpool/myrootfs@clean-install > received 886KB stream in 3 seconds (295KB/sec) > receiving full stream of senderpool/myrootfs/freshfs@clean-install into r= eceiverpool/myrootfs/freshfs@clean-install > [...stuck here...] The receiving end seems to be running fine with zfskern spinning. But it will never finish the the filesystem in question. Any ideas what might be going on, or what to do about it? Thanks, Johannes --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-215634-8>