Date: Sat, 02 Feb 2019 06:37:17 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 235419] zpool scrub progress does not change for hours, heavy disk activity still present Message-ID: <bug-235419-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D235419 Bug ID: 235419 Summary: zpool scrub progress does not change for hours, heavy disk activity still present Product: Base System Version: 11.2-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: bobf@mrp3.com Frequently, on one of my computers running 11-STABLE, a 'zpool scrub' will continue for hours while progress does not increase. The scrub is still 'active' and there is a LOT of disk activity, causing stuttering of applica= tion response as you would expect. This does not always happen, but happens more often than not. The previous scrub completed without any such 'hangs' 2 we= eks ago, with no changes to the configuration since. This system uses a 'zfs everywhere' configuration, i.e. all partitions are = zfs. A second computer that has UFS+J partitions for userland and kernel does not appear to exhibit this particular problem. uname output: FreeBSD hack.SFT.local 11.2-STABLE FreeBSD 11.2-STABLE #1 r339273: Tue Oct = 9 21:10:39 PDT 2018 root@hack.SFT.local:/usr/obj/usr/src/sys/GENERIC amd= 64 This system had been running for 80+ days. At first, I discovered that the scrub had 'hung' at around 74% complete. A= fter pausing the scrub for a while, and also terminating firefox and thunderbird, the scrub re-started and continued. I re-started firefox and thunderbird, a= nd allowed everything to continue. The scrub then 'hung' again at about 84%, = and terminating applications (including Xorg) did not seem to help. With the scrub paused I performed a reboot, and the scrub restarted on boot [causing the boot process to be excrutiatingly slow]. I have restarted mos= t of the applications that were running before, while the scrub was continuing to run . Now the zpool status shows that the scrub has completed with no erro= rs. here are some additional pieces of information that might help: > mount zroot/ROOT/default on / (zfs, NFS exported, local, noatime, nfsv4acls) devfs on /dev (devfs, local, multilabel) zroot/d-drive on /d-drive (zfs, NFS exported, local, noatime, nfsv4acls) zroot/e-drive on /e-drive (zfs, NFS exported, local, noatime, nfsv4acls) zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls) zroot/usr/home on /usr/home (zfs, NFS exported, local, noatime, nfsv4acls) zroot/usr/ports on /usr/ports (zfs, NFS exported, local, noatime, nosuid, nfsv4acls) zroot/usr/src on /usr/src (zfs, NFS exported, local, noatime, nfsv4acls) zroot/var/audit on /var/audit (zfs, local, noatime, noexec, nosuid, nfsv4ac= ls) zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, nfsv4ac= ls) zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls) zroot/var/mail on /var/mail (zfs, local, nfsv4acls) zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls) zroot on /zroot (zfs, local, noatime, nfsv4acls) > kldstat Id Refs Address Size Name 1 44 0xffffffff80200000 206b5d0 kernel 2 1 0xffffffff8226d000 393200 zfs.ko 3 2 0xffffffff82601000 a380 opensolaris.ko 4 1 0xffffffff82821000 4090 cuse.ko 5 1 0xffffffff82826000 6e40 uftdi.ko 6 1 0xffffffff8282d000 3c58 ucom.ko 7 3 0xffffffff82831000 50c70 vboxdrv.ko 8 2 0xffffffff82882000 2ad0 vboxnetflt.ko 9 2 0xffffffff82885000 9a20 netgraph.ko 10 1 0xffffffff8288f000 14b8 ng_ether.ko 11 1 0xffffffff82891000 3f70 vboxnetadp.ko 12 2 0xffffffff82895000 37528 linux.ko 13 2 0xffffffff828cd000 2d28 linux_common.ko 14 1 0xffffffff828d0000 31e80 linux64.ko 15 1 0xffffffff82902000 c60 coretemp.ko 16 1 0xffffffff82903000 965128 nvidia.ko there were no messages regarding zpool scrub that I could find. port versions for things with kernel modules: nvidia-driver-340-340.106 virtualbox-ose-5.1.18 virtualbox-ose-kmod-5.1.22 linux-c7-7.3.1611_1 This problem has happened since mid last year, around the time when the -ST= ABLE source went to 11.2 and I updated kernel+world on this computer. The zpool= has also been upgraded. It is worth noting that this computer ran 11.0 for a l= ong time without incident. The problem may have been present in 11.1 . Related: there is an apparent (random crash) bug in the NVidia module that= I have been trying to track down. It causes occasional page fault crashes.=20 Sometimes I will see swap space in use when there does not seem to be any reason for it, and I believe this NVidia bug is a part of that (the crash happening from randomly accessing 'after free' or random memory addresses, = and swap space is allocated as a consequence?). Whether this NVidia driver bug= is responsible for the zfs problem, I do not know, but this driver is only on = this particular computer, and so it's worth mentioning, as only this computer se= ems to exhibit the problem. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-235419-227>