From owner-freebsd-bugs@freebsd.org Sat Feb 2 06:37:20 2019 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0041914AC7DA for ; Sat, 2 Feb 2019 06:37:20 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 866A7718FC for ; Sat, 2 Feb 2019 06:37:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 456C914AC7D9; Sat, 2 Feb 2019 06:37:19 +0000 (UTC) Delivered-To: bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1DF3E14AC7D8 for ; Sat, 2 Feb 2019 06:37:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A7151718FB for ; Sat, 2 Feb 2019 06:37:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id D21F6402D for ; Sat, 2 Feb 2019 06:37:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id x126bH42090730 for ; Sat, 2 Feb 2019 06:37:17 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id x126bHkt090729 for bugs@FreeBSD.org; Sat, 2 Feb 2019 06:37:17 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 235419] zpool scrub progress does not change for hours, heavy disk activity still present Date: Sat, 02 Feb 2019 06:37:17 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.2-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: bobf@mrp3.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Feb 2019 06:37:20 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D235419 Bug ID: 235419 Summary: zpool scrub progress does not change for hours, heavy disk activity still present Product: Base System Version: 11.2-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: bobf@mrp3.com Frequently, on one of my computers running 11-STABLE, a 'zpool scrub' will continue for hours while progress does not increase. The scrub is still 'active' and there is a LOT of disk activity, causing stuttering of applica= tion response as you would expect. This does not always happen, but happens more often than not. The previous scrub completed without any such 'hangs' 2 we= eks ago, with no changes to the configuration since. This system uses a 'zfs everywhere' configuration, i.e. all partitions are = zfs. A second computer that has UFS+J partitions for userland and kernel does not appear to exhibit this particular problem. uname output: FreeBSD hack.SFT.local 11.2-STABLE FreeBSD 11.2-STABLE #1 r339273: Tue Oct = 9 21:10:39 PDT 2018 root@hack.SFT.local:/usr/obj/usr/src/sys/GENERIC amd= 64 This system had been running for 80+ days. At first, I discovered that the scrub had 'hung' at around 74% complete. A= fter pausing the scrub for a while, and also terminating firefox and thunderbird, the scrub re-started and continued. I re-started firefox and thunderbird, a= nd allowed everything to continue. The scrub then 'hung' again at about 84%, = and terminating applications (including Xorg) did not seem to help. With the scrub paused I performed a reboot, and the scrub restarted on boot [causing the boot process to be excrutiatingly slow]. I have restarted mos= t of the applications that were running before, while the scrub was continuing to run . Now the zpool status shows that the scrub has completed with no erro= rs. here are some additional pieces of information that might help: > mount zroot/ROOT/default on / (zfs, NFS exported, local, noatime, nfsv4acls) devfs on /dev (devfs, local, multilabel) zroot/d-drive on /d-drive (zfs, NFS exported, local, noatime, nfsv4acls) zroot/e-drive on /e-drive (zfs, NFS exported, local, noatime, nfsv4acls) zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls) zroot/usr/home on /usr/home (zfs, NFS exported, local, noatime, nfsv4acls) zroot/usr/ports on /usr/ports (zfs, NFS exported, local, noatime, nosuid, nfsv4acls) zroot/usr/src on /usr/src (zfs, NFS exported, local, noatime, nfsv4acls) zroot/var/audit on /var/audit (zfs, local, noatime, noexec, nosuid, nfsv4ac= ls) zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, nfsv4ac= ls) zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls) zroot/var/mail on /var/mail (zfs, local, nfsv4acls) zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls) zroot on /zroot (zfs, local, noatime, nfsv4acls) > kldstat Id Refs Address Size Name 1 44 0xffffffff80200000 206b5d0 kernel 2 1 0xffffffff8226d000 393200 zfs.ko 3 2 0xffffffff82601000 a380 opensolaris.ko 4 1 0xffffffff82821000 4090 cuse.ko 5 1 0xffffffff82826000 6e40 uftdi.ko 6 1 0xffffffff8282d000 3c58 ucom.ko 7 3 0xffffffff82831000 50c70 vboxdrv.ko 8 2 0xffffffff82882000 2ad0 vboxnetflt.ko 9 2 0xffffffff82885000 9a20 netgraph.ko 10 1 0xffffffff8288f000 14b8 ng_ether.ko 11 1 0xffffffff82891000 3f70 vboxnetadp.ko 12 2 0xffffffff82895000 37528 linux.ko 13 2 0xffffffff828cd000 2d28 linux_common.ko 14 1 0xffffffff828d0000 31e80 linux64.ko 15 1 0xffffffff82902000 c60 coretemp.ko 16 1 0xffffffff82903000 965128 nvidia.ko there were no messages regarding zpool scrub that I could find. port versions for things with kernel modules: nvidia-driver-340-340.106 virtualbox-ose-5.1.18 virtualbox-ose-kmod-5.1.22 linux-c7-7.3.1611_1 This problem has happened since mid last year, around the time when the -ST= ABLE source went to 11.2 and I updated kernel+world on this computer. The zpool= has also been upgraded. It is worth noting that this computer ran 11.0 for a l= ong time without incident. The problem may have been present in 11.1 . Related: there is an apparent (random crash) bug in the NVidia module that= I have been trying to track down. It causes occasional page fault crashes.=20 Sometimes I will see swap space in use when there does not seem to be any reason for it, and I believe this NVidia bug is a part of that (the crash happening from randomly accessing 'after free' or random memory addresses, = and swap space is allocated as a consequence?). Whether this NVidia driver bug= is responsible for the zfs problem, I do not know, but this driver is only on = this particular computer, and so it's worth mentioning, as only this computer se= ems to exhibit the problem. --=20 You are receiving this mail because: You are the assignee for the bug.=