From owner-freebsd-bugs@freebsd.org Thu Aug 22 21:41:28 2019 Return-Path: Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 08DCFD35F3 for ; Thu, 22 Aug 2019 21:41:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 46Dydz6S9dz3xjg for ; Thu, 22 Aug 2019 21:41:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id DCB0DD35F2; Thu, 22 Aug 2019 21:41:27 +0000 (UTC) Delivered-To: bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DC74ED35F1 for ; Thu, 22 Aug 2019 21:41:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46Dydz5YHJz3xjd for ; Thu, 22 Aug 2019 21:41:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A2040A650 for ; Thu, 22 Aug 2019 21:41:27 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id x7MLfRF6075017 for ; Thu, 22 Aug 2019 21:41:27 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id x7MLfRRK075016 for bugs@FreeBSD.org; Thu, 22 Aug 2019 21:41:27 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 240047] more and more processes get stuck waiting for ufs and zfs until system is rendered inaccessible Date: Thu, 22 Aug 2019 21:41:27 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: fuz@fuz.su X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Aug 2019 21:41:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D240047 Bug ID: 240047 Summary: more and more processes get stuck waiting for ufs and zfs until system is rendered inaccessible Product: Base System Version: 12.0-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: fuz@fuz.su I'm on a conference running an open FTP server. Files are served by FTP via ftpd(8), NFS via nfsd(8), and HTTP via Apache 2.4. The server has its root= on UFS and remaining files spread over three ZFS pools, one currently replacin= g a (working) disk: $ zpool list -v NAME SIZE ALLOC FREE CKPOINT EXPAN= DSZ=20 FRAG CAP DEDUP HEALTH ALTROOT disk12 18.1T 14.0T 4.16T - = -=20 3% 77% 1.00x ONLINE - da3 9.06T 6.98T 2.08T - = -=20 3% 77% diskid/DISK-7JG9E40C%20%20%20%20%20%20%20%20%20%20%20%20 9.06T 6.98T 2= .08T - - 3% 77% cache - - - - = -=20=20=20 - ada0p2 170G 3.98G 166G - = -=20 0% 2% disk34 18.1T 14.8T 3.33T - = -=20 4% 81% 1.00x ONLINE - da2 9.06T 7.39T 1.67T - = -=20 4% 81% da1 9.06T 7.41T 1.66T - = -=20 4% 81% cache - - - - = -=20=20=20 - ada0p5 170G 5.14G 165G - = -=20 0% 3% disk56 18.1T 14.0T 4.15T - = -=20 1% 77% 1.00x ONLINE - replacing 9.06T 6.97T 2.10T - = -=20 1% 76% da0 - - - - = -=20 - - da4 - - - - = -=20 - - diskid/DISK-7PGVBGZC%20%20%20%20%20%20%20%20%20%20%20%20 9.06T 7.01T 2= .06T - - 1% 77% cache - - - - = -=20=20=20 - ada0p6 170G 6.03G 164G - = -=20 0% 3% $ df -h Filesystem Size Used Avail Capacity Mounted on /dev/ada0p4 375G 68G 278G 20% / devfs 1.0K 1.0K 0B 100% /dev tmpfs 33G 76K 33G 0% /var/run tmpfs 33G 4.0K 33G 0% /tmp tmpfs 33G 156K 33G 0% /var/log fdescfs 1.0K 1.0K 0B 100% /dev/fd procfs 4.0K 4.0K 0B 100% /proc disk12 18T 14T 3.6T 80% /disk12 disk34 17T 14T 2.8T 83% /disk34 disk56 18T 14T 3.6T 80% /disk56 disk34/zeug 3.6T 864G 2.8T 23% /usr/home/fuz/zeug :/disk12 18T 14T 3.6T 80% /export :/disk34 35T 32T 2.8T 92% /export :/disk56 52T 49T 3.6T 93% /export Files are served over a 10 GBe connection with an average bandwith of around 200 MB/s, the limit seems to be in the number of IOP/s: $ zpool iostat capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- disk12 14.0T 4.16T 254 0 34.8M 6.16K disk34 14.8T 3.33T 261 29 35.0M 1.20M disk56 14.0T 4.15T 882 29 118M 191K ---------- ----- ----- ----- ----- ----- ----- RAM is about half used and nothing seems to indicate any resource exhaustio= n. $ vmstat procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad0 da0 in sy cs us s= y id 0 0 0 1.0T 666M 451 1197 436 0 64834 14532 0 0 28631 18084 93822 = 0 17 83 The only sysctl set is kern.racct.enable=3D1 After a while, more and more httpd and ftpd processes get stuck in an ufs or zfs wait state. They cannot be killed. I have since rebooted the server a bunch of times and the problem keeps appearing. --=20 You are receiving this mail because: You are the assignee for the bug.=