From owner-freebsd-bugs@freebsd.org Thu Feb 7 09:40:18 2019 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0E39114D1A28 for ; Thu, 7 Feb 2019 09:40:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 72FBA800AE for ; Thu, 7 Feb 2019 09:40:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 2F30B14D1A27; Thu, 7 Feb 2019 09:40:17 +0000 (UTC) Delivered-To: bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0A55214D1A26 for ; Thu, 7 Feb 2019 09:40:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 98C1D800AD for ; Thu, 7 Feb 2019 09:40:16 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id D35F38075 for ; Thu, 7 Feb 2019 09:40:15 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id x179eFsF093071 for ; Thu, 7 Feb 2019 09:40:15 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id x179eFPP093069 for bugs@FreeBSD.org; Thu, 7 Feb 2019 09:40:15 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 231457] Out of swap space on ZFS Date: Thu, 07 Feb 2019 09:40:09 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.2-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: mail@rubenvos.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Feb 2019 09:40:18 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D231457 mail@rubenvos.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mail@rubenvos.com --- Comment #15 from mail@rubenvos.com --- Hi, We are seeing similar behaviour on one of our zfs-nfs servers as well. Jan 31 10:41:13 volume1 kernel: pid 17505 (collectd), uid 0, was killed: ou= t of swap space Jan 31 10:41:13 volume1 kernel: pid 51659 (ntpd), uid 0, was killed: out of swap space Jan 31 10:42:54 volume1 kernel: pid 73673 (devd), uid 0, was killed: out of swap space Jan 31 10:43:11 volume1 kernel: pid 31167 (mountd), uid 0, was killed: out = of swap space Jan 31 10:44:12 volume1 kernel: pid 50359 (nfsd), uid 0, was killed: out of swap space Jan 31 10:44:36 volume1 kernel: pid 81152 (zsh), uid 0, was killed: out of = swap space Jan 31 10:44:54 volume1 kernel: pid 49005 (zsh), uid 4002, was killed: out = of swap space Jan 31 10:46:13 volume1 kernel: pid 95263 (nrpe3), uid 181, was killed: out= of swap space Jan 31 10:46:36 volume1 kernel: pid 48518 (sshd), uid 4002, was killed: out= of swap space Jan 31 10:46:55 volume1 kernel: pid 92367 (rpcbind), uid 0, was killed: out= of swap space Jan 31 10:47:11 volume1 kernel: pid 56206 (nfsd), uid 0, was killed: out of swap space Jan 31 10:47:23 volume1 kernel: pid 68827 (dhclient), uid 65, was killed: o= ut of swap space Jan 31 10:47:38 volume1 kernel: pid 87548 (getty), uid 0, was killed: out of swap space Jan 31 10:47:50 volume1 kernel: pid 24945 (getty), uid 0, was killed: out of swap space Jan 31 10:49:14 volume1 kernel: pid 29466 (getty), uid 0, was killed: out of swap space Jan 31 10:49:37 volume1 kernel: pid 77339 (getty), uid 0, was killed: out of swap space Jan 31 10:49:51 volume1 kernel: pid 78317 (getty), uid 0, was killed: out of swap space Jan 31 10:50:13 volume1 kernel: pid 81831 (getty), uid 0, was killed: out of swap space Jan 31 10:50:37 volume1 kernel: pid 89762 (getty), uid 0, was killed: out of swap space Jan 31 10:50:51 volume1 kernel: pid 92067 (getty), uid 0, was killed: out of swap space Jan 31 10:51:49 volume1 kernel: pid 97499 (getty), uid 0, was killed: out of swap space Jan 31 10:52:14 volume1 kernel: pid 96091 (getty), uid 0, was killed: out of swap space Jan 31 10:52:37 volume1 kernel: pid 98907 (getty), uid 0, was killed: out of swap space Jan 31 10:52:51 volume1 kernel: pid 99595 (getty), uid 0, was killed: out of swap space Jan 31 10:55:47 volume1 kernel: pid 60068 (zsh), uid 0, was killed: out of = swap space Feb 7 09:57:40 volume1 collectd[25157]: plugin_read_thread: read-function = of the `swap' plugin took 19.765 seconds, which is above its read interval (10= .000 seconds). You might want to adjust the `Interval' or `ReadThreads' settings. Feb 7 09:59:48 volume1 kernel: pid 25157 (collectd), uid 0, was killed: ou= t of swap space Feb 7 09:59:48 volume1 kernel: pid 94240 (atop), uid 0, was killed: out of swap space Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 327109, size: 16384 Feb 7 09:59:48 volume1 kernel: pid 51515 (ntpd), uid 0, was killed: out of swap space Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 326787, size: 4096 Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 102263, size: 4096 Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 327152, size: 4096 Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 100915, size: 8192 Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 326754, size: 8192 Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 8471, size: 4096 Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 106028, size: 12288 Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 8229, size: 8192 Feb 7 09:59:48 volume1 kernel: swap_pager: indefinite wait buffer: bufobj:= 0, blkno: 103890, size: 8192 Feb 7 10:03:11 volume1 kernel: swap_pager_getswapspace(32): failed Feb 7 10:06:00 volume1 kernel: swap_pager_getswapspace(32): failed root@volume1:~ # grep arc /boot/loader.conf=20 vfs.zfs.arc_min=3D"10024M" vfs.zfs.arc_max=3D"13084M" root@volume1:~ # sysctl -a | grep phys kern.ipc.shm_use_phys: 0 vm.phys_segs:=20 vm.phys_free:=20 vm.phys_pager_cluster: 1024 hw.physmem: 17139478528 root@volume1:~ # sysctl vm.pageout_oom_seq vm.pageout_oom_seq: 120 root@volume1:~ #=20 root@volume1:~ # swapinfo=20 Device 1K-blocks Used Avail Capacity /dev/gpt/swap 8388608 26080 8362528 0% root@volume1:~ # freebsd-version -uk 11.2-RELEASE-p8 11.2-RELEASE-p8 root@volume1:~ #=20 We actually do have reason to assume the VM's storage backend might be periodically affected by an extremely slow storage provider (its running as= a VM on Openstack), as indicated by the "swap_pager: indefinite wait buffer: bufobj". It's kind of worrisome that important processes (nfsd for instanc= e) are shot down by the OOM with the default value of vm.pageout_oom_seq (if t= he default setting of that sysctl turns out to cause the OOM killer). We've just changed the vm.pageout_oom_seq from its default of 12 to 120 and= are monitoring the impact of that change. Ruben(In reply to Billg from comment #13) --=20 You are receiving this mail because: You are the assignee for the bug.=