From owner-freebsd-stable@freebsd.org Tue Aug 7 12:58:24 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C4E2D105DD2D for ; Tue, 7 Aug 2018 12:58:23 +0000 (UTC) (envelope-from Mark.Martinec+freebsd@ijs.si) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 5CC798E3E4 for ; Tue, 7 Aug 2018 12:58:23 +0000 (UTC) (envelope-from Mark.Martinec+freebsd@ijs.si) Received: by mailman.ysv.freebsd.org (Postfix) id 21C11105DD2C; Tue, 7 Aug 2018 12:58:23 +0000 (UTC) Delivered-To: stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F3FBD105DD2B for ; Tue, 7 Aug 2018 12:58:22 +0000 (UTC) (envelope-from Mark.Martinec+freebsd@ijs.si) Received: from mail.ijs.si (mail.ijs.si [IPv6:2001:1470:ff80::25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 81FC18E3E2; Tue, 7 Aug 2018 12:58:22 +0000 (UTC) (envelope-from Mark.Martinec+freebsd@ijs.si) Received: from amavis-ori.ijs.si (localhost [IPv6:::1]) by mail.ijs.si (Postfix) with ESMTP id 41lF1m2QCjz3bG; Tue, 7 Aug 2018 14:58:20 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ijs.si; h= user-agent:message-id:references:in-reply-to:organization :subject:subject:from:from:date:date:content-transfer-encoding :content-type:content-type:mime-version:received:received :received:received; s=jakla4; t=1533646697; x=1536238698; bh=Ekp ZROjUSVNfSWz3U9AmeYmXB5yAzozbTJeDOwfYAaE=; b=hp7u+QKDUuzKyhCvppV 3y2uKESsQKC8dLJN8RJRN9ACuCO0YdHvI4T5RqjW0CH/mJ56w4Vdrs/vsbPk5pHN PFIc5ZrjX2rilYTnpPeIjQ880sudvvdtgURVR1SIrqo7W7+IsXozrTj7cmqrfk4y O/gyPiDiEOrq7qc0Bo09aSIk= X-Virus-Scanned: amavisd-new at ijs.si Received: from mail.ijs.si ([IPv6:::1]) by amavis-ori.ijs.si (mail.ijs.si [IPv6:::1]) (amavisd-new, port 10026) with LMTP id WYsEGv5BdQhi; Tue, 7 Aug 2018 14:58:17 +0200 (CEST) Received: from mildred.ijs.si (mailbox.ijs.si [IPv6:2001:1470:ff80::143:1]) by mail.ijs.si (Postfix) with ESMTP id 41lF1j0jZSz3bC; Tue, 7 Aug 2018 14:58:16 +0200 (CEST) Received: from nabiralnik.ijs.si (nabiralnik.ijs.si [IPv6:2001:1470:ff80::80:16]) by mildred.ijs.si (Postfix) with ESMTP id 41lF1h4q7Qzh3; Tue, 7 Aug 2018 14:58:16 +0200 (CEST) Received: from neli.ijs.si (2001:1470:ff80:88:21c:c0ff:feb1:8c91) by nabiralnik.ijs.si with HTTP (HTTP/1.1 POST); Tue, 07 Aug 2018 14:58:16 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Tue, 07 Aug 2018 14:58:16 +0200 From: Mark Martinec To: stable@freebsd.org Cc: Mark Johnston Subject: Re: All the memory eaten away by ZFS 'solaris' malloc - on 11.1-R amd64 Organization: Jozef Stefan Institute In-Reply-To: <20180804194757.GD12146@raichu> References: <1a039af7758679ba1085934b4fb81b57@ijs.si> <3e56e4de076111c04c2595068ba71eec@ijs.si> <20180731220948.GA97237@raichu> <2ec91ebeaba54fda5e9437f868d4d590@ijs.si> <20180804170154.GA12146@raichu> <87f6a55cc2ee3d754ddb89475bbfbab8@ijs.si> <20180804194757.GD12146@raichu> Message-ID: X-Sender: Mark.Martinec+freebsd@ijs.si User-Agent: Roundcube Webmail/1.3.1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Aug 2018 12:58:24 -0000 > On Sat, Aug 04, 2018 at 08:38:04PM +0200, Mark Martinec wrote: >> 2018-08-04 19:01, Mark Johnston wrote: >> > I think running "zpool list" is adding a lot of noise to the output. >> > Could you retry without doing that? >> No, like I said previously, the "zpool list" (with one defunct >> zfs pool) *is* the sole culprit of the zfs memory leak. >> With each invocation of "zpool list" the "solaris" malloc >> jumps up by the same amount, and never ever drops. Without >> running it (like repeatedly under 'telegraf' monitoring >> of zfs), the machine runs normally and never runs out of >> memory, the "solaris" malloc count no longer grows steadily. 2018-08-04 21:47, Mark Johnston wrote: > Sorry, I missed that message. Given that information, it would be > useful to see the output of the following script instead: > > # dtrace -c "zpool list -Hp" -x temporal=off -n ' > dtmalloc::solaris:malloc > /pid == $target/{@allocs[stack(), args[3]] = count()} > dtmalloc::solaris:free > /pid == $target/{@frees[stack(), args[3]] = count();}' > > This will record all allocations and frees from a single instance of > "zpool list". Collected, here it is: https://www.ijs.si/usr/mark/tmp/dtrace-cmd.out.bz2 Kevin P. Neal wrote: > Was there a mention of a defunct pool? Indeed. Haven't tried yet to destroy it, so it is only my hypothesis that a defunct pool plays a role in this leak. > I've got a machine with 8GB RAM running 11.1-RELEASE-p4 with a single > ZFS > pool. It runs zfs list in a script multiple times a minute, and it has > been doing so for 181 days with no reboot. I have not seen any memory > issues. I have jumped from 10.3 directly to 11.1-RELEASE-p11, so I'm not sure with exactly which version / patch level the problem was introduced. Tried to reproduce the problem on another host running 11.2R, using memory disk (md), created GPT partition on it and a ZFS pool on top, then destroyed the disk, so the pool was left as UNAVAILABLE. Unfortunately this did not reproduce the problem, the "zpool list" on that host does not cause ZFS to leak memory. Must be something specific to that failed disk or pool, which is causing the leak. Mark