Date: Wed, 09 Feb 2011 21:10:00 +0100 From: Martin Matuska <mm@FreeBSD.org> To: Kostik Belousov <kostikbel@gmail.com> Cc: freebsd-fs@freebsd.org, pjd@freebsd.org Subject: Re: Memory leak in ZFS? Message-ID: <4D52F498.2090000@FreeBSD.org> In-Reply-To: <20110208203653.GC78089@deviant.kiev.zoral.com.ua> References: <AANLkTi=8fFwiaQ4%2Bm_cWFkXwpa4_W0_DDV2aW8vyNU4E@mail.gmail.com> <op.vqjyb21daevz08@ghost-pc.home.lan> <4D510BBB.1060708@kkip.pl> <20110208102727.GA8555@icarus.home.lan> <4D511F65.2050503@kkip.pl> <4D519F97.2000805@kkip.pl> <20110208203653.GC78089@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
I think this should go to HEAD and I have already prepared ported patches for v28 (head and stable). Dňa 08.02.2011 21:36, Kostik Belousov wrote / napísal(a): > On Tue, Feb 08, 2011 at 08:55:03PM +0100, Bartosz Stec wrote: >> W dniu 2011-02-08 11:48, Bartosz Stec pisze: >>> W dniu 2011-02-08 11:27, Jeremy Chadwick pisze: >>>> On Tue, Feb 08, 2011 at 10:24:11AM +0100, Bartosz Stec wrote: >>>>> W dniu 2011-02-07 22:37, Emil Muratov pisze: >>>>>>> For the past few weeks, I noticed that the amount of memory >>>>>>> reported in top >>>>>>> (sum of active, inact, wired, cache buf and free) keeps >>>>>>> decreasing as the >>>>>>> uptime increases. I can't pinpoint to when I first noticed this, >>>>>>> as I have >>>>>>> updated the system a few times just in case this has been fixed. >>>>>> Yes, I have the same issue on my home file storage. My system is >>>>>> 8.1 amd64, 2G ram, zfs on root raidz with 4x1,5T drives. >>>>>> After updating to stable a couple of days ago I noticed that the >>>>>> system leaks memory very fast. Checking here and there I found >>>>>> that the issue concerns sendfile (yep, again!). >>>>>> >>>>>> How to reproduce: >>>>>> Configure samba with aio and sendfile (mine is version 3.5.6) >>>>>> >>>>>> smb.conf >>>>>> [global] >>>>>> use sendfile=true >>>>>> aio read size = 16384 >>>>>> >>>>>> Download a couple of large samba shared files (8-10 gigs). >>>>>> >>>>>> >>>>>> While downloading files I can see that memory decreazes to nowhere >>>>>> very-very fast, several MBs per second! First it drains free mem, >>>>>> than active and inactive, than comes wired until the whole system >>>>>> commits suicide suffocating itself to the death. >>>>>> The only way to free memory is to reboot the system. I can't >>>>>> unload zfs module like PJD suggested to do, 'cause my root is on >>>>>> zfs :( >>>>>> I'll try to make a bootable flash and move root to the flash to >>>>>> try to unload module and what will happen. >>>>>> >>>>>> Everything was OK in stable before the new year, sendfile used to >>>>>> pump free and wired memory to inactive than slowly reclaiming it >>>>>> back. But it seems something was changed after NY holydays? >>>>> I'm glad someone else finally picked that problem, so there's >>>>> appareantly no memory-eating ghost in my machine ;) >>>>> Here's my thread on stable list about this issue: >>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-January/061247.html >>>>> >>>>> >>>>> And in fact, PC reported in thread above is also SAMBA server with >>>>> aio/sendfile enabled and ZFS. >>>>> >>>>> I would be happy testing some patches if necessary, because until >>>>> now I need to monitor memory and reboot this server before it dies. >>>> The source and build date of your kernel will matter greatly here. >>>> >>>> I can't speak about the memory utilisation aspect, but I tend to disable >>>> sendfile everywhere possible when ZFS is in use on a system. The reason >>>> is based on something I and another user experienced back in October >>>> 2010 pertaining to sendfile() on ZFS locking up processes (making them >>>> unkillable). See here[1] for details; this problem has since been >>>> fixed[2] (look for commits around October). You'll also find some >>>> commits that went through in November pertaining to ZFS and sendfile. >>>> This is why I said the date of your kernel/sources matters. :-) >>> >>> I tried rebuild since original thread, hoping that problem is fixed >>> already, so now it's very fresh: 8.2-PRERELEASE #18: Sun Feb 6 >>> 03:04:47 CET 2011. >>> Problem is still here: >>> >>> Mem: 37M Active, 78M Inact, 1154M Wired, 64M Cache, 199M Buf, 40M Free >>> About 1373MB instead of 2GB, and it's not even 2 days of uptime. >>> >>>> The issue I referenced in [1] is not related to memory utilisation, but >>>> does indicate use of sendfile with ZFS may be a bad idea (by this I >>>> mean, there may be aspects of its implementation when mixed with ZFS >>>> that have been overlooked). >>>> >>>> Simple test: if you disable use of sendfile (but not AIO) in Samba, does >>>> the problem go away? >>> I've just disabled sendfile in smb.conf and I'll report in about 2 >>> days, after reboot which I will perform tonight. >>> I hope it won't hit samba performance too much ;) >>> >> We didn't need to wait 2 days :) >> Now I can confirm that sendfile under SAMBA + ZFS are responsible for >> issue. Here's sample output from my monitoring script[1] (update every 2 >> seconds): >> >> PHYSMEM: 2027.41 MB >> ACTIVE: 61.14 MB >> INACTIVE: 40.01 MB >> WIRED: 1303.86 MB >> CACHED: .50 MB >> FREE: 552.30 MB >> SUM: 1957.82 MB >> ------------------------ >> MISSING: 69.58 MB >> >> PHYSMEM: 2027.41 MB >> ACTIVE: 61.14 MB >> INACTIVE: 40.07 MB >> WIRED: 1303.86 MB >> CACHED: .50 MB >> FREE: 551.80 MB >> SUM: 1957.38 MB >> ------------------------ >> MISSING: 70.02 MB >> >> PHYSMEM: 2027.41 MB >> ACTIVE: 61.14 MB >> INACTIVE: 40.13 MB >> WIRED: 1303.86 MB >> CACHED: .50 MB >> FREE: 551.30 MB >> SUM: 1956.94 MB >> ------------------------ >> MISSING: 70.46 MB >> >> PHYSMEM: 2027.41 MB >> ACTIVE: 61.14 MB >> INACTIVE: 40.19 MB >> WIRED: 1303.86 MB >> CACHED: .50 MB >> FREE: 550.80 MB >> SUM: 1956.51 MB >> ------------------------ >> MISSING: 70.89 MB >> >> PHYSMEM: 2027.41 MB >> ACTIVE: 61.14 MB >> INACTIVE: 40.24 MB >> WIRED: 1303.86 MB >> CACHED: .50 MB >> FREE: 550.42 MB >> SUM: 1956.18 MB >> ------------------------ >> MISSING: 71.22 MB >> >> PHYSMEM: 2027.41 MB >> ACTIVE: 61.14 MB >> INACTIVE: 40.30 MB >> WIRED: 1303.86 MB >> CACHED: .50 MB >> FREE: 549.92 MB >> SUM: 1955.74 MB >> ------------------------ >> MISSING: 71.66 MB >> >> PHYSMEM: 2027.41 MB >> ACTIVE: 61.14 MB >> INACTIVE: 40.38 MB >> WIRED: 1303.86 MB >> CACHED: .50 MB >> FREE: 549.30 MB >> SUM: 1955.19 MB >> ------------------------ >> MISSING: 72.21 MB >> >> PHYSMEM: 2027.41 MB >> ACTIVE: 61.14 MB >> INACTIVE: 40.44 MB >> WIRED: 1303.86 MB >> CACHED: .50 MB >> FREE: 548.80 MB >> SUM: 1954.76 MB >> ------------------------ >> MISSING: 72.64 MB >> >> This behaviour has been seen while copying 600MB file from SAMBA share >> with sendfile enabled. >> It doesn't happen when writing to samba share, and it doesn't happen >> with sendfile disabled, both ways. >> For me it looks like memory which leaks should be added to wired pool >> and belongs to ARC, but appareantly this doesn't work well and WIRED: >> 1303.86 MB all the time. >> >> [1] http://pastebin.com/sQUyQbmm > > Try this. I the similar fix is needed for tmpfs, but there are some > more issues and pending rewrite, so I decided not to touch it. > > commit 8e5885bce1afecd419e40240a2d7ab90deb0392a > Author: Konstantin Belousov <kostik@pooma.home> > Date: Tue Feb 8 22:35:29 2011 +0200 > > Do not forget to activate the page > > diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > index e8191b3..7343c72 100644 > --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c > @@ -353,6 +353,9 @@ page_unlock(vm_page_t pp) > { > > vm_page_wakeup(pp); > + vm_page_lock(pp); > + vm_page_activate(pp); > + vm_page_unlock(pp); > } > > static caddr_t > @@ -480,7 +483,7 @@ again: > if (error == 0) > uiomove_fromphys(&m, off, bytes, uio); > VM_OBJECT_LOCK(obj); > - vm_page_wakeup(m); > + page_unlock(m); > } else if (uio->uio_segflg == UIO_NOCOPY) { > /* > * The code below is here to make sendfile(2) work > @@ -527,9 +530,15 @@ again: > zfs_unmap_page(sf); > } > VM_OBJECT_LOCK(obj); > - if (error == 0) > - m->valid = VM_PAGE_BITS_ALL; > vm_page_io_finish(m); > + vm_page_lock(m); > + if (error == 0) { > + m->valid = VM_PAGE_BITS_ALL; > + vm_page_activate(m); > + } else > + vm_page_free(m); > + vm_page_unlock(m); > + > if (error == 0) { > uio->uio_resid -= bytes; > uio->uio_offset += bytes;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D52F498.2090000>