From owner-freebsd-hackers@FreeBSD.ORG Wed Mar 19 04:53:05 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 89A46C55; Wed, 19 Mar 2014 04:53:05 +0000 (UTC) Received: from mail-qc0-x22b.google.com (mail-qc0-x22b.google.com [IPv6:2607:f8b0:400d:c01::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 277BEDC8; Wed, 19 Mar 2014 04:53:05 +0000 (UTC) Received: by mail-qc0-f171.google.com with SMTP id c9so3255244qcz.16 for ; Tue, 18 Mar 2014 21:53:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=A4DIORy4gcOFpLV4B4o0FJgjEtaZZmvPcLLBWYc66JU=; b=Ojg7VEBy8GhwwT0qFIo/Qn8b2n/FGF0d3quQxOWumjGMmP17TiZwp6tfUcnCLBaEoY NAmt5vxYndLK1AhtOs3r9DlkxBewauAsQ9oWs2q6IMAFN1YLLZRp//EPXC4o+EjXITgH t+yg0Yp/9SJIN8xS+0QFG7WWbchd7latA7GSzQRqqP6cFITviEG4LdmLTk7TyPIfDPFZ 52gycfil2yiFqikAXAXpq0cBgZne+UsavgSlHWZgIgotiXXSGcIEvufygUZ2NECU1Uto w/Avtbf410U9XeOqs9DoHHEDD8gpcgwozU1JMWEx26re9JRltoD4KF9fgLVNDJXE1t5Q ug8w== MIME-Version: 1.0 X-Received: by 10.224.74.201 with SMTP id v9mr58623qaj.94.1395204784375; Tue, 18 Mar 2014 21:53:04 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.8.137 with HTTP; Tue, 18 Mar 2014 21:53:04 -0700 (PDT) In-Reply-To: <532910D1.3010704@denninger.net> References: <53260B36.2070409@denninger.net> <201403181505.47349.jhb@freebsd.org> <5328A024.6050901@denninger.net> <201403181730.02471.jhb@freebsd.org> <532910D1.3010704@denninger.net> Date: Tue, 18 Mar 2014 21:53:04 -0700 X-Google-Sender-Auth: JNiZNS3gs4YWj7C-PFgLqVUD6sg Message-ID: Subject: Re: Tracking down what has inact pages locked up From: Adrian Chadd To: Karl Denninger Content-Type: text/plain; charset=ISO-8859-1 Cc: Alan Cox , "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Mar 2014 04:53:05 -0000 Can you dump out vmstat -z during the leak and once you've unloaded it? -a On 18 March 2014 20:36, Karl Denninger wrote: > > On 3/18/2014 4:30 PM, John Baldwin wrote: >> >> On Tuesday, March 18, 2014 3:36:04 pm Karl Denninger wrote: >>> >>> On 3/18/2014 2:05 PM, John Baldwin wrote: >>>> >>>> On Sunday, March 16, 2014 4:36:06 pm Karl Denninger wrote: >>>>> >>>>> Is there a reasonable way to determine who or what has that memory >>>>> locked up -- and thus why the vm system is not demoting that space into >>>>> the cache bucket so it can be freed (which, if my understanding is >>>>> correct, should be happening long before now!) >>>> >>>> I have a hackish thing (for 8.x, might work on 10.x) to let you figure >>>> out >>>> what is using up RAM. This should perhaps go into the base system at >>>> some >>>> point. >>>> >>>> Grab the bits at http://people.freebsd.org/~jhb/vm_objects/ >>>> >>>> You will want to build the kld first and use 'make load' to load it. It >>>> adds >>>> a new sysctl that dumps info about all the VM objects in the system. >>>> You can >>>> then build the 'vm_objects' tool and run it. It can take a while to run >>>> if >>>> you have NFS mounts, so I typically save its output to a file first and >>>> then >>>> use sort on the results. sort -n will show you the largest consumer of >>>> RAM, >>>> sort -n -k 3 will show you the largest consumer of inactive pages. Note >>>> that 'df' and 'ph' objects are anonymous, and that filename paths aren't >>>> always reliable, but this can still be useful. >>>> >>> Thanks. >>> >>> I suspect the cause of the huge inact consumption is a RAM leak in the >>> NAT code in IPFW. It was not occurring in 9.2-STABLE, but is on >>> 10.0-STABLE, and reverting to natd in userland stops it -- which >>> pretty-well isolates where it's coming from. >> >> Memory for in-kernel NAT should be wired pages, not inactive. > > Yeah, should be. :-) > > But..... it managed to lock up 19GB of the 24GB the system has in inact > pages over 12 hours, and dropping the system to single user and unloading > the modules did not release the RAM...... which is why the question (on how > to track down what the hell is going on.) > > Changing the config back to natd as opposed to in-kernel NAT, however, made > the problem disappear. > > -- > -- Karl > karl@denninger.net > >