From owner-freebsd-stable@FreeBSD.ORG Fri Mar 27 09:16:44 2015 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4D4BA43C; Fri, 27 Mar 2015 09:16:44 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E3EB9EA6; Fri, 27 Mar 2015 09:16:43 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t2R9Gc7e052295 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 27 Mar 2015 11:16:38 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t2R9Gc7e052295 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t2R9Gbbu052294; Fri, 27 Mar 2015 11:16:37 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 27 Mar 2015 11:16:37 +0200 From: Konstantin Belousov To: J David Subject: Re: Significant memory leak in 9.3p10? Message-ID: <20150327091637.GH2379@kib.kiev.ua> References: <20150316232404.GM2379@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-stable , "freebsd-questions@freebsd.org" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 09:16:44 -0000 On Thu, Mar 26, 2015 at 03:46:05PM -0400, J David wrote: > On Mon, Mar 16, 2015 at 7:52 PM, J David wrote: > > On Mon, Mar 16, 2015 at 7:24 PM, Konstantin Belousov > > wrote: > >> There are a lot of possibilities to create persistent anonymous shared > >> memory objects. Not complete list is tmpfs mounts, swap-backed md disks, > >> sysv shared memory, possibly posix shared memory (I do not remember which > >> implementation is used in stable/9). > > > > If that's the explanation, how could it be > > detected/measured/investigated/resolved/prevented? > > > > Under ordinary circumstances, machines will go run like this for days/weeks: > > > > Mem: 549M Active, 3623M Inact, 567M Wired, 3484K Cache, 827M Buf, 3156M Free > > Swap: 1024M Total, 1024M Free > > > > Then, when this happens, it rapidly degrades from that to so bad that > > processes start getting killed for being out of swap space. > > These FreeBSD machines running out of swap space and dying continues > to be a daily problem causing outages and unscheduled reboots. Is > there really no way to even research what might be causing the > problem? > > (Widening the cross-posting in the hopes of eliciting more help, so > the brief summary of the problem orginally posted to freebsd-stable is > that an unknown actor consumes all the user-space memory in the > system, including swap space, to the point where processes are killed > for being out of swap space, but if every process on the machine is > stopped, very little of the user-space memory in use is freed. > Original message with more details is here: > https://lists.freebsd.org/pipermail/freebsd-stable/2015-March/081986.html > .) > > There are no tmpfs mounts or md disks, so it would have to be one of > the other causes. How can FreeBSD's use of persistent, anonymous > shared memory objects be investigated, measured, or controlled so we > can get a handle on this issue? Start by providing useful information about your system, not a description of the information. E.g., a consistent snapshot of the following: ps auxww swapinfo mount -v mdconfig -lv vmstat -z vmstat -m vmstat -s sysctl -a ipcs -a Collect this data both during the normal run, run while the problem appear but userspace is not killed, and after you killed the processes. Just in case, show kldstat.