From owner-freebsd-current@FreeBSD.ORG Thu Nov 6 08:25:39 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F149C16A4CE for ; Thu, 6 Nov 2003 08:25:39 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id E88E143FEC for ; Thu, 6 Nov 2003 08:25:38 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9p2/8.12.9) with ESMTP id hA6GNUMg003847; Thu, 6 Nov 2003 11:23:30 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)hA6GNU48003844; Thu, 6 Nov 2003 11:23:30 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Thu, 6 Nov 2003 11:23:30 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Kevin Oberman In-Reply-To: <20031106160831.4D10F5D07@ptavv.es.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@freebsd.org Subject: Re: Kernel memory leak in ATAPI/CAM or ATAng? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 16:25:40 -0000 On Thu, 6 Nov 2003, Kevin Oberman wrote: > I have learned a bit more about the problems I have been having with > the DVD drive on my T30 laptop. When I have run the drive for an > extended time (like 2 or 3 hours), I invariably have my system lock up > because it can't malloc kernel memory for the ATAPI/CAM or ATA > device. (Usually it's both.) > > The only recovery seems to be to reboot the system. Is it possible to drop to DDB and generate a coredump at that point? If so, you can run vmstat on the core to look at memory use statistics in a post-mortem way. As to what to look for: "big numbers" is about the limit of what I can suggest, I'm afraid :-). Usually the activity of choice is to compare vmstat statistics (with -m and -z) during normal operation and when the leak has occurred, and look for any marked differences. It's worth observing that there are two failure modes here that appear almost identical: (1) a memory leak resulting in address space exhaustion for the kernel, and (2) a tunable maximum allocation being too high for the available address space. Note that (2) isn't a leak, simply a poorly tuned value. We've noticed a number of tuned memory limits were set when memory sizes on systems were much lower, and so we've had to readjust the tuning parameters for large memory systems. Likewise, a number of problems were observed when PAE was introduced, as some of the tuning parameters scaled with the amount of physical memory, not with the addressable space for the kernel. So we probably want to be on the look out for both of these possibilities. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories