From owner-freebsd-hackers@FreeBSD.ORG Wed Jul 23 15:20:59 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5FE4437B401 for ; Wed, 23 Jul 2003 15:20:59 -0700 (PDT) Received: from smartrafficenter.org (pacer.smartrafficenter.org [207.14.56.3]) by mx1.FreeBSD.org (Postfix) with SMTP id 8680943FB1 for ; Wed, 23 Jul 2003 15:20:58 -0700 (PDT) (envelope-from kpieckiel@smartrafficenter.org) Received: (qmail 74737 invoked by uid 1500); 23 Jul 2003 22:20:56 -0000 Date: Wed, 23 Jul 2003 18:20:56 -0400 From: "Kevin A. Pieckiel" To: Mike Silbersack Message-ID: <20030723222056.GA74596@pacer.dmz.smartrafficenter.org> References: <20030723173007.GD41280@pacer.dmz.smartrafficenter.org> <20030723163643.F4074@odysseus.silby.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030723163643.F4074@odysseus.silby.com> User-Agent: Mutt/1.4i cc: freebsd-hackers@freebsd.org Subject: Re: mbuf cluster shortage caused kernel panic X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2003 22:20:59 -0000 Mike, On Wed, Jul 23, 2003 at 04:57:38PM -0500, Mike Silbersack wrote: > > Your panic seems to indicate that the mbuf cluster chain became corrupted, > which could have happened in one of a few ways. I'll address your > question in two parts: > > 1. How do I prevent the system from using all mbuf clusters. > > This depends on the application you're running; next time you're in a > similar situation, you may wish to run netstat -n | more and look at the You are exactly right. Right before I read this E-Mail, I noticed I started running out again. I was fortunate enough to catch the right information in time. A program one of my colleagues wrote was running ping every couple of seconds. The problem was the -c flag was not used, so ping never exited. I had hundreds of ping commands running. I was not able to catch this before the panic. (It panicked twice more, BTW, before I was able to catch this.) This time, I was fortunate enough to notice a high load average for this machine. That lead to checking the process list. That led to gazillions of ping commands running. 'killall -9 ping' was my best friend today. > 2. How do I prevent the system from panicing when all mbuf clusters are > used up? > > This question has a more useful answer. :) > > You could cvsup to 4.8-STABLE; at least two bugs which would result in > panics during mbuf exhaustion have been fixed, and an additional potential > panic causing situation has been patched. One of those bugs may be the > same as the one that affected you, but it would be very time consuming to > figure it out. This is a good thought. In fact, I did use the third crash as an opportunity to upgrade, in hopes of solving the panic problem, even if it didn't solve the real issue. Not panicking would give me more time to see what was really wrong, even if I had no network. Fortunately, I didn't have to test this theory, but I did get the upgrade. :) > If this problem is infrequent, I think your best course of action is to > build a 4.7 kernel with INVARIANTS for now, and plan on a 4.8-stable > upgrade at some point in the future. Mike, I am truly thankful for your response. I appreciate your help. Even though I did find the problem before I read your answer, I believe it would have given me the insight/time I needed to find what the real problem was had I not noticed my high load average. Thank you. Sincerely, Kevin