From owner-freebsd-stable@FreeBSD.ORG Sat Jul 15 06:05:42 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9787516A4DD; Sat, 15 Jul 2006 06:05:42 +0000 (UTC) (envelope-from fbsd-stable@mawer.org) Received: from mail18.syd.optusnet.com.au (mail18.syd.optusnet.com.au [211.29.132.199]) by mx1.FreeBSD.org (Postfix) with ESMTP id 00E5B43D46; Sat, 15 Jul 2006 06:05:41 +0000 (GMT) (envelope-from fbsd-stable@mawer.org) Received: from [127.0.0.1] (c220-239-234-69.thorn1.nsw.optusnet.com.au [220.239.234.69]) by mail18.syd.optusnet.com.au (8.12.11/8.12.11) with ESMTP id k6F65ZJG027835; Sat, 15 Jul 2006 16:05:40 +1000 Message-ID: <44B88630.7040506@mawer.org> Date: Fri, 14 Jul 2006 20:07:44 -1000 From: Antony Mawer User-Agent: Thunderbird 1.5.0.4 (Windows/20060516) MIME-Version: 1.0 To: User Freebsd References: <20060705100403.Y80381@fledge.watson.org> <20060705234514.I70011@fledge.watson.org> <20060715000351.U1799@ganymede.hub.org> <20060715035308.GJ32624@deviant.kiev.zoral.com.ua> <20060715010607.L1799@ganymede.hub.org> In-Reply-To: <20060715010607.L1799@ganymede.hub.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Kostik Belousov , Robert Watson , freebsd-stable@freebsd.org, Michel Talon , Francisco Reyes Subject: Re: vm_map.c lock up (Was: Re: NFS Locking Issue) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Jul 2006 06:05:42 -0000 On 14/07/2006 6:08 PM, User Freebsd wrote: >> Just in case, do you use mlocked mappings ? Also, why so huge number >> of crons exist in the system ? The are all forking now. It may be (can >> not say definitely without further investigation) just a fork bomb. > > re: crons ... this, I'm not sure of, but my suspicion was that the crons > weren't able to complete, since the file system was locked up, but the > next one was being attempted to run ... *shrug* This seems consistent with behaviour I've seen in on several 6.0-RELEASE machines.. from the limited information I've been able to get from the machines, there has appeared to be multiple tasks from cron all piled up upon one another. In particular, the daily periodic tasks that run the various 'find' were one of the things I noticed (although we run numerous tasks out of cron)... If something is blocking the filesystem and causing find (and possibly other processes) to become stuck, these would just keep mounting up until it all falls over (with numerous maxproc exceeded etc errors). These are on machines without NFS, but the symptoms are very very similar.. NWFS and SMBFS are commonly used on a number of the machines I've seen the problem on, which may be relevant -- perhaps it affects more than just NFS? I may experiment with building up a test server locally and trying to reproduce similar loads to see if I can trigger the problem in-house.. at least that way I can hook up a serial console and get some more detailed information... Regards Antony