From owner-freebsd-stable@FreeBSD.ORG Fri Apr 30 07:52:58 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D7CA016A4CE for ; Fri, 30 Apr 2004 07:52:58 -0700 (PDT) Received: from util.inch.com (shellutil.inch.com [216.223.208.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5A25343D58 for ; Fri, 30 Apr 2004 07:52:58 -0700 (PDT) (envelope-from bsdlist@bsdisp.com) Received: from kod.inch.com (kod.inch.com [216.223.192.68]) i3UEqkpH006447; Fri, 30 Apr 2004 10:52:46 -0400 (EDT) (envelope-from bsdlist@bsdisp.com) Date: Fri, 30 Apr 2004 10:52:46 -0400 (EDT) From: Gerald To: Gerald In-Reply-To: <20040427120729.W56999@kod.inch.com> Message-ID: <20040430104156.G15623@kod.inch.com> References: <20040427120729.W56999@kod.inch.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-stable@freebsd.org Subject: Re: Compaq 1850R freezing, controller issues? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Apr 2004 14:52:59 -0000 On Tue, 27 Apr 2004, Gerald wrote: > Saturday when it froze last I setup 2 displays running commands since it > appeared to keep running to the monitor after it would die to all else. > One was running top -ores (since mrtg was pointing around memory) and the > other was running systat -vm 5. When it froze today, there were about 15 > processes in State: inode running at priority -14. They weren't all > sendmail either. snmpd, radiator (just for accounting), and sendmail were > running at -14. The top process and other processes were still running but > all services had died again and I couldn't pull it out of the coma > without hitting the power button...again. This machine died again Wed night at 9 PM. I had already installed kernel sans SMP but not booted to it yet so I'll see how long that goes. If we make it past tonight, I'll know if that resolved the issue for certain after this weekend. The biggest clue thus far in my problem on this machine has been when I left a top running on the console. When it freezes, the processes are still alive but sleeping in "STATE: inode". This was not one process in such state but 80% of the processes that were still on the screen. Most of those processes had a Priority of -14. I don't know UFS or the kernel source code well enough to know what problem a ton of processes waiting for "inode" would indicate, but I know what an inode is. I have plenty free, and a lack of response from the disk or RAID array would leave the kernel waiting for some type of inode information. Can someone shed some light on what this alone might indicate? Could it be RAID controller OR a hard drive flaking? Would something (the kernel) missing an interrupt from something else (a disk or RAID card) cause this? How can I debug this information and find out why it was waiting for inodes? Can someone even clarify what a process in STATE: inode means literally. (reading/writing/both/taking inventory,requesting inode information) Gerald