From owner-freebsd-stable@FreeBSD.ORG Tue Apr 27 09:19:25 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6ACC116A4CE for ; Tue, 27 Apr 2004 09:19:25 -0700 (PDT) Received: from util.inch.com (shellutil.inch.com [216.223.208.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7EE4F43D5A for ; Tue, 27 Apr 2004 09:19:24 -0700 (PDT) (envelope-from bsdlist@bsdisp.com) Received: from kod.inch.com (kod.inch.com [216.223.192.68]) i3RGJM6V048892 for ; Tue, 27 Apr 2004 12:19:23 -0400 (EDT) (envelope-from bsdlist@bsdisp.com) Date: Tue, 27 Apr 2004 12:19:22 -0400 (EDT) From: Gerald To: freebsd-stable@freebsd.org Message-ID: <20040427120729.W56999@kod.inch.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Compaq 1850R freezing, controller issues? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Apr 2004 16:19:25 -0000 I'll try to be as specific as possible without overkill. I have a Compaq 1850R with dual P3 450s and a gig of RAM running FreeBSD 4.8-RELEASE-p16. The 3 internal 36G SCSI 10k disks are set up RAID 5 on the Smart 2 SL RAID controller. SMP is enabled. FreeBSD uses the ida driver to interact with the RAID controller. This machine is the most I've tasked these 1850s to do so far and it has started Freezing shortly after I was forced to put it in production. There is a lot of disk I/O since this is a mail server (POP & SMTP), and the disk is being NFS accessed as well. Time between freezes ranges 15 hours to 72 hours. I've set up a lot of debugging to try and find what is going on with the machine and I had a little more light shed this morning. Let me define freeze: - no network response at all - display was still going to monitor - alt fN keys would switch displays, but... - type in username and hit enter and it just acknowledges the enter with line feeds. - mrtg was also registering a huge release of memory right before the crash. Average is 10-100 MB of Free memory and it would register all of the memory being freed up. Saturday when it froze last I setup 2 displays running commands since it appeared to keep running to the monitor after it would die to all else. One was running top -ores (since mrtg was pointing around memory) and the other was running systat -vm 5. When it froze today, there were about 15 processes in State: inode running at priority -14. They weren't all sendmail either. snmpd, radiator (just for accounting), and sendmail were running at -14. The top process and other processes were still running but all services had died again and I couldn't pull it out of the coma without hitting the power button...again. None of the logs record anything out of the ordinary. The machine goes from normal operation to freeze too fast to record the problem. Also, if it is a disk access problem, then that would explain why my logs don't have anything. If I had to put this as questions...: - What do I do to keep the freezes from happening? - What can I do to record more information to find out what specifically is causing the freeze? (or is this enough information and I just don't know the answer?) - Has anyone else put the Smart 2 SL or the 1850Rs through some heavy lifting on 4.8? I'm going to do some research on the Smart 2 SL and see if there are any updates that the SmartStart CD might have put an SEP field around and I'm going to try to find drop in replacement controller prices. The disks are brand new from newegg so I don't speculate them yet. Thanks for any help, suggestions, pointers, or assistance in advance, Gerald P.S. First post to the FreeBSD lists. Go easy on me.