From owner-freebsd-hardware@FreeBSD.ORG Fri Sep 21 22:26:34 2007 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6A13616A417 for ; Fri, 21 Sep 2007 22:26:34 +0000 (UTC) (envelope-from benjie@addgene.org) Received: from rv-out-0910.google.com (rv-out-0910.google.com [209.85.198.186]) by mx1.freebsd.org (Postfix) with ESMTP id 47CBD13C467 for ; Fri, 21 Sep 2007 22:26:34 +0000 (UTC) (envelope-from benjie@addgene.org) Received: by rv-out-0910.google.com with SMTP id l15so806533rvb for ; Fri, 21 Sep 2007 15:26:33 -0700 (PDT) Received: by 10.114.73.1 with SMTP id v1mr4094949waa.1190413593375; Fri, 21 Sep 2007 15:26:33 -0700 (PDT) Received: by 10.114.15.16 with HTTP; Fri, 21 Sep 2007 15:26:33 -0700 (PDT) Message-ID: Date: Fri, 21 Sep 2007 18:26:33 -0400 From: "Benjie Chen" To: freebsd-hardware@freebsd.org, freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Kernel panic on PowerEdge 1950 under certain stress load X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Sep 2007 22:26:34 -0000 Hi FreeBSD hackers and engineers, I am experiencing a kernel panic that comes on when my new PowerEdge 1950 FreeBSD 6.2 setup is under a certain stress load. I've emailed a few people on the list who have given me useful comments, some of which I am still following up. But I wanted to send a general cry for help to see if there are more knowledge out there about this problem. FreeBSD 6.2 on PowerEdge 1950, RAID1 setup with mfi driver (PERC5i). 4GB RAM. I am currently running i386, and not amd64, due to various reasons. I've ran exhaustively memory tests, disk tests, and network tests and cannot produce the kernel panic. I worked with Dell support to run memory test 1 DIMM at a time and cannot find any problem. With 1 DIMM at a time, I could still get the kernel panic under my work load. My work load is heavily hitting a web site running on the machine and requiring the web service to do MySQL requests. On the side, I am running a bunch of scripts that mostly read from the MySQL database but also write to it occasionally. Not memory intensive -- still have usually about 1GB free memory, but fairly disk intensive. I don't get disk errors. Anywhere from between 10 minutes to 4 or 5 hours into the test, I get the kernel panic. Again, still no disk errors. I turned off soft-update, still happens. Kernel panic is at 0xC066C731, which from nm shows it's in mtx_lock_spin c066c7b4 T _mtx_lock_spin c066c85c T _mtx_unlock_sleep So this could mean that independent stress tests will not result in panic if there aren't enough concurrency to cause the problem. There are a few other complaints about kernel panics at the same IP on the web (google 0xc066c731)... I was wondering if anyone had dealt with this before and if there are any work arounds? Thanks, Benjie