From owner-freebsd-hackers@FreeBSD.ORG Fri Sep 21 22:50:49 2007 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7538616A46B for ; Fri, 21 Sep 2007 22:50:49 +0000 (UTC) (envelope-from benjie@addgene.org) Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.235]) by mx1.freebsd.org (Postfix) with ESMTP id 3EB0E13C45D for ; Fri, 21 Sep 2007 22:50:49 +0000 (UTC) (envelope-from benjie@addgene.org) Received: by nz-out-0506.google.com with SMTP id l8so723836nzf for ; Fri, 21 Sep 2007 15:50:48 -0700 (PDT) Received: by 10.114.73.1 with SMTP id v1mr4094949waa.1190413593375; Fri, 21 Sep 2007 15:26:33 -0700 (PDT) Received: by 10.114.15.16 with HTTP; Fri, 21 Sep 2007 15:26:33 -0700 (PDT) Message-ID: Date: Fri, 21 Sep 2007 18:26:33 -0400 From: "Benjie Chen" To: freebsd-hardware@freebsd.org, freebsd-hackers@freebsd.org MIME-Version: 1.0 X-Mailman-Approved-At: Mon, 24 Sep 2007 08:25:07 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Kernel panic on PowerEdge 1950 under certain stress load X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Sep 2007 22:50:49 -0000 Hi FreeBSD hackers and engineers, I am experiencing a kernel panic that comes on when my new PowerEdge 1950 FreeBSD 6.2 setup is under a certain stress load. I've emailed a few people on the list who have given me useful comments, some of which I am still following up. But I wanted to send a general cry for help to see if there are more knowledge out there about this problem. FreeBSD 6.2 on PowerEdge 1950, RAID1 setup with mfi driver (PERC5i). 4GB RAM. I am currently running i386, and not amd64, due to various reasons. I've ran exhaustively memory tests, disk tests, and network tests and cannot produce the kernel panic. I worked with Dell support to run memory test 1 DIMM at a time and cannot find any problem. With 1 DIMM at a time, I could still get the kernel panic under my work load. My work load is heavily hitting a web site running on the machine and requiring the web service to do MySQL requests. On the side, I am running a bunch of scripts that mostly read from the MySQL database but also write to it occasionally. Not memory intensive -- still have usually about 1GB free memory, but fairly disk intensive. I don't get disk errors. Anywhere from between 10 minutes to 4 or 5 hours into the test, I get the kernel panic. Again, still no disk errors. I turned off soft-update, still happens. Kernel panic is at 0xC066C731, which from nm shows it's in mtx_lock_spin c066c7b4 T _mtx_lock_spin c066c85c T _mtx_unlock_sleep So this could mean that independent stress tests will not result in panic if there aren't enough concurrency to cause the problem. There are a few other complaints about kernel panics at the same IP on the web (google 0xc066c731)... I was wondering if anyone had dealt with this before and if there are any work arounds? Thanks, Benjie