From owner-freebsd-amd64@FreeBSD.ORG Wed May 25 20:40:15 2005 Return-Path: X-Original-To: amd64@freebsd.org Delivered-To: freebsd-amd64@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 888C416A41C for ; Wed, 25 May 2005 20:40:15 +0000 (GMT) (envelope-from girgen@pingpong.net) Received: from melon.pingpong.net (82.milagro.bahnhof.net [195.178.168.82]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2A8BC43D1F for ; Wed, 25 May 2005 20:40:14 +0000 (GMT) (envelope-from girgen@pingpong.net) Received: from localhost (localhost.pingpong.net [127.0.0.1]) by melon.pingpong.net (Postfix) with ESMTP id 18FC14AF2F; Wed, 25 May 2005 22:40:13 +0200 (CEST) Received: from melon.pingpong.net ([127.0.0.1]) by localhost (melon.pingpong.net [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 15636-01-8; Wed, 25 May 2005 22:40:12 +0200 (CEST) Received: from [82.182.157.67] (1-2-8-5b.asp.sth.bostream.se [82.182.157.67]) by melon.pingpong.net (Postfix) with ESMTP id 9C8684AF2E; Wed, 25 May 2005 22:40:12 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v622) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: <75f1b24e6dc7e145f7d36a874b825ab1@pingpong.net> Content-Transfer-Encoding: 7bit From: Palle Girgensohn Date: Wed, 25 May 2005 22:40:12 +0200 To: amd64@freebsd.org X-Mailer: Apple Mail (2.622) X-Virus-Scanned: by amavisd-new at pingpong.net Cc: Jon Kuster Subject: Dual Xeon EM64T crashes reliably w/ 5.x amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 May 2005 20:40:15 -0000 Hi! When running our Dell 2850, dual xeon CPUs, in SMP mode with 5.4 (same w/ 5.3-stable), it will relibaly crash at least a couple of times per day. When crashin, the kernel panics (Fatal trap 12: page fault while in kernel mode) and the system will not reboot, neither will it save a core dump. I need to manually hit the big button to reboot. This machine is very loaded, mostly due to some rather sloppy php scripts, that can be well optimized. Average load >1 most of the time, I'd say. Still, it's not a reason to panic, IMHO :) I've built a uni-processor kernel, and now the machine is quite stable, but that's not a solution, of course. I'm cc:ing Jon Kuster, since he describes exactly the same problem, with identical hardware. His machine is not as loaded, so in his case moving from four CPUs (two "real" + HTT) to two real (shutting down HTT) was enough to stop the crashes. For me, I must run UP. So, I don't get any core dumps, the machine does not reboot automatically, and customers are really unhappy. I'm clueless and need help. What do I do? Don't say "linux"... :( /Palle