From owner-freebsd-questions@FreeBSD.ORG Thu Nov 23 11:09:06 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2149416A415 for ; Thu, 23 Nov 2006 11:09:06 +0000 (UTC) (envelope-from ml.diespammer@netfence.it) Received: from parrot.aev.net (parrot.aev.net [212.31.247.179]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8F01543D7B for ; Thu, 23 Nov 2006 11:08:11 +0000 (GMT) (envelope-from ml.diespammer@netfence.it) Received: from soth.ventu (adsl-ull-55-202.51-151.net24.it [151.51.202.55]) (authenticated bits=128) by parrot.aev.net (8.13.8/8.13.6) with ESMTP id kANBIxKi081992 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 23 Nov 2006 12:19:05 +0100 (CET) (envelope-from ml.diespammer@netfence.it) Received: from [10.1.2.18] (alamar.ventu [10.1.2.18]) by soth.ventu (8.13.8/8.13.3) with ESMTP id kANB8UxS054799; Thu, 23 Nov 2006 12:08:31 +0100 (CET) (envelope-from ml.diespammer@netfence.it) Message-ID: <4565812E.6060604@netfence.it> Date: Thu, 23 Nov 2006 12:08:30 +0100 From: Andrea Venturoli User-Agent: Thunderbird 1.5.0.8 (X11/20061115) MIME-Version: 1.0 To: Kris Kennaway , freebsd-questions@freebsd.org References: <83083882-E193-445F-AF3D-E3ECD1E243B1@hughes.net> <20061111063049.GA81772@xor.obsecurity.org> <608516EC-BBD0-40E7-A773-2E8056981FAA@hughes.net> <20061111195632.GA8718@xor.obsecurity.org> In-Reply-To: <20061111195632.GA8718@xor.obsecurity.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.57 on 212.31.247.179 Cc: Subject: Re: 6.x hangs on AMD64 again X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-questions@freebsd.org List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Nov 2006 11:09:06 -0000 Kris Kennaway wrote: > On Sat, Nov 11, 2006 at 11:15:54AM -0800, Chris wrote: >>> If your system is hanging then you need to configure additional >>> debugging to figure out the cause. Read the chapter on kernel >>> debugging the developers handbook; without this information no >>> developer can help you. >>> >>> Kris >>> >>> P.S. In my testing SMP amd64 is quite stable even under exceptionally >>> heavy loads, so it's either something related to your hardware or your >>> particular workload. >> Hadn't considered that a user level debugging solution. I'll give it >> a try. >> ... > That is indeed almost always failing hardware. Hello. I think I'm having the same problems. I'm running 6.1(latest patch set)/amd64 on a dual-core Opteron Acer server with SCSI disks and it is hanging completely and suddenly. Checking the hardware was the first thing I did, but it really seems ok (unless it's the second core on the processor). I checked, among the others: the HDs with the vendor's tools, RAM with MemTest86+ and the CPU with different stress tools. If anyone can suggest other diagnostics I'd be happy to comply. I compiled the kernel with debug info, but that's totally useless, since it won't dump anything, just hang there; I don't think even DDB would help, since even the keyboard is not working at that time. If I'm missing something, I'd be glad to be directed to any pointer. The box features an em NIC on board, but since it shows a lot of problems, I removed that driver from the kernel (it's not possible to turn it off in the BIOS, though) and put in a different add-on card. I had some shared IRQs, but managed to solve that issue (even if I think it should not matter). Next, I'll try to disable SMP as soon as I can and see if it helps. Of course upgrading to 6.2 should be attempted, but since this is a production server and 6.2 is still at RC1... bye & Thanks av.