From owner-freebsd-stable@FreeBSD.ORG Wed Nov 16 16:10:54 2005 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7D50216A41F for ; Wed, 16 Nov 2005 16:10:54 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id CCB3E43D45 for ; Wed, 16 Nov 2005 16:10:48 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.14] (imini.samsco.home [192.168.254.14]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id jAGGAk82075833; Wed, 16 Nov 2005 09:10:47 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <437B5A06.6060804@samsco.org> Date: Wed, 16 Nov 2005 09:10:46 -0700 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.7) Gecko/20050416 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Joerg Pulz References: <20051115161253.F7025@hades.admin.frm2> In-Reply-To: <20051115161253.F7025@hades.admin.frm2> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org Cc: stable@freebsd.org Subject: Re: FreeBSD-6 amr and ahd trouble X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Nov 2005 16:10:54 -0000 Joerg Pulz wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > Hi guys, > > I'm running an Fujitsu-Siemens Primergy RX300 dual-XEON hyperthreading > enabled server with an onboard LSI MegaRAID controller and an Adaptec > 39320A Ultra320 dual channel SCSI adapter. The LSI MegaRAID controller > is configured to RAID1 with two disk and one hotspare. On this array > FreeBSD is installed. > Up to now, the system was running fine with FreeBSD-5.3 first and > FreeBSD-5.4 now. > I tried to upgrade this beast to FreeBSD-6.0-RELEASE without success. > The kernel is booting and detects all devices correctly but when it > comes to read from the amr(4) the last thing i see is "GEOM: new disk > amrd0" after that the system "hangs" and its nearly impossible to scroll > the kernel messages up or down (Scroll lock pressed). then after a while > there are a lot of SCSI error messages about SCB timeouts coming from > the ahd(4). > I decided to boot the old RELENG_5_4 kernel and cvsup'ed the sources to > RELENG_6 but i got the same results. booting from a FreeBSD-6.0-RELEASE > bootonly CDRom got again the same results. > I searched google about this, and found something about a tuneable > sysctl/loader setting called hw.pci.do_powerstate and tried it, but the > same result. later i saw, that in RELENG_6 this tuneable is renamed and > set to 0 anyway. > the next step was removing the Adaptec card to make sure this one is not > interrupting the amr(4) but the only thing that happened was the SCSI > error messages going away so this was not the problem. > I decided to give CURRENT from today a try, and it was working without > any problems. I have tested CURRENT some steps back until i hit 700003 > dated to "Sun Sep 18 05:12:39 2005 UTC" which is exactly the same time > the RELENG_6 branch was marked for 6.0-BETA5 and CURRENT was working > with every point i checked out from cvs. Unfortunately 6.0-BETA5 is NOT > working. > I checked out the sources for 6.0-BETA4 and it is working again. So > somewhere between 6.0-BETA4 and 6.0-BETA5 the whole thing is broken, at > least for me and my hardware. > I've seen some differences in sys/cam/cam_xpt.c, maybe these cause the > trouble i have, but I'm not so deep in the FreeBSD kernel code to make > this sure. > > It would be nice if someone can take a look at this to get this fixed in > RELENG_6. > Any patches to test are welcome. > > regards > Joerg > This is almost certainly an interrupt routing bug. Can you try booting with ACPI disabled? Can you try building a 6.0 kernel without SMP and the 'apic' devices? From 5.4, can you send your system information? Scott