From owner-freebsd-scsi@FreeBSD.ORG Thu May 15 13:02:02 2008 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E1A5C106564A for ; Thu, 15 May 2008 13:02:00 +0000 (UTC) (envelope-from allan@physics.umn.edu) Received: from florence.spa.umn.edu (florence.spa.umn.edu [128.101.220.10]) by mx1.freebsd.org (Postfix) with ESMTP id B63138FC1E for ; Thu, 15 May 2008 13:02:00 +0000 (UTC) (envelope-from allan@physics.umn.edu) Received: from c-75-72-245-201.hsd1.mn.comcast.net ([75.72.245.201] helo=[192.168.0.2]) by florence.spa.umn.edu with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.66 (FreeBSD)) (envelope-from ) id 1Jwd5r-000FGk-SD for freebsd-scsi@freebsd.org; Thu, 15 May 2008 08:02:00 -0500 Message-ID: <482C3446.8010203@physics.umn.edu> Date: Thu, 15 May 2008 08:01:58 -0500 From: Graham Allan User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: freebsd-scsi@freebsd.org References: <20080509011028.GV25577@physics.umn.edu> <20080509215621.GX25577@physics.umn.edu> <482646B5.807@miralink.com> <482760D0.1070106@physics.umn.edu> <48276560.30302@miralink.com> <4827AD9F.50202@physics.umn.edu> <3c0b01820805120919s7c8d5249xf5dd62934c113506@mail.gmail.com> <20080512171404.GE25577@physics.umn.edu> <20080514014307.GV25577@physics.umn.edu> In-Reply-To: <20080514014307.GV25577@physics.umn.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Hang on boot in isp with QLA2342 after upgrading to 6.3 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 May 2008 13:02:03 -0000 Graham Allan wrote: > On Mon, May 12, 2008 at 12:14:04PM -0500, Graham Allan wrote: >> It has been pointed out to me that this kind of weird interaction isn't >> exactly unknown in the SAN world, and setting up zoning on the switch >> would probably make it go away. So I will also try that (it's probably >> a giveway of a SAN novice that I hadn't already done so - it certainly >> does sound like it would help). But if the hang does point to a problem >> in the driver, I'm also happy to keep trying different things in the >> hope of revealing where the problem actually lies. > > Replying to my own message here. > > The good news for me is that setting up zoning in the switch does fix > (or at least hide) the problem on this server for me. > > The bad news is, I believe I'm seeing a similar kind of behaviour on a > completely different 6.3 setup. Haven't had time to fully characterise > it yet, but in short... Dell 1950 with QLA2342, connected directly to > an EMC CX300 array. Very often (lets say unpredictably 50% of time) > hangs during boot at exactly the same point as the first system, right > around the time it would be probing for drives. So I guess one thing I could do is build a kernal with debugging support (and possibly the "deadlock recipe" from the freebsd handbook), and force it to the debugger when it hangs. Then I could at least get some tracebacks and other information - though as it never actually panics I'm not sure how useful the information will be - I guess it's likely stuck in a loop somehow. It should give some clue. Does that sound like a reasonable idea? Does the kernel version matter (eg standard 6.3 vs RELENG_6)? Is this list the most appropriate place for me to talk about the issue? (I also think I should double-check 6.2 again, as its release notes indicate it was where isp was synced from CURRENT - I'd think it should have the same issue). Thanks for everyones interest, Graham