From owner-freebsd-hardware@FreeBSD.ORG Mon Mar 15 20:03:27 2010 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 72A301065676; Mon, 15 Mar 2010 20:03:27 +0000 (UTC) (envelope-from cowens@greatbaysoftware.com) Received: from portcityhosting.com (bayringfw.portcityweb.com [64.140.243.92]) by mx1.freebsd.org (Postfix) with ESMTP id ED6FE8FC17; Mon, 15 Mar 2010 20:03:26 +0000 (UTC) Received: from [127.0.0.1] ([173.14.128.81]) by portcityhosting.com with MailEnable ESMTP; Mon, 15 Mar 2010 16:03:26 -0400 Message-ID: <4B9E928B.2070409@greatbaysoftware.com> Date: Mon, 15 Mar 2010 16:03:23 -0400 From: Charles Owens MIME-Version: 1.0 To: John Baldwin References: <4B75AB2D.2090306@greatbaysoftware.com> <201002181023.08131.jhb@freebsd.org> <4B7ED202.2030901@greatbaysoftware.com> <201002191315.13796.jhb@freebsd.org> In-Reply-To: <201002191315.13796.jhb@freebsd.org> X-WatchGuard-AntiVirus: part scanned. clean action=allow X-ME-Bayesian: 0.000000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hardware@freebsd.org Subject: Re: mptutil(8) segfault on IBM xSeries 3550 X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Mar 2010 20:03:27 -0000 John Baldwin wrote: > On Friday 19 February 2010 1:01:38 pm Charles Owens wrote: > >> John Baldwin wrote: >> >>> On Monday 15 February 2010 5:25:15 pm Charles Owens wrote: >>> >>>> Charles Owens wrote: >>>> >>>>> Howdy, >>>>> >>>>> We're working with IBM hardware (xSeries 3550) that has an >>>>> mpt-based RAID controller... after initial success with testing the >>>>> mptutil utility, now operations other than "show adapter" and "show >>>>> volume" are resulting in segfaults. >>>>> >>>>> While it was working properly we created and removed volumes several >>>>> times, force-failed drives, and just generally put it through its >>>>> paces... and all seemed fine. Then, after a reboot, it suddenly started >>>>> failing with segfault as described, and nothing we do has helped to get >>>>> it out of this state (including trying to use the LSI in-BIOS manager to >>>>> create/delete volumes -- which in and of itself works fine). >>>>> >>>>> We found recent thread >>>>> http://docs.freebsd.org/cgi/mid.cgi?4B56CD4C.80503 and hoped that it >>>>> might somehow relate... and even tried the patch that John Baldwin >>>>> posted, but to no avail. >>>>> >>>>> Has anyone seen this behavior and/or have a suggested fix or workaround? >>>>> >>>>> >>>>> Here's the output of "mptutil show adapter": >>>>> >>>>> mpt0 Adapter: >>>>> Board Name: SR-BR10i >>>>> Board Assembly: L3-25116-01H >>>>> Chip Name: C1068E >>>>> Chip Revision: UNUSED >>>>> RAID Levels: RAID0, RAID1, RAID1E >>>>> RAID0 Stripes: 64K >>>>> RAID1E Stripes: 64K >>>>> RAID0 Drives/Vol: 1-10 >>>>> RAID1 Drives/Vol: 2 >>>>> RAID1E Drives/Vol: 3-10 >>>>> >>>>> >>>>> This work is being done using FreeBSD 8.0-RELEASE-p2 + PAE. >>>>> >>>>> >>>> I should add that the RAID controller in question is the IBM >>>> ServeRAID-BR10i SAS/SATA Controller which is based on the LSI 1068E >>>> processor, as described here: >>>> http://www-01.ibm.com/common/ssi/rep_ca/4/872/ENUSAG09-0104/index.html >>>> >>> Try this updated patch. It should fix the problems with 'mptutil show drives' >>> displaying all daX devices in the system rather than just the ones for the >>> mptX bus. I had incorrectly interpreted the XPT matches as being an AND >>> rather than an OR. This changes the code to first do a lookup for the logical >>> "path" (SCSI bus) for mptX devices and then do a second lookup to fetch any >>> daX devices on that path. I tested it on a machine with an mpt controller and >>> a USB disk. Unfortunately I wasn't able to test any of the RAID stuff, just >>> 'show drives'. This mpt(4) controller doesn't support RAID either, so I was >>> also able to verify the fix you had already tested for cleaning up 'show >>> adapter' output in that case. >>> >>> [patch omitted] >>> >> John, >> >> The patch appears to have resolved the problem. We're still banging on >> it, but so far it looks very good! >> >> Thanks very much! >> > > Excellent, thanks! I've committed it to HEAD and will MFC it in a week or > so. It is probably too late to make 7.3 however. > Again, thanks for the patch... overall it is working well... we're now able to successively do what we need to do with RAID system. We are, though, seeing some sor of error messages: # mptutil show volumes mpt0 Volumes: Id Size Level Stripe State Write-Cache Name mptutil: mpt_query_disk got 4 matches, expected 2 0 ( 279G) RAID-1 OPTIMAL Disabled # mptutil show config mpt0 Configuration: 1 volumes, 2 drives mptutil: mpt_query_disk got 4 matches, expected 2 volume 0 (279G) RAID-1 OPTIMAL spans: drive 1 (279G) ONLINE SATA drive 0 (279G) ONLINE SATA spare pools: 0 We can certainly live with this, but I wanted to let you know in case you thought it was worth digging into. Let me know if you need any additional debug info beyond this: # camcontrol devlist at scbus0 target 0 lun 0 (pass0,da0) at scbus1 target 1 lun 0 (pass1) at scbus2 target 0 lun 0 (pass2,cd0) at scbus3 target 0 lun 0 (da1,pass3) at scbus3 target 0 lun 1 (da2,pass4) Thanks, Charles