Date: Mon, 15 Mar 2010 16:03:23 -0400 From: Charles Owens <cowens@greatbaysoftware.com> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-hardware@freebsd.org Subject: Re: mptutil(8) segfault on IBM xSeries 3550 Message-ID: <4B9E928B.2070409@greatbaysoftware.com> In-Reply-To: <201002191315.13796.jhb@freebsd.org> References: <4B75AB2D.2090306@greatbaysoftware.com> <201002181023.08131.jhb@freebsd.org> <4B7ED202.2030901@greatbaysoftware.com> <201002191315.13796.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin wrote: > On Friday 19 February 2010 1:01:38 pm Charles Owens wrote: > >> John Baldwin wrote: >> >>> On Monday 15 February 2010 5:25:15 pm Charles Owens wrote: >>> >>>> Charles Owens wrote: >>>> >>>>> Howdy, >>>>> >>>>> We're working with IBM hardware (xSeries 3550) that has an >>>>> mpt-based RAID controller... after initial success with testing the >>>>> mptutil utility, now operations other than "show adapter" and "show >>>>> volume" are resulting in segfaults. >>>>> >>>>> While it was working properly we created and removed volumes several >>>>> times, force-failed drives, and just generally put it through its >>>>> paces... and all seemed fine. Then, after a reboot, it suddenly started >>>>> failing with segfault as described, and nothing we do has helped to get >>>>> it out of this state (including trying to use the LSI in-BIOS manager to >>>>> create/delete volumes -- which in and of itself works fine). >>>>> >>>>> We found recent thread >>>>> http://docs.freebsd.org/cgi/mid.cgi?4B56CD4C.80503 and hoped that it >>>>> might somehow relate... and even tried the patch that John Baldwin >>>>> posted, but to no avail. >>>>> >>>>> Has anyone seen this behavior and/or have a suggested fix or workaround? >>>>> >>>>> >>>>> Here's the output of "mptutil show adapter": >>>>> >>>>> mpt0 Adapter: >>>>> Board Name: SR-BR10i >>>>> Board Assembly: L3-25116-01H >>>>> Chip Name: C1068E >>>>> Chip Revision: UNUSED >>>>> RAID Levels: RAID0, RAID1, RAID1E >>>>> RAID0 Stripes: 64K >>>>> RAID1E Stripes: 64K >>>>> RAID0 Drives/Vol: 1-10 >>>>> RAID1 Drives/Vol: 2 >>>>> RAID1E Drives/Vol: 3-10 >>>>> >>>>> >>>>> This work is being done using FreeBSD 8.0-RELEASE-p2 + PAE. >>>>> >>>>> >>>> I should add that the RAID controller in question is the IBM >>>> ServeRAID-BR10i SAS/SATA Controller which is based on the LSI 1068E >>>> processor, as described here: >>>> http://www-01.ibm.com/common/ssi/rep_ca/4/872/ENUSAG09-0104/index.html >>>> >>> Try this updated patch. It should fix the problems with 'mptutil show drives' >>> displaying all daX devices in the system rather than just the ones for the >>> mptX bus. I had incorrectly interpreted the XPT matches as being an AND >>> rather than an OR. This changes the code to first do a lookup for the logical >>> "path" (SCSI bus) for mptX devices and then do a second lookup to fetch any >>> daX devices on that path. I tested it on a machine with an mpt controller and >>> a USB disk. Unfortunately I wasn't able to test any of the RAID stuff, just >>> 'show drives'. This mpt(4) controller doesn't support RAID either, so I was >>> also able to verify the fix you had already tested for cleaning up 'show >>> adapter' output in that case. >>> >>> [patch omitted] >>> >> John, >> >> The patch appears to have resolved the problem. We're still banging on >> it, but so far it looks very good! >> >> Thanks very much! >> > > Excellent, thanks! I've committed it to HEAD and will MFC it in a week or > so. It is probably too late to make 7.3 however. > Again, thanks for the patch... overall it is working well... we're now able to successively do what we need to do with RAID system. We are, though, seeing some sor of error messages: # mptutil show volumes mpt0 Volumes: Id Size Level Stripe State Write-Cache Name mptutil: mpt_query_disk got 4 matches, expected 2 0 ( 279G) RAID-1 OPTIMAL Disabled # mptutil show config mpt0 Configuration: 1 volumes, 2 drives mptutil: mpt_query_disk got 4 matches, expected 2 volume 0 (279G) RAID-1 OPTIMAL spans: drive 1 (279G) ONLINE <WD3000BLFS-23YBU 4V04> SATA drive 0 (279G) ONLINE <WD3000BLFS-23YBU 4V04> SATA spare pools: 0 We can certainly live with this, but I wanted to let you know in case you thought it was worth digging into. Let me know if you need any additional debug info beyond this: # camcontrol devlist <LSILOGIC Logical Volume 3000> at scbus0 target 0 lun 0 (pass0,da0) <ATA WD3000BLFS-23YBU 4V04> at scbus1 target 1 lun 0 (pass1) <Linux Virtual CD/DVD 0316> at scbus2 target 0 lun 0 (pass2,cd0) <Linux Virtual Floppy 0316> at scbus3 target 0 lun 0 (da1,pass3) <Linux Virtual Floppy 0316> at scbus3 target 0 lun 1 (da2,pass4) Thanks, Charles
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B9E928B.2070409>