Date: Wed, 17 Mar 2010 11:14:11 -0400 From: John Baldwin <jhb@freebsd.org> To: Charles Owens <cowens@greatbaysoftware.com> Cc: freebsd-hardware@freebsd.org Subject: Re: mptutil(8) segfault on IBM xSeries 3550 Message-ID: <201003171114.11601.jhb@freebsd.org> In-Reply-To: <4B9E928B.2070409@greatbaysoftware.com> References: <4B75AB2D.2090306@greatbaysoftware.com> <201002191315.13796.jhb@freebsd.org> <4B9E928B.2070409@greatbaysoftware.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 15 March 2010 4:03:23 pm Charles Owens wrote: > John Baldwin wrote: > > On Friday 19 February 2010 1:01:38 pm Charles Owens wrote: > > > >> John Baldwin wrote: > >> > >>> On Monday 15 February 2010 5:25:15 pm Charles Owens wrote: > >>> > >>>> Charles Owens wrote: > >>>> > >>>>> Howdy, > >>>>> > >>>>> We're working with IBM hardware (xSeries 3550) that has an > >>>>> mpt-based RAID controller... after initial success with testing the > >>>>> mptutil utility, now operations other than "show adapter" and "show > >>>>> volume" are resulting in segfaults. > >>>>> > >>>>> While it was working properly we created and removed volumes several > >>>>> times, force-failed drives, and just generally put it through its > >>>>> paces... and all seemed fine. Then, after a reboot, it suddenly started > >>>>> failing with segfault as described, and nothing we do has helped to get > >>>>> it out of this state (including trying to use the LSI in-BIOS manager to > >>>>> create/delete volumes -- which in and of itself works fine). > >>>>> > >>>>> We found recent thread > >>>>> http://docs.freebsd.org/cgi/mid.cgi?4B56CD4C.80503 and hoped that it > >>>>> might somehow relate... and even tried the patch that John Baldwin > >>>>> posted, but to no avail. > >>>>> > >>>>> Has anyone seen this behavior and/or have a suggested fix or workaround? > >>>>> > >>>>> > >>>>> Here's the output of "mptutil show adapter": > >>>>> > >>>>> mpt0 Adapter: > >>>>> Board Name: SR-BR10i > >>>>> Board Assembly: L3-25116-01H > >>>>> Chip Name: C1068E > >>>>> Chip Revision: UNUSED > >>>>> RAID Levels: RAID0, RAID1, RAID1E > >>>>> RAID0 Stripes: 64K > >>>>> RAID1E Stripes: 64K > >>>>> RAID0 Drives/Vol: 1-10 > >>>>> RAID1 Drives/Vol: 2 > >>>>> RAID1E Drives/Vol: 3-10 > >>>>> > >>>>> > >>>>> This work is being done using FreeBSD 8.0-RELEASE-p2 + PAE. > >>>>> > >>>>> > >>>> I should add that the RAID controller in question is the IBM > >>>> ServeRAID-BR10i SAS/SATA Controller which is based on the LSI 1068E > >>>> processor, as described here: > >>>> http://www-01.ibm.com/common/ssi/rep_ca/4/872/ENUSAG09-0104/index.html > >>>> > >>> Try this updated patch. It should fix the problems with 'mptutil show drives' > >>> displaying all daX devices in the system rather than just the ones for the > >>> mptX bus. I had incorrectly interpreted the XPT matches as being an AND > >>> rather than an OR. This changes the code to first do a lookup for the logical > >>> "path" (SCSI bus) for mptX devices and then do a second lookup to fetch any > >>> daX devices on that path. I tested it on a machine with an mpt controller and > >>> a USB disk. Unfortunately I wasn't able to test any of the RAID stuff, just > >>> 'show drives'. This mpt(4) controller doesn't support RAID either, so I was > >>> also able to verify the fix you had already tested for cleaning up 'show > >>> adapter' output in that case. > >>> > >>> [patch omitted] > >>> > >> John, > >> > >> The patch appears to have resolved the problem. We're still banging on > >> it, but so far it looks very good! > >> > >> Thanks very much! > >> > > > > Excellent, thanks! I've committed it to HEAD and will MFC it in a week or > > so. It is probably too late to make 7.3 however. > > > > Again, thanks for the patch... overall it is working well... we're now > able to successively do what we need to do with RAID system. We are, > though, seeing some sor of error messages: > > # mptutil show volumes > mpt0 Volumes: > Id Size Level Stripe State Write-Cache Name > mptutil: mpt_query_disk got 4 matches, expected 2 > 0 ( 279G) RAID-1 OPTIMAL Disabled > > # mptutil show config > mpt0 Configuration: 1 volumes, 2 drives > mptutil: mpt_query_disk got 4 matches, expected 2 > volume 0 (279G) RAID-1 OPTIMAL spans: > drive 1 (279G) ONLINE <WD3000BLFS-23YBU 4V04> SATA > drive 0 (279G) ONLINE <WD3000BLFS-23YBU 4V04> SATA > spare pools: 0 Are you sure this is a fixed binary? The new binary doesn't print out that message anymore, it only ways 'got %d matches, expected 1'. Also, the 4 instead of 2 is consistent with the old bug in that the two Linux virtual floppies (da1 and da2) would be reported as extra for 'mptutil show drives' in this case I think. > We can certainly live with this, but I wanted to let you know in case > you thought it was worth digging into. Let me know if you need any > additional debug info beyond this: > > # camcontrol devlist > <LSILOGIC Logical Volume 3000> at scbus0 target 0 lun 0 (pass0,da0) > <ATA WD3000BLFS-23YBU 4V04> at scbus1 target 1 lun 0 (pass1) > <Linux Virtual CD/DVD 0316> at scbus2 target 0 lun 0 (pass2,cd0) > <Linux Virtual Floppy 0316> at scbus3 target 0 lun 0 (da1,pass3) > <Linux Virtual Floppy 0316> at scbus3 target 0 lun 1 (da2,pass4) -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201003171114.11601.jhb>