Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Mar 2010 16:43:11 -0400
From:      Charles Owens <cowens@greatbaysoftware.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-hardware@freebsd.org
Subject:   Re: mptutil(8) segfault on IBM xSeries 3550
Message-ID:  <4BA13EDF.8040909@greatbaysoftware.com>
In-Reply-To: <201003171114.11601.jhb@freebsd.org>
References:  <4B75AB2D.2090306@greatbaysoftware.com> <201002191315.13796.jhb@freebsd.org> <4B9E928B.2070409@greatbaysoftware.com> <201003171114.11601.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 3/17/2010 11:14 AM, John Baldwin wrote:
> On Monday 15 March 2010 4:03:23 pm Charles Owens wrote:
>   
>> John Baldwin wrote:
>>     
>>> On Friday 19 February 2010 1:01:38 pm Charles Owens wrote:
>>>   
>>>       
>>>> John Baldwin wrote:
>>>>     
>>>>         
>>>>> On Monday 15 February 2010 5:25:15 pm Charles Owens wrote:
>>>>>       
>>>>>           
>>>>>> Charles Owens wrote:
>>>>>>         
>>>>>>             
>>>>>>> Howdy,
>>>>>>>
>>>>>>> We're working with IBM hardware (xSeries 3550) that has an
>>>>>>> mpt-based RAID controller... after initial success with testing the
>>>>>>> mptutil utility, now operations other than "show adapter" and "show
>>>>>>> volume" are resulting in segfaults.
>>>>>>>
>>>>>>> While it was working properly we created and removed volumes several
>>>>>>> times, force-failed drives, and just generally put it through its
>>>>>>> paces... and all seemed fine.  Then, after a reboot, it suddenly 
>>>>>>>               
> started
>   
>>>>>>> failing with segfault as described, and nothing we do has helped to 
>>>>>>>               
> get
>   
>>>>>>> it out of this state (including trying to use the LSI in-BIOS manager 
>>>>>>>               
> to
>   
>>>>>>> create/delete volumes -- which in and of itself works fine).
>>>>>>>
>>>>>>> We found recent thread
>>>>>>> http://docs.freebsd.org/cgi/mid.cgi?4B56CD4C.80503 and hoped that it
>>>>>>> might somehow relate... and even tried the patch that John Baldwin
>>>>>>> posted, but to no avail.
>>>>>>>
>>>>>>> Has anyone seen this behavior and/or have a suggested fix or 
>>>>>>>               
> workaround?
>   
>>>>>>>
>>>>>>> Here's the output of "mptutil show adapter":
>>>>>>>
>>>>>>> mpt0 Adapter:
>>>>>>>        Board Name: SR-BR10i
>>>>>>>    Board Assembly: L3-25116-01H
>>>>>>>         Chip Name: C1068E
>>>>>>>     Chip Revision: UNUSED
>>>>>>>       RAID Levels: RAID0, RAID1, RAID1E
>>>>>>>     RAID0 Stripes: 64K
>>>>>>>    RAID1E Stripes: 64K
>>>>>>>  RAID0 Drives/Vol: 1-10
>>>>>>>  RAID1 Drives/Vol: 2
>>>>>>> RAID1E Drives/Vol: 3-10
>>>>>>>
>>>>>>>
>>>>>>> This work is being done using FreeBSD 8.0-RELEASE-p2 + PAE.
>>>>>>>   
>>>>>>>           
>>>>>>>               
>>>>>> I should add that the RAID controller in question is the IBM
>>>>>> ServeRAID-BR10i SAS/SATA Controller which is based on the LSI 1068E
>>>>>> processor, as described here:
>>>>>> http://www-01.ibm.com/common/ssi/rep_ca/4/872/ENUSAG09-0104/index.html
>>>>>>         
>>>>>>             
>>>>> Try this updated patch.  It should fix the problems with 'mptutil show 
>>>>>           
> drives' 
>   
>>>>> displaying all daX devices in the system rather than just the ones for 
>>>>>           
> the 
>   
>>>>> mptX bus.  I had incorrectly interpreted the XPT matches as being an AND 
>>>>> rather than an OR.  This changes the code to first do a lookup for the 
>>>>>           
> logical 
>   
>>>>> "path" (SCSI bus) for mptX devices and then do a second lookup to fetch 
>>>>>           
> any 
>   
>>>>> daX devices on that path.  I tested it on a machine with an mpt 
>>>>>           
> controller and 
>   
>>>>> a USB disk.  Unfortunately I wasn't able to test any of the RAID stuff, 
>>>>>           
> just 
>   
>>>>> 'show drives'.  This mpt(4) controller doesn't support RAID either, so I 
>>>>>           
> was 
>   
>>>>> also able to verify the fix you had already tested for cleaning up 'show 
>>>>> adapter' output in that case.
>>>>>
>>>>> [patch omitted]
>>>>>       
>>>>>           
>>>> John,
>>>>
>>>> The patch appears to have resolved the problem.   We're still banging on
>>>> it, but so far it looks very good!
>>>>
>>>> Thanks very much!
>>>>     
>>>>         
>>> Excellent, thanks!  I've committed it to HEAD and will MFC it in a week or
>>> so.  It is probably too late to make 7.3 however.
>>>   
>>>       
>> Again, thanks for the patch... overall it is working well... we're now
>> able to successively do what we need to do with RAID system.  We are,
>> though, seeing some sor of error messages:
>>
>> # mptutil show volumes
>> mpt0 Volumes:
>>   Id     Size    Level   Stripe  State  Write-Cache  Name
>> mptutil: mpt_query_disk got 4 matches, expected 2
>>      0 (  279G) RAID-1          OPTIMAL   Disabled  
>>
>> # mptutil show config 
>> mpt0 Configuration: 1 volumes, 2 drives
>> mptutil: mpt_query_disk got 4 matches, expected 2
>>     volume 0 (279G) RAID-1 OPTIMAL spans:
>>         drive 1 (279G) ONLINE <WD3000BLFS-23YBU 4V04> SATA
>>         drive 0 (279G) ONLINE <WD3000BLFS-23YBU 4V04> SATA
>>         spare pools: 0
>>     
> Are you sure this is a fixed binary?  The new binary doesn't print out that 
> message anymore, it only ways 'got %d matches, expected 1'.  Also, the 4 
> instead of 2 is consistent with the old bug in that the two Linux virtual 
> floppies (da1 and da2) would be reported as extra for 'mptutil show drives' in 
> this case I think.

You're right!  It appears on one of my two devel systems I misapplied
the patch somehow.  Much better now... thanks!






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4BA13EDF.8040909>