From owner-freebsd-scsi@FreeBSD.ORG Tue Apr 13 16:49:20 2010 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6FE1F1065679 for ; Tue, 13 Apr 2010 16:49:20 +0000 (UTC) (envelope-from freebsd@wcubed.net) Received: from mail.datausa.com (mail.datausa.com [216.150.220.220]) by mx1.freebsd.org (Postfix) with SMTP id 3F8358FC28 for ; Tue, 13 Apr 2010 16:49:20 +0000 (UTC) Received: (qmail 18404 invoked by uid 89); 13 Apr 2010 10:49:16 -0600 Received: from c-76-25-180-188.hsd1.co.comcast.net (HELO ?10.0.1.1?) (brad@wcubed.net@76.25.180.188) by mail.datausa.com with SMTP; 13 Apr 2010 10:49:16 -0600 Message-ID: <4BC4A084.7050906@wcubed.net> Date: Tue, 13 Apr 2010 10:49:08 -0600 From: Brad Waite User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.19) Gecko/20081209 Lightning/0.9 Thunderbird/2.0.0.19 Mnenhy/0.7.6.666 MIME-Version: 1.0 To: Gary Palmer References: <4BC280EE.5090202@wcubed.net> <20100412034937.GA24680@in-addr.com> In-Reply-To: <20100412034937.GA24680@in-addr.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-scsi@freebsd.org Subject: Re: QLogic 2360 FC HBAs not playing well with others X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Apr 2010 16:49:20 -0000 Gary Palmer wrote: > On Sun, Apr 11, 2010 at 08:09:50PM -0600, Brad Waite wrote: >>>> Matthew Jacob wrote: >>>>> On 04/09/2010 11:29 AM, Brad Waite wrote: >>>>> I beseech you, oh great masters of SCSI and fibre channel, hear my >>> pleas >>>>> for help! >>>>> >>>>> My 2 QLE2360s don't appear to be waking up properly in a Dell R710 >>>>> running 7.2 AMD64. At the very least, they're not recognizing any of >>>>> the volumes on the Sun 2540 array in the fabric. Everything works just >>>>> fine under VMware ESXi 4.1, though. >>>>> >>>> Get newer firmware either by upgrading with RELENG_7 or snagging >>>> asm_2300.h from RELENG_7 and rebuilding. >>>> >>>> You don't have to load all of ispfw >>>> >>>> isp2300_LOAD=YES >>>> >>>> should get you just that onemodule >>>> >>>> the latest in the FreeBSD tree is 3.03.26 >>> Woot. That helped. Built & installed RELENG_7, but I've got some >>> more wierdness. >> Woot. That helped. >> >> Built & installed RELENG_7, but I've got some more wierdness. >> >> First off I've got da0 - da15 showing similar to this: >> >> da0 at isp0 bus 0 target 0 lun 0 >> da0: Fixed Direct Access SCSI-5 device >> da0: 200.000MB/s transfers >> da0: Command Queueing Enabled >> da0: 138989MB (284650656 512 byte sectors: 255H 63S/T 17718C) >> >> We've got a Sun Storagetek 2540 12-drive array with 4 volumes mapped to >> this host. It would appear that it's showing the 4 volumes AND each of >> the 12 drives. Is that normal? >> >> Next, I have about 20 of the following errors for each of da1, da2, da3, >> da4, da9, da10, da11 & da12. >> >> (da1:isp0:0:0:1): READ(6)/WRITE(6) not supported, increasing >> minimum_cmd_size to 10. >> (da1:isp0:0:0:1): READ(10). CDB: 28 0 0 0 0 0 0 0 1 0 >> (da1:isp0:0:0:1): CAM Status: SCSI Status Error >> (da1:isp0:0:0:1): SCSI Status: Check Condition >> (da1:isp0:0:0:1): ILLEGAL REQUEST asc:94,1 >> (da1:isp0:0:0:1): Vendor Specific ASC >> (da1:isp0:0:0:1): Unretryable error >> >> What's going on here? >> >> Is there any config I need to to for volume mapping and/or >> multipathing? I'm a complete newb when it comes to FC on FreeBSD, so >> forgive my ignorance. >> >> Thanks for the help, guys! > > I suspect the reason you have 16 disk devices showing up is that you > are running multipath. You will get one da device showing up for each > different path, and if you're running a full multipath environment > that's likely 4 paths per device, which would lead to the 16 disks > (unless they're not the sizes you expect, but I would tend to suspect > its a multipath artifact) Thanks for pointing out what should have been obvious. The Sun 2540 has 2 ports on 2 controllers and camcontrol shows exactly that: # camcontrol devlist at scbus0 target 0 lun 0 (da0,pass0) at scbus0 target 0 lun 1 (da1,pass1) at scbus0 target 0 lun 2 (da2,pass2) at scbus0 target 0 lun 3 (da3,pass3) at scbus0 target 1 lun 0 (da4,pass4) at scbus0 target 1 lun 1 (da5,pass5) at scbus0 target 1 lun 2 (da6,pass6) at scbus0 target 1 lun 3 (da7,pass7) at scbus1 target 0 lun 0 (da8,pass8) at scbus1 target 0 lun 1 (da9,pass9) at scbus1 target 0 lun 2 (da10,pass10) at scbus1 target 0 lun 3 (da11,pass11) at scbus1 target 1 lun 0 (da12,pass12) at scbus1 target 1 lun 1 (da13,pass13) at scbus1 target 1 lun 2 (da14,pass14) at scbus1 target 1 lun 3 (da15,pass15) > To handle multipath you probably want to look at gmultipath(8). > > I'm not sure about READ/WRITE errors. You say they show up for 8 > devices? Is it possible that the array is not true active/active > on the controllers? Its possible that half the paths are going to > a controller that is rejecting the I/O until the LUN fails over, > but thats just a guess based on the error message. If you can > look at the controller/bus/target/lun information from dmesg and > see if you can spot a pattern about the path to the LUNs giving > the error that may give a better idea about whats going on. I think you've nailed it. da4-7 & da12-15 have the following respective lines in dmesg: da[4-7]: 200.000MB/s transfers WWNN 0x200400a0b8388efd WWPN 0x203500a0b8388efd PortID 0x10100 da[12-15]: 200.000MB/s transfers WWNN 0x200400a0b8388efd WWPN 0x202500a0b8388efd PortID 0x10100 The two WWPNs correspond to the 2540's controllers and the write errors are on da0-3 & da8-11. I can't find anything yet in the docs on making the other ports active, but I successfully labeled da7 & da15 with gmultipath, although I couldn't add da3 & da11 due to write errors. No real surprise, but since I can't add the label, what happens if one of the active ports on a controller fails? I know I'd have the other path to the active port on the other controller, but would I have to manually add the label to the volumes from newly-active port?