From owner-freebsd-scsi@FreeBSD.ORG Mon Sep 1 09:00:43 2008 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0E2CF1065681 for ; Mon, 1 Sep 2008 09:00:43 +0000 (UTC) (envelope-from erich@fuujinnetworks.com) Received: from fluorine.fuujinnetworks.com (fluorine.fuujinnetworks.com [64.90.67.234]) by mx1.freebsd.org (Postfix) with ESMTP id CC12E8FC20 for ; Mon, 1 Sep 2008 09:00:42 +0000 (UTC) (envelope-from erich@fuujinnetworks.com) Received: from [10.168.1.8] (copper.fuujinnetworks.com [64.90.67.254]) by fluorine.fuujinnetworks.com (Postfix) with ESMTP id BD1358FC2D; Mon, 1 Sep 2008 04:00:41 -0500 (CDT) Message-ID: <48BBBD16.80402@fuujinnetworks.com> Date: Mon, 01 Sep 2008 03:59:50 -0600 From: Fuujin Networks LLC User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) MIME-Version: 1.0 To: Alexander Sack References: <48B4CF57.30603@fuujinnetworks.com> <3c0b01820808271520w78d0f338iaf6996774512b5bb@mail.gmail.com> <48B733CF.5000105@fuujinnetworks.com> <3c0b01820808290914s638c970ejeae1d4f8c8c8a9d9@mail.gmail.com> <3c0b01820808290915t4e964182y784c215e28977252@mail.gmail.com> <48B8E879.7020809@fuujinnetworks.com> <3c0b01820808300708s5ed5cb18o5199e0e4ec1dcbba@mail.gmail.com> <48BA87C6.5070008@fuujinnetworks.com> <3c0b01820808311012n7e83a948t732e6544ddb0d703@mail.gmail.com> In-Reply-To: <3c0b01820808311012n7e83a948t732e6544ddb0d703@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-scsi@freebsd.org Subject: Re: Qlogic FC scsi_target ISP2310 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 09:00:43 -0000 Alex: So here's something interesting. The target decided to panic on me just now. Here's the last message from the target before it rebooted: scsi_target: main loop beginning scsi_target: read ready scsi_target: event -1 done scsi_target: Working on ATIO 0x800b7fe20 scsi_target: tcmd_handle atio 0x800b7fe20 ctio 0x800b85040 atioflags 0x8000 scsi_target: INQUIRY from 0: 12 0 0 0 24 0 The initiator just sits there and says the rescan was successful with no events or errors..... Here's the panic: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x8 fault code = supervisor read data, page not present instruction pointer = 0x8:0xffffffff8022f128 stack pointer = 0x10:0xffffffffae3c1650 frame pointer = 0x10:0xffffffffae3c16e0 code segment = base 0x0, limit oxfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 777 (scsi_target) [thread pid 777 tid 100075 ] Stopped at isp_pci_dmasetup+0x1d8: movq 0x8(%rax),%rsi Here's the bt: Tracing pid 777 tid 100075 td 0xffffff00016e100 isp_pci_dmasetup() at isp_dmasetup+0x1d8 isp_action() at isp_action+0x1089 xpt_run_dev_sendq() at xpt_run_dev_sendq+0x1c4 xpt_action() at xpt_action+0x796 targsendccb() at targsendccb+0x9e targstart() at targstart+0x130 xpt_run_dev_allocq() at xpt_run_dev_allocq+0xd4 targwrite() at targwrite+0x184 giant_write() at giant_write+0x60 devfs_write_f() at devfs_write_f+0x75 dofilewrite() at dofilewrite+0x85 kern_writev() at kern_writev+0x4c write() at write+0x54 syscall() at syscall+0x254 Xfast_syscall() at Xfast_syscall+0xab --- syscall (4, FreeBSD ELF64, write), rip = 0x800929d3c, rsp = 0x7fffffff4908, rbp = 0x800b83440 --- Please note the above trace was copied by hand because I couldn't get console redirection to stay up when this died. Does anyone know how to get this data into a file or out via serial (vt100 perhaps??) or is this pretty much a manual process?? Erich M. Jenkins Fuujin Networks, LLC PO Box 792 Brainerd, MN 56401 (p) 218-824-5038 (f) 218-824-7516 "You should never, never doubt what no one is sure about." -- Gene Wilder Alexander Sack wrote: > On Sun, Aug 31, 2008 at 8:00 AM, Fuujin Networks LLC > wrote: >> I apologize for not being more specific in my questions. I understand that >> we're loading the firmware via the kernel, but my question was why not load >> it from the card? If I have an HP SmartArray 5300 card and the firmware is >> out of date, I'm expected to update it, not load a kernel module to do it >> for me. This makes sense for many reasons, not he least of which is >> compatibility. I'm in no position to suggest what is proper from the >> standpoint of this particular problem, but I'm trying to understand the >> reason for choosing a kernel module rather than an sys admin as with nearly >> all other devices. > > We do both! QLogic ships each card with some version of the firmware > on it that boots up at runtime. One of the nice features of the ISP > is that its RISC based firmware can be updated at runtime ensuring you > are always running the latest. The ispfw driver is strictly used to > register firmwares with the generic firmware driver (the real action > happens in isp during isp_reset()). > > I think the driver should really check to see if the ispfw version is > less than the resident driver and do the right thing. I think it used > to do that but was taken out, I don't know why - I'm actually thinking > of maybe it should be added back. > > In any event, if you want to disable loading of the firmware you can > set in your hints file: > > hint.isp.0.fwload_disable=1 > > That should prevent the driver from loading the ispfw version (please > check during bootup what version your resident firmware is at to > determine which is newer). If you do this then you should see: > > isp0: Board Type 2300, Chip Revision 0x1, resident F/W Revision > > instead of > > isp0: Board Type 2300, Chip Revision 0x1, loaded F/W Revision > > Having a separate utility (typically DOS or Windows based) is not that > great in my eyes but to each his own. Bottom line is you should run > the latest ISP firmware (whether its the one that was flashed from > QLogic or the one in the ispfw driver). I'm thinking that perhaps and > audit should be done and we should ship the latest firmware off the > QLogic website. What version is shipped with your card? Looks like > 3.3.25 is the latest for 23xx cards. Hmmm.... > >> I misunderstood the purpose of your patch as well. I thought the problem >> was a firmware loading issue, but as you mentioned, this does not appear to >> be the case. > > Right, it seems something else. > >> I did see your message with the patch and it was correctly applied and the >> kernel was correctly compiled. I did, however, reinstall the OS because of >> all the fiddling I did to this point. Funny thing is that I can't get it to >> crash anymore. I tried it clean, and the system tanked, but after I applied >> your patch, I can't get it to panic anymore. The loop looks like it comes >> up, but when I rescan with the initiator, the target stays up without >> incident, but nothing shows up in camcontrol as an emulated disk: >> >> amd_svr0-01# camcontrol devlist -v >> scbus0 on isp0 bus 0: >> < > at scbus0 target -1 lun -1 () >> scbus-1 on xpt0 bus 0: >> < > at scbus-1 target -1 lun -1 (xpt0) >> >> I do get this on the initiator though: >> >> [snip] >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.5 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.5 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.5 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.5 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, >> status not marked) >> Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.7 (count 36, resid 36, >> status not marked) >> [snip] >> >> After a clean install, this is what I see from dmesg on the target: >> >> [snip] >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> isp0: port 0x3000-0x30ff mem >> 0xfe020000-0xfe020fff irq 25 at device 1.0 on pci2 >> isp0: [ITHREAD] >> isp0: Board Type 2300, Chip Revision 0x1, loaded F/W Revision 3.3.19 >> isp0: target notify code 0x1007 >> isp0: target notify code 0x1007 >> isp0: target notify code 0x1007 >> isp0: target notify code 0x1008 >> isp0: target notify code 0x1006 >> [snip] > > Is this with or without the isp patch I sent regarding the firmware? > I noticed its not trying to get isp_2300_it like before (I'm hoping > that's due to the patch I sent otherwise I'm confused this holiday > weekend). > >> Here's the complete kernel, also after a fresh install and the removal of >> unnecessary options/devices (stuff not in the server): > >> # SCSI Controllers >> device isp # Qlogic family >> device ispfw # Firmware for QLogic HBAs- normally a >> module >> options ISP_TARGET_MODE # for ISP cards to operate in target mode >> device targ # SCSI Target device >> device targbh # SCSI Target Black Hole >> options CAMDEBUG >> options VFS_AIO > > Thanks for this, I just wanted to verify your build options look good. > >> Not sure what to make of this.... Would you recommend a different FC card? >> Emulex? > > I have no direct experience with Emulex with FreeBSD so I'm not the > right person to ask. I was under the impression that the 23xx target > mode was working. Did you enable target mode in the BIOS by chance > (or disable it, I think on my 24xx BIOS I have that option but I'm not > in front of it yet). Just verify your BIOS version number and options > before completely giving up! :D > > -aps