From owner-freebsd-scsi@FreeBSD.ORG Sun Aug 31 11:00:58 2008 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C1E2C106567C for ; Sun, 31 Aug 2008 11:00:58 +0000 (UTC) (envelope-from erich@fuujinnetworks.com) Received: from fluorine.fuujinnetworks.com (fluorine.fuujinnetworks.com [64.90.67.234]) by mx1.freebsd.org (Postfix) with ESMTP id 76AAE8FC12 for ; Sun, 31 Aug 2008 11:00:58 +0000 (UTC) (envelope-from erich@fuujinnetworks.com) Received: from [10.168.1.8] (copper.fuujinnetworks.com [64.90.67.254]) by fluorine.fuujinnetworks.com (Postfix) with ESMTP id 7EF678FC22; Sun, 31 Aug 2008 06:00:57 -0500 (CDT) Message-ID: <48BA87C6.5070008@fuujinnetworks.com> Date: Sun, 31 Aug 2008 06:00:06 -0600 From: Fuujin Networks LLC User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) MIME-Version: 1.0 To: Alexander Sack References: <48B4CF57.30603@fuujinnetworks.com> <3c0b01820808271520w78d0f338iaf6996774512b5bb@mail.gmail.com> <48B733CF.5000105@fuujinnetworks.com> <3c0b01820808290914s638c970ejeae1d4f8c8c8a9d9@mail.gmail.com> <3c0b01820808290915t4e964182y784c215e28977252@mail.gmail.com> <48B8E879.7020809@fuujinnetworks.com> <3c0b01820808300708s5ed5cb18o5199e0e4ec1dcbba@mail.gmail.com> In-Reply-To: <3c0b01820808300708s5ed5cb18o5199e0e4ec1dcbba@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-scsi@freebsd.org Subject: Re: Qlogic FC scsi_target ISP2310 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Aug 2008 11:00:58 -0000 Alex: I apologize for not being more specific in my questions. I understand that we're loading the firmware via the kernel, but my question was why not load it from the card? If I have an HP SmartArray 5300 card and the firmware is out of date, I'm expected to update it, not load a kernel module to do it for me. This makes sense for many reasons, not he least of which is compatibility. I'm in no position to suggest what is proper from the standpoint of this particular problem, but I'm trying to understand the reason for choosing a kernel module rather than an sys admin as with nearly all other devices. I misunderstood the purpose of your patch as well. I thought the problem was a firmware loading issue, but as you mentioned, this does not appear to be the case. I did see your message with the patch and it was correctly applied and the kernel was correctly compiled. I did, however, reinstall the OS because of all the fiddling I did to this point. Funny thing is that I can't get it to crash anymore. I tried it clean, and the system tanked, but after I applied your patch, I can't get it to panic anymore. The loop looks like it comes up, but when I rescan with the initiator, the target stays up without incident, but nothing shows up in camcontrol as an emulated disk: amd_svr0-01# camcontrol devlist -v scbus0 on isp0 bus 0: < > at scbus0 target -1 lun -1 () scbus-1 on xpt0 bus 0: < > at scbus-1 target -1 lun -1 (xpt0) I do get this on the initiator though: [snip] Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.5 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.5 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.5 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.5 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.6 (count 36, resid 36, status not marked) Aug 31 05:44:34 test kernel: isp0: bad underrun for 0.7 (count 36, resid 36, status not marked) [snip] After a clean install, this is what I see from dmesg on the target: [snip] registered firmware set registered firmware set registered firmware set registered firmware set registered firmware set registered firmware set registered firmware set registered firmware set registered firmware set registered firmware set registered firmware set isp0: port 0x3000-0x30ff mem 0xfe020000-0xfe020fff irq 25 at device 1.0 on pci2 isp0: [ITHREAD] isp0: Board Type 2300, Chip Revision 0x1, loaded F/W Revision 3.3.19 isp0: target notify code 0x1007 isp0: target notify code 0x1007 isp0: target notify code 0x1007 isp0: target notify code 0x1008 isp0: target notify code 0x1006 [snip] Here's the complete kernel, also after a fresh install and the removal of unnecessary options/devices (stuff not in the server): Kernel cpu HAMMER ident GENERIC # To statically compile in device wiring instead of /boot/device.hints #hints "GENERIC.hints" # Default places to look for devices. makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options SCHED_4BSD # 4BSD scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL # Enable gjournal-based UFS journaling options MD_ROOT # MD is a potential root device options NFSCLIENT # Network Filesystem Client options NFSSERVER # Network Filesystem Server options NFS_ROOT # NFS usable as /, requires NFSCLIENT options NTFS # NT File System options MSDOSFS # MSDOS Filesystem options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization options COMPAT_43TTY # BSD 4.3 TTY compat [KEEP THIS!] options COMPAT_IA32 # Compatible with i386 binaries options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD5 # Compatible with FreeBSD5 options COMPAT_FREEBSD6 # Compatible with FreeBSD6 options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV # install a CDEV entry in /dev options ADAPTIVE_GIANT # Giant mutex is adaptive. options STOP_NMI # Stop CPUS using NMI instead of IPI options AUDIT # Security event auditing # Make an SMP-capable kernel by default options SMP # Symmetric MultiProcessor Kernel # kernel debugging framework options KDB options DDB # CPU frequency control device cpufreq # Bus support. device acpi device pci # ATA and ATAPI devices device ata device atadisk # ATA disk drives device ataraid # ATA RAID drives device atapicd # ATAPI CDROM drives options ATA_STATIC_ID # Static device numbering # SCSI Controllers device isp # Qlogic family device ispfw # Firmware for QLogic HBAs- normally a module options ISP_TARGET_MODE # for ISP cards to operate in target mode device targ # SCSI Target device device targbh # SCSI Target Black Hole options CAMDEBUG options VFS_AIO # SCSI peripherals device scbus # SCSI bus (required for SCSI) device da # Direct Access (disks) device sa # Sequential Access (tape etc) device pass # Passthrough device (direct SCSI access) device ses # SCSI Environmental Services (and SAF-TE) # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver device splash # Splash screen and screen saver support # syscons is the default console driver, resembling an SCO console device sc device agp # support several AGP chipsets # Serial (COM) ports device sio # 8250, 16[45]50 based serial ports device uart # Generic UART driver device em # Intel PRO/1000 adapter Gigabit Ethernet Card # PCI Ethernet NICs that use the common MII bus controller code. # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs! device miibus # MII bus support device bce # Broadcom BCM5706/BCM5708 Gigabit Ethernet device bfe # Broadcom BCM440x 10/100 Ethernet device bge # Broadcom BCM570xx Gigabit Ethernet device fxp # Intel EtherExpress PRO/100B (82557, 82558) device re # RealTek 8139C+/8169/8169S/8110S device rl # RealTek 8129/8139 device sis # Silicon Integrated Systems SiS 900/SiS 7016 # Pseudo devices. device loop # Network loopback device random # Entropy device device ether # Ethernet support device sl # Kernel SLIP device ppp # Kernel PPP device tun # Packet tunnel. device pty # Pseudo-ttys (telnet etc) device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translation) device firmware # firmware assist module # The `bpf' device enables the Berkeley Packet Filter. # Be aware of the administrative consequences of enabling this! # Note that 'bpf' is required for DHCP. device bpf # Berkeley packet filter Not sure what to make of this.... Would you recommend a different FC card? Emulex? Erich M. Jenkins Fuujin Networks, LLC PO Box 792 Brainerd, MN 56401 (p) 218-824-5038 (f) 218-824-7516 "You should never, never doubt what no one is sure about." -- Gene Wilder Alexander Sack wrote: > On Sat, Aug 30, 2008 at 2:28 AM, Fuujin Networks LLC > wrote: >> Alex: >> >> Thanks very much for the patch. Unfortunately, I ended up with a similar >> result as seen below. Just for grins, I tried the patch on a 64-bit system >> (AMD64) to see if there was a difference based on which architecture is used >> for the target. No difference there either; still dumps core and reboots. >> The upside I would think is that both branches seem to be in sync. I do have >> a sparc64 box here if you'd like to see what happens in that world (haven't >> tested it yet). > > No, no, no....the patch was not to FIX the target mode issue. There > is only one firmware and it does get loaded (even with the error > message below). I was *JUST* trying to avoid firmware_get() to > attempt to register a firmware that does not exist. I reversed the > patch, are you sure you applied the right one? Take a look but > isp_pci.c should have an added IS_SCSI(isp) to line 1039 here it is > again: > > --- isp_pci.c.0 2008-08-29 08:03:24.000000000 -0400 > +++ isp_pci.c 2008-08-29 07:58:08.000000000 -0400 > @@ -1039,7 +1039,7 @@ > } > > isp->isp_osinfo.fw = NULL; > - if (isp->isp_role & ISP_ROLE_TARGET) { > + if (isp->isp_role & ISP_ROLE_TARGET && IS_SCSI(isp)) { > snprintf(fwname, sizeof (fwname), "isp_%04x_it", did); > isp->isp_osinfo.fw = firmware_get(fwname); > } > > This should eliminate the firmware_get() message. But AGAIN this will > not fix your target mode issues. > >> [snip] >> (targ0:isp0:0:2:0): targdone 0xffffff0001ddda00 >> (targ0:isp0:0:2:0): targread >> (targ0:isp0:0:2:0): targread ccb 0xffffff0001ddda00 (0x800b7fe20) >> (targ0:isp0:0:2:0): targreturnccb 0xffffff0001ddda00 >> cam_debug: targfreeccb descr 0xffffff0001dda1c0 and >> cam_debug: freeing ccb 0xffffff0001ddda00 >> (targ0:isp0:0:2:0): write - uio_resid 8 >> (targ0:isp0:0:2:0): Sending queued ccb 0x933 (0x800b85040) >> (targ0:isp0:0:2:0): targstart 0xffffff0001369000 >> (targ0:isp0:0:2:0): sendccb 0xffffff0001369000 >> >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id = 00 >> fault virtual address = 0x8 >> fault code = supervisor read data, page not present >> instruction pointer = 0x8:0xffffffff8025d2e8 >> stack pointer = 0x10:0xffffffffae3d06f0 >> frame pointer = 0x10:0xffffffff80a42000 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 783 (scsi_target) >> trap number = 12 >> panic: page fault >> cpuid = 0 >> Uptime: 7m21s >> Physical memory: 4021 MB >> Dumping 364 MB:: write - uio_resid 8 >> (targ0:isp0:0:2:0): getccb 0xffffff0001db7c00 >> (targ0:isp0:0:2:0): Sent ATIO/INOT (0x800b61a10) >> (targ0:isp0:0:2:0): write - uio_resid 8 >> (targ0:isp0:0:2:0): getccb 0xffffff0001db7b00 >> [snip] >> >> >> Seems to be nearly the same result in loading the firmware: >> >> [snip] >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> registered firmware set >> [snip] >> isp0: port 0x3000-0x30ff mem >> 0xfe020000-0xfe020fff irq 25 at device 1.0 on pci2 >> firmware_get: failed to load firmware image isp_2300_it >> isp0: [ITHREAD] > > Hold on, do you see this message with the patch? Are you sure you > rebuild and rebooted correctly? You should not see this message > anymore. I will verify the patch again but see above. > >> It doesn't appear that the firmware "isp_2300_it" either exists or possibly >> isn't named properly on the target machine (or the initiator for that >> matter). However, there do not seem to be any problems loading the firmware >> for the card when it's not in target mode. > > No there is no problem. The error message above is harmless. > >> From everything I've read, it looks like the firmware needs to be loaded via >> the kernel device option "ispfw". If for nothing other than my >> understanding, is there some reason we're not loading the firmware resident >> on the card? > > You ARE loading the firmware (isp_2300 one from ispfw) it just that > the code also tries to get an IT version of it and for fibre channel > cards there is no such thing. The patch I gave should remove this > nuisance BUT NOT FIX target mode. > > You need to rebuild the kernel with: > > options KDB > options DDB > > and when you panic, do a "bt" and copy the stack trace so we know > WHERE exactly its panicing on your 23xxx setup. Can you post your > kernel configuration file as well? > > Thanks! > > -aps