From owner-freebsd-stable@FreeBSD.ORG Thu Jan 26 22:50:44 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2A39216A422 for ; Thu, 26 Jan 2006 22:50:44 +0000 (GMT) (envelope-from ghelmer@palisadesys.com) Received: from magellan.palisadesys.com (magellan.palisadesys.com [192.188.162.211]) by mx1.FreeBSD.org (Postfix) with ESMTP id B4A7B43D46 for ; Thu, 26 Jan 2006 22:50:43 +0000 (GMT) (envelope-from ghelmer@palisadesys.com) Received: from [172.16.1.108] (cetus.palisadesys.com [192.188.162.7]) (authenticated bits=0) by magellan.palisadesys.com (8.13.4/8.13.4) with ESMTP id k0QMofa1028405; Thu, 26 Jan 2006 16:50:41 -0600 (CST) (envelope-from ghelmer@palisadesys.com) Message-ID: <43D95241.6010703@palisadesys.com> Date: Thu, 26 Jan 2006 16:50:41 -0600 From: Guy Helmer User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: Joshua Coombs References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Palisade-MailScanner-Information: Please contact the ISP for more information X-Palisade-MailScanner: Found to be clean X-Palisade-MailScanner-From: ghelmer@palisadesys.com Cc: freebsd-stable@freebsd.org Subject: Re: Adaptec SCSI going nuts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jan 2006 22:50:44 -0000 Joshua Coombs wrote: > Just got a dual Opteron system to play with amd64 builds of FreeBSD. > Poking around to see how it behaived, I initially tossed FreeBSD 4.11 > on it, and was surprised to see a dump of the scsi controller state at > the end of the dmesg. Tried 6.0-Rel, both x86 and amd64, same > behavior. Bumped up to 6-stable from yesterday, same behavior. After > the dump, the system appears solid, no other errors, so I'm guessing > it's just a problem with how the card is initialized, FreeBSD corrects > it and moves on. > > Should I be more concerned, or consider it a quirk of the machine? Oh, you have your Adaptec controller hooked up to one of those Cheetah 10K.7 drives like the ones that are driving us a bit batty at work (except I'm using ST373207LC [73GB] drives). IIRC, we've been seeing this SCSI state dump at startup on some of our 5.4-based kernels but not always. Until recently, I thought the drives were working OK but now I'm not so sure. We're trying to track down an infrequent lockup of the SCSI system with Adaptec AIC7902 Ultra320 controller built-into the motherboard and a single Seagate ST373207LC drive while running FreeBSD 5.4 on a dual-Xeon system. We've been running bonnie++ to stress the drive, and we're having a hard time pinning the failure down to a particular component (controller or disk) because it's taken bonnie++ running continuously as many as 6 days straight for the failure to occur. We've tried updating the firmware on the Seagate drive to ST373207LC from version 3 to 4 and turning off Packetized transfers in the Adaptec BIOS but we still seem to be able to make it fail after a while. The next thing I'm tempted to try changing is the SCSI cable if we still don't have a stable system. So I'd suggest that you run something like bonnie++ that can stress the disk & controller for a while, and see whether or not you encounter any problems... Good luck! Guy Helmer > > Joshua Coombs > > Copyright (c) 1992-2005 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD 6.0-RELEASE-p4 #0: Thu Jan 26 15:55:58 EST 2006 > root@testbed.gwi.net:/usr/obj/usr/src/sys/SMP > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: AMD Opteron(tm) Processor 246 (1994.67-MHz K8-class CPU) > Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 > Features=0x78bfbff > > AMD Features=0xe0500800 > real memory = 1073676288 (1023 MB) > avail memory = 1024929792 (977 MB) > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > MADT: Forcing active-low polarity and level trigger for SCI > ioapic0 irqs 0-23 on motherboard > ioapic1 irqs 24-27 on motherboard > ioapic2 irqs 28-31 on motherboard > acpi0: on motherboard > acpi0: Power Button (fixed) > pci_link0: irq 9 on acpi0 > pci_link1: irq 10 on acpi0 > pci_link2: irq 11 on acpi0 > pci_link3: irq 15 on acpi0 > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 > cpu0: on acpi0 > acpi_throttle0: on cpu0 > cpu1: on acpi0 > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pcib1: at device 6.0 on pci0 > pci3: on pcib1 > pci3: at device 6.0 (no driver attached) > fxp0: port 0xbc00-0xbc3f mem > 0xfeafb000-0xfeafbfff,0xfeaa0000-0xfeabffff irq 18 at device 8.0 on pci3 > miibus0: on fxp0 > inphy0: on miibus0 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > fxp0: Ethernet address: 00:e0:81:34:64:da > isab0: at device 7.0 on pci0 > isa0: on isab0 > atapci0: port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on pci0 > ata0: on atapci0 > ata1: on atapci0 > pci0: at device 7.2 (no driver attached) > pci0: at device 7.3 (no driver attached) > pcib2: at device 10.0 on pci0 > pci2: on pcib2 > ahd0: port > 0x8000-0x80ff,0x7800-0x78ff mem 0xfc8fc000-0xfc8fdfff irq 24 at device > 6.0 on pci2 > ahd0: [GIANT-LOCKED] > aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs > ahd1: port > 0x8800-0x88ff,0x8400-0x84ff mem 0xfc8fe000-0xfc8fffff irq 25 at device > 6.1 on pci2 > ahd1: [GIANT-LOCKED] > aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs > bge0: mem > 0xfc8b0000-0xfc8bffff,0xfc8a0000-0xfc8affff irq 24 at device 9.0 on pci2 > miibus1: on bge0 > brgphy0: on miibus1 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge0: Ethernet address: 00:e0:81:34:65:28 > bge1: mem > 0xfc8e0000-0xfc8effff,0xfc8d0000-0xfc8dffff irq 25 at device 9.1 on pci2 > miibus2: on bge1 > brgphy1: on miibus2 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge1: Ethernet address: 00:e0:81:34:65:29 > pci0: at device 10.1 (no > driver attached) > pcib3: at device 11.0 on pci0 > pci1: on pcib3 > pci0: at device 11.1 (no > driver attached) > acpi_button0: on acpi0 > atkbdc0: port 0x60,0x64 irq 1 on acpi0 > atkbd0: flags 0x1 irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 > on acpi0 > sio0: type 16550A > fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq > 2 on acpi0 > fdc0: [FAST] > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > orm0: at iomem 0xc0000-0xc7fff on isa0 > ppc0: cannot reserve I/O port range > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > Timecounters tick every 1.000 msec > Waiting 5 seconds for SCSI devices to settle > acd0: CDROM at ata0-slave PIO4 > ahd0: Invalid Sequencer interrupt occurred. >>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< > ahd0: Dumping Card State at program address 0x23c Mode 0x0 > Card was paused > INTSTAT[0x0] SELOID[0x1] SELID[0x0] HS_MAILBOX[0x0] > INTCTL[0x80]:(SWTMINTMASK) SEQINTSTAT[0x0] SAVED_MODE[0x11] > DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|FIFO1FREE) > SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] > LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] > SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x0] > SEQINTCTL[0x6]:(INTMASK1|INTMASK2) > SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] QFREEZE_COUNT[0x2] > KERNEL_QFREEZE_COUNT[0x2] MK_MESSAGE_SCB[0xff00] MK_MESSAGE_SCSIID[0xff] > SSTAT0[0x0] SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] > SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0] > LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x0] > LQOSTAT2[0x0] > > SCB Count = 16 CMDS_PENDING = 0 LASTSCB 0xffff CURRSCB 0xe NEXTSCB 0xff40 > qinstart = 28 qinfifonext = 28 > QINFIFO: > WAITING_TID_QUEUES: > Pending list: > Total 0 > Kernel Free SCB list: 15 14 1 2 3 4 5 6 7 8 9 10 11 12 13 0 > Sequencer Complete DMA-inprog list: > Sequencer Complete list: > Sequencer DMA-Up and Complete list: > Sequencer On QFreeze and Complete list: > > > ahd0: FIFO0 Free, LONGJMP == 0x8000, SCB 0xf > SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) > > SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) > SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) > > ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0xe > SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) > > SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) > SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) > LQIN: 0x8 0x0 0x0 0xf 0x0 0x1 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 > 0x0 0x0 0x0 0x0 > ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 > ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x1 > ahd0: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0 > > SIMODE0[0xc]:(ENOVERRUN|ENIOERR) > CCSCBCTL[0x4]:(CCSCBDIR) > ahd0: REG0 == 0x8f60, SINDEX = 0x10e, DINDEX = 0x104 > ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff40, SCB_NEXT2 == 0xe > CDB 12 20 0 80 88 e6 > STACK: 0x237 0x2 0x0 0x0 0x0 0x0 0x0 0x0 > <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> > Copied 18 bytes of sense data offset 12: 0x70 0x0 0x6 0x0 0x0 0x0 0x0 > 0xa 0x0 0x0 0x0 0x0 0x29 0x2 0x2 0x0 0x0 0x0 > Copied 18 bytes of sense data offset 12: 0x70 0x0 0x6 0x0 0x0 0x0 0x0 > 0xa 0x0 0x0 0x0 0x0 0x29 0x2 0x2 0x0 0x0 0x0 > SMP: AP CPU #1 Launched! > da0 at ahd0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-3 device > da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged > Queueing Enabled > da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) > da1 at ahd0 bus 0 target 1 lun 0 > da1: Fixed Direct Access SCSI-3 device > da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged > Queueing Enabled > da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) > Trying to mount root from ufs:/dev/da0s1a > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" -- Guy Helmer, Ph.D. Principal System Architect Palisade Systems, Inc.