From owner-freebsd-current@FreeBSD.ORG Fri Apr 2 13:46:31 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9F1A516A4CE for ; Fri, 2 Apr 2004 13:46:31 -0800 (PST) Received: from smtp-gw-cl-d.dmv.com (smtp-gw-cl-d.dmv.com [216.240.97.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2E0CA43D2D for ; Fri, 2 Apr 2004 13:46:31 -0800 (PST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) i32Lm1Rv086773 for ; Fri, 2 Apr 2004 16:48:02 -0500 (EST) (envelope-from sven@dmv.com) From: Sven Willenberger To: freebsd-current@freebsd.org In-Reply-To: <1079649669.26805.77.camel@lanshark.dmv.com> References: <1079649669.26805.77.camel@lanshark.dmv.com> Content-Type: text/plain Message-Id: <1080942386.23534.26.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Fri, 02 Apr 2004 16:46:27 -0500 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 Subject: Re: odd dmesg scsi error with aic7902 controller X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Apr 2004 21:46:31 -0000 On Thu, 2004-03-18 at 17:41, Sven Willenberger wrote: > Have just moved to supermicro 1U boxes using 80-pin SCA drives (Seagate > ST336607LC) and U320 adaptec on-board controllers (aic7902) and the > error shown in the dmesg below crops up during bootup. > > The odd thing is that it only occurs if 2 or more drives are connected; > if only 1 drive is physically attached, the boot up sequence is smooth. > These boxes are not yet in production and I hesitate to do so if the > error message is an indication of future problems. This same message > does occur in FreeBSD 4.9-Release also. The following dmesg snippet > comes from a 5.2.1-Release system: > > . > Waiting 15 seconds for SCSI devices to settle > ahd0: Invalid Sequencer interrupt occurred. > >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< > ahd0: Dumping Card State at program address 0x216 Mode 0x0 > Card was paused > HS_MAILBOX[0x0] INTCTL[0x80]:(SWTMINTMASK) SEQINTSTAT[0x0] > SAVED_MODE[0x11] DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|FIFO1FREE) > SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] > LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] > SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x0] > SEQINTCTL[0x6]:(INTMASK1|INTMASK2) > SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x0] SSTAT1[0x0] > SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] > SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) > LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] > LQOSTAT1[0x0] LQOSTAT2[0x0] > > SCB Count = 16 CMDS_PENDING = 0 LASTSCB 0xffff CURRSCB 0x9 NEXTSCB > 0xff80 > qinstart = 38 qinfifonext = 41 > QINFIFO: 0xe 0x9 0xf > WAITING_TID_QUEUES: > Pending list: > 15 FIFO_USE[0x0] SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x17] > 9 FIFO_USE[0x0] SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] > 14 FIFO_USE[0x0] SCB_CONTROL[0x48]:(STATUS_RCVD|DISCENB) > SCB_SCSIID[0x7] > Total 3 > Kernel Free SCB list: 1 2 3 4 5 6 7 8 10 11 12 13 0 > Sequencer Complete DMA-inprog list: > Sequencer Complete list: > Sequencer DMA-Up and Complete list: > > ahd0: FIFO0 Free, LONGJMP == 0x8000, SCB 0xf > SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) > SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) > SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) > ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0x9 > SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) > SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) > SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) > LQIN: 0x8 0x0 0x0 0xf 0x0 0x1 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 > 0x0 0x0 0x0 0x0 > ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x52 > ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x1 > > SIMODE0[0xc]:(ENOVERRUN|ENIOERR) > CCSCBCTL[0x4]:(CCSCBDIR) > ahd0: REG0 == 0xb660, SINDEX = 0x10e, DINDEX = 0x104 > ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff80, SCB_NEXT2 == 0xffc8 > CDB 12 20 0 80 88 a6 > STACK: 0x211 0x2 0x0 0x0 0x0 0x0 0x0 0x0 > >>>>>>>>>>>>>>>>> > ses0 at ahd0 bus 0 target 6 lun 0 > ses0: Fixed Processor SCSI-2 device > ses0: 3.300MB/s transfers > ses0: SAF-TE Compliant Device > GEOM: create disk da0 dp=0xc86c8850 > GEOM: create disk da1 dp=0xc86cd050 > Copied 18 bytes of sense data offset 12: 0x70 0x0 0x6 0x0 0x0 0x0 0x0 > 0xa 0x0 0x0 0x0 0x0 0x29 0x2 0x2 0x0 0x0 0x0 > SMP: AP CPU #2 Launched! > SMP: AP CPU #1 Launched! > SMP: AP CPU #3 Launched! > Copied 18 bytes of sense data offset 12: 0x70 0x0 0x6 0x0 0x0 0x0 0x0 > 0xa 0x0 0x0 0x0 0x0 0x29 0x2 0x2 0x0 0x0 0x0 > da0 at ahd0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-3 device > da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged > Queueing Enabled > da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) > da1 at ahd0 bus 0 target 1 lun 0 > da1: Fixed Direct Access SCSI-3 device > da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged > Queueing Enabled > da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) > As a followup to this issue, I have been in contact with Seagate regarding this issue. Their recommendations were to check for IRQ conflicts, bad cabling, disabling SMP (not!) etc. Installing IBM/Hitachi U320 drives does not result in this boot up message so I believe that the suggestions listed are not the culprit. This would leave possibly the scsi probe function as having a timing(?) issue or some incompatibility issues between Seagate and Supermicro. All firmware/bios were as up to date as is possible. I have not had the wherewithal to see if the Seagate drives actually suffer during normal use on these systems as I am reluctant to put any real workload (read "production") on them. Barring any further investigation, I would recommend the fix to the Seagate issue is to use drives other than Seagate with this particular hardware setup (as I have had a few off-list responses from others with similar problems). Sven