From owner-freebsd-scsi@FreeBSD.ORG Thu Jan 20 20:01:09 2005 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3B8A116A4CF for ; Thu, 20 Jan 2005 20:01:09 +0000 (GMT) Received: from www16.your-server.de (www16.your-server.de [213.133.104.16]) by mx1.FreeBSD.org (Postfix) with SMTP id 171D143D2F for ; Thu, 20 Jan 2005 20:01:08 +0000 (GMT) (envelope-from galtmann@las-cad.com) Received: (qmail 14462 invoked by uid 1825); 20 Jan 2005 20:01:06 -0000 Received: from galtmann@las-cad.com by www16.your-server.de by uid 502 with qmail-scanner-1.22 (clamdscan: 0.71. Clear:RC:1(84.154.35.141):. Processed in 0.188696 secs); 20 Jan 2005 20:01:06 -0000 X-Qmail-Scanner-Mail-From: galtmann@las-cad.com via www16.your-server.de X-Qmail-Scanner: 1.22 (Clear:RC:1(84.154.35.141):. Processed in 0.188696 secs) Received: from p549a238d.dip0.t-ipconnect.de (HELO asterix.las-cad.local) (84.154.35.141) by www16.your-server.de with SMTP; 20 Jan 2005 20:01:05 -0000 Received: from [192.168.98.23] (mnch-d9ba47c4.pool.mediaWays.net [217.186.71.196]) (authenticated bits=0)j0KK0bia013638 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 20 Jan 2005 21:00:38 +0100 (CET) (envelope-from galtmann@las-cad.com) Date: Thu, 20 Jan 2005 21:00:35 +0100 From: Georg Altmann To: freebsd-scsi@freebsd.org Message-ID: X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Scanned-By: MIMEDefang 2.35 Subject: card state dump during tape backup X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Jan 2005 20:01:09 -0000 Hi, I am having troubles with our tape backup. The system has been running stable for about 3 month now after a tape drive failure/replacement and maybe for one year before that. The system (hardware/kernel/software) has not been touched during this period. A week ago the backup software (Netvault 7.1) showed the tape drive being offline and I found a scsi card dump in the logs (see below). The problem recurred almost daily the past few days during the differential backups (1-3 GB maybe). Still I am able to perform larger test backups (10 GB) without the problem occurring. After restarting netvault and doing a camcontrol reset all, the backup is running fine again for 12h to 48h. We are using an ADIC Fastor 2 tape library with a LTO1 tape drive attached to an Adaptec 29160 ultra160 adapter. FreeBSD is 4.10-RELEASE-p4. Machine details see below. Although I don't suspect this to be the cause: termination is set to automatic for LVD in the adapter BIOS (68pin external connector) and an active terminator is plugged on the library side. I checked cables again 2 days ago. My first guess was to disable wide negotiation for all devices and set the tape drive to 40 MB/s in the adapter BIOS, but I doesn't seem to help. What is this? Is the hardware starting to fail? And if so is likely to be something on the SCSI bus e.g. the drive or is it the motherboard or the SCSI card failing? I mean it is very unlikely for it to be some kind of kernel or software failure, right? Any help would be greatly appreciated. Best regards, Georg # pciconf -lv ahc0@pci0:12:0: class=0x010000 card=0xe2a09005 chip=0x00809005 rev=0x02 hdr=0x00 vendor = 'Adaptec' device = '29160/N/LP Ultra160 (AIC-7892A) SCSI Host Adapter' class = mass storage subclass = SCSI # camcontrol devlist at scbus0 target 0 lun 0 (pass0) at scbus0 target 5 lun 0 (pass1,sa0) relevant (?) kernel options: options SCSI_DELAY=8000 options CAMDEBUG options AHC_ALLOW_MEMIO /var/run/dmesg.boot: CPU: AMD Athlon(tm) Processor (1102.51-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x642 Stepping = 2 Features=0x183f9ff AMD Features=0xc0440000 real memory = 805240832 (786368K bytes) avail memory = 780029952 (761748K bytes) pci0: at 7.2 irq 9 pci0: at 7.3 irq 9 ahc0: port 0xbc00-0xbcff mem 0xdfffb000-0xdfffbfff irq 5 at device 12.0 on pci0 aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs sa0 at ahc0 bus 0 target 5 lun 0 sa0: Removable Sequential Access SCSI-3 device sa0: 40.000MB/s transfers (40.000MHz, offset 15) pass0 at ahc0 bus 0 target 0 lun 0 pass0: Removable Changer SCSI-2 device pass0: 3.300MB/s transfers Here comes the dump: Jan 19 22:46:01 asterix /kernel: ahc0: Recovery Initiated Jan 19 22:46:01 asterix /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< Jan 19 22:46:01 asterix /kernel: ahc0: Dumping Card State while idle, at SEQADDR 0x9 Jan 19 22:46:01 asterix /kernel: Card was paused Jan 19 22:46:01 asterix /kernel: ACCUM = 0x4, SINDEX = 0x7, DINDEX = 0x21, ARG_2 = 0x3 Jan 19 22:46:01 asterix /kernel: HCNT = 0x0 SCBPTR = 0x1 Jan 19 22:46:01 asterix /kernel: SCSIPHASE[0x0] SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0] Jan 19 22:46:01 asterix /kernel: LASTPHASE[0x1]:(P_BUSFREE) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) Jan 19 22:46:01 asterix /kernel: SBLKCTL[0xa]:(SELWIDE|SELBUSB) SCSIRATE[0x0] SEQCTL[0x10]:(FASTMODE) Jan 19 22:46:01 asterix /kernel: SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIED) SSTAT0[0x0] Jan 19 22:46:01 asterix /kernel: SSTAT1[0x8]:(BUSFREE) SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8]:(ENSWRAP) Jan 19 22:46:01 asterix /kernel: SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) SXFRCTL0[0x80]:(DFON) Jan 19 22:46:01 asterix /kernel: DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) Jan 19 22:46:01 asterix /kernel: STACK: 0x0 0x164 0x179 0x3 Jan 19 22:46:01 asterix /kernel: SCB count = 20 Jan 19 22:46:01 asterix /kernel: Kernel NEXTQSCB = 4 Jan 19 22:46:01 asterix /kernel: Card NEXTQSCB = 4 Jan 19 22:46:01 asterix /kernel: QINFIFO entries: Jan 19 22:46:01 asterix /kernel: Waiting Queue entries: Jan 19 22:46:01 asterix /kernel: Disconnected Queue entries: 1:14 Jan 19 22:46:01 asterix /kernel: QOUTFIFO entries: Jan 19 22:46:01 asterix /kernel: Sequencer Free SCB List: 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Jan 19 22:46:01 asterix /kernel: Sequencer SCB Info: Jan 19 22:46:01 asterix /kernel: 0 SCB_CONTROL[0xc0]:(DISCENB|TARGET_SCB) SCB_SCSIID[0x7] Jan 19 22:46:01 asterix /kernel: SCB_LUN[0x0] SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 1 SCB_CONTROL[0x44]:(DISCONNECTED|DISCENB) SCB_SCSIID[0x7] Jan 19 22:46:01 asterix /kernel: SCB_LUN[0x0] SCB_TAG[0xe] Jan 19 22:46:01 asterix /kernel: 2 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 3 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 16 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 17 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 18 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 19 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 20 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 21 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 22 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 23 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 24 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 25 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 26 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 27 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 28 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 29 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 30 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: 31 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Jan 19 22:46:01 asterix /kernel: Pending list: Jan 19 22:46:01 asterix /kernel: 14 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] SCB_LUN[0x0] Jan 19 22:46:01 asterix /kernel: Kernel Free SCB list: 9 15 16 17 18 19 0 1 2 3 5 6 7 8 13 12 11 10 Jan 19 22:46:01 asterix /kernel: Untagged Q(0): 14 Jan 19 22:46:01 asterix /kernel: Jan 19 22:46:01 asterix /kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): SCB 0xe - timed out Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): Queuing a BDR SCB Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): Bus Device Reset Message Sent Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): no longer in timeout, status = 34b Jan 19 22:46:01 asterix /kernel: ahc0: Bus Device Reset on A:0. 1 SCBs aborted Jan 19 22:46:01 asterix /kernel: ahc0: Timedout SCBs already complete. Interrupts may not be functioning.