From owner-freebsd-scsi@FreeBSD.ORG Thu Jul 10 16:39:31 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 59C7B37B401 for ; Thu, 10 Jul 2003 16:39:31 -0700 (PDT) Received: from dynamic.hydro.washington.edu (dynamic.hydro.washington.edu [128.95.246.166]) by mx1.FreeBSD.org (Postfix) with ESMTP id BD8C943FBD for ; Thu, 10 Jul 2003 16:39:30 -0700 (PDT) (envelope-from penglish@hydro.washington.edu) Received: from dynamic.hydro.washington.edu (localhost [127.0.0.1]) h6ANdU0B034260 for ; Thu, 10 Jul 2003 16:39:30 -0700 (PDT) (envelope-from penglish@hydro.washington.edu) Received: from localhost (penglish@localhost)h6ANdTu3034257 for ; Thu, 10 Jul 2003 16:39:29 -0700 (PDT) (envelope-from penglish@hydro.washington.edu) X-Authentication-Warning: dynamic.hydro.washington.edu: penglish owned process doing -bs Date: Thu, 10 Jul 2003 16:39:29 -0700 (PDT) From: Paul English To: freebsd-scsi@freebsd.org Message-ID: <20030710163318.G14925-100000@dynamic.hydro.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: SCSI errors and device hang X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jul 2003 23:39:31 -0000 All of a sudden some SCSI errors have started cropping up on a system that has been stable 24/7 under light and heavy usage for more than a year. It probably hadn't been rebooted in 90 days at least. I'm mainly concerned about whether these errors are FreeBSD driver/os related, or there is something wrong with the device (and Arena EX3 6 Bay IDE Desktop unit from raidweb.com). If it can definitely be pointed at the unit, then I can talk to raidweb tech support to resolve it hopefully. I suspect that it is the device, because during the ensuing (purposeful) reboots, occasionally it hangs in the SCSI controller bios partway through detecting the drive. Following are the SCSI errors I'm seeing. These result in the filesystem (mounted as /raid) hanging, but otherwise the system keeps ticking: (da0:ahc0:0:11:0): SCB 0xc - timed out ahc0: Dumping Card State while idle, at SEQADDR 0x7 ACCUM = 0x95, SINDEX = 0xd, DINDEX = 0x8c, ARG_2 = 0x0 HCNT = 0x0 SCSISEQ = 0x12, SBLKCTL = 0x2 DFCNTRL = 0x0, DFSTATUS = 0x29 LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80 SSTAT0 = 0x5, SSTAT1 = 0xa STACK == 0x3, 0xec, 0x147, 0x0 SCB count = 20 Kernel NEXTQSCB = 18 Card NEXTQSCB = 18 QINFIFO entries: Waiting Queue entries: Disconnected Queue entries: 2:12 QOUTFIFO entries: Sequencer Free SCB List: 0 1 3 4 5 6 7 8 9 10 11 12 13 14 15 Pending list: 12 Kernel Free SCB list: 13 1 16 8 2 17 3 4 14 9 6 5 7 19 0 15 11 10 sg[0] - Addr 0xa85b000 : Length 4096 sg[1] - Addr 0x7f3c000 : Length 4096 sg[2] - Addr 0x69fd000 : Length 4096 sg[3] - Addr 0x5a1e000 : Length 4096 sg[4] - Addr 0xadff000 : Length 4096 sg[5] - Addr 0xa700000 : Length 4096 sg[6] - Addr 0x168a1000 : Length 4096 sg[7] - Addr 0xaf42000 : Length 4096 sg[8] - Addr 0xaca3000 : Length 4096 sg[9] - Addr 0x9e04000 : Length 4096 sg[10] - Addr 0x8985000 : Length 4096 sg[11] - Addr 0x97c6000 : Length 4096 sg[12] - Addr 0x8447000 : Length 4096 sg[13] - Addr 0xaa48000 : Length 4096 sg[14] - Addr 0xa349000 : Length 4096 sg[15] - Addr 0xa4aa000 : Length 4096 (da0:ahc0:0:11:0): Queuing a BDR SCB (da0:ahc0:0:11:0): Bus Device Reset Message Sent (da0:ahc0:0:11:0): no longer in timeout, status = 34b ahc0: Bus Device Reset on A:11. 1 SCBs aborted (da0:ahc0:0:11:0): SCB 0x12 - timed out ahc0: Dumping Card State while idle, at SEQADDR 0x7 ACCUM = 0x97, SINDEX = 0x64, DINDEX = 0x65, ARG_2 = 0x4 HCNT = 0x0 SCSISEQ = 0x12, SBLKCTL = 0x2 DFCNTRL = 0x0, DFSTATUS = 0x2d LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80 SSTAT0 = 0x5, SSTAT1 = 0xa STACK == 0x3, 0x188, 0x0, 0xcb SCB count = 20 Kernel NEXTQSCB = 12 Card NEXTQSCB = 12 QINFIFO entries: Waiting Queue entries: Disconnected Queue entries: 2:18 QOUTFIFO entries: Sequencer Free SCB List: 0 1 3 4 5 6 7 8 9 10 11 12 13 14 15 Pending list: 18 Kernel Free SCB list: 13 1 16 8 2 17 3 4 14 9 6 5 7 19 0 15 11 10 sg[0] - Addr 0xa85b000 : Length 4096 sg[1] - Addr 0x7f3c000 : Length 4096 sg[2] - Addr 0x69fd000 : Length 4096 sg[3] - Addr 0x5a1e000 : Length 4096 sg[4] - Addr 0xadff000 : Length 4096 sg[5] - Addr 0xa700000 : Length 4096 sg[6] - Addr 0x168a1000 : Length 4096 sg[7] - Addr 0xaf42000 : Length 4096 sg[8] - Addr 0xaca3000 : Length 4096 sg[9] - Addr 0x9e04000 : Length 4096 sg[10] - Addr 0x8985000 : Length 4096 sg[11] - Addr 0x97c6000 : Length 4096 sg[12] - Addr 0x8447000 : Length 4096 sg[13] - Addr 0xaa48000 : Length 4096 sg[14] - Addr 0xa349000 : Length 4096 sg[15] - Addr 0xa4aa000 : Length 4096 (da0:ahc0:0:11:0): Queuing a BDR SCB (da0:ahc0:0:11:0): Bus Device Reset Message Sent (da0:ahc0:0:11:0): no longer in timeout, status = 34b ahc0: Bus Device Reset on A:11. 1 SCBs aborted