From owner-freebsd-stable@FreeBSD.ORG Mon Dec 17 21:45:07 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 14C62F5E; Mon, 17 Dec 2012 21:45:07 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3]) by mx1.freebsd.org (Postfix) with ESMTP id 8E1368FC17; Mon, 17 Dec 2012 21:45:06 +0000 (UTC) Received: from rack1.digiware.nl (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id 84745153435; Mon, 17 Dec 2012 22:45:05 +0100 (CET) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tZkBTRbYd_VK; Mon, 17 Dec 2012 22:45:01 +0100 (CET) Received: from [192.168.10.10] (vaio [192.168.10.10]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id 4E4D4153434; Mon, 17 Dec 2012 22:45:01 +0100 (CET) Message-ID: <50CF925C.5040106@digiware.nl> Date: Mon, 17 Dec 2012 22:45:00 +0100 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: Jim Harris Subject: Re: Strange CAM errors References: <50CEFAC5.8000002@digiware.nl> <572946ED30FA47C69D6DCDD511CF6EB2@multiplay.co.uk> <50CF47A5.4090008@digiware.nl> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD Stable Users , Steven Hartland X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2012 21:45:07 -0000 On 17-12-2012 20:16, Jim Harris wrote: > > > On Mon, Dec 17, 2012 at 9:26 AM, Willem Jan Withagen > wrote: > > On 2012-12-17 15:38, Steven Hartland wrote: > > Check the smart results of each disk in the array you may have a > failing > > disk. > > ----- Original Message ----- From: "Willem Jan Withagen" > > > > To: "FreeBSD Stable Users" > > > Sent: Monday, December 17, 2012 10:58 AM > > Subject: Strange CAM errors > > > > > >> Hi, > >> > >> I have not noticed this before, but my system rebooted this > morning and > >> in the following security report I found a lot of messgaes in the > >> dmesg-part like: > >> > >> +(probe0:arcmsr0:0:16:1): INQUIRY. CDB: 12 20 0 0 24 0 > >> +(probe0:arcmsr0:0:16:1): CAM status: Command timeout > >> +(probe0:arcmsr0:0:16:1): Retrying command > >> +(probe0:arcmsr0:0:16:1): INQUIRY. CDB: 12 20 0 0 24 0 > >> +(probe0:arcmsr0:0:16:1): CAM status: Command timeout > >> +(probe0:arcmsr0:0:16:1): Retrying command > >> > >> And it seems that bus 16 is: > >> +pass6 at arcmsr0 bus 0 scbus0 target 16 lun 0 > >> +pass6: Fixed Processor SCSI-0 device > >> > >> The system has been running > >> FreeBSD zfs.digiware.nl 9.1-PRERELEASE > FreeBSD 9.1-PRERELEASE #3: Wed > >> Nov 14 13:25:55 CET 2012 > >> root@zfs.digiware.nl:/usr/obj/usr/srcs/src9/src/sys/ZFS amd64 > >> for already a while. > >> > >> Anybody suggestions as to why I have these messages? > >> > >> They are during the boot sequence, so no smartd talking to the > disks at > >> that moment. > >> > >> --WjW > >> > >> ps: dmesg, config, etc.... at: > > >> http://www.tegenbosch28.nl/FreeBSD/Systems/ZFS > >> ps2: upgrading to the most recent 9.1 > > 'mmm, > > Smartd seems to think otherwise... > > 'camcontrol rescan all' actually delivers the same pack of errors. > > --WjW > > > The timeouts are occurring on inquiry commands to non-zero LUNs. > arcmsr(4) is returning CAM_SEL_TIMEOUT instead of CAM_DEV_NOT_THERE for > inquiry commands to this device and LUN > 0. CAM_DEV_NOT_THERE is > preferred to remove these types of warnings, and similar patches have > gone into for other SCSI drivers recently. > > Can you try this patch? > > Index: sys/dev/arcmsr/arcmsr.c > =================================================================== > --- sys/dev/arcmsr/arcmsr.c (revision 244190) > +++ sys/dev/arcmsr/arcmsr.c (working copy) > @@ -2439,7 +2439,7 @@ > char *buffer=pccb->csio.data_ptr; > > if (pccb->ccb_h.target_lun) { > - pccb->ccb_h.status |= CAM_SEL_TIMEOUT; > + pccb->ccb_h.status |= CAM_DEV_NOT_THERE; > xpt_done(pccb); > return; > } > Hi Jim, The noise has gone down by a factor of 5, now I get: (probe6:arcmsr0:0:16:1): INQUIRY. CDB: 12 20 0 0 24 0 (probe6:arcmsr0:0:16:1): CAM status: Unable to terminate I/O CCB request (probe6:arcmsr0:0:16:1): Error 5, Unretryable error (probe6:arcmsr0:0:16:2): INQUIRY. CDB: 12 40 0 0 24 0 Which is defined in sys/cam/cam.c .... as CAM_UA_TERMIO, but that error is nowhere set in the arcmsr code.... So I clearly do not yet know enough to hellp in this. --WjW For all of the ports on the adapter.