From owner-freebsd-scsi  Sat Apr 29  0:12:55 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from panzer.kdm.org (panzer.kdm.org [216.160.178.169])
	by hub.freebsd.org (Postfix) with ESMTP id 008D537B6C1
	for <scsi@FreeBSD.ORG>; Sat, 29 Apr 2000 00:12:52 -0700 (PDT)
	(envelope-from ken@panzer.kdm.org)
Received: (from ken@localhost)
	by panzer.kdm.org (8.9.3/8.9.1) id BAA51121;
	Sat, 29 Apr 2000 01:12:41 -0600 (MDT)
	(envelope-from ken)
Date: Sat, 29 Apr 2000 01:12:40 -0600
From: "Kenneth D. Merry" <ken@kdm.org>
To: John Reynolds <jjreynold@home.com>
Cc: scsi@FreeBSD.ORG
Subject: Re: hardware meltdown or cosmic ray?
Message-ID: <20000429011240.A50701@panzer.kdm.org>
References: <14602.26788.441429.641364@whale.home-net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <14602.26788.441429.641364@whale.home-net>; from jjreynold@home.com on Fri, Apr 28, 2000 at 09:44:20PM -0700
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Fri, Apr 28, 2000 at 21:44:20 -0700, John Reynolds wrote:
> 
> Ok. I've got a question I know that one of you SCSI gurus can help with. I was
> copying some CDs (data) tonight. Rather than chance buffer underrun, I was
> copying the entire disc onto an empty chunk o' disk, then burning those ISOs.
> The command I submitted to copy the cd was:
> 
>  dd if=/dev/cd0c of=disc.iso
> 
> It chugged along with about 200Mb or so snarfed off the disc, then I got these
> messages appearing on my console and the whole SCSI subsystem went "dead"
> (during the timeouts)

One thing you didn't mention was what version of FreeBSD you are using.

I suspect you're using a version that has block devices.  The above command
won't work at all with the raw device, since dd's default sector size (512
bytes) isn't a multiple of the CD sector size (2048 bytes).

> Apr 28 18:44:17 whale /kernel: dscheck: b_bcount 512 is not on a sector boundary (ssize 2048)

For some reason, the block device code issued a read that wasn't a multiple
of the sector size, thus the above error.

> Apr 28 18:47:04 whale /kernel: (cd0:ahc0:0:3:0): SCB 0x2 - timed out in dataout phase, SEQADDR == 0x5d
> Apr 28 18:47:40 whale /kernel: (cd0:ahc0:0:3:0): Other SCB Timeout
> Apr 28 18:47:40 whale /kernel: (da1:ahc0:0:6:0): SCB 0x3b - timed out in dataout phase, SEQADDR == 0x5d
> Apr 28 18:47:40 whale /kernel: (da1:ahc0:0:6:0): BDR message in message buffer
> Apr 28 18:47:40 whale /kernel: (da1:ahc0:0:6:0): SCB 0x2d - timed out in dataout phase, SEQADDR == 0x5d
> Apr 28 18:47:40 whale /kernel: (da1:ahc0:0:6:0): no longer in timeout, status = 34b
> Apr 28 18:47:40 whale /kernel: ahc0: Issued Channel A Bus Reset. 7 SCBs aborted

It looks like the bus is wedged there.  Note that it happened 4 minutes
after the error message you got, so they may not be related.  In fact, they
probably aren't, because the dscheck error above means that the read on the
non-sector boundary wasn't issued.

[ same thing a couple of hours later ]

> At that point the machine was just hung solid and required the reset button. I
> then powered it down complete and brought it back up. A subsequent dd command
> produced no wierd behavior:
> 
> root@whale [/data/isos]<21># dd if=/dev/cd0c of=disc.iso
> 985332+0 records in
> 985332+0 records out
> 504489984 bytes transferred in 224.169648 secs (2250483 bytes/sec)
> 
> (not a full disc).
> 
> Any idea by the looks of these error messages if I'm having some some sort of
> H/W meltdown or is your best guess that this was a random occurance (cosmic
> rays I like to blame)???

In general, timed out in dataout/datain phase messages mean that you have
some sort of cabling or termination problem.  That may or may not be the
case here, I dunno.

It may be a cabling or connector problem that is triggered by using the
cdrom drive.

In any case, you should probably use the raw device, and a multiple of the
sector size to get an image of the disc, like this:

dd if=/dev/rcd0c of=disc.iso bs=64k

I doubt that will fix the problem, but it would be interesting to see if
you can use the above command to try to reproduce the problem.  If you can
reproduce the problem, I would start looking for cabling or termination
problems.

Ken
-- 
Kenneth Merry
ken@kdm.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message