From owner-freebsd-scsi  Mon Aug 30  2:15:19 1999
Delivered-To: freebsd-scsi@freebsd.org
Received: from ren.detir.qld.gov.au (ns.detir.qld.gov.au [203.46.81.66])
	by hub.freebsd.org (Postfix) with ESMTP
	id 3A11514F69; Mon, 30 Aug 1999 02:15:11 -0700 (PDT)
	(envelope-from syssgm@detir.qld.gov.au)
Received: by ren.detir.qld.gov.au; id TAA12542; Mon, 30 Aug 1999 19:12:42 +1000 (EST)
Received: from ogre.detir.qld.gov.au(167.123.8.3) by ren.detir.qld.gov.au via smap (3.2)
	id xma012532; Mon, 30 Aug 99 19:12:17 +1000
Received: from atlas.detir.qld.gov.au (atlas.detir.qld.gov.au [167.123.8.9])
	by ogre.detir.qld.gov.au (8.8.8/8.8.7) with ESMTP id TAA08240;
	Mon, 30 Aug 1999 19:12:17 +1000 (EST)
Received: from nymph.detir.qld.gov.au (nymph.detir.qld.gov.au [167.123.10.10])
	by atlas.detir.qld.gov.au (8.8.5/8.8.5) with ESMTP id TAA26119;
	Mon, 30 Aug 1999 19:12:16 +1000 (EST)
Received: from nymph.detir.qld.gov.au (localhost.detir.qld.gov.au [127.0.0.1])
	by nymph.detir.qld.gov.au (8.8.8/8.8.7) with ESMTP id TAA00547;
	Mon, 30 Aug 1999 19:12:15 +1000 (EST)
	(envelope-from syssgm@nymph.detir.qld.gov.au)
Message-Id: <199908300912.TAA00547@nymph.detir.qld.gov.au>
To: freebsd-current@freebsd.org, freebsd-scsi@freebsd.org
Cc: syssgm@detir.qld.gov.au
Subject: SCSI surprise! (was: Softupdates reliability?)
Date: Mon, 30 Aug 1999 19:12:15 +1000
From: Stephen McKay <syssgm@detir.qld.gov.au>
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

[I'm trying my first crosspost experiment here.  Please follow up to -scsi.]

A week ago I posted my strange crash and subsequent doubts about the proper
functioning of softupdates.  This is more of the story.

I examined the lost+found directory more closely and of the few files that
I traced, they were all temporary files or newly created directories (ports
actually) created in the CTM update process.  So, maybe I didn't really
lose anything.  Maybe fsck just doesn't recognise one of the safe-but-crashed
modes you get when using softupdates.  But unfortunately, I needed a CVS tree
urgently and restored a backup.  To make up for this, I promise to do serious
destruction testing of softupdates soon.

But, I had another crash almost as soon as I started using the machine again.
Again, the Exabyte was being used (but only rewinding at the time), but the
obvious trigger this time was intense disk activity (from "rm").  The active
file system was not using softupdates, and had a number of fsck -p correctable
errors on reboot.  Conclusions:

1) The Exabyte was not to blame for the crash
2) The crash wasn't a "scribble junk" crash (first one probably wasn't either)
3) Regular mounts are still safer than softupdates

I took the lid off anyway hoping to find anything at all weird and noticed
something I had forgotten.  I was using a Seagate ST51080N 1GB disk earlier
for some experimenting and had disconnected the POWER, but not the SCSI CABLE.
(It's a really noisy drive!) When I also unplugged the SCSI cable, all crashes
stopped.  I've now used the machine intensively for several days (copying over
20GB of small and big files, and read and written several tapes) without
incident.  Conclusions:

4) My stepping of K6-2/300 is just fine
5) My Exabyte really is ok :-)
6) It is NOT safe to have a powered down SCSI device attached to a SCSI chain
7) The world really is a wonderful place ;-)

So, apart from being happy at having stable hardware again, I am intensely
curious about this.  Why is a powered down SCSI device so nasty?  For example,
the first crash locked up my SCSI card so that reset didn't fix it, and the
second crash hung one of my disks so that it had to be powered down to even
be recognised!  Is there a standard for this stuff?

Stephen.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message