From owner-freebsd-scsi  Thu Oct 22 20:40:26 1998
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id UAA15281
          for freebsd-scsi-outgoing; Thu, 22 Oct 1998 20:40:26 -0700 (PDT)
          (envelope-from owner-freebsd-scsi@FreeBSD.ORG)
Received: from pluto.plutotech.com (mail.plutotech.com [206.168.67.137])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id UAA15276;
          Thu, 22 Oct 1998 20:40:25 -0700 (PDT)
          (envelope-from gibbs@plutotech.com)
Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130])
	by pluto.plutotech.com (8.8.7/8.8.5) with ESMTP id VAA25994;
	Thu, 22 Oct 1998 21:39:53 -0600 (MDT)
Message-Id: <199810230339.VAA25994@pluto.plutotech.com>
X-Mailer: exmh version 2.0.2 2/24/98
To: Don Lewis <Don.Lewis@tsc.tdk.com>
cc: freebsd-fs@FreeBSD.ORG, freebsd-scsi@FreeBSD.ORG
Subject: Re: filesystem safety and SCSI disk write caching 
In-reply-to: Your message of "Thu, 22 Oct 1998 17:13:09 PDT."
             <199810230013.RAA19305@salsa.gv.tsc.tdk.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 22 Oct 1998 21:33:03 -0600
From: "Justin T. Gibbs" <gibbs@plutotech.com>
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

>} I've already given my opinion on this.  I believe the Hawk is seeing
>} a power glitch or temporary power loss when the reset switch is hit and
>} so the contents of the cache are lost.  I have never said that the
>} behavior that Don Lewis is seeing is 'not possible', only that, for
>} the drive in question, the reset causing cache corruption is not likely.
>
>If this problem caused by a load related power glitch, then it is
>possible to get silent filesystem corruption in normal operation if
>write caching is enabled, since cached writes could get lost and the
>driver might never notice.

The driver will notice as the drive will notice an issue a Unit Attention
response the next time you touch it.  Or is your point that you won't
necessarily access the device again and so never see that the device
saw a loss in power?

There has been quite a bit of debate on how UAs should be handled.  The
original CAM driver was *very* conservative and returned all pending
I/O with EIO, marked the pack invalid, and refused to take any I/O unless
the device cycled through final close.  This ensures that if the device
or pack is replaced that you don't spam different media or even if the
media is the same, make the problem worse by attempting to continue after
some transactions were irrevocably lost.  This was considered too 
disruptive and since a permanent solution could not be developed for
3.0R, the UA code was disabled.  The correct solution likely requires
better communication to the FS or user layer so that pack validation
of some sort can occur.

--
Justin


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message