From owner-freebsd-scsi@FreeBSD.ORG  Tue Jun  3 10:35:04 2003
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id ED0E237B401
	for <freebsd-scsi@freebsd.org>; Tue,  3 Jun 2003 10:35:03 -0700 (PDT)
Received: from matou.sibbald.com (matou.sibbald.com [195.202.201.48])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 32B0D43F75
	for <freebsd-scsi@freebsd.org>; Tue,  3 Jun 2003 10:35:02 -0700 (PDT)
	(envelope-from kern@sibbald.com)
Received: from [192.168.68.112] (rufus [192.168.68.112])
	by matou.sibbald.com (8.11.6/8.11.6) with ESMTP id h53HYSv10479;
	Tue, 3 Jun 2003 19:34:28 +0200
From: Kern Sibbald <kern@sibbald.com>
To: mjacob@feral.com
In-Reply-To: <20030603084701.U24586@wonky.in0.lcl>
References: <3EDB31AB.16420.C8964B7D@localhost>
	<3EDB59A4.27599.C93270FB@localhost> <20030602110836.H71034@beppo>
	<20030602131225.F71034@beppo>	 <1054645616.13630.161.camel@rufus>
	<3490610000.1054651919@aslan.scsiguy.com>
	<20030603084701.U24586@wonky.in0.lcl>
Content-Type: text/plain
Organization: 
Message-Id: <1054661668.13606.292.camel@rufus>
Mime-Version: 1.0
X-Mailer: Ximian Evolution 1.2.4 
Date: 03 Jun 2003 19:34:28 +0200
Content-Transfer-Encoding: 7bit
cc: freebsd-scsi@freebsd.org
Subject: Re: SCSI tape data loss
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
	<mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Jun 2003 17:35:04 -0000

There are now a lot more things making sense
because I've had other FreeBSD users report 
"unbelievable" output from a simple test program
I have. I'll respond below, but with the latest
test results, the problem seems to be generated
from the simple sequence:

  write()
  ...
  write()
  ioctl(MTEOF)
  ioctl(MTEOF)
  ioctl(MTREW)

is there any reason why writing two end of file marks
followed by a rewind after a series of writes should
create data loss?

On Tue, 2003-06-03 at 18:03, Matthew Jacob wrote:
> >
> > This is exactly what it does. *Every* time the requested write
> > size does not agree with the returned value, Bacula gives
> > up on the tape.  My last email has the code that does that.
> >
> > My email above was not very clear because I was telling you what
> > happened in the particular case of loss of data (the -1 and errno=0
> > or errno=ENOSPC I don't know which). As noted here, Bacula *will*
> > stop writing if the driver returns a short block (assuming my
> > code isn't broken), but I have never seen that case on FreeBSD.
> 
> That's really wierd. I have to look at this closer. I've had some
> drives not report LEOT at all, but since tape_pattern_tester didn't
> complain on the same drive you were using, I know tape_pattern_tester is
> in fact stopping at LEOT.

I'm now sure this is not related to LEOF, so please don't waste your
time on that.

> 
> write(2) isn't necessarily returning -1. It may be returning 0- which
> means that no data moved.

In this case I am sure a -1 was returned because I print different
error messages.

> 
> I think the ENOSPC as you report is a red herring because you're setting
> this value- unless you actually *did* see -1 returned from write(2) and
> ENOSPC set in errno,.

I don't think this is the case because of what I say above.

> 
> In any case, even if you hit PEOT instead of LEOT, you shouldn't *lose*
> data. If you hit PEOT, we have to return -1/ENOSPC. Because this is Unix
> or Linux or Solaris instead of a reasonable and modern OS, like RSX, VMS
> or NT, which allow you to give realistic details to failures in I/O
> requests, this means you have no way of telling the user application how
> much was *actually* written when you hit *PEOT* (not LEOT, note!). As
> far as the user application is concerned, *no* data was written at all
> for this last write.

I've been screaming and tearing my hair out for the last 3 years for
exactly the reasons you write, so I am pleased that someone else feels
the same about it.

> 
> But there may in fact be data on the tape media. 

If it is there, it is hiding because we read the tape back and
printed all the block numbers. That way, we "verified" that the 
blocks were really missing.

> What is particularily
> annoying in the PEOT case is that your application probably asked for
> the next tape and rewrote all the blocks from the failed write. 

I don't think so because we would have clearly seen this in the 
listing we did.  The dump of the two tapes was done separately
reading only one at a time -- no possibility of getting confused.

> This is
> fine, but you have to make damned sure then on rereading the data later
> that you can handle duplicate blocks because you may read blocks NOPQR
> on tapeA and then switch to tapeB and read blocks OPQR again on tapeB.

I can handle duplicate blocks because I put the block number in each
block -- however, I have not programmed it because I have so many
things to do and I have never run into any duplicate blocks -- one
reason is that Bacula always stops writing if the write count is
not correct.

> 
> I don't think this is your problem here, but I thought I'd have a
> pre-coffee diatribe about it. Grump.
> 
> 
> >
> > > Ignoring the short write and waiting until you hit ENOSPC guarantees
> > > you will hit PEOM, since the LEOM is only reported once.  The tape
> > > driver expects that you know what you are doing if you go on writing.
> >
> > The only additional writing Bacula does (unless I am missing something)
> > is the two EOF marks.
> 
> This is one of the things that's bothering me. You shouldn't be writing
> extra marks if you actually close the device. I'd like to look over all
> the current Bacula source, but sourceforge is offline at the moment.
> 
> 
> -matt