From owner-freebsd-scsi@FreeBSD.ORG Tue Jul 1 23:11:26 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9259737B401 for ; Tue, 1 Jul 2003 23:11:26 -0700 (PDT) Received: from beppo.feral.com (beppo.feral.com [192.67.166.79]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9AE4544005 for ; Tue, 1 Jul 2003 23:11:25 -0700 (PDT) (envelope-from mjacob@feral.com) Received: from mailhost.feral.com (mjacob@mailhost.feral.com [192.67.166.1]) by beppo.feral.com (8.12.9/8.12.9) with ESMTP id h626BNKa049358; Tue, 1 Jul 2003 23:11:24 -0700 (PDT) (envelope-from mjacob@feral.com) Date: Tue, 1 Jul 2003 23:11:23 -0700 (PDT) From: Matthew Jacob X-X-Sender: mjacob@beppo To: Dan Langille In-Reply-To: <3F01EA0C.420.5FD9390A@localhost> Message-ID: <20030701230949.G48388@beppo> References: <1054550725.1582.1859.camel@rufus> <3F01EA0C.420.5FD9390A@localhost> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-scsi@freebsd.org Subject: Re: Differences between Solaris/Linux and FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: mjacob@feral.com List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Jul 2003 06:11:26 -0000 the last I heard this was in the bacula court- I've been away, and will be away again shortly for the fourth of july. there's a compat issue for fixing stuff mentioned below. what do you think is broken at this point? On Tue, 1 Jul 2003, Dan Langille wrote: > As a bacula fan I'd like to see it working on FreeBSD but I don't > know what I can do in order to achieve that objective. Any ideas? > > On 3 Jun 2003 at 11:39, Matthew Jacob wrote: > > > > As promised, in this email, I will try my best to describe > > > the differences I found between Solaris/Linux and FreeBSD > > > concerning tape handling. There were five separate areas > > > where I noticed differences: > > > > > > 1. On Solaris/Linux, the default behavior for ioctl(MTEOM) > > > is to run in what they call slow mode. In this mode, the > > > tape is positioned to the end of the data, and the driver > > > returns the correct file number in the MTIOCGET packet. > > > It is possible to enable fast-EOM, but no one uses it to > > > my knowledge. > > > > > > On FreeBSD, you apparently always use the fast-EOM so that > > > the tape position is unknown after the ioctl(). > > > > You *could* read block position. Particularly for h/w blocks this works > > very fast when you need to locate. > > > > NB: SCSI-3 changed the layout for h/w block position stuff and I haven't > > updated the FreeBSD driver to handle this yet. > > > > > Bacula always knows how many files are on a tape, and when > > > appending to a tape that is already written and newly opened, > > > it MUST know where it is on the tape. As a consequence, on > > > FreeBSD, I must explicitly use MTFSF with read()s in between > > > to position to the end of the tape -- a fairly slow affair. > > > > Uh, this is how 'slow' EOM works. It's not really faster to do it in the > > kernel as opposed to in the driver. > > > > I must point out that you cannot, and should not, depend absolutely on > > reported position. For tape you can ensure BOT or end of recorded media, > > but otherwise you really must use self-referential data on the tape if > > tape location is important. > > > > > 2. Your handling of EOM differs from Solaris/Linux. On both of > > > those systems, when the Bacula reads the first EOF, the driver > > > returns 0 bytes read. On reading the second EOF, the driver > > > returns 0 bytes read, but before returning backspaces over > > > the EOF, leaving you positioned correctly for appending to the > > > tape and having told you you are at the end of the tape by > > > giving two consecutive 0 byte read. Any further read() > > > request return an I/O error. > > > > > > On FreeBSD, reading the first EOF returns 0 bytes, reading > > > the second EOF also returns 0 bytes (sometimes, I apparently > > > get "Illegal operation"). However, the tape is left positioned > > > after the second EOF, so appending from that point effectively > > > "loses" the data. > > > > > > To handle this correctly the FreeBSD user must add a configuration > > > statement to Bacula telling him to backspace file at EOM. > > > > Yes. This is a problem. > > > > But part of the problem here is that dual-filemark at EOM is only one > > tape convention- and a poorly thought out one at best- it exists > > *solely* because a *few* (ancient) tape drives would unwind off the feed > > reel if you kept advancing them. For QIC drives, you *cannot* write dual > > filemarks (really). > > > > Note that there is a setting that can change the model to single EOM. If > > I could have gotten away with it, I would have made this the default. > > > > I think, though, I'd accept that the FreeBSD behaviour is a bug that > > should be fixed. If we have a dual fmk EOT model and are advancing along > > and hit two in a row, we *probably* should say we're at logical EOT and > > backspace over one of them. After all, this is what we do when we're > > *writing* to tape and close the no-rewind device. > > > > I also would agree that this situation is exacerbated by the 'space to > > end of recorded data' model for the MTEOM command. This now leaves us > > with a legacy of tapes with spurious dual filemarks in the middle. > > > > Oops. This means that I really can't fix things the way you'd like :-(. > > > > > > > > 3. I have previously described this but will do so again for > > > completeness here. On Solaris/Linux when Bacula does: > > > > > > write(); > > > ioctl(MTEOF); > > > ioctl(MTEOF) > > > ioctl(MTBSF); > > > ioctl(MTBSF); > > > ioctl(MTBSR); > > > read(); > > > > > > the read() re-reads the last write. On FreeBSD, the read returns > > > 0 bytes (there is also a problem of freezing the tape wrapped into > > > this example if I am not mistaken). Apparently the 0 bytes read is > > > because FreeBSD adds an additional EOF mark (not necessary) and > > > leaves the drive positioned *after* the mark thus re-reading the > > > last record fails when it logically should not. > > > > I don't believe that FreeBSD adds an additional filemark here, but I > > should add this as a test case. I have another tester program that I use > > for testing block locate, but I haven't really validated it or finished > > it yet. > > > > Why, btw, are you issuing two MTEOFs? The mtop has a count field y'know > > :-). > > > > > > > > 4. Tape freezing: On Solaris/Linux, the tape never "freezes". On > > > FreeBSD it does freeze. As best I can determine, you freeze the > > > drive when you lose track of where you are. Typically, this > > > occurs when I do a MTBSR to re-read the last record. On Solaris/Linux > > > the tape is never frozen, but when they don't know the position, > > > they simply return -s in the MTIOCGET packet, which is fine with > > > me because Bacula only uses that info when initially reading a > > > tape to append to it. > > > > > > Freezing the tape causes all sorts of problems because it generates > > > a flood of unexpected errors. Within a large complicated program like > > > Bacula, when a low level routine re-reads a record during writing and > > > the tape freezes, it cannot simply rewind the drive as this could > > > cause chaos and possible overwriting of the beginning of the drive. > > > > > > I've attempted to overcome tape freezing by providing the user a > > > means to turn off MTBSR (but they don't always do so), and by issuing > > > ioctl(MTIOCERRSTAT) after every return of -1 from any I/O request. > > > > > > I recommend that you do away with freezing the drive -- it seems to > > > me that it only causes more problems. In saying that I have to > > > that I really do not understand tape freezing or why you do it since > > > I found no documentation on it, and everything I write above I have > > > deduced from what Dan has reported back to me. > > > > Freezing the drive is precisely what Solaris and Linux *should* do. If > > you've lost position, you have to take some action to bring the tape to > > a known position. The unaware application should not be allowed to > > overwrite in random spots on the tape. If your low level read/write > > routines get any kind of error, you have to move to a "what do I have in > > my tape drive now?" state anyway. > > > > You know, I was pretty sure I'd documented the freeze option, but I > > cannot find it in the man page (sa(4)) now at all. > > > > > > > > > > 5. I am quite fuzzy on this point because I forget exactly what happened > > > and what I did about it. > > > > > > It seems to me that on Linux, if I read a block but specify a number > > > of bytes less than the number actually in the block on the tape, the > > > driver returns the data anyway. I then check if the block is > > > internally complete and if not, increase my record size to the size > > > indicated in the data received, backspace one record, and re-read it. > > > > > > If I am not mistaken, on FreeBSD, the first read returns an error, > > > and Bacula just immediately gives up. Your documentation specifies > > > that one can never read a partial record from a tape, but it does not > > > specify what error code is generated. As a consequence, rather than > > > recovering and re-reading the record, Bacula has to assume it was > > > a fatal error. > > > > The reason linux 'succeeds' here is because linux internally reads all > > tape data to an oversized buffer in kernel memory anyway. This means > > that it doesn't suffer an 'overrun' condition which is what you are > > doing if you attempt to read *less* than a tape record size. Solaris > > will fail the same way, btw, as FreeBSD. > > > > What you should always do is start out by reading the largest possible > > record size (a pathetic 64KB for FreeBSD) and adjust *downward* (if > > desired and you are just autosizing to find a tape record size). > > > > > > THanks for doing the critique. There's definitely food for thought here > > and some changes that *should* be made. > > -- > Dan Langille : http://www.langille.org/ > >