Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Nov 95 23:19 WET
From:      uhclem%nemesis@fw.ast.com
To:        FreeBSD-gnats-submit@freebsd.org
Subject:   bin/850: dump treats write-protect as an EOT & spoils set FDIV042
Message-ID:  <m0tKevg-000CU4C@nemesis.lonestar.org>
Resent-Message-ID: <199511290540.VAA12934@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         850
>Category:       bin
>Synopsis:       dump treats write-protect as an EOT & spoils set FDIV042
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 28 21:40:02 PST 1995
>Last-Modified:
>Originator:     Frank Durda IV
>Organization:
None
>Release:        FreeBSD 2.1-STABLE i386
>Environment:

FreeBSD 2.1.0-RELEASE (or STABLE?)
SCSI Viper 150 Tape Drive

>Description:

When doing a multi-volume level zero dump, the second tape was
inserted with the write-protect pin set (aka SAFE).  When
dump was told the next volume was mounted and ready, the
driver reported the WRITE PROTECT error to the root console, but dump
simply reported
"Dump: Volume 2 begins with blocks from inode 21341
 Dump: End of Tape detected
 Dump: Change Volumes: Mount Volume #3"

The operator didn't realize what happened, and inserted a third tape.
(I suspect that he walked away immediately after responding to the second
 volume and came back later and found the system wanting the next tape.)
The backup should have only required two tapes but three were "used".
The dump appeared successful.

Now the hard drive was reformatted and the level 0 dump was used to restore
as in "restore -rvf".  After reading tape Volume 1, restore asks for
Volume 2, which is inserted, and restore complains that this is not
the correct tape.  Volume 3 is inserted and we are told that it is not
Volume 2.  We are now trapped.


Back in the old days on the VAX-11/780, we used to forget to put
the write rings in all the time on the 6250 tapes, and BSD 4.3 dump NEVER
pulled this stunt.  It always said that something was wrong (like the open
failed) and "Do you want to retry the open?" and gave you a chance to pull
the tape, write-enable it and keep going, even if it was the first or
nth tape of the set.

Clearly something has changed in dump or the way that the write-protect
error is reported back to dump is different and that "want to retry" code
isn't being run.


>How-To-Repeat:

Use the above procedure after making a good dump set first and hiding
it away somewhere.


>Fix:
	
Absolutely! (he says with bloodshot eyes as he trys to bail the site
out using older tapes and they keep saying this never happened with 
the Brand S system)

- - - - - - - - - - - - - -

Because the way this happened reminded me, I'll make my pitch again for
adding two new system-wide error returns specifically to deal with issues
related to removable media.  (The last time I mentioned it in "arch"
nobody even bothered to flame the idea, wave the NIH banner, etc.)

One new error number should be used for reporting write protect errors,
such as tapes or removable discs that are protected (or the media is
ready-only).  We can't sit back and simply reject requests in the write
call in the drivers anymore with new hardware appearing, such as a SCSI
drive that supports both MO media and CD-ROMs.  Or CD-burners that can be
written to but only in certain areas.

Right now, most drivers appear to return something akin to an I/O Error,
leaving the user wondering if the media is flawed, the wrong type or not
formatted, or the wrong filesystem type, or perhaps write-protected?


A second new error number should be used to report that the media is
offline or that the drive is not ready.  In other words, this error
code would allow the user/application to tell that "yes, the driver is
installed, and the dev being access is configured and all that stuff,
but that the problem is that the tape or disc isn't loaded or up to
speed at this second."


In the first new error number case, we currently spray a write protect
message on the console, but only root gets to see that, and we want people
to not run root, so the mortal user gets little or no information on
what went wrong on a write protect error.  Just a flimsy I/O error or
some other code even less descriptive.

In the second new error number case, dividing non-existant/not-configured
types errors from not-ready errors will really improve the
"user-friendlyness" of the applications.  The "not configured" case is not
recoverable; being offline or not ready is a recoverable condition and
the user should be given the chance to revover.

Windows and even MS-DOS make these distinctions in their error reporting,
we should too.  (Geez, even TRSDOS did.)

Yes, the block I/O drivers would have to be updated (I'll be happy to do
them all!) to take advantage of the new error returns, but I still think we
should seriously think about having them.

*END*

>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0tKevg-000CU4C>