Date: Wed, 15 Feb 1995 12:51:46 -0500 (EST) From: Mark Hittinger <bugs@warlock.win.net> To: freebsd-hackers@FreeBSD.org Subject: long DAT tape rewind bit sprays disk Message-ID: <199502151751.MAA10044@warlock.win.net>
next in thread | raw e-mail | index | archive | help
Hi - I have been having a problem with the close/rewind function of my scsi DAT tape drive. It appears that the rewind is not being given enough time to complete before the driver decides an abort condition is in order. This is one of those dds-2 16 gig jobber tapes so the rewinds will take awhile. This is happening with all SNAPS up through the latest 2/10 snap. I have the BT946C controller. If I get two "abort timeouts" in a row while attempting to rewind/unload the tape my active disks get bit sprayed. :-) While chasing this down I've noticed a couple of things in the bt742a.c driver that I wanted to ask about. In routine bt_poll we call bt_timeout and then call untimeout. I note that inside bt_timeout we already called untimeout. It looks suspicious to me to have this dual call to untimeout. --------------------------------------------------------------------- bt_poll .... if (count == 0) { /* * We timed out, so call the timeout handler manually, * accounting for the fact that the clock is not running yet * by taking out the clock queue entry it makes. */ bt_timeout(ccb); /* * because we are polling, take out the timeout entry * bt_timeout made */ untimeout(bt_timeout, (caddr_t)ccb); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ actually call #2 ---------------------------------------------------------------------- bt_timeout() .... /* * A timeout routine in kernel DONOT unlink * Entry chains when time outed....So infinity Loop.. * 94/04/20 amurai@spec.co.jp */ untimeout(bt_timeout, (caddr_t)ccb); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ actually call #1 --------------------------------------------------------------------- Finally further down in bt_timeout this code looks interesting: .... /* abort the operation that has timed out */ printf("bt%d: Try to abort\n", unit); bt_send_mbo(unit, ~SCSI_NOMASK, BT_MBO_ABORT, ccb); /* 2 secs for the abort */ ccb->flags = CCB_ABORTED; timeout(bt_timeout, (caddr_t)ccb, 2 * hz); ^^^^^^ (200 not 2000?) } What I am doing now is mass NFS mounting every disk in the place on my FreeBSD box. Then tar'ing everything to SCSI DDS-2 using device /dev/nrst0. No rewind/unload attempt will be made when things are complete. I can then shutdown into single user mode, sync, halt. If I attempt to rewind or unload the tape about 50% of the time the system disk will get bit sprayed so I no longer try :-). I am playing with longer timeouts ect but it does appear that two abort timeouts in a row do some corruption of the ccbs. Having fun! Mark Hittinger bugs@win.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199502151751.MAA10044>