From owner-freebsd-hackers  Thu Feb 16 08:49:27 1995
Return-Path: hackers-owner
Received: (from root@localhost) by freefall.cdrom.com (8.6.9/8.6.6) id IAA00733 for hackers-outgoing; Thu, 16 Feb 1995 08:49:27 -0800
Received: from warlock.win.net (warlock.win.net [198.30.130.3]) by freefall.cdrom.com (8.6.9/8.6.6) with ESMTP id IAA00727 for <freebsd-hackers@FreeBSD.ORG>; Thu, 16 Feb 1995 08:49:24 -0800
Received: (from bugs@localhost) by warlock.win.net (8.6.9/8.6.9) id LAA00531 for freebsd-hackers@FreeBSD.ORG; Thu, 16 Feb 1995 11:50:00 -0500
From: Mark Hittinger <bugs@warlock.win.net>
Message-Id: <199502161650.LAA00531@warlock.win.net>
Subject: re: long DAT tape rewinds bit spray your disks
To: freebsd-hackers@FreeBSD.org
Date: Thu, 16 Feb 1995 11:50:00 -0500 (EST)
X-Mailer: ELM [version 2.4 PL23]
Content-Type: text
Content-Length: 1997      
Sender: hackers-owner@FreeBSD.org
Precedence: bulk


Jordan I had no problems with DAT tape rewind this morning.  Below are diffs of
what I changed in my kernel.  Everything behaved like it should have, and no
disks got bit sprayed!

These adjustments are against the 2-10 snapshot sources.  I have just increased
some of the timeouts and touched up a few possibly cosmetic things.  

I tried to think of why a short rewind timeout would be usefull in st.c but 
couldn't come up with an important reason.  Anybody tell me if you know why
this 5 second thing was there.

I still have a reboot problem where the scsi_test_unit_ready call will hang
about half the time if there is no media in the DAT drive.  I get the printf
with the density code (x13) and it stops.  I am sure the timeout here is
really long so describing it as a hang is probably misleading.  I think I
will play with some combination of a shorter timeout there with a reset and
re-sense mode attempt.

Remember it is the second "abort timeout" within BT742A that bit sprays
your disk.  I always survived the first one.  If the rewind completed before
the second "abort timeout" then everything was ok.  I did not get any
"abort timeout" messages this morning.  Before they happened very quickly
and I suspected the "int count" might be a short.

sys/scsi/st.c
-------------
1831c1831
< 		immed ? 5000 : 300000,	/* 5 sec or 5 min */
---
> 		300000,      /* msh 2/15/95 always use 5 min */

sys/i386/isa/bt742a.c
---------------------
1580c1580
< 	int	count = xs->timeout;
---
> 	u_int32	count = xs->timeout;   /* msh use uns-long 2/15/95 */
1611,1612c1611,1613
< 		untimeout(bt_timeout, (caddr_t)ccb);
< 		count = 2000;
---
> /*		untimeout(bt_timeout, (caddr_t)ccb); msh 2/15/95 done in
>                                                      bt_timeout */
> 		count = 5000;       /* msh 2/15/95 give it more time */
1690c1691
< 		timeout(bt_timeout, (caddr_t)ccb, 2 * hz);
---
> 		timeout(bt_timeout, (caddr_t)ccb, 300 * hz); /* msh more time */


Fun,

Mark Hittinger
bugs@win.net