Date: Wed, 12 Mar 1997 12:50:01 -0800 (PST) From: Stefan Esser <se@freebsd.org> To: freebsd-bugs Subject: Re: kern/2965: st0 hang/fail on reading 4mm DAT tape for larger files Message-ID: <199703122050.MAA10436@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/2965; it has been noted by GNATS. From: Stefan Esser <se@freebsd.org> To: jin@iss-p1.lbl.gov Cc: FreeBSD-gnats-submit@freebsd.org Subject: Re: kern/2965: st0 hang/fail on reading 4mm DAT tape for larger files Date: Wed, 12 Mar 1997 21:42:40 +0100 On Mar 12, "Jin Guojun[ITG]" <jin@iss-p1.lbl.gov> wrote: > >Synopsis: st0 hang/fail on reading 4mm DAT tape for larger files > 2.2-SNAP(s) with HP C1536A SCSI 4mm DAT tape drive: > (ncr0:4:0): "HP HP35480A T503" type 1 removable SCSI 2 > st0(ncr0:4:0): Sequential-Access > st0(ncr0:4:0): 5.0 MB/s (200 ns, offset 8) > > >Description: > > tar -c some.files # writing OK > tar -t > or > tar -xv > will hang when looking/reading a file larger than about 6090 Bytes; > > tar: read error on /dev/nrst0 : Input/output error > > tar process is hanging at here and tape drive stopped. > > system errors: > ncr0:4: ERROR (0:48) (1-21-1e) (88/13) @ (c2c:19000200). The error code (0x48) signals a GROSS ERROR, which in most cases is an over- or underflow on the SCSI bus while doing a synchronous transfer. This means, that either one byte has been acknowledged null or two times. This is a reported hardware error. > script cmd = 89030000 > reg: da 10 80 13 47 88 04 1f 01 01 84 21 80 01 99 00. > ncr0: have to clear fifos. > ncr0: restart (fatal error). > st0(ncr0:4:0): COMMAND FAILED (9 ff) @f2136c00. > ncr0: timeout ccb=f2136c00 (skip) > > The tar process cannot be killed. The only solution is power > cycle the tape drive. > > The same hardware worked with 2.1.{6-7} without any problem. > So, it looks like software problem in the kernel somewhere. Well, it looks like the error recovery fails, and there is actually a difference between 2.1 and 2.2, but with 2.2 surviving a number of scenarios where 2.1 hung. I know that the error recovery code needs quite some work, but the primary cause of your problem is more likely related to a SCSI bus problem, which is only visible by the failed recovery procedure. I'm using a HP1533A DDS-2 DAT drive for my backups and to exchange data, and never had a single occurance of a hang under any -current (i.e. post 2.1 kernel). Please make sure that there is no problem with your controller or SCSI bus cable/terminators/terminator power ... Regards, STefan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199703122050.MAA10436>