From owner-freebsd-bugs Wed Mar 12 13:50:06 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id NAA13544 for bugs-outgoing; Wed, 12 Mar 1997 13:50:06 -0800 (PST) Received: (from gnats@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id NAA13535; Wed, 12 Mar 1997 13:50:03 -0800 (PST) Date: Wed, 12 Mar 1997 13:50:03 -0800 (PST) Message-Id: <199703122150.NAA13535@freefall.freebsd.org> To: freebsd-bugs Cc: From: "Jin Guojun[ITG]" Subject: Re: kern/2965: st0 hang/fail on reading 4mm DAT tape for larger files Reply-To: "Jin Guojun[ITG]" Sender: owner-bugs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk The following reply was made to PR kern/2965; it has been noted by GNATS. From: "Jin Guojun[ITG]" To: jin@iss-p1.lbl.gov, se@freebsd.org Cc: FreeBSD-gnats-submit@freebsd.org Subject: Re: kern/2965: st0 hang/fail on reading 4mm DAT tape for larger files Date: Wed, 12 Mar 1997 13:43:55 -0800 } On Mar 12, "Jin Guojun[ITG]" wrote: } > >Synopsis: st0 hang/fail on reading 4mm DAT tape for larger files } } > 2.2-SNAP(s) with HP C1536A SCSI 4mm DAT tape drive: } > (ncr0:4:0): "HP HP35480A T503" type 1 removable SCSI 2 } > st0(ncr0:4:0): Sequential-Access } > st0(ncr0:4:0): 5.0 MB/s (200 ns, offset 8) } > } > >Description: } > } > tar -c some.files # writing OK } > tar -t } > or } > tar -xv } > will hang when looking/reading a file larger than about 6090 Bytes; } > } > tar: read error on /dev/nrst0 : Input/output error } > } > tar process is hanging at here and tape drive stopped. } > } > system errors: } > ncr0:4: ERROR (0:48) (1-21-1e) (88/13) @ (c2c:19000200). } } The error code (0x48) signals a GROSS ERROR, which in most } cases is an over- or underflow on the SCSI bus while doing } a synchronous transfer. This means, that either one byte } has been acknowledged null or two times. } } This is a reported hardware error. } } > script cmd = 89030000 } > reg: da 10 80 13 47 88 04 1f 01 01 84 21 80 01 99 00. } > ncr0: have to clear fifos. } > ncr0: restart (fatal error). } > st0(ncr0:4:0): COMMAND FAILED (9 ff) @f2136c00. } > ncr0: timeout ccb=f2136c00 (skip) } > } > The tar process cannot be killed. The only solution is power } > cycle the tape drive. } > } > The same hardware worked with 2.1.{6-7} without any problem. } > So, it looks like software problem in the kernel somewhere. } } Well, it looks like the error recovery fails, and there } is actually a difference between 2.1 and 2.2, but with } 2.2 surviving a number of scenarios where 2.1 hung. } } I know that the error recovery code needs quite some work, } but the primary cause of your problem is more likely } related to a SCSI bus problem, which is only visible by } the failed recovery procedure. } } I'm using a HP1533A DDS-2 DAT drive for my backups and } to exchange data, and never had a single occurance of a } hang under any -current (i.e. post 2.1 kernel). } } Please make sure that there is no problem with your } controller or SCSI bus cable/terminators/terminator } power ... } } Regards, STefan This is one machine that runs both 2.1.x and 2.2-SNAP. So, there is no hardware problem at all. tar -cv files under 2.2-SNAP and reboot it to 2.1.x right way, 2.1.x read the tape perfectly. However, the tape written by both 2.1.x and 2.2-SNAP will not be able to be read by 2.2-SNAP. That is why I think it is 2.2 problem, nothing else. -Jin