Date: Sat, 15 Apr 1995 09:12:25 -0500 (CDT) From: Mike Pritchard <pritc003@maroon.tc.umn.edu> To: rlenk@xmission.com (Ron Lenk) Cc: questions@FreeBSD.org Subject: Re: SCSI timeout... Message-ID: <199504151412.JAA06761@mpp.com> In-Reply-To: <199504150141.TAA04738@xmission.xmission.com> from "Ron Lenk" at Apr 14, 95 07:41:30 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> I had run the 2.0-920210-SNAP from the for approx. 3 weeks, at which > point, about the first week in April, I compiled from the "current" > sources. Everthing ran fine until the first of this week when I added > the SyQuest drive to the SCSI bus. I was able to fdisk, disklabel, and > create a new filesystem on the drive, and things worked great under > _light_ use. However, when I attempted to copy the contents of > /usr/src ( 110 Mb, as we all know ) to the SyQuest mounted on /mnt, I > got about half way through the copy when I began getting a kernel > message that sd0 ( the 1.05 Micropolis disk ) had timed out. I got the > message about 8 times, and then the system appeared to hang. After > doing a hard reset, I found the root filesystem (sd0a) was damaged > beyond repair ( by me, anyway ), and I was forced to attempt to > reinstall everything I just ran into the same problem a few days ago with an Adaptec 2842VL SCSI controller and two SCSI disk and running -current. I can reproduce the problem at will by doing something like: find /disk1 -print > /dev/null & [run about 6 of the above commands in the background] find /disk2 -print > /dev/null & [run a couple of the above commands in the background] After about 30 seconds or so you will start to see timeouts. It looks like the data corruption you saw was due to the fact that after a while the I/O that is timing out will be completed with garbage. E.g. the finds start printing stuff like /usr/src/sys/AAD,kjhet2@#$098 not found Only there are lots of really strange characters in the filename. I suspect that this might be related to writes on the disk, since I've also seen this happen if I do something like: cd /disk1 touch a b c d e cd /disk2 [start doing some I/O on disk2] after a bit you might see timeouts If I do a sync before doing the cd /disk2, then there isn't a problem. I suspect that the finds die out when sync runs to flush the i-nodes back to disk to update the directory access times. > I have looked into the obvious problems, i.e. problems with either of > the disks, improper termination of the SCSI bus, excessive SCSI bus > length, etc. But I'm not sure what is causing the problem. I do know > that everything works fine under Windows NT, with native support for > the 2842. ( although I'm not sure that this is a fair comparison ) > > Any advise, help, or suggestions would be appreciated. Well, to prevent disk damage, hit the reset button when you start to see the timeout messages. -- Mike Pritchard pritc003@maroon.tc.umn.edu "Go that way. Really fast. If something gets in your way, turn"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199504151412.JAA06761>