Date: Fri, 26 Feb 1999 12:22:25 -0800 From: Mike Smith <mike@smith.net.au> To: Andrew Heybey <ath@niksun.com> Cc: Mike Smith <mike@smith.net.au>, freebsd-hackers@freebsd.org Subject: Re: Advice wanted on tracking down bug (or hw problem?) in 3.1R Message-ID: <199902262022.MAA09175@dingo.cdrom.com> In-Reply-To: Your message of "Fri, 26 Feb 1999 13:24:12 EST." <199902261824.NAA12066@stiegl.niksun.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> >>On Fri, 26 Feb 1999 09:52:33 -0800, Mike Smith <mike@smith.net.au> said: > >> I have just submitted PR kern/10243, but I thought I would ask > >> for some advice on hackers as well. > >> > >> The bug is that under certain loads, read(2) can return corrupted > >> data (ie data that are not in the file on disk). The instances I > >> have seen are relatively small amounts (8-64 bytes) of corrupt > >> data at the end of a 4k page. The corrupt data is from a file > >> previously read or another position in the current file. I have > >> also seen this problem in 3.0-RELEASE but not in 2.2.8-RELEASE. > > mike> Can you look at the corrupt data and see if you can identify > mike> it? In particular, look for objects that look like IP > mike> addresses, MAC addresses, pointers into kernel space, ascii > mike> text, etc. This is usually the best way to work out where the > mike> data is coming from. > > The data is always (in every instance that I have examined) from some > other part of the file currently being read or some other file in my > set of test files. How my test setup works is that I have 30 50MB > files. The files are filled with sequential integers (counting over > the entire 1.5GB). My test program reads from the files (in order, > starting over at file #0 when it reaches file #29) and compares what > read(2) returns to what should be there (based on file number and file > offset). > > One other possible clue: This morning I hooked my disks up to the > regular Ultra SCSI (40MB/s) port of the 7890 controller rather than > the Ultra/2 (80MB/s) port and I haven't seen the bug yet. I am not > 100% positive since I have only run it for a few hours so far, but > before I could almost always make the bug happen withing 10-15 > minutes. Could you try bzero'ing your buffers before every read? This sniffs very much like short transfers rather than sniping... -- \\ Sometimes you're ahead, \\ Mike Smith \\ sometimes you're behind. \\ mike@smith.net.au \\ The race is long, and in the \\ msmith@freebsd.org \\ end it's only with yourself. \\ msmith@cdrom.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199902262022.MAA09175>