Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 26 Feb 1999 12:22:25 -0800
From:      Mike Smith <mike@smith.net.au>
To:        Andrew Heybey <ath@niksun.com>
Cc:        Mike Smith <mike@smith.net.au>, freebsd-hackers@freebsd.org
Subject:   Re: Advice wanted on tracking down bug (or hw problem?) in 3.1R 
Message-ID:  <199902262022.MAA09175@dingo.cdrom.com>
In-Reply-To: Your message of "Fri, 26 Feb 1999 13:24:12 EST." <199902261824.NAA12066@stiegl.niksun.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
> >>On Fri, 26 Feb 1999 09:52:33 -0800, Mike Smith <mike@smith.net.au> said:
>   >> I have just submitted PR kern/10243, but I thought I would ask
>   >> for some advice on hackers as well.
>   >> 
>   >> The bug is that under certain loads, read(2) can return corrupted
>   >> data (ie data that are not in the file on disk).  The instances I
>   >> have seen are relatively small amounts (8-64 bytes) of corrupt
>   >> data at the end of a 4k page.  The corrupt data is from a file
>   >> previously read or another position in the current file.  I have
>   >> also seen this problem in 3.0-RELEASE but not in 2.2.8-RELEASE.
> 
>   mike> Can you look at the corrupt data and see if you can identify
>   mike> it?  In particular, look for objects that look like IP
>   mike> addresses, MAC addresses, pointers into kernel space, ascii
>   mike> text, etc.  This is usually the best way to work out where the
>   mike> data is coming from.
> 
> The data is always (in every instance that I have examined) from some
> other part of the file currently being read or some other file in my
> set of test files.  How my test setup works is that I have 30 50MB
> files.  The files are filled with sequential integers (counting over
> the entire 1.5GB).  My test program reads from the files (in order,
> starting over at file #0 when it reaches file #29) and compares what
> read(2) returns to what should be there (based on file number and file
> offset).
> 
> One other possible clue: This morning I hooked my disks up to the
> regular Ultra SCSI (40MB/s) port of the 7890 controller rather than
> the Ultra/2 (80MB/s) port and I haven't seen the bug yet.  I am not
> 100% positive since I have only run it for a few hours so far, but
> before I could almost always make the bug happen withing 10-15
> minutes.

Could you try bzero'ing your buffers before every read?  This sniffs 
very much like short transfers rather than sniping...

-- 
\\  Sometimes you're ahead,       \\  Mike Smith
\\  sometimes you're behind.      \\  mike@smith.net.au
\\  The race is long, and in the  \\  msmith@freebsd.org
\\  end it's only with yourself.  \\  msmith@cdrom.com




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199902262022.MAA09175>