From owner-freebsd-hackers Fri Feb 26 10:25: 7 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from arjun.niksun.com (gw.niksun.com [206.20.52.122]) by hub.freebsd.org (Postfix) with ESMTP id 4F4B51508E for ; Fri, 26 Feb 1999 10:24:34 -0800 (PST) (envelope-from ath@niksun.com) Received: from stiegl.niksun.com (stiegl.niksun.com [10.0.0.44]) by arjun.niksun.com (8.8.8/8.8.8) with ESMTP id NAA00961; Fri, 26 Feb 1999 13:24:14 -0500 (EST) Received: from stiegl.niksun.com (localhost.niksun.com [127.0.0.1]) by stiegl.niksun.com (8.8.8/8.8.7) with ESMTP id NAA12066; Fri, 26 Feb 1999 13:24:12 -0500 (EST) (envelope-from ath@stiegl.niksun.com) Message-Id: <199902261824.NAA12066@stiegl.niksun.com> From: Andrew Heybey To: Mike Smith Cc: freebsd-hackers@freebsd.org Subject: Re: Advice wanted on tracking down bug (or hw problem?) in 3.1R In-reply-to: Your message of Fri, 26 Feb 1999 09:52:33 -0800. <199902261752.JAA08432@dingo.cdrom.com> Mime-Version: 1.0 (generated by tm-edit 7.108) Content-Type: text/plain; charset=US-ASCII Date: Fri, 26 Feb 1999 13:24:12 -0500 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG >>On Fri, 26 Feb 1999 09:52:33 -0800, Mike Smith said: >> I have just submitted PR kern/10243, but I thought I would ask >> for some advice on hackers as well. >> >> The bug is that under certain loads, read(2) can return corrupted >> data (ie data that are not in the file on disk). The instances I >> have seen are relatively small amounts (8-64 bytes) of corrupt >> data at the end of a 4k page. The corrupt data is from a file >> previously read or another position in the current file. I have >> also seen this problem in 3.0-RELEASE but not in 2.2.8-RELEASE. mike> Can you look at the corrupt data and see if you can identify mike> it? In particular, look for objects that look like IP mike> addresses, MAC addresses, pointers into kernel space, ascii mike> text, etc. This is usually the best way to work out where the mike> data is coming from. The data is always (in every instance that I have examined) from some other part of the file currently being read or some other file in my set of test files. How my test setup works is that I have 30 50MB files. The files are filled with sequential integers (counting over the entire 1.5GB). My test program reads from the files (in order, starting over at file #0 when it reaches file #29) and compares what read(2) returns to what should be there (based on file number and file offset). One other possible clue: This morning I hooked my disks up to the regular Ultra SCSI (40MB/s) port of the 7890 controller rather than the Ultra/2 (80MB/s) port and I haven't seen the bug yet. I am not 100% positive since I have only run it for a few hours so far, but before I could almost always make the bug happen withing 10-15 minutes. andrew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message