Date: Sat, 28 Aug 1999 11:03:45 -0700 (PDT) From: Matthew Jacob <mjacob@feral.com> To: Matthew Dillon <dillon@apollo.backplane.com> Cc: "Justin T. Gibbs" <gibbs@plutotech.com>, hackers@FreeBSD.ORG Subject: Re: Should cam_imask be part of bio_imask ? Message-ID: <Pine.BSF.4.05.9908281102180.8884-100000@semuta.feral.com> In-Reply-To: <199908281800.LAA05485@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> :I strongly doubt that this is a CAM isr problem- the error pattern isn't > :entirely clear from what you said, but it looks more like a FIFO or CACHE > :LINE sized type of problem- it looks to be < 16 bytes, but not a short > :count. Because this isn't one of the wacky systems I spent most of my > :career on at Sun where the first and usual suspect was a system memory > :cache line because IO wasn't cache coherent on Suns between the Sun > :3/{50,60,75,150} and the advent SuperSparc Viking Chipset, I'd guess a > :FIFO somewhere in the I/O movement path. > : > :Justin- any changes lately where flushing a FIFO in the Adaptec at the end > :of tranfer might have been spoodged? > : > :-matt > > The problem is definitely aligned in some way. Here's a diff of > a hexdump of one error. Sometimes I lose a whole page, sometimes two > pages, sometimes 16 bytes, but the error is always page aligned. > > 1536c1536 > < 0005ff0 3333 2033 3434 3434 7c20 207c 3030 3030 > --- > > 0005ff0 7365 3d20 3120 093b 2309 6720 6f6c 6162 > > A cache-line problem would fit the symptoms. I know it isn't the > hardware... this 1xCPU PPro/200 system has been with me for several > years and this test didn't fail like this a month ago. When I updated > the machine last (unfortunately w/ about a month's worth of changes), > my buildworlds started failing with odd errors. > > I then switched away from the failing buildworlds (which take an hour) > and started doing cp -r's and then diff -r's (takes only 20 min), and as > you can see I'm still seeing the problem. > > Maybe this is DMA related. Perhaps the cache is not getting cleared? > Maybe an MMU optimization someone threw in recently? That's possible too- I'll admit I'm a bit hazy on i386 specifics- it's always been a "just works wrt I/O" so for all I know there's a required i/o flush command when you switch mappings. Gawd I hate these kind of problems. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.05.9908281102180.8884-100000>