From owner-freebsd-hackers Tue Dec 7 11:54:39 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from herring.nlsystems.com (nlsys.demon.co.uk [158.152.125.33]) by hub.freebsd.org (Postfix) with ESMTP id A406514D95; Tue, 7 Dec 1999 11:54:29 -0800 (PST) (envelope-from dfr@nlsystems.com) Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3]) by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id UAA23239; Tue, 7 Dec 1999 20:01:51 GMT (envelope-from dfr@nlsystems.com) Date: Tue, 7 Dec 1999 20:01:51 +0000 (GMT) From: Doug Rabson To: Peter Wemm Cc: Ed Hall , Matthew Dillon , "Jonathan M. Bresler" , kris@hub.freebsd.org, freebsd-hackers@freebsd.org Subject: Re: PCI DMA lockups in 3.2 (3.3 maybe?) In-Reply-To: <19991207120139.869F01CC6@overcee.netplex.com.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Tue, 7 Dec 1999, Peter Wemm wrote: > Ed Hall wrote: > > : you wrote: > > : : I wrote: > > : :4) Using a different SCSI driver (Peter managed to get a driver from 4.0 > > : : hooked up under 3.3, and it survived two days of torture that would > > : : have toasted things within an hour using the stock driver; you'll have > > : : to ask him for details). > > : > > : Ed, this is great stuff! > > : > > : Are you sure about #4? Is that the same ncr.c driver or something > > : else? There are only a few differences between the 3.x and 4.x > > : /usr/src/sys/pci/ncr.c drivers. Which Peter, Peter Wemm? > > > > It was Peter Wemm. I may be misunderstanding just what he did--trying > > the 4.0 driver was just one several experiments he proposed and > > performed. And saying that it "worked" is provisional; two days of > > testing strongly suggests that it reduced the problem with 3.3 to > > acceptible levels for my application. Is it truly a "fix?" I don't > > know. > > > > -Ed > > I might add that others have found that using sym + fxp on the N440BX > motherboards didn't solve their problems, or moved the problem elsewhere, > eg: to the sbdrop() etc routines. One other interesting variable.. an ahc > + pn driver combination on a 440BX motherboard under -current in late may > 99 had the exact same problems we saw a number of times with ncr + fxp (ie: > sbdrop, sbflush, m_copym etc panics). The same motherboard with ahc + de or > fxp did not have the problems. > > In all cases the panics were extremely "strange". The original fxp+ncr > combination changed it's crash pattern when we put extra debugging in it to > sanity check and check conditions. The results varied from registers getting > clobbered (as though an interrupt happened and the trapframe on the stack got > changed by the interrupt handler and then returned with garbage contents in > some registers.. this is what seems to be happening in the fxp_add_rfabuf() > panics - %esi was getting loaded earlier on and when it got to do the > vtophys() it was zero. People have printed the contents of "rfa" on the stack > and seen garbage - in fact it's a register variable under normal circumstances. > Adding debugging caused it to be stored in the local variable rather than > being left in %esi, and then the panics moved elsewhere (!).) > > It had the markings of "something trashing something somewhere and then crashing > quite a bit later". :-( Has anyone tried fiddling with the latency timer on either fxp or ncr (or both)? -- Doug Rabson Mail: dfr@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 442 9037 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message