From owner-freebsd-hackers Thu Oct 7 17:24:50 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from dingo.cdrom.com (dingo.cdrom.com [204.216.28.145]) by hub.freebsd.org (Postfix) with ESMTP id D2C9E14BB7; Thu, 7 Oct 1999 17:24:46 -0700 (PDT) (envelope-from mike@dingo.cdrom.com) Received: from dingo.cdrom.com (localhost.cdrom.com [127.0.0.1]) by dingo.cdrom.com (8.9.3/8.8.8) with ESMTP id RAA02524; Thu, 7 Oct 1999 17:16:53 -0700 (PDT) (envelope-from mike@dingo.cdrom.com) Message-Id: <199910080016.RAA02524@dingo.cdrom.com> X-Mailer: exmh version 2.0.2 2/24/98 To: sa-list@avantgo.com Cc: FreeBSD-hackers@freebsd.org, FreeBSD-stable@freebsd.org Subject: Re: SMP + fxp0 wierdness In-reply-to: Your message of "Thu, 07 Oct 1999 16:58:09 PDT." <37FD3391.1F84611A@avantgo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 07 Oct 1999 17:16:53 -0700 From: Mike Smith Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Greetings, > > We're running 3.3-REL on dual processor PII-450's, with a N440BX > motherboard, using the onboard EtherExpress Pro (fxp) NIC and 512MB RAM. > > These machines are running custom software that excercises the disk, CPU > and network quite heavily. The SMP machines seem to have both "fxp0: > device timeout" problems, and spontaneous reboots. We were uable to get > a working savecore until now, and have traced the reboots back to the > fxp driver as well. Here are the debug outputs, and any custom changes > to our kernel config. > > Could this be a problem with SMP + fxp combination? Any other thoughts > or ideas? This is a known problem, insofar as it's been seen on a wide range of systems. It's not SMP specific, but it does appear to be very sensitive to your hardware configuration (eg. we were seeing it repeatedly in combination with an ncr SCSI card, and when we switched to an Adaptec it went away). Others have reported it in conjunction with Adaptec cards however, as well as in IDE-only systems. We haven't been successful in stirring David Greenman's interest in this so far, which is really crucial since the only common factor in these problems so far has been the fxp card/driver combination. > #4 0xc01cd95a in trap () > #5 0xc018979e in fxp_add_rfabuf () For anyone else wondering whether they're seeing this problem, the above two lines are the signature; there is a point inside fxp_add_rfabuf where the variables on the stack seem right, but the register shadows of the variables are corrupt, causing the trap. I spent several days looking at this and came away with a sore head. -- \\ Give a man a fish, and you feed him for a day. \\ Mike Smith \\ Tell him he should learn how to fish himself, \\ msmith@freebsd.org \\ and he'll hate you for a lifetime. \\ msmith@cdrom.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message