From owner-freebsd-stable@FreeBSD.ORG Tue Jan 31 11:25:02 2006 Return-Path: X-Original-To: freebsd-stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A1BD416A420 for ; Tue, 31 Jan 2006 11:25:02 +0000 (GMT) (envelope-from roam@straylight.ringlet.net) Received: from straylight.ringlet.net (nat84.cnsys.bg [85.95.80.84]) by mx1.FreeBSD.org (Postfix) with SMTP id 92AC743D76 for ; Tue, 31 Jan 2006 11:24:52 +0000 (GMT) (envelope-from roam@straylight.ringlet.net) Received: (qmail 1450 invoked by uid 1000); 31 Jan 2006 11:24:47 -0000 Date: Tue, 31 Jan 2006 13:24:47 +0200 From: Peter Pentchev To: Warner Losh Message-ID: <20060131112447.GA1173@straylight.m.ringlet.net> References: <20060131091027.CC43516A424@hub.freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="9amGYk9869ThD9tj" Content-Disposition: inline In-Reply-To: <20060131083002.GC93773@FreeBSD.org> User-Agent: Mutt/1.5.11 Cc: Anish Mistry , Gleb Smirnoff , freebsd-stable@FreeBSD.org Subject: Re: dc0: watchdog timeout and nve0: device timeout X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jan 2006 11:25:02 -0000 --9amGYk9869ThD9tj Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, 31 Jan 2006 11:30:02 +0300, Gleb Smirnoff wrote: > On Tue, Jan 31, 2006 at 03:08:03AM -0500, Anish Mistry wrote: > A> After updating to STABLE today I'm getting the following message with= =20 > A> my dc and nve NICs every few seconds. UP, AMD64. A kernel from last= =20 > A> Thursday was fine. > A>=20 > A> dc0: watchdog timeout > A> nve0: device timeout (4) >=20 > Can you try to backout the code in sys/dev/pci to Thursday? If this > doesn't help, you probably need to do a binary search in this small > timeframe. I think I found the problem - the merge was not quite correct, and the PCI interrupt rerouting was disabled for some reason. Warner, is there a reason for hiding the "Try to re-route interrupts" code behind an apparently "ifdef 0" case? Well, okay, most probably there is a reason, since you've done it, but... it breaks my re0 card and it also seems to break Anish's hardware :) BTW, the commit message was not quite correct - rev. 1.302 was not really merged, it's included in my patch here. Also, rev. 1.305 of pci.c seems to have more than just adding the PCI_FIND_EXTCAP method - there are a couple of offset fixes that I also included in the patch while trying to come as close to the -CURRENT code as possible; could you check if they actually apply to -STABLE? Anyway, here's a patch that fixes it for me, although most probably the __PCI_REROUTE_INTERRUPT chunk should be sufficient. Warner, if you want more details, I could help with debugging this - on my system, the re0 card definitely needs this rerouting. I've posted some verbose boot output with explanations at http://people.FreeBSD.org/~roam/pcirouting/ The patch itself is also there in case it gets munged by the mail swervers along the way. Index: src/sys/dev/pci/pci.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/dev/pci/pci.c,v retrieving revision 1.292.2.6 diff -u -r1.292.2.6 pci.c --- src/sys/dev/pci/pci.c 30 Jan 2006 18:42:10 -0000 1.292.2.6 +++ src/sys/dev/pci/pci.c 31 Jan 2006 10:57:32 -0000 @@ -428,7 +428,7 @@ ptrptr =3D PCIR_CAP_PTR; break; case 2: - ptrptr =3D 0x14; + ptrptr =3D PCIR_CAP_PTR_2; break; default: return; /* no extended capabilities support */ @@ -447,10 +447,10 @@ } /* Find the next entry */ ptr =3D nextptr; - nextptr =3D REG(ptr + 1, 1); + nextptr =3D REG(ptr + PCICAP_NEXTPTR, 1); =20 /* Process this entry */ - switch (REG(ptr, 1)) { + switch (REG(ptr + PCICAP_ID, 1)) { case PCIY_PMG: /* PCI power management */ if (cfg->pp.pp_cap =3D=3D 0) { cfg->pp.pp_cap =3D REG(ptr + PCIR_POWER_CAP, 2); @@ -1040,7 +1040,8 @@ } =20 if (cfg->intpin > 0 && PCI_INTERRUPT_VALID(cfg->intline)) { -#ifdef __PCI_REROUTE_INTERRUPT +#if defined(__ia64__) || defined(__i386__) || defined(__amd64__) || \ + defined(__arm__) || defined(__alpha__) /* * Try to re-route interrupts. Sometimes the BIOS or * firmware may leave bogus values in these registers. Hope this helps! G'luck, Peter --=20 Peter Pentchev roam@ringlet.net roam@cnsys.bg roam@FreeBSD.org PGP key: http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint FDBA FD79 C26F 3C51 C95E DF9E ED18 B68D 1619 4553 "yields falsehood, when appended to its quotation." yields falsehood, when = appended to its quotation. --9amGYk9869ThD9tj Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFD30j/7Ri2jRYZRVMRAgeyAKCt1Yhreat2iP+DWpimyQf5tc3+TwCgp0cA W0uyXdh2E60EeVGH8sQCtio= =XJ9a -----END PGP SIGNATURE----- --9amGYk9869ThD9tj--