From owner-freebsd-current@FreeBSD.ORG Tue Jun 29 16:08:12 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6309016A4CE for ; Tue, 29 Jun 2004 16:08:12 +0000 (GMT) Received: from mail1.speakeasy.net (mail1.speakeasy.net [216.254.0.201]) by mx1.FreeBSD.org (Postfix) with ESMTP id 40D5E43D41 for ; Tue, 29 Jun 2004 16:08:12 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 26811 invoked from network); 29 Jun 2004 16:08:03 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 29 Jun 2004 16:08:02 -0000 Received: from 131.106.56.214 (p58.n-nypop02.stsn.com [199.106.89.58]) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i5TG73r2010966; Tue, 29 Jun 2004 12:07:48 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-current@FreeBSD.org Date: Tue, 29 Jun 2004 11:58:19 -0400 User-Agent: KMail/1.6 References: <678213ABF77E5D4F9E6CF1DA61A4E2D518413E@usmilm005.palm1.palmone.com> In-Reply-To: <678213ABF77E5D4F9E6CF1DA61A4E2D518413E@usmilm005.palm1.palmone.com> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200406291158.19613.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Vadim Mikhailov cc: current@FreeBSD.org Subject: Re: [kern/68351] bge0 watchdog timeout on 5.2.1 and -current, 5.1 is ok X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Jun 2004 16:08:12 -0000 On Monday 28 June 2004 01:32 pm, Vadim Mikhailov wrote: > Hi, > > I have a Dell PowerEdge 1750 server with 2 Xeon 3.0 GHZ CPUs, 4 GB RAM and > 2 onboard gigabit ethernet ports: > > bge0: mem > 0xfcd20000-0xfcd2ffff,0xfcd30000-0xfcd3ffff irq 17 at device 0.0 on pci2 > bge1: mem > 0xfcd00000-0xfcd0ffff,0xfcd10000-0xfcd1ffff irq 18 at device 0.1 on pci2 > > Only bge0 is used, with jumbo frames (my gigabit switch PowerConnect 5224 > supports them): > > bge0: flags=8843 mtu 9000 > options=1b > inet 172.xx.xx.xx netmask 0xfffff800 broadcast 172.xx.xx.255 > ether 00:06:5b:ef:63:e6 > media: Ethernet autoselect (1000baseTX ) > status: active > > This box has two dualport SCSI adapters: > > mpt0: port 0xbc00-0xbcff mem > 0xfcb20000-0xfcb2ffff,0xfcb30000-0xfcb3ffff irq 13 at device 5.0 on pci4 > mpt1: port 0xb800-0xb8ff mem > 0xfcb00000-0xfcb0ffff,0xfcb10000-0xfcb1ffff irq 16 at device 5.1 on pci4 > ahc0: port 0xdc00-0xdcff mem > 0xfcf01000-0xfcf01fff irq 19 at device 4.0 on pci1 > ahc1: port 0xd800-0xd8ff mem > 0xfcf00000-0xfcf00fff irq 20 at device 4.1 on pci1 > > Each adapter has disks attached to them. Firmware on motherboard and all > peripherial > devices is upgraded to the very latest versions from Dell. > This setup works more or less ok under FreeBSD 5.1-RELEASE-p8 (GENERIC > kernel with SMP enabled), > but once a month or two machine reboots under load, so I want to upgrade it > to 5.2.1-RELEASE. > But when I boot 5.2.1-RELEASE or later kernel (-current) on this box, > network adapter locks up. > I see these messages on console and in the logs: > > Jun 25 15:25:22 vortex kernel: bge0: watchdog timeout -- resetting > > If I do "ifconfig bge0 down up", network becomes available for few seconds > and then > machine is not pingable again. I ran "systat -v" and have noticed that ping > stops > working exactly when I see any interrupt coming to mpt or ahc (i.e. on any > disk activity). > > One visible difference between 5.1 (where it works) and 5.2.1/current > (where it doesn't) > is that interrupts to PCI devices are getting assigned differently: > > IRQ map under 5.1: mpt0 13, mpt1 16, bge0 17, bge0 18, ahc0 19, ahc1 20, > and under 5.2.1: mpt0 18, mpt1 19, bge0 16, bge1 17, ahc0 20, ahc1 21. The numbers mean different things under 5.1 and 5.2.1. Can you try booting a kernel from a recent snapshot of current to see if current works better? There have been various APIC and ACPI fixes since 5.2.1. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org