From owner-freebsd-current@FreeBSD.ORG Wed Apr 23 08:52:15 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4B270106566C for ; Wed, 23 Apr 2008 08:52:15 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.250]) by mx1.freebsd.org (Postfix) with ESMTP id 036E18FC18 for ; Wed, 23 Apr 2008 08:52:14 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: by an-out-0708.google.com with SMTP id c14so732525anc.13 for ; Wed, 23 Apr 2008 01:52:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:received:date:from:to:cc:subject:message-id:reply-to:references:mime-version:content-type:content-disposition:in-reply-to:user-agent; bh=EAXc6N49URsMm1ByUgYRAIoF2FRgl91pSrS1djxufJA=; b=Wo/7Xn/4odgcQ0vCyp76nZHaIrMKf/w3DiUlj2y+VQXAbXsNRlEJJO/JqMHZI5vyNorJB8C6zL0gGjeLdxfkVgLuCZiksVc2Dhcfm7WVtLunJ4vWG5g/BJTFzqejsC8W904//UGA78lQkYfaGqNEwzokTYX7A/vcwGLFmxJyK94= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version:content-type:content-disposition:in-reply-to:user-agent; b=PZpe0XmQnq+zB0GjMECe9LTvhURyHpmPWnNxaBPyLINWiOQRGOQj0+qErEYOPSHXFZvLFZSRvIuXTDT00RdbSPqihxnHl/Mf5kifMn/LlclhxuiOd7+WiHa0LPWtWDnkdo987l60J4B1rTRjy1ic1ytnfGyEnvisIAaAGDr9Rr0= Received: by 10.100.190.14 with SMTP id n14mr2231874anf.142.1208938976136; Wed, 23 Apr 2008 01:22:56 -0700 (PDT) Received: from michelle.cdnetworks.co.kr ( [211.53.35.84]) by mx.google.com with ESMTPS id c27sm1779552ana.27.2008.04.23.01.22.51 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 23 Apr 2008 01:22:54 -0700 (PDT) Received: from michelle.cdnetworks.co.kr (localhost.cdnetworks.co.kr [127.0.0.1]) by michelle.cdnetworks.co.kr (8.13.5/8.13.5) with ESMTP id m3N8Mj7q056107 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 23 Apr 2008 17:22:45 +0900 (KST) (envelope-from pyunyh@gmail.com) Received: (from yongari@localhost) by michelle.cdnetworks.co.kr (8.13.5/8.13.5/Submit) id m3N8MffB056106; Wed, 23 Apr 2008 17:22:41 +0900 (KST) (envelope-from pyunyh@gmail.com) Date: Wed, 23 Apr 2008 17:22:40 +0900 From: Pyun YongHyeon To: Luigi Rizzo Message-ID: <20080423082240.GF54715@cdnetworks.co.kr> References: <20080422072839.GA85728@onelab2.iet.unipi.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080422072839.GA85728@onelab2.iet.unipi.it> User-Agent: Mutt/1.4.2.1i Cc: current@FreeBSD.org, bug-followup@FreeBSD.org, yongari@FreeBSD.org Subject: Re: amd64/115126: [nfe] nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Apr 2008 08:52:15 -0000 On Tue, Apr 22, 2008 at 09:28:39AM +0200, Luigi Rizzo wrote: > related to this bug, i am seeing similar problems with RELENG_7 and amd64, > with an ASUS M2N-VM DVI motherboard > http://www.asus.com/products.aspx?modelmenu=1&model=1841&l1=3&l2=101&l3=567&l4=0 > and an Athlon64-BE2400 dual core CPU . > > Under heavy load, e.g. scp-ing a large file over the local network, > and at the same time doing a buildkernel or building a port, > and with X11 active (using the 'vesa' xorg driver) > the network card stalls and doesn't recover - i waited over 10 minutes > hoping for the watchdog or some timeout to kick in, the only way > to bring the link back up was > > ifconfig nfe0 down ; ifconfig nfe0 up > dhclient nfe0 > > doing only ifconfig down/up or only dhclient did not help, i needed both. > > vmstat -i says the network card has irq256 (???) and it is not shared with > other devices. Ehci, sound, ohci, ata, and others have low irq numbers > (6, 14, 20, 21, 22), some shared, some not. > > Changing the bios setting for PnP OS from 'yes' to 'no' or viceversa > does not change the situation. > Your BIOS may have an option for ASF related one for onboard NIC. Try toggling that option and see how it goes. > The stall seems related to the presence of other activity - if i > let the bulk scp transfer alone, i get an happy 10-10.5Mbytes/s > (over a 100meg link). > > When the stall occurs, i see no interrupts (vmstat -i counts > for irq256 says the same), > Packets are still transmitted and received on the other side, it's > the rx side of the card that becomes deaf. I don't see any > watchdog timeout or other error messages in /var/log/messages. > > Also, enabling polling does not help getting traffic in > (with a kernel built with DEVICE_POLLING, > doing sysctl kern.polling.enable=1 and "ifconfig nfe0 polling"). > > So i suspect that for some reason the rx ring becomes confused > and does not recover. > Just vague guess, how about disabling MSI/MSI-X in loader.conf? (hw.nfe.msi_disable = "1", hw.nfe.msix_disable = "1") If you are using jumbo frame, try disabling it too. > Hope this helps... > It would be even better if you can post verbosed boot messages related wiht nfe(4) and PHY driver. > cheers > luigi -- Regards, Pyun YongHyeon