From owner-freebsd-current@FreeBSD.ORG Tue Apr 3 16:50:12 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A4C7116A4E9 for ; Tue, 3 Apr 2007 16:50:12 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Received: from tmailer.gwdg.de (tmailer.gwdg.de [134.76.10.23]) by mx1.freebsd.org (Postfix) with ESMTP id 38EEC13C44C for ; Tue, 3 Apr 2007 16:50:09 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Received: from [87.139.104.184] (helo=[192.168.2.20]) by mailer.gwdg.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.66) (envelope-from ) id 1HYmCQ-0008VA-9F; Tue, 03 Apr 2007 18:49:38 +0200 Message-ID: <461285A6.5010805@gwdg.de> Date: Tue, 03 Apr 2007 18:49:42 +0200 From: Rainer Hurling User-Agent: Thunderbird 1.5.0.10 (X11/20070318) MIME-Version: 1.0 To: pyunyh@gmail.com References: <20070312045116.GA83433@cdnetworks.co.kr> <45F5C914.3000805@gwdg.de> <20070313004601.GA87608@cdnetworks.co.kr> <20070313005845.GB87608@cdnetworks.co.kr> <45F636B5.9060608@gwdg.de> <20070313070153.GD87608@cdnetworks.co.kr> <20070331003031.GB68853@cdnetworks.co.kr> <460E77BE.9090503@gwdg.de> <20070402010230.GA1323@cdnetworks.co.kr> <46113CF2.6090009@gwdg.de> <20070403035845.GB7223@cdnetworks.co.kr> In-Reply-To: <20070403035845.GB7223@cdnetworks.co.kr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated: Id:rhurlin X-Spam-Level: - X-Virus-Scanned: (clean) by exiscan+sophie Cc: darren780@yahoo.com, freebsd-current@freebsd.org, shigeaki@se.hiroshima-u.ac.jp Subject: Re: yongari nfe problems X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Apr 2007 16:50:12 -0000 Pyun YongHyeon schrieb: > On Mon, Apr 02, 2007 at 07:27:14PM +0200, Rainer Hurling wrote: > > Pyun YongHyeon schrieb: > > >On Sat, Mar 31, 2007 at 05:01:18PM +0200, Rainer Hurling wrote: > > > > Thank you Pyun YongHyeon for the newest patch. I am running it with > > > > if_nfe.c and if_nfereg.h from 03/21/2007 and if_nfevar.h from > > > 03/19/2007 > on FreeBSD 7.0-CURRENT (i386) from today. > > > > > > > > boot -v gives me: > > > > nfe0: port 0xb000-0xb007 mem > > > > xfbef3000-0xfbef3fff,0xfbefa800-0xfbefa8ff,0 > > > > xfbefa400-0xfbefa40f irq 22 at device 8.0 on pci0 > > > > nfe0: Reserved 0x1000 bytes for rid 0x10 type 3 at 0xfbef3000 > > > > miibus0: on nfe0 > > > > ciphy0: PHY 1 on miibus0 > > > > ciphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > 1000baseT-FDX, auto > > > > nfe0: bpf attached > > > > nfe0: Ethernet address: 00:16:17:95:d9:7c > > > > nfe0: [MPSAFE] > > > > nfe0: [FILTER] > > > > > > > > > > > > Now there are no more warning from miibus0 :-) > > > > > > > > > >Thanks for testing. > > > > > > > Unfortunately at bigger network transfers I still observe the > > > previously > described watchdog timeouts: > > > > > > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering > > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering > > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering > > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering > > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering > > > > nfe0: watchdog timeout (missed Tx interrupts) -- recovering > > > > ... > > > > > > > > During these timeouts I am not able to use my network ;-( > > > > > > > > I would be happy if I could help solving this problem. Let me know if I > > > > can test anything. > > > > > > > > > >Does nfe(4) use shared interrupt with other devices? > > >(Check 'vmstat -i' output.) > > > > > > #vmstat -i > > interrupt total rate > > irq1: atkbd0 10848 1 > > irq12: psm0 79500 7 > > irq14: ata0 102455 10 > > irq16: sym0 14 0 > > irq17: nvidia0 632579 61 > > irq21: pcm0 ohci0 30994 3 > > irq22: nfe0 ehci0 36673 3 > ^^^^^^^^^^ > > You use shared interrupt. :-( Yes, that's it. Both units are on the mainboard. In "man ehci(4)" I found: ------- BUGS The driver is not finished and is quite buggy. There is currently no support for isochronous transfers. ------- Possibly this could cause the observed "dropouts" of nfe0 from a few seconds till several minutes? > > > irq23: atapci1 143425 14 > > cpu0: timer 20480047 1999 > > cpu1: timer 20466044 1998 > > Total 41982579 4099 > > > > > > >Since the watchdog timeout error indicates you've had missing Tx > > >completion interrupts I guess you've lost Tx completion interrupts > > >under high systems loads. One of major changes in new nfe(4) was > > >switching to so-called adaptive polling and it is known to give better > > >performance. However it can loose interrupts under high system loads > > >(e.g. buildworld) and I guess there are two ways to fix the issue. > > > > > >1. Add MSI/MSI-X support. > > > I think this is the cleanest solution to the issue. But old > > > hardwares which has no MSI/MSI-X support and buggy PCI bridges may > > > have issues dealing with MSI/MSI-X. In addition, there is no public > > > documentation available for NVIDIA NICs and lack of MSI/MSI-X capable > > > hardwares make me hard to add MSI/MSI-X support. AFAIK, Shigeaki > > > Tagashira is working on supporting MSI/MSI-X.(CCed) > > > > dmesg shows on my MCP55 system: > > pcib0: port 0xcf8-0xcff on acpi0 > > pci0: on pcib0 > > pcib0: HT Bridge at 0:5:0 has non-default MSI window 0xc02000a > > pcib0: HT Bridge at 0:5:1 has non-default MSI window 0x602000a > > pcib0: HT Bridge at 0:6:1 has non-default MSI window 0x0 > > pcib0: HT Bridge at 0:8:0 has non-default MSI window 0x75011 > > I'm not sure what non-default MSI window have influence on MSI > support code. Maybe jhb has better idea.(CCed) It was just a guess. > > pci0: at device 0.0 (no driver attached) > > > > A more comprehensive info of 'boot -v' you can find as attachement. I > > snipped a few lines because they are not necessary in this context (cpu, > > pcm0, ad, acd, ...). > > > > >2. polling(4) > > > Because polling(4) does not rely on timed-delivery of Tx interrupts > > > it would help in your case. > > > > Is polling in classical sense the right way for this new driver with > > 'adaptive polling'? > > > > I think you could be right when assuming inadequate MSI/MSI-X support > > for the MCP55 chipset. > > > > Personally I don't like polling(4) due to latency issues but it > seems that there is no easy way to work-around until nfe(4) get > working MSI/MSI-X support. > Alternatively, if you don't use USB at all you can completely > disable USBs and can avoid the use of shared interrupt with USB > devices. Is there a knob or option in driver nfe(4) I can use to try classical polling or any 'lower' mode of operation? Rainer Hurling