From owner-freebsd-current@FreeBSD.ORG Fri Feb 24 16:52:19 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4B4AA16A420 for ; Fri, 24 Feb 2006 16:52:19 +0000 (GMT) (envelope-from mime@traveller.cz) Received: from ss.eunet.cz (ss.eunet.cz [193.85.228.13]) by mx1.FreeBSD.org (Postfix) with ESMTP id C93B743D5F for ; Fri, 24 Feb 2006 16:52:11 +0000 (GMT) (envelope-from mime@traveller.cz) Received: from localhost.i.cz (ss.eunet.cz [193.85.228.13]) by ss.eunet.cz (8.13.3/8.13.1) with ESMTP id k1OGq6ec098220 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Fri, 24 Feb 2006 17:52:09 +0100 (CET) (envelope-from mime@traveller.cz) From: Michal Mertl To: Scott Long In-Reply-To: <43E2184B.3040606@samsco.org> References: <1138813174.1358.34.camel@genius.i.cz> <43E0FE09.50804@samsco.org> <1138875351.1807.12.camel@genius.i.cz> <43E203F9.9060307@samsco.org> <1138890130.9192.3.camel@genius.i.cz> <43E2184B.3040606@samsco.org> Content-Type: text/plain Date: Fri, 24 Feb 2006 17:51:55 +0100 Message-Id: <1140799915.867.17.camel@genius.i.cz> Mime-Version: 1.0 X-Mailer: Evolution 2.4.2.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org Subject: Re: em(4) stops forwarding X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Feb 2006 16:52:19 -0000 Scott Long wrote: > Michal Mertl wrote: > > > Scott Long wrote: > > > >>Michal Mertl wrote: > >> > >>>Scott Long wrote: > >>> > >>> > >>>>Michal Mertl wrote: > >>>> > >>>> > >>>>>Hello, > >>>>> > >>>>>I've been running CURRENT for long time and never experienced problem > >>>>>with the built-in em(4) card before. Recently (I first noticed it on Jan > >>>>>24) the card has stopped working several times. Nothing gets into the > >>>>>log file. Carrier is still detected properly but no data is exchanged. > >>>>>Ifconfig up/down doesn't help but kldunload/load does. When I run > >>>>>tcpdump I don't see any packet coming in but I see some outgoing. > >>>>> > >>>>>Can someone suggest what to look at when it happens the next time? I > >>>>>have DDB compiled in. I will try to sniff the wire using another machine > >>>>>next time to see if the card sends out anything. > >>>>> > >>>>>The command 'pciconf -lv' says about the card this: > >>>>>em0@pci2:1:0: class=0x020000 card=0x05491014 chip=0x101e8086 rev=0x03 > >>>>>hdr=0x00 > >>>>> vendor = 'Intel Corporation' > >>>>> device = '82540EP Gigabit Ethernet Controller (Mobile)' > >>>>> class = network > >>>>> subclass = ethernet > >>>>> > >>>>>The dmesg: > >>>>>em0: port > >>>>>0x8000-0x803f mem 0xc0220000-0xc023ffff,0xc0200000-0xc020ffff irq 11 at > >>>>>device 1.0 on pci2 > >>>>>em0: Ethernet address: 00:0d:60:cd:ae:e2 > >>>>>em0: [FAST] > >>>>> > >>>>>The interrupt is shared since the machine is a notebook. I don't know if > >>>>>it was just a coincidence but I think that it happened at the same time > >>>>>as my USB mouse stopped working - the USB controller is on the same irq. > >>>>> > >>>>>Michal > >>>>> > >>>> > >>>>What is sharing the interrupt? > >>> > >>> > >>>vgapci0, ipw0, ehci0, uhci0-2. I don't think vgapci0 and ipw0 are really > >>>using the interrupt when I use em0. > >>> > >>> > >> > >>Ouch. For now, edit /sys/dev/em/if_em.c and add the following line to > >>the top of the file: > >> > >>#define NO_EM_FASTINTR > > > > > > Do you know the reason of the problem? Wouldn't it be better if I used > > stock driver and got some information for you when it doesn't work? I > > use the machine as my workstation so it isn't such a big problem when it > > looses the network. > > > > The problem is that the drivers that are sharing the interrupt, > particularly the USB ones, can spend a very very long time waiting on > locks to service the interrupt. During that time, the interrupt pin is > masked and the all interrupts from all shared devices don't get > delivered. So even though the if_em driver has a very fast interrupt > handler, it still has to wait on the USB drivers. During that wait, a > burst of network traffic might come into the card, filling its buffers > and triggering an overflow. This would be especially likely to happen > while the kernel is flushing out filesystem i/o. In theory the > interrupt service latency shouldn't be any different whether the if_em > driver is fast or not, but there might be coincidental timing issues > that I don't understand. That's why I'd like you to set the #ifdef in > the driver to revert it back to it's classic behaviour and see if the > problem persists. If it doesn't, then I'll have to rethink some of the > changes that I made to it. > I thought I should let you know if I still experience the em lock up. The answer is unfotunately that it didn't happen any more neither with NO_EM_FASTINTR defined or not. > Scott > > > > >>Also, does your kernel config include the apic device? > > > > > > Yes, it does. But I believe that the chipset doesn't have it and neither > > the CPU supports it. > > > > Michal > > > >