From owner-freebsd-current@FreeBSD.ORG Fri Jul 21 16:27:32 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 864FC16A4FD for ; Fri, 21 Jul 2006 16:27:32 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.177]) by mx1.FreeBSD.org (Postfix) with ESMTP id E6FED43D45 for ; Fri, 21 Jul 2006 16:27:31 +0000 (GMT) (envelope-from jfvogel@gmail.com) Received: by py-out-1112.google.com with SMTP id b36so1265849pyb for ; Fri, 21 Jul 2006 09:27:31 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=mLaY+Egi4WajvQ4Xj7dMo7pE14Ms+U2y19SYKcq2QUqgAN6sxrc3LS6fu0P6LyliQlcJSUWWHQqlGsAM/jdBxlL1OklPva9KcyuN8dGbHA0I1y42R9J9kbX3uRPcOBk0FHqRDYvOj/t+sdWt15YOVzlcUp3QCZp7F9EAsdKKhuk= Received: by 10.35.70.2 with SMTP id x2mr1494663pyk; Fri, 21 Jul 2006 09:27:31 -0700 (PDT) Received: by 10.35.119.14 with HTTP; Fri, 21 Jul 2006 09:27:31 -0700 (PDT) Message-ID: <2a41acea0607210927s108d1326qdad02b7d29376a09@mail.gmail.com> Date: Fri, 21 Jul 2006 09:27:31 -0700 From: "Jack Vogel" To: "Jeremie Le Hen" In-Reply-To: <20060721123448.GV6253@obiwan.tataz.chchile.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20060721123448.GV6253@obiwan.tataz.chchile.org> Cc: freebsd-current@freebsd.org Subject: Re: em(4) watchdog timeout X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Jul 2006 16:27:32 -0000 On 7/21/06, Jeremie Le Hen wrote: > Hi, > > I am running a two month old current (dated from May 24), and I am > experiencing watchdog timeouts with my em(4) adapter when running > some CPU bound workload involving a computational perl script. > Unfortunately this bugs occurs very infrequently, I can't trigger > it each time I run this job. > > FWIW, the command line is something like this : > % gzip -dc data.gz | perlscript > chewed_data > > I recompiled em(4) with DEBUG_INIT, DEBUG_IOCTL and DEBUG_HW > all set to 1, but it doesn't seem to provide valuable information : > > % Jul 21 11:17:14 neuneuf kernel: em0: watchdog timeout -- resetting > % Jul 21 11:17:14 neuneuf kernel: em_init: begin > % Jul 21 11:17:14 neuneuf kernel: em_stop: begin > % Jul 21 11:17:14 neuneuf kernel: free_transmit_structures: begin > % Jul 21 11:17:14 neuneuf kernel: free_receive_structures: begin > % Jul 21 11:17:14 neuneuf kernel: em_init: pba=48K > % Jul 21 11:17:14 neuneuf kernel: em_hardware_init: begin > % Jul 21 11:17:14 neuneuf kernel: em_initialize_transmit_unit: begin > % Jul 21 11:17:14 neuneuf kernel: Base = 1ebf9000, Length = 1000 > % Jul 21 11:17:14 neuneuf kernel: > % Jul 21 11:17:14 neuneuf kernel: em_set_multi: begin > % Jul 21 11:17:14 neuneuf kernel: em_initialize_receive_unit: begin > % Jul 21 11:17:14 neuneuf kernel: em0: link state changed to DOWN > % Jul 21 11:17:16 neuneuf kernel: em0: link state changed to UP > % Jul 21 11:17:16 neuneuf kernel: ioctl rcv'd: SIOCxIFMEDIA (Get/Set Interface Media) > % Jul 21 11:17:16 neuneuf kernel: em_media_status: begin > > The ship is: > % em0@pci3:11:0: class=0x020000 card=0x02871014 chip=0x10138086 rev=0x00 hdr=0x00 > % vendor = 'Intel Corporation' > % device = '82541EI Gigabit Ethernet Controller (Copper)' > % class = network > % subclass = ethernet > > The interrupt is shared with uhci0: > % neuneuf:/sys:112# vmstat -i > % interrupt total rate > % irq1: atkbd0 39216 0 > % irq14: ata0 4801030 3 > % irq16: em0 uhci0++ 919491852 688 > % irq19: uhci1 35141 0 > % irq23: ehci0 1 0 > % cpu0: timer 2670435076 1999 > % Total 3594802316 2692 > > I can't try DEVICE_POLLING right now since IIRC I should recompile the whole > kernel (right now I am using the if_em module so that I can tune the driver > without rebooting). hitting watchdog means you have a hang of some sort. try 'sysctl dev.em.0.debug_info=1' and see if that gives any clues. Jack