Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Jul 2006 09:27:31 -0700
From:      "Jack Vogel" <jfvogel@gmail.com>
To:        "Jeremie Le Hen" <jeremie@le-hen.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: em(4) watchdog timeout
Message-ID:  <2a41acea0607210927s108d1326qdad02b7d29376a09@mail.gmail.com>
In-Reply-To: <20060721123448.GV6253@obiwan.tataz.chchile.org>
References:  <20060721123448.GV6253@obiwan.tataz.chchile.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 7/21/06, Jeremie Le Hen <jeremie@le-hen.org> wrote:
> Hi,
>
> I am running a two month old current (dated from May 24), and I am
> experiencing watchdog timeouts with my em(4) adapter when running
> some CPU bound workload involving a computational perl script.
> Unfortunately this bugs occurs very infrequently, I can't trigger
> it each time I run this job.
>
> FWIW, the command line is something like this :
> %   gzip -dc data.gz | perlscript > chewed_data
>
> I recompiled em(4) with DEBUG_INIT, DEBUG_IOCTL and DEBUG_HW
> all set to 1, but it doesn't seem to provide valuable information :
>
> % Jul 21 11:17:14 neuneuf kernel: em0: watchdog timeout -- resetting
> % Jul 21 11:17:14 neuneuf kernel: em_init: begin
> % Jul 21 11:17:14 neuneuf kernel: em_stop: begin
> % Jul 21 11:17:14 neuneuf kernel: free_transmit_structures: begin
> % Jul 21 11:17:14 neuneuf kernel: free_receive_structures: begin
> % Jul 21 11:17:14 neuneuf kernel: em_init: pba=48K
> % Jul 21 11:17:14 neuneuf kernel: em_hardware_init: begin
> % Jul 21 11:17:14 neuneuf kernel: em_initialize_transmit_unit: begin
> % Jul 21 11:17:14 neuneuf kernel: Base = 1ebf9000, Length = 1000
> % Jul 21 11:17:14 neuneuf kernel:
> % Jul 21 11:17:14 neuneuf kernel: em_set_multi: begin
> % Jul 21 11:17:14 neuneuf kernel: em_initialize_receive_unit: begin
> % Jul 21 11:17:14 neuneuf kernel: em0: link state changed to DOWN
> % Jul 21 11:17:16 neuneuf kernel: em0: link state changed to UP
> % Jul 21 11:17:16 neuneuf kernel: ioctl rcv'd: SIOCxIFMEDIA (Get/Set Interface Media)
> % Jul 21 11:17:16 neuneuf kernel: em_media_status: begin
>
> The ship is:
> % em0@pci3:11:0:  class=0x020000 card=0x02871014 chip=0x10138086 rev=0x00 hdr=0x00
> %     vendor   = 'Intel Corporation'
> %     device   = '82541EI Gigabit Ethernet Controller (Copper)'
> %     class    = network
> %     subclass = ethernet
>
> The interrupt is shared with uhci0:
> % neuneuf:/sys:112# vmstat -i
> % interrupt                          total       rate
> % irq1: atkbd0                       39216          0
> % irq14: ata0                      4801030          3
> % irq16: em0 uhci0++             919491852        688
> % irq19: uhci1                       35141          0
> % irq23: ehci0                           1          0
> % cpu0: timer                   2670435076       1999
> % Total                         3594802316       2692
>
> I can't try DEVICE_POLLING right now since IIRC I should recompile the whole
> kernel (right now I am using the if_em module so that I can tune the driver
> without rebooting).

hitting watchdog means you have a hang of some sort.
try 'sysctl dev.em.0.debug_info=1' and see if that gives any clues.

Jack



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a41acea0607210927s108d1326qdad02b7d29376a09>