Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Feb 2010 15:26:43 -0700
From:      "Kirk Davis" <kirk.davis@epsb.ca>
To:        "Jack Vogel" <jfvogel@gmail.com>
Cc:        freebsd-net@freebsd.org, voovoos-fnet@killfile.pl, Mike Tancsa <mike@sentex.net>
Subject:   RE: Intel em0: watchdog timeout
Message-ID:  <529374128DC1B04D9D037911B8E8F05301C17A5F@Exchange26.EDU.epsb.ca>
In-Reply-To: <2a41acea1002221629vbe7548am7b5f1ba94d7efa9f@mail.gmail.com>
References:  <529374128DC1B04D9D037911B8E8F05301C17A51@Exchange26.EDU.epsb.ca> <43416_1266864062_4B82CFBE_43416_81_1_2a41acea1002221043k1b8742c9m8fb484a8e8a4fdda@mail.gmail.com> <529374128DC1B04D9D037911B8E8F05301C17A54@Exchange26.EDU.epsb.ca> <43669_1266865888_4B82D6E0_43669_263_1_2a41acea1002221113v26804200q4f3971c3359dffab@mail.gmail.com> <529374128DC1B04D9D037911B8E8F05301C17A55@Exchange26.EDU.epsb.ca> <201002222107.o1ML7v3Z059734@lava.sentex.ca> <529374128DC1B04D9D037911B8E8F05301C17A56@Exchange26.EDU.epsb.ca> <2a41acea1002221444o6e449602m1830761b21837c41@mail.gmail.com> <529374128DC1B04D9D037911B8E8F05301C17A57@Exchange26.EDU.epsb.ca> <2a41acea1002221629vbe7548am7b5f1ba94d7efa9f@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,
    Looks like I may have tracked down this problem. =20
=20
    I noticed that fastforwarding ( net.inet.ip.fastforwarding=3D1 ) was
turned on.  I turned it off to see if that was causing the problem.
Sure enough, 5 hours later and no watchdog timeouts.  This is still
running on FreeBSD 7.1 (I'm still planning to move to 7.2 soon).  I read
up on the net.inet.ip.fastforwarding sysctl and it doesn't look like it
should cause any problems with the intel NIC driver.  This may need to
be looked at and tested by some one more knowledgeable with the
networking code than I am.
=20
    Thanks to Jack and Mike for your help. =20
=20
---- Kirk
Kirk Davis=20
Senior Network Analyst, ITS=20
Edmonton Public Schools=20
One Kingsway Ave.=20
Edmonton, Alberta, Canada=20
T5H 4G9=20


________________________________

	From: Jack Vogel [mailto:jfvogel@gmail.com]=20
	Sent: Monday, February 22, 2010 5:30 PM
	To: Kirk Davis
	Cc: Mike Tancsa; freebsd-net@freebsd.org
	Subject: Re: Intel em0: watchdog timeout
=09
=09
	Is your driver static, ie builtin, to the kernel, or do you
load/unload it as a module?
	I ask because perhaps we could try a later driver, and being a
module makes that
	easier.=20
=09
	Jack
=09
=09
=09
	On Mon, Feb 22, 2010 at 3:37 PM, Kirk Davis <kirk.davis@epsb.ca>
wrote:
=09

		OK.  I have the following in /boot/loader.conf (and
rebooted)
		hw.em.rxd=3D1024
		hw.em.txd=3D1024
		=20
		Should this be hw.em2.rxd?  Is it set per interface or
across all interfaces?
		=20
		nmbcluster=3D262144
		=20
		# sysctl dev.em.2.stats=3D1
		Feb 22 16:29:57 inet-gw kernel: em2: Defer count =3D 20
		Feb 22 16:29:57 inet-gw kernel: em2: Missed Packets =3D
119947  =20
		Feb 22 16:29:57 inet-gw kernel: em2: Receive No Buffers
=3D 276762
		Feb 22 16:29:57 inet-gw kernel: em2: Receive Length
Errors =3D 0=20
		Feb 22 16:29:57 inet-gw kernel: em2: Receive errors =3D 0
		Feb 22 16:29:57 inet-gw kernel: em2: Crc errors =3D 0
		Feb 22 16:29:57 inet-gw kernel: em2: Alignment errors =3D
0
		Feb 22 16:29:57 inet-gw kernel: em2: Collision/Carrier
extension errors =3D 0
		Feb 22 16:29:57 inet-gw kernel: em2: RX overruns =3D 21
		Feb 22 16:29:57 inet-gw kernel: em2: watchdog timeouts =3D
47
		Feb 22 16:29:57 inet-gw kernel: em2: RX MSIX IRQ =3D 0 TX
MSIX IRQ =3D 0 LINK MSIX IRQ =3D 0
		Feb 22 16:29:57 inet-gw kernel: em2: XON Rcvd =3D 22
		Feb 22 16:29:57 inet-gw kernel: em2: XON Xmtd =3D 8349
		Feb 22 16:29:57 inet-gw kernel: em2: XOFF Rcvd =3D 31
		Feb 22 16:29:57 inet-gw kernel: em2: XOFF Xmtd =3D 15779
		Feb 22 16:29:57 inet-gw kernel: em2: Good Packets Rcvd =3D
966101852
		Feb 22 16:29:57 inet-gw kernel: em2: Good Packets Xmtd =3D
755993237
		Feb 22 16:29:57 inet-gw kernel: em2: TSO Contexts Xmtd =3D
0
		Feb 22 16:29:57 inet-gw kernel: em2: TSO Contexts Failed
=3D 0
		=20
		still seeing the watchdog timer and link up/down
messages.
		=20
		Should I try going higher than 1024 on the hw.em.rxd?
I'm not sure the next time I can schedule another reboot on this
production server.
		=20
		---- Kirk
		=20
		Kirk Davis=20
		Senior Network Analyst, ITS=20
		Edmonton Public Schools=20
		One Kingsway Ave.=20
		Edmonton, Alberta, Canada=20
		T5H 4G9=20
		phone: 1-780-429-8308=20

		=20


________________________________

		=09
			From: Jack Vogel [mailto:jfvogel@gmail.com]=20
		=09
			Sent: Monday, February 22, 2010 3:45 PM
			To: Kirk Davis
			Cc: Mike Tancsa; freebsd-net@freebsd.org=20

			Subject: Re: Intel em0: watchdog timeout
		=09

			OK, so you are still failing to get mbufs in the
RX side, increase the nmbcluster
			value, and then what size is your RX ring
(number of rx descriptors)?
		=09
			If you havent already done so, change that to
1024.=20
		=09
			I am developing a change in the RX code right
now that will help
			this situation, but am doing so in the 10G
driver, once its solid there
			I will be backporting it into the 1G drivers, it
will make discards
			almost unnecessary.
		=09
			Jack
		=09
		=09
			On Mon, Feb 22, 2010 at 1:43 PM, Kirk Davis
<kirk.davis@epsb.ca> wrote:
		=09



				> -----Original Message-----
				> From: Mike Tancsa
[mailto:mike@sentex.net]
				> Subject: Re: Intel em0: watchdog
timeout
				>
				> At 03:46 PM 2/22/2010, Kirk Davis
wrote:
				> >Does this need to be done in
loader.conf?  It doesn't seem
				> to take from
				> >the command line.
				> ># sysctl dev.em.2.stats=3D1
				> >dev.em.2.stats: -1 -> -1
				> >
				> ># sysctl dev.em.2.stats
				> >dev.em.2.stats: -1
				>
				> Hi,
				>          After you issue those
commands, the driver will spit out a
				> lot of useful stats to syslog. It will
report something like the
				> following in /var/log/messages
				>
				> Feb 22 16:06:31 offsite kernel: em0:
Excessive collisions =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Sequence errors =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Defer count =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Missed Packets =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Receive No Buffers =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Receive Length Errors =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Receive errors =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Crc errors =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Alignment errors =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Collision/Carrier
				> extension errors =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
RX overruns =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
watchdog timeouts =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
RX MSIX IRQ =3D 0 TX MSIX IRQ =3D 0
				> LINK MSIX IRQ =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
XON Rcvd =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
XON Xmtd =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
XOFF Rcvd =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
XOFF Xmtd =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
Good Packets Rcvd =3D 2559032551
				> Feb 22 16:06:31 offsite kernel: em0:
Good Packets Xmtd =3D 1568751141
				> Feb 22 16:06:31 offsite kernel: em0:
TSO Contexts Xmtd =3D 0
				> Feb 22 16:06:31 offsite kernel: em0:
TSO Contexts Failed =3D 0
			=09
			=09
				Thanks Mike and Jack.  I don't know why
I didn'ty notice the output in
				/var/log/messages
			=09
				Here is the output for the two
interfaces that are causing this issue.
			=09
				Feb 22 13:33:52 inet-gw kernel: em0:
Excessive collisions =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
Sequence errors =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
Defer count =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
Missed Packets =3D 24296
				Feb 22 13:33:52 inet-gw kernel: em0:
Receive No Buffers =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
Receive Length Errors =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
Receive errors =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0: Crc
errors =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
Alignment errors =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
Collision/Carrier extension errors
				=3D 0
				Feb 22 13:33:52 inet-gw kernel: em0: RX
overruns =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
watchdog timeouts =3D 6
				Feb 22 13:33:52 inet-gw kernel: em0: RX
MSIX IRQ =3D 0 TX MSIX IRQ =3D 0
				LINK MSIX IRQ =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0: XON
Rcvd =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0: XON
Xmtd =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
XOFF Rcvd =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
XOFF Xmtd =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0:
Good Packets Rcvd =3D 424303810
				Feb 22 13:33:52 inet-gw kernel: em0:
Good Packets Xmtd =3D 576529136
				Feb 22 13:33:52 inet-gw kernel: em0: TSO
Contexts Xmtd =3D 0
				Feb 22 13:33:52 inet-gw kernel: em0: TSO
Contexts Failed =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2:
Excessive collisions =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2:
Sequence errors =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2:
Defer count =3D 20
				Feb 22 13:34:12 inet-gw kernel: em2:
Missed Packets =3D 68059
				Feb 22 13:34:12 inet-gw kernel: em2:
Receive No Buffers =3D 275612
				Feb 22 13:34:12 inet-gw kernel: em2:
Receive Length Errors =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2:
Receive errors =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2: Crc
errors =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2:
Alignment errors =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2:
Collision/Carrier extension errors
				=3D 0
				Feb 22 13:34:12 inet-gw kernel: em2: RX
overruns =3D 17
				Feb 22 13:34:12 inet-gw kernel: em2:
watchdog timeouts =3D 38
				Feb 22 13:34:12 inet-gw kernel: em2: RX
MSIX IRQ =3D 0 TX MSIX IRQ =3D 0
				LINK MSIX IRQ =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2: XON
Rcvd =3D 21
				Feb 22 13:34:12 inet-gw kernel: em2: XON
Xmtd =3D 8344
				Feb 22 13:34:12 inet-gw kernel: em2:
XOFF Rcvd =3D 30
				Feb 22 13:34:12 inet-gw kernel: em2:
XOFF Xmtd =3D 9159
				Feb 22 13:34:12 inet-gw kernel: em2:
Good Packets Rcvd =3D 713607509
				Feb 22 13:34:12 inet-gw kernel: em2:
Good Packets Xmtd =3D 569694020
				Feb 22 13:34:12 inet-gw kernel: em2: TSO
Contexts Xmtd =3D 0
				Feb 22 13:34:12 inet-gw kernel: em2: TSO
Contexts Failed =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2:
Excessive collisions =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2:
Sequence errors =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2:
Defer count =3D 20
				Feb 22 13:35:10 inet-gw kernel: em2:
Missed Packets =3D 68059
				Feb 22 13:35:10 inet-gw kernel: em2:
Receive No Buffers =3D 275612
				Feb 22 13:35:10 inet-gw kernel: em2:
Receive Length Errors =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2:
Receive errors =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2: Crc
errors =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2:
Alignment errors =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2:
Collision/Carrier extension errors
				=3D 0
				Feb 22 13:35:10 inet-gw kernel: em2: RX
overruns =3D 17
				Feb 22 13:35:10 inet-gw kernel: em2:
watchdog timeouts =3D 38
				Feb 22 13:35:10 inet-gw kernel: em2: RX
MSIX IRQ =3D 0 TX MSIX IRQ =3D 0
				LINK MSIX IRQ =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2: XON
Rcvd =3D 21
				Feb 22 13:35:10 inet-gw kernel: em2: XON
Xmtd =3D 8344
				Feb 22 13:35:10 inet-gw kernel: em2:
XOFF Rcvd =3D 30
				Feb 22 13:35:10 inet-gw kernel: em2:
XOFF Xmtd =3D 9159
				Feb 22 13:35:10 inet-gw kernel: em2:
Good Packets Rcvd =3D 715555016
				Feb 22 13:35:10 inet-gw kernel: em2:
Good Packets Xmtd =3D 571157561
				Feb 22 13:35:10 inet-gw kernel: em2: TSO
Contexts Xmtd =3D 0
				Feb 22 13:35:10 inet-gw kernel: em2: TSO
Contexts Failed =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2:
Excessive collisions =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2:
Sequence errors =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2:
Defer count =3D 20
				Feb 22 13:39:12 inet-gw kernel: em2:
Missed Packets =3D 68059
				Feb 22 13:39:12 inet-gw kernel: em2:
Receive No Buffers =3D 275612
				Feb 22 13:39:12 inet-gw kernel: em2:
Receive Length Errors =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2:
Receive errors =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2: Crc
errors =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2:
Alignment errors =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2:
Collision/Carrier extension errors
				=3D 0
				Feb 22 13:39:12 inet-gw kernel: em2: RX
overruns =3D 17
				Feb 22 13:39:12 inet-gw kernel: em2:
watchdog timeouts =3D 38
				Feb 22 13:39:12 inet-gw kernel: em2: RX
MSIX IRQ =3D 0 TX MSIX IRQ =3D 0
				LINK MSIX IRQ =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2: XON
Rcvd =3D 21
				Feb 22 13:39:12 inet-gw kernel: em2: XON
Xmtd =3D 8344
				Feb 22 13:39:12 inet-gw kernel: em2:
XOFF Rcvd =3D 30
				Feb 22 13:39:12 inet-gw kernel: em2:
XOFF Xmtd =3D 9159
				Feb 22 13:39:12 inet-gw kernel: em2:
Good Packets Rcvd =3D 723521981
				Feb 22 13:39:12 inet-gw kernel: em2:
Good Packets Xmtd =3D 577211431
				Feb 22 13:39:12 inet-gw kernel: em2: TSO
Contexts Xmtd =3D 0
				Feb 22 13:39:12 inet-gw kernel: em2: TSO
Contexts Failed =3D 0
			=09
			=09
				Can this be the problem? "Receive No
Buffers =3D 275612"
			=09

				---- Kirk
				Kirk Davis
				Senior Network Analyst, ITS
				Edmonton Public Schools
				One Kingsway Ave.
				Edmonton, Alberta, Canada
				T5H 4G9
			=09
				phone: 1-780-429-8308
			=09
			=09
			=09






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?529374128DC1B04D9D037911B8E8F05301C17A5F>