Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Sep 2005 18:52:48 +0100
From:      Chris Howells <howells@kde.org>
To:        freebsd-net@freebsd.org
Subject:   Re: em(4) receive part wedging randomly at moderate load
Message-ID:  <200509281852.53335.howells@kde.org>
In-Reply-To: <20050926142907.GI91328@cell.sick.ru>
References:  <20050926142907.GI91328@cell.sick.ru>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
On Monday 26 September 2005 15:29, Gleb Smirnoff wrote:
>   during last month we are experiencing a nasty problem with em(4)
> driver. Several times a day the receive path of the driver wedges
> for a minute or two. During wedge the transmit part works with
> no problems. The latter fact makes this problem very nasty, because
> the problematic router can't be backed up with help of CARP.

This sounds very much like the problem I've been having. It affects two 
machines, one runs 5.4-STABLE and one runs 4.11-STABLE. Both are Duron 1800s 
based on the Asus A7V8X motherboard.

The card in the 4.11 machine is:

em0: <Intel(R) PRO/1000 Network Connection, Version - 2.1.7> port 
0xb000-0xb03f mem 0xf4800000-0xf481ffff,0xf5000000-0xf501ffff irq 11 at 
device 13.0 on pci0

em0@pci0:13:0:  class=0x020000 card=0x002e8086 chip=0x100e8086 rev=0x02 
hdr=0x00
    vendor   = 'Intel Corporation'
    device   = '82540EM Gigabit Ethernet Controller'
    class    = network
    subclass = ethernet


The card in the 5.4 machine is:
em0: <Intel(R) PRO/1000 Network Connection, Version - 2.1.7> port 
0x6400-0x643f mem 0xf0000000-0xf001ffff irq 3 at device 19.0 on pci0

em0@pci0:19:0:  class=0x020000 card=0x10028086 chip=0x10268086 rev=0x04 
hdr=0x00
    vendor   = 'Intel Corporation'
    device   = '82545GM Gigabit Ethernet Controller'
    class    = network
    subclass = ethernet

> The box is serving 8 - 15 kpps, 70 - 100 MBps. It runs stateful pf(4)
> firewall, with 50k - 80k states. The IP fastforwarding is enabled. The
> average state insert/removal ratio is 300 states per second, however
> sometimes several thousands of states can be removed in one pass. The
> state removal locks the network code for quite a long time, so I guess
> that wedge happens exactly when a lot of states are removed. The NIC
> interrupts aren't serviced for some time and it wedges.

Happens for me with no pf and serving a single client with samba and much 
lower load -- only a few tens of KB a second.

> The NIC is plugged in Cisco Catalyst 6509 gigabit ethernet port. No
> errors are counted on switch port.

Mine is a simple unmanaged SMC 5 port GigE switch.

> To workaround the problem, I have made the following patch:

Interesting, I'll give that a go....

<snip>

> I am asking developers, who work in Intel, to pay attention to this
> problem.

Have you tried the em driver directly from intel? It can be found on the Intel 
web site. A few people on freebsd-stable are claiming that it works 
perfectly.

I have noticed that having something like this in sysctl.conf helps to reduce 
the frequency of it happening:

kern.ipc.somaxconn=1024
net.inet.udp.recvspace=65536
net.inet.tcp.sendspace=65536
net.inet.tcp.recvspace=65536

Though sadly it still does happen...

-- 
Cheers, Chris Howells -- chris@chrishowells.co.uk, howells@kde.org
Web: http://www.chrishowells.co.uk, PGP ID: 0x33795A2C
KDE/Qt/C++/PHP Developer: http://www.kde.org

[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (FreeBSD)

iD8DBQBDOth1F8Iu1zN5WiwRAoB3AJ9Y8ePvHQpIZka0AgFjdQAgvqnTyACgnNp2
jbXbStj9oAaPRQjXM2ElfIs=
=gWgY
-----END PGP SIGNATURE-----

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200509281852.53335.howells>