Date: Wed, 26 Apr 2017 16:50:04 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 218894] Network dropouts on em(4) due to jumbo cluster failures Message-ID: <bug-218894-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D218894 Bug ID: 218894 Summary: Network dropouts on em(4) due to jumbo cluster failures Product: Base System Version: 11.0-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Many People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: mandrews@bit0.com This is a hard one to reproduce on demand, unfortunately. After about two weeks of uptime, several of our systems of varying vintage = will lose most/all network connectivity for several minutes, then go back to wor= king as normal without anything having been done. When these drops happen, "netstat -i" shows a jump in Ierrs, and "netstat -= m" shows a jump in "requests for jumbo clusters denied" for 9K. So something = is causing jumbo allocations to fail, which in turn causes Ierrs, which in turn causes temporary loss of connectivity. When it starts happening, it usually happens once or twice a day, and gets worse over time until a reboot clears= it up for a while. Some Googling on this indicates that this might be a memory fragmentation issue, and that after some uptime, it might be hard for the kernel to find a contiguous block of memory larger than 4K (1 page), and that there might be= a defragmentation process that isn't happening. I'm now unable to find the specific page that led me to that wild theory though; that was a few weeks = ago. Oopsie. We only run jumbo frames on one VLAN, and we use an MTU of 5000 instead of = 9000 because we have some Supermicro PDSMi+-based systems we can't yet get rid of (grumble) that use 82573L NICs that have hardware bugs and once choked on anything bigger (and 82573E NICs that don't do jumbos at all). Whether tha= t's contributing to the problem, in trying to allocate 4K+1K vs 4K+4K+1K, I'm n= ot sure. I'm also running with "-tso" because leaving TSO on causes problems with NFS stalls for us -- similar problems, but probably unrelated to this issue. The affected systems have 82574L NICs and 82579LM NICs -- Supermicro X9SCM-= F, X8STi-F, X8SIE-F, X8DT6-F. The one igb-based (I350) system we have, a Supermicro X9DRD-7LN4F, doesn't = seem to be affected by this issue at all. This is 11.0-RELEASE, which has em driver 7.6.1 in it. I have tried the net/intel-em-kmod-7.6.2 port and it doesn't help. Short of buying a crapload of igb or ixgbe cards and/or turning off jumbo frames, any ideas on how to troubleshoot and fix this before 11.1-RELEASE?= =20 Anything I can pull out of netstat, sysctl, etc? --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-218894-8>