Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Mar 2014 09:26:38 -0400
From:      Garrett Wollman <wollman@bimajority.org>
To:        freebsd-net@freebsd.org, freebsd-stable@freebsd.org
Cc:        jackv@freebsd.org
Subject:   Network stack returning EFBIG?
Message-ID:  <21290.60558.750106.630804@hergotha.csail.mit.edu>

next in thread | raw e-mail | index | archive | help
I recently put a new server running 9.2 (with a local patches for NFS)
into production, and it's immediately started to fail in an odd way.
Since I pounded this server pretty heavily and never saw the error in
testing, I'm more than a little bit taken aback.  We have identical
hardware in production with 9.1, and I have the same kernel running
just peachy on a machine with Chelsio T4 NICs.  The problem machine has
ixgbe(4):

ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15> port 0x9c00-0x9c1f mem 0xdef80000-0xdeffffff,0xdef7c000-0xdef7ffff irq 24 at device 0.0 on pci2
ix0: Using MSIX interrupts with 7 vectors
ix0: Ethernet address: 04:7d:7b:a5:87:32
ix0: PCI Express Bus: Speed 5.0GT/s Width x4
ix1: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15> port 0x9880-0x989f mem 0xdee80000-0xdeefffff,0xdee7c000-0xdee7ffff irq 34 at device 0.1 on pci2
ix1: Using MSIX interrupts with 7 vectors
ix1: Ethernet address: 04:7d:7b:a5:87:33
ix1: PCI Express Bus: Speed 5.0GT/s Width x4

(pciconf tells me these are "82599EB 10-Gigabit SFI/SFP+ Network
Connection".  It's a bug that the driver doesn't tell me that.)

These are glued together in a lagg(4) using LACP.

Since we put this server into production, random network system calls
have started failing with [EFBIG] or maybe sometimes [EIO].  I've
observed this with a simple ping, but various daemons also log the
errors:
Mar 20 09:22:04 nfs-prod-4 sshd[42487]: fatal: Write failed: File too large [preauth]
Mar 20 09:23:44 nfs-prod-4 nrpe[42492]: Error: Could not complete SSL handshake. 5

The machine eventually becomes unreachable and has to be rebooted from
the console.

So, can anyone tell me how this is possible, and what changed between
9.1 and 9.2 to cause it?

-GAWollman



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21290.60558.750106.630804>