Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Mar 2014 15:32:02 +0200
From:      Daniel Braniss <danny@cs.huji.ac.il>
To:        Garrett Wollman <wollman@bimajority.org>
Cc:        freebsd-net@freebsd.org, freebsd-stable@freebsd.org, jackv@freebsd.org
Subject:   Re: Network stack returning EFBIG?
Message-ID:  <868FFD0A-106E-4C5E-A61C-10C3895C3281@cs.huji.ac.il>
In-Reply-To: <21290.60558.750106.630804@hergotha.csail.mit.edu>
References:  <21290.60558.750106.630804@hergotha.csail.mit.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
turn off TSO

the problems sound similar to the one I reported a while back. truing =
off tso fixed it.

danny

On Mar 20, 2014, at 3:26 PM, Garrett Wollman <wollman@bimajority.org> =
wrote:

> I recently put a new server running 9.2 (with a local patches for NFS)
> into production, and it's immediately started to fail in an odd way.
> Since I pounded this server pretty heavily and never saw the error in
> testing, I'm more than a little bit taken aback.  We have identical
> hardware in production with 9.1, and I have the same kernel running
> just peachy on a machine with Chelsio T4 NICs.  The problem machine =
has
> ixgbe(4):
>=20
> ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15> =
port 0x9c00-0x9c1f mem 0xdef80000-0xdeffffff,0xdef7c000-0xdef7ffff irq =
24 at device 0.0 on pci2
> ix0: Using MSIX interrupts with 7 vectors
> ix0: Ethernet address: 04:7d:7b:a5:87:32
> ix0: PCI Express Bus: Speed 5.0GT/s Width x4
> ix1: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15> =
port 0x9880-0x989f mem 0xdee80000-0xdeefffff,0xdee7c000-0xdee7ffff irq =
34 at device 0.1 on pci2
> ix1: Using MSIX interrupts with 7 vectors
> ix1: Ethernet address: 04:7d:7b:a5:87:33
> ix1: PCI Express Bus: Speed 5.0GT/s Width x4
>=20
> (pciconf tells me these are "82599EB 10-Gigabit SFI/SFP+ Network
> Connection".  It's a bug that the driver doesn't tell me that.)
>=20
> These are glued together in a lagg(4) using LACP.
>=20
> Since we put this server into production, random network system calls
> have started failing with [EFBIG] or maybe sometimes [EIO].  I've
> observed this with a simple ping, but various daemons also log the
> errors:
> Mar 20 09:22:04 nfs-prod-4 sshd[42487]: fatal: Write failed: File too =
large [preauth]
> Mar 20 09:23:44 nfs-prod-4 nrpe[42492]: Error: Could not complete SSL =
handshake. 5
>=20
> The machine eventually becomes unreachable and has to be rebooted from
> the console.
>=20
> So, can anyone tell me how this is possible, and what changed between
> 9.1 and 9.2 to cause it?
>=20
> -GAWollman
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to =
"freebsd-stable-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?868FFD0A-106E-4C5E-A61C-10C3895C3281>