From owner-freebsd-net@FreeBSD.ORG Thu Mar 20 13:51:44 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 939501ED; Thu, 20 Mar 2014 13:51:44 +0000 (UTC) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3A8997DD; Thu, 20 Mar 2014 13:51:44 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.7/8.14.7) with ESMTP id s2KDpg27080117; Thu, 20 Mar 2014 09:51:42 -0400 (EDT) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.7/8.14.4/Submit) id s2KDpghe080116; Thu, 20 Mar 2014 09:51:42 -0400 (EDT) (envelope-from wollman) Date: Thu, 20 Mar 2014 09:51:42 -0400 (EDT) Message-Id: <201403201351.s2KDpghe080116@hergotha.csail.mit.edu> From: wollman@bimajority.org To: freebsd-net@freebsd.org, freebsd-stable@freebsd.org Subject: Re: Network stack returning EFBIG? In-Reply-To: <21290.60558.750106.630804@hergotha.csail.mit.edu> Organization: none X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Thu, 20 Mar 2014 09:51:42 -0400 (EDT) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: jfv@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Mar 2014 13:51:44 -0000 In article <21290.60558.750106.630804@hergotha.csail.mit.edu>, I wrote: >Since we put this server into production, random network system calls >have started failing with [EFBIG] or maybe sometimes [EIO]. I've >observed this with a simple ping, but various daemons also log the >errors: >Mar 20 09:22:04 nfs-prod-4 sshd[42487]: fatal: Write failed: File too >large [preauth] >Mar 20 09:23:44 nfs-prod-4 nrpe[42492]: Error: Could not complete SSL >handshake. 5 I found at least one call stack where this happens and it does get returned all the way to userspace: 17 15547 _bus_dmamap_load_buffer:return kernel`_bus_dmamap_load_mbuf_sg+0x5f kernel`bus_dmamap_load_mbuf_sg+0x38 kernel`ixgbe_xmit+0xcf kernel`ixgbe_mq_start_locked+0x94 kernel`ixgbe_mq_start+0x12a if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 8863 _bus_dmamap_load_mbuf_sg:return kernel`bus_dmamap_load_mbuf_sg+0x38 kernel`ixgbe_xmit+0xcf kernel`ixgbe_mq_start_locked+0x94 kernel`ixgbe_mq_start+0x12a if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 25315 bus_dmamap_load_mbuf_sg:return kernel`ixgbe_xmit+0xcf kernel`ixgbe_mq_start_locked+0x94 kernel`ixgbe_mq_start+0x12a if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 15547 _bus_dmamap_load_buffer:return kernel`_bus_dmamap_load_mbuf_sg+0x5f kernel`bus_dmamap_load_mbuf_sg+0x38 kernel`ixgbe_xmit+0xcf kernel`ixgbe_mq_start_locked+0x94 kernel`ixgbe_mq_start+0x12a if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 8863 _bus_dmamap_load_mbuf_sg:return kernel`bus_dmamap_load_mbuf_sg+0x38 kernel`ixgbe_xmit+0xcf kernel`ixgbe_mq_start_locked+0x94 kernel`ixgbe_mq_start+0x12a if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 25315 bus_dmamap_load_mbuf_sg:return kernel`ixgbe_xmit+0xcf kernel`ixgbe_mq_start_locked+0x94 kernel`ixgbe_mq_start+0x12a if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 4206 ixgbe_xmit:return kernel`ixgbe_mq_start_locked+0x94 kernel`ixgbe_mq_start+0x12a if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 4208 ixgbe_mq_start_locked:return kernel`ixgbe_mq_start+0x12a if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 4212 ixgbe_mq_start:return if_lagg.ko`lagg_transmit+0xc4 kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 36017 lagg_transmit:return kernel`ether_output_frame+0x33 kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 23948 ether_output_frame:return kernel`ether_output+0x4fe kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 18849 ether_output:return kernel`ip_output+0xd74 kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 30895 ip_output:return kernel`tcp_output+0xfea kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 20356 tcp_output:return kernel`tcp_usr_send+0x325 kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 10923 tcp_usr_send:return kernel`sosend_generic+0x3f6 kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 19509 sosend_generic:return kernel`soo_write+0x5e kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 26794 soo_write:return kernel`dofilewrite+0x85 kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 9141 dofilewrite:return kernel`kern_writev+0x6c kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 25665 kern_writev:return kernel`sys_write+0x64 kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 17 24390 sys_write:return kernel`amd64_syscall+0x5ea kernel`0xffffffff808443c7 The MTU here is 9120, and the ixgbe driver has one local modification, to prevent it from using large contiguous mbufs in its receive queue: Index: ixgbe.c =================================================================== --- ixgbe.c (revision 261091) +++ ixgbe.c (working copy) @@ -1117,12 +1117,8 @@ */ if (adapter->max_frame_size <= 2048) adapter->rx_mbuf_sz = MCLBYTES; - else if (adapter->max_frame_size <= 4096) + else adapter->rx_mbuf_sz = MJUMPAGESIZE; - else if (adapter->max_frame_size <= 9216) - adapter->rx_mbuf_sz = MJUM9BYTES; - else - adapter->rx_mbuf_sz = MJUM16BYTES; /* Prepare receive descriptors and buffers */ if (ixgbe_setup_receive_structures(adapter)) { -GAWollman