From owner-freebsd-bugs@FreeBSD.ORG Mon Oct 28 05:10:02 2013 Return-Path: Delivered-To: freebsd-bugs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id DE61FFD1 for ; Mon, 28 Oct 2013 05:10:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C183321C9 for ; Mon, 28 Oct 2013 05:10:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r9S5A1Pl003389 for ; Mon, 28 Oct 2013 05:10:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r9S5A1TI003387; Mon, 28 Oct 2013 05:10:01 GMT (envelope-from gnats) Resent-Date: Mon, 28 Oct 2013 05:10:01 GMT Resent-Message-Id: <201310280510.r9S5A1TI003387@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, David Gilbert Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 2B728DA0 for ; Mon, 28 Oct 2013 05:01:43 +0000 (UTC) (envelope-from root@virtual.accountingreality.com) Received: from virtual.accountingreality.com (virtual.accountingreality.ca [66.96.20.52]) by mx1.freebsd.org (Postfix) with ESMTP id 08D3A2189 for ; Mon, 28 Oct 2013 05:01:42 +0000 (UTC) Received: by virtual.accountingreality.com (Postfix, from userid 0) id 2221D5C036; Mon, 28 Oct 2013 00:54:29 -0400 (EDT) Message-Id: <20131028045430.2221D5C036@virtual.accountingreality.com> Date: Mon, 28 Oct 2013 00:54:29 -0400 (EDT) From: David Gilbert To: FreeBSD-gnats-submit@freebsd.org X-Send-Pr-Version: 3.114 Subject: kern/183381: Use of 9k buffers in if_em.c hangs with resource starvation X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: David Gilbert List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Oct 2013 05:10:02 -0000 >Number: 183381 >Category: kern >Synopsis: Use of 9k buffers in if_em.c hangs with resource starvation >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Oct 28 05:10:00 UTC 2013 >Closed-Date: >Last-Modified: >Originator: David Gilbert >Release: FreeBSD 9.2-STABLE amd64 >Organization: DaveG.ca >Environment: System: FreeBSD virtual.accountingreality.com 9.2-STABLE FreeBSD 9.2-STABLE #14 r256870M: Sat Oct 26 02:11:37 EDT 2013 root@virtual.accountingreality.com:/usr/obj/usr/src/sys/VRA amd64 As above, but I think this affects all current versions of the driver. >Description: I have a ZFS server here that also runs a number of other services and has 9k packets turned on. It hangs every day or two with it's current load. I had a number of theories on the problem ... but it seems that 9k buffer allocation is a problem when other resources are stressed. In the email list discussion which I will pin to this pr after I submit it, GAWollman noted that the 9k buffer allocation may not be required by the code in if_em.c, and, indeed, when I removed it, it wasn't. Not only was it not required, but removing it fixed the hangs. I suppose the argument, then, is, that the more efficient path is to use page-sized buffers and scatter-gather --- which apparently everything supports (he is my only reference for this statement). >How-To-Repeat: This might be challenging. The server has 8 Gig of RAM, 17 disks in ZFS and another 2 disks in UFS service. ZFS serves SMB, NFS (v3), and iSCSI to a GigE lan with 9k packets enabled. The system also runs (signiciant) postgreSQL, rtorrent and apache loads. Of all of these, the rtorrent and iSCSI loads seem to be most involved in replicating this problem. >Fix: Index: if_em.c =================================================================== --- if_em.c (revision 256870) +++ if_em.c (working copy) @@ -1343,10 +1343,8 @@ */ if (adapter->hw.mac.max_frame_size <= 2048) adapter->rx_mbuf_sz = MCLBYTES; - else if (adapter->hw.mac.max_frame_size <= 4096) + else adapter->rx_mbuf_sz = MJUMPAGESIZE; - else - adapter->rx_mbuf_sz = MJUM9BYTES; /* Prepare receive descriptors and buffers */ if (em_setup_receive_structures(adapter)) { >Release-Note: >Audit-Trail: >Unformatted: