From owner-freebsd-net@freebsd.org Sat Jan 19 20:39:58 2019 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4CA7C14939AB for ; Sat, 19 Jan 2019 20:39:58 +0000 (UTC) (envelope-from d8zNeCFG@aon.at) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 4654F6D735 for ; Sat, 19 Jan 2019 20:39:57 +0000 (UTC) (envelope-from d8zNeCFG@aon.at) Received: by mailman.ysv.freebsd.org (Postfix) id 0681C14939AA; Sat, 19 Jan 2019 20:39:57 +0000 (UTC) Delivered-To: net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D848714939A9 for ; Sat, 19 Jan 2019 20:39:56 +0000 (UTC) (envelope-from d8zNeCFG@aon.at) Received: from smtpout-fallback.aon.at (smtpout-fallback.aon.at [195.3.96.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F1DC76D703 for ; Sat, 19 Jan 2019 20:39:54 +0000 (UTC) (envelope-from d8zNeCFG@aon.at) Received: (qmail 22883 invoked from network); 19 Jan 2019 20:39:46 -0000 Received: from unknown (HELO smtpout.aon.at) ([172.18.1.198]) (envelope-sender ) by fallback44.highway.telekom.at (qmail-ldap-1.03) with SMTP for ; 19 Jan 2019 20:39:46 -0000 X-A1Mail-Track-Id: 1547930386:22882:fallback44:172.18.1.198:1 Received: (qmail 14122 invoked from network); 19 Jan 2019 20:39:38 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on WARSBL505.highway.telekom.at X-Spam-Level: Received: from 91-115-53-43.adsl.highway.telekom.at (HELO gandalf.xyzzy) ([91.115.53.43]) (envelope-sender ) by smarthub78.res.a1.net (qmail-ldap-1.03) with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 19 Jan 2019 20:39:37 -0000 X-A1Mail-Track-Id: 1547930371:13699:smarthub78:91.115.53.43:1 Received: from mizar.xyzzy (mizar.xyzzy [192.168.1.19]) by gandalf.xyzzy (8.15.2/8.15.2) with ESMTP id x0JKdV7a093008; Sat, 19 Jan 2019 21:39:31 +0100 (CET) (envelope-from d8zNeCFG@aon.at) Subject: Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior To: Bruce Evans , Eugene Grosbein References: <20190119204156.D929@besplex.bde.org> <3e407ee7-54e3-a6ac-5535-d11aceca9558@grosbein.net> <20190120061258.X3312@besplex.bde.org> Cc: net@freebsd.org From: Martin Birgmeier Message-ID: <16ce1832-13da-d7bb-cce2-6682e058b5a6@aon.at> Date: Sat, 19 Jan 2019 21:39:31 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190120061258.X3312@besplex.bde.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: F1DC76D703 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.99 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.99)[-0.989,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Jan 2019 20:39:58 -0000 I just tried the patch by Bruce (from the mail sent 10 hours ago), but it makes no difference. Also, it does not seem like bad frames or too high an interrupt rate are the problem (the machine should easily handle what is coming from its NFS client which only has a 100 Mbps interface). I believe that the simplifications introduced to sys/dev/e1000 between 11.2 and 12.0 have broken something. -- Martin On 19.01.19 21:06, Bruce Evans wrote: > On Sun, 20 Jan 2019, Eugene Grosbein wrote: > >> 19.01.2019 17:21, Bruce Evans wrote: >> >>> Your problem looks more like lost interrupts.  All em NICs should >>> interrupt >>> at the default interrupt moderation rate of 8 kHz under load.  Once >>> there >>> are are that many interrupts, there is not much else that can go >>> wrong (nfs >>> would have to be working to generate that many interrupts). >> >> I have a patch (in production since 8.x) that makes em(4) support >> hw.em.max_interrupt_rate >> just like igb(4) supports hw.igb.max_interrupt_rate: >> >> http://www.grosbein.net/freebsd/patches/em_sysctl-11.0.diff.gz >> >> It also brings in sysctls dev.em.X.max_interrupt_rate and >> hw.em.max_interrupt_rate sets defaults for them. > > This is inverted and spelled dev.em.X.itr for em. > > Hmm, em already has this, but it is only a read-only tunable. > > igb seems to have gone away.  In FreeBSD-11, its > dev.em.X.max_interrupt_rate > is also only a tunable. > > I use the variants of the following fix for itr in FreeBSD-[7-13] > > XX Index: if_em.c > XX =================================================================== > XX --- if_em.c    (revision 332488) > XX +++ if_em.c    (working copy) > XX @@ -908,10 +910,10 @@ > XX          E1000_REGISTER(hw, E1000_TADV), > XX          em_tx_abs_int_delay_dflt); > XX      em_add_int_delay_sysctl(adapter, "itr", > XX -        "interrupt delay limit in usecs/4", > XX +        "interrupt delay limit in usecs", > XX          &adapter->tx_itr, > XX          E1000_REGISTER(hw, E1000_ITR), > XX -        DEFAULT_ITR); > XX +        1000000 / MAX_INTS_PER_SEC); > XX XX      hw->mac.autoneg = DO_AUTO_NEG; > XX      hw->phy.autoneg_wait_to_complete = FALSE; > > This fixes the description and the initial value for the sysctl to match > the code.  The description almost matches the buggy initial value.  The > hardware has power of 2 units, but the code scales to microseconds.  > Except > the initial value has was in hardware units scaled by another power of 2 > which made them nearly microseconds/4.  The code sets the initial > value to > a representation of 125 usec (8 kHz), but the sysctl says that the > initial > value is 488 and the description says that this is a representation of > 488/4 = 122 usec.  However, writing back this value using sysctl gives > 488 usec (~2 kHz).  The magic number 122 is 125 mis-scaled by 1000/1024. > > FreeBSD[7-10] have lem in a separate file with the bug duplicated, so > need the patch duplicated.  FreeBSD[7-8] don't have a sysctl for this. > They default to 125 usec and there is no way to see or change the value. > I usually want the smaller value of 0, and hard-code this when there is > no sysctl. > > DEFAULT_ITR is used mainly to obfuscate this.  IGB_DEFAULT_ITR and > IGB_LINK_ITR are also defined, but are not used even in versions of > FreeBSD > that have igb. > >> I use hw.em.max_interrupt_rate=32000 for 1GB link passing average >> sized packets >> (about 600 bytes per packet at average) but driver's default 8000 >> should be nearly fine >> for full size packets (1500 or above) and this 8000 limit cannot be >> reason for such low throughput. > > 0 for itr maxes out at about 100 kHz here.  This is good for low > latency with > small packets. > > My version of bge dynamically modifies the rate to match the rx load (no > moderation for light loads).  tx is handled specially and only needs 1 > interrupt every few seconds for freeing resources. > > Bruce