Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Jan 2019 23:56:28 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Martin Birgmeier <d8zNeCFG@aon.at>
Cc:        Bruce Evans <brde@optusnet.com.au>, Eugene Grosbein <eugen@grosbein.net>,  net@freebsd.org
Subject:   Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior
Message-ID:  <20190120231915.M2326@besplex.bde.org>
In-Reply-To: <fd67eca6-7c1d-687d-91ae-e09138732ed1@aon.at>
References:  <bug-235031-7501@https.bugs.freebsd.org/bugzilla/> <bug-235031-7501-goXNmp3zVl@https.bugs.freebsd.org/bugzilla/> <20190119204156.D929@besplex.bde.org> <3e407ee7-54e3-a6ac-5535-d11aceca9558@grosbein.net> <20190120061258.X3312@besplex.bde.org> <16ce1832-13da-d7bb-cce2-6682e058b5a6@aon.at> <20190120145627.X1077@besplex.bde.org> <fd67eca6-7c1d-687d-91ae-e09138732ed1@aon.at>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 20 Jan 2019, Martin Birgmeier wrote:

> Regarding duplex, ifconfig shows the following:
>
> [0]# ifconfig em0
> em0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 15=
00
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0
> options=3D81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WO=
L_MAGIC,VLAN_HWFILTER>
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ether f0:de:f1:98:86:a9
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 inet 192.168.1.19 netmask 0xff=
ffff00 broadcast 192.168.1.255
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 inet6 fe80::f2de:f1ff:fe98:86a=
9%em0 prefixlen 64 scopeid 0x1
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 inet6 fec0:0:0:4d42::13 prefix=
len 64
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 inet6 fec0::4d42:f2de:f1ff:fe9=
8:86a9 prefixlen 64 autoconf
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 inet6 2002:bc17:f381:4d42:f2de=
:f1ff:fe98:86a9 prefixlen 64 autoconf
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 media: Ethernet autoselect (10=
00baseT <full-duplex>)
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 status: active
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 nd6 options=3D23<PERFORMNUD,AC=
CEPT_RTADV,AUTO_LINKLOCAL>
> [0]#
>
> This seems to be o.k.

The media setting can't be trusted to have reached the hardware -- see my
previous reply.

But I thought that you said that you were using 100 Mbps (presumably with
autoselect).  The above shos autoselect giving 1 Gbps.

I checked that iflib_media_change() is not called for autoselect to 1 Gbps
here.  Also that it fails to stop the NIC if called.  Also that it breaks
the NIC's state after a few calls in the loop:

 =09while :; do
 =09=09./ifconfig em0 media 1000baseT mediaopt full-duplex
 =09=09./ifconfig em0 media autoselect
 =09done

provided ./ifconfig is on nfs.  This gives null changes disguised as
non-null changes so that iflib_media_change() is called.

Console output for this:

XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX em0: TX(0) desc avail =3D 21, pidx =3D 34

Sometimes the queue indexes are corrupted and this messages is printed.
Sometimes, but never in this output, this message is repeated many times
before the interface comes back up.  Actually, this doesn't always
occur between down and up, and when it is repeaded the queue state is
avail =3D 1024, pidx =3D 0, and this state seems to be sticky unless ifconf=
ig
somehow runs to generate another reinitialization.

XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX em0: TX(0) desc avail =3D 1, pidx =3D 30
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX em0: TX(0) desc avail =3D 14, pidx =3D 33
XX Link state changed to up

ipv4 ping is broken most of the time while this loop is running.  Of course
ping should stop responding while the interface is down.  It rarely starts
when the interface comes back up.  Sometimes it starts with low latency,
but usually it starts with DUPs.  For about 50 iterations, the only ping
output was:

XX 64 bytes from 192.168.2.8: icmp_seq=3D619 ttl=3D64 time=3D0.158 ms
XX 64 bytes from 192.168.2.8: icmp_seq=3D619 ttl=3D64 time=3D3523.305 ms (D=
UP!)
XX 64 bytes from 192.168.2.8: icmp_seq=3D619 ttl=3D64 time=3D6696.247 ms (D=
UP!)
XX 64 bytes from 192.168.2.8: icmp_seq=3D619 ttl=3D64 time=3D9857.912 ms (D=
UP!)
XX 64 bytes from 192.168.2.8: icmp_seq=3D728 ttl=3D64 time=3D0.094 ms
XX 64 bytes from 192.168.2.8: icmp_seq=3D728 ttl=3D64 time=3D4154.124 ms (D=
UP!)
XX 64 bytes from 192.168.2.8: icmp_seq=3D728 ttl=3D64 time=3D7253.986 ms (D=
UP!)
XX 64 bytes from 192.168.2.8: icmp_seq=3D728 ttl=3D64 time=3D10367.938 ms (=
DUP!)
XX 64 bytes from 192.168.2.8: icmp_seq=3D728 ttl=3D64 time=3D13540.805 ms (=
DUP!)

Bruce
From owner-freebsd-net@freebsd.org  Sun Jan 20 14:20:08 2019
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 459C51491616
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Sun, 20 Jan 2019 14:20:08 +0000 (UTC) (envelope-from d8zNeCFG@aon.at)
Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org
 [IPv6:2001:1900:2254:206a::50:5])
 by mx1.freebsd.org (Postfix) with ESMTP id B357968278
 for <freebsd-net@freebsd.org>; Sun, 20 Jan 2019 14:20:07 +0000 (UTC)
 (envelope-from d8zNeCFG@aon.at)
Received: by mailman.ysv.freebsd.org (Postfix)
 id 76D551491615; Sun, 20 Jan 2019 14:20:07 +0000 (UTC)
Delivered-To: net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 544FC1491613
 for <net@mailman.ysv.freebsd.org>; Sun, 20 Jan 2019 14:20:07 +0000 (UTC)
 (envelope-from d8zNeCFG@aon.at)
Received: from smtpout-fallback.aon.at (smtpout-fallback.aon.at [195.3.96.120])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id AA20668273
 for <net@freebsd.org>; Sun, 20 Jan 2019 14:20:06 +0000 (UTC)
 (envelope-from d8zNeCFG@aon.at)
Received: (qmail 24851 invoked from network); 20 Jan 2019 14:20:04 -0000
Received: from unknown (HELO smtpout.aon.at) ([172.18.1.196])
 (envelope-sender <d8zNeCFG@aon.at>)
 by fallback44.highway.telekom.at (qmail-ldap-1.03) with SMTP
 for <net@freebsd.org>; 20 Jan 2019 14:20:04 -0000
X-A1Mail-Track-Id: 1547994004:24850:fallback44:172.18.1.196:1
Received: (qmail 3648 invoked from network); 20 Jan 2019 14:19:57 -0000
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
 WARSBL504.highway.telekom.at
X-Spam-Level: 
Received: from 188-23-243-129.adsl.highway.telekom.at (HELO gandalf.xyzzy)
 ([188.23.243.129]) (envelope-sender <d8zNeCFG@aon.at>)
 by smarthub76.res.a1.net (qmail-ldap-1.03) with DHE-RSA-AES256-GCM-SHA384
 encrypted SMTP; 20 Jan 2019 14:19:56 -0000
X-A1Mail-Track-Id: 1547993996:3635:smarthub76:188.23.243.129:1
Received: from mizar.xyzzy (mizar.xyzzy [192.168.1.19])
 by gandalf.xyzzy (8.15.2/8.15.2) with ESMTP id x0KEJsuo014479;
 Sun, 20 Jan 2019 15:19:54 +0100 (CET) (envelope-from d8zNeCFG@aon.at)
Subject: Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior
To: Bruce Evans <brde@optusnet.com.au>
References: <bug-235031-7501@https.bugs.freebsd.org/bugzilla/>
 <bug-235031-7501-goXNmp3zVl@https.bugs.freebsd.org/bugzilla/>
 <20190119204156.D929@besplex.bde.org>
 <3e407ee7-54e3-a6ac-5535-d11aceca9558@grosbein.net>
 <20190120061258.X3312@besplex.bde.org>
 <16ce1832-13da-d7bb-cce2-6682e058b5a6@aon.at>
 <20190120145627.X1077@besplex.bde.org>
 <fd67eca6-7c1d-687d-91ae-e09138732ed1@aon.at>
 <20190120231915.M2326@besplex.bde.org>
Cc: Eugene Grosbein <eugen@grosbein.net>, net@freebsd.org
From: Martin Birgmeier <d8zNeCFG@aon.at>
Message-ID: <50a63079-4c2d-fc5c-47c5-1070b8fcd20c@aon.at>
Date: Sun, 20 Jan 2019 15:19:54 +0100
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101
 Thunderbird/60.4.0
MIME-Version: 1.0
In-Reply-To: <20190120231915.M2326@besplex.bde.org>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Content-Language: en-US
X-Rspamd-Queue-Id: AA20668273
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [-6.96 / 15.00];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 NEURAL_HAM_SHORT(-0.96)[-0.961,0]; REPLY(-4.00)[];
 NEURAL_HAM_LONG(-1.00)[-1.000,0]
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>;
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 20 Jan 2019 14:20:08 -0000

Hi Bruce,

Thank you for your support.

The machine A with the em0 issue is running at 1 Gbps and acts as NFS
server. The NFS client B has a 100 Mbps interface. B gets a throughput
of only 1 Mbyte/s when talking to A but the full 10 Mbyte/s when talking
to another third machine C. In addition, while B is talking to A, if at
the same time A runs an iperf to C, the situation for B improves (up to
5..7 Mbyte/s).

All machines are connected by a DGS-1210-24 1 Gbps switch.

In the mailing list and FreeBSD bugs I have seen that there are a
multitude of issues with the em driver in FreeBSD 12. It seems that the
switch to iflib has introduced them.

I have also discovered that there is net/intel-em-kmod. What is the
relationship between the driver in the base sources and this one? How
advisable is it to use the driver from ports?

-- Martin

On 20.01.19 13:56, Bruce Evans wrote:
> On Sun, 20 Jan 2019, Martin Birgmeier wrote:
>
>> Regarding duplex, ifconfig shows the following:
>>
>> [0]# ifconfig em0
>> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
>> 1500
>>        
>> options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
>>
>>         ether f0:de:f1:98:86:a9
>>         inet 192.168.1.19 netmask 0xffffff00 broadcast 192.168.1.255
>>         inet6 fe80::f2de:f1ff:fe98:86a9%em0 prefixlen 64 scopeid 0x1
>>         inet6 fec0:0:0:4d42::13 prefixlen 64
>>         inet6 fec0::4d42:f2de:f1ff:fe98:86a9 prefixlen 64 autoconf
>>         inet6 2002:bc17:f381:4d42:f2de:f1ff:fe98:86a9 prefixlen 64
>> autoconf
>>         media: Ethernet autoselect (1000baseT <full-duplex>)
>>         status: active
>>         nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
>> [0]#
>>
>> This seems to be o.k.
>
> The media setting can't be trusted to have reached the hardware -- see my
> previous reply.
>
> But I thought that you said that you were using 100 Mbps (presumably with
> autoselect).  The above shos autoselect giving 1 Gbps.
>
> I checked that iflib_media_change() is not called for autoselect to 1
> Gbps
> here.  Also that it fails to stop the NIC if called.  Also that it breaks
> the NIC's state after a few calls in the loop:
>
>     while :; do
>         ./ifconfig em0 media 1000baseT mediaopt full-duplex
>         ./ifconfig em0 media autoselect
>     done
>
> provided ./ifconfig is on nfs.  This gives null changes disguised as
> non-null changes so that iflib_media_change() is called.
>
> Console output for this:
>
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX em0: TX(0) desc avail = 21, pidx = 34
>
> Sometimes the queue indexes are corrupted and this messages is printed.
> Sometimes, but never in this output, this message is repeated many times
> before the interface comes back up.  Actually, this doesn't always
> occur between down and up, and when it is repeaded the queue state is
> avail = 1024, pidx = 0, and this state seems to be sticky unless ifconfig
> somehow runs to generate another reinitialization.
>
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX em0: TX(0) desc avail = 1, pidx = 30
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX em0: TX(0) desc avail = 14, pidx = 33
> XX Link state changed to up
>
> ipv4 ping is broken most of the time while this loop is running.  Of
> course
> ping should stop responding while the interface is down.  It rarely
> starts
> when the interface comes back up.  Sometimes it starts with low latency,
> but usually it starts with DUPs.  For about 50 iterations, the only ping
> output was:
>
> XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=0.158 ms
> XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=3523.305 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=6696.247 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=9857.912 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=0.094 ms
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=4154.124 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=7253.986 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=10367.938 ms
> (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=13540.805 ms
> (DUP!)
>
> Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190120231915.M2326>