Date: Mon, 21 Jan 2019 04:03:57 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Martin Birgmeier <d8zNeCFG@aon.at> Cc: Eugene Grosbein <eugen@grosbein.net>, net@freebsd.org Subject: Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior Message-ID: <20190121014017.W945@besplex.bde.org> In-Reply-To: <50a63079-4c2d-fc5c-47c5-1070b8fcd20c@aon.at> References: <bug-235031-7501@https.bugs.freebsd.org/bugzilla/> <bug-235031-7501-goXNmp3zVl@https.bugs.freebsd.org/bugzilla/> <20190119204156.D929@besplex.bde.org> <3e407ee7-54e3-a6ac-5535-d11aceca9558@grosbein.net> <20190120061258.X3312@besplex.bde.org> <16ce1832-13da-d7bb-cce2-6682e058b5a6@aon.at> <20190120145627.X1077@besplex.bde.org> <fd67eca6-7c1d-687d-91ae-e09138732ed1@aon.at> <20190120231915.M2326@besplex.bde.org> <50a63079-4c2d-fc5c-47c5-1070b8fcd20c@aon.at>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 20 Jan 2019, Martin Birgmeier wrote: > The machine A with the em0 issue is running at 1 Gbps and acts as NFS > server. The NFS client B has a 100 Mbps interface. B gets a throughput > of only 1 Mbyte/s when talking to A but the full 10 Mbyte/s when talking > to another third machine C. In addition, while B is talking to A, if at > the same time A runs an iperf to C, the situation for B improves (up to > 5..7 Mbyte/s). > > All machines are connected by a DGS-1210-24 1 Gbps switch. I see. I get worse misbehaviour (nfs write speed 24 KB/s for 512-blocks instead of 1 MB/s) after changing the media of the bge NIC on my server to 1000base full-duplex (where the switch is a cheap TP-Link 1 Gbps). ping remains fast. Concurrent ping doesn't improve nfs. For the em0 NIC on my client, even the null change from autoselect to 1000baseT full-duplex often corrupts the NIC state so that even ping doesn't work. I got tired of that and fixed the missing stopping: XX Index: iflib.c XX =================================================================== XX --- iflib.c (revision 332488) XX +++ iflib.c (working copy) XX @@ -2232,7 +2234,7 @@ XX XX CTX_LOCK(ctx); XX if ((err = IFDI_MEDIA_CHANGE(ctx)) == 0) XX - iflib_init_locked(ctx); XX + iflib_if_init_locked(ctx); XX CTX_UNLOCK(ctx); XX return (err); XX } The fix works perfectly. Now it is safe to change the media on the. The null change from autoselect to 1000baseT full-duplex on the client now doesn't corrupt the state or change the nfs or ping speeds. Changing the media to 100baseTX full-duplex on the client gives much the same misbehaviour as changing the media on the server similarly (not quite so bad). But changing the mediat to 100baseTX full-duplex on both gives much worse behaviour. Sometimes it causes the frame error reported by my previous patch. Clearly there is a protocol mismatch. This problem occurs often. I don't know how it can occur when there is a switch. The switch should translate to 1000 Mbps for the em0 side. I don't really understand this, but have a lot of code in mii/e1000phy.c related to it, and once tested this with all combinations of speeds and duplexes. e1000phy.c has nothing to do with Intel e1000, but is for an old Marvell phy. I have one on an sk NIC, and it stopped working at 1 Gbps on cold days. The simplest fix was to set the speed manually, but this gave problems like the above, and gives an unnecessarily low speed on warm days. At least my version of e1000phy.c or sk has some link flags which give more control over this. Half-duplex on both sides works! The old version of bge on the server doesn't support mediaopt half-duplex, but seems to default to that and ifconfig prints nothing for the duplex. -current em0 supports it. Working means that the nfs write speed is about 9 MB/s. Half-duplex is of course slightly slower than full-duplex. Similarly for 10baseT/UTP. I found my old tables of working combinations of duplexes and autoselects for bge <-> switch <-> sk and bge <-> sk. The switch affects the working combinations. The tables are cryptic, but seem to be as follows: switch case: bge sk success --- -- ------- A A n/a (handling of the sk bug gives a fuzzy auto speed) 1 A n/a 1F A n/a 1 1 OK (as above) 1F 1 fail 1 1F OK! (1F -> 1) 1F 1F fail! (as above) A 1 OK (A -> 1) A 1F OK (A -> 1F) direct case: bge sk success --- -- ------- A A n/a (handling of the sk bug gives a fuzzy auto speed) 1 A OK (A -> 1) 1F A partial succes (giving half-duplex!?) 1 1 OK (as above) 1F 1 fail 1 1F fail (as expected, but different from switch case!) 1F 1F fail! (as above) A 1 OK (A -> 1) A 1F OK (A -> 1F) Here 1 means a speed of 1000 Mbps or possibly 100 Mbps, A means autoselect, F means full duplex, and the absense of F means half-duplex or nothing. A for both should work and is normally used, and the only really weird case is 1F for both not working. > ... > I have also discovered that there is net/intel-em-kmod. What is the > relationship between the driver in the base sources and this one? How > advisable is it to use the driver from ports? I don't know about that. I guess Intel still does some development, especially for newer chipsets. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190121014017.W945>