From owner-freebsd-net@freebsd.org Tue Jan 3 16:21:26 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CE11CC9D276 for ; Tue, 3 Jan 2017 16:21:26 +0000 (UTC) (envelope-from julien@perdition.city) Received: from relay-b02.edpnet.be (relay-b02.edpnet.be [212.71.1.222]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "edpnet.email", Issuer "Go Daddy Secure Certificate Authority - G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7CCD71A2E for ; Tue, 3 Jan 2017 16:21:26 +0000 (UTC) (envelope-from julien@perdition.city) X-ASG-Debug-ID: 1483460481-0a7b8d035de0e50001-QdxwpM Received: from mordor.lan ([213.219.148.14]) by relay-b02.edpnet.be with ESMTP id WRj2J1qI5Aq55H29 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 03 Jan 2017 17:21:22 +0100 (CET) X-Barracuda-Envelope-From: julien@perdition.city X-Barracuda-Effective-Source-IP: UNKNOWN[213.219.148.14] X-Barracuda-Apparent-Source-IP: 213.219.148.14 Date: Tue, 3 Jan 2017 17:21:21 +0100 From: Julien Cigar To: Meny Yossefi Cc: "freebsd-net@freebsd.org" , Ben RUBSON , Yuval Bason , Hans Petter Selasky Subject: Re: FW: iSCSI failing, MLX rx_ring errors ? Message-ID: <20170103162120.GX15696@mordor.lan> X-ASG-Orig-Subj: Re: FW: iSCSI failing, MLX rx_ring errors ? References: <486A6DA0-54C8-40DF-8437-F6E382DA01A8@gmail.com> <6a31ef00-5f7a-d36e-d5e6-0414e8b813c7@selasky.org> <613AFD8E-72B2-4E3F-9C70-1D1E43109B8A@gmail.com> <2c9a9c2652a74d8eb4b34f5a32c7ad5c@AM5PR0502MB2916.eurprd05.prod.outlook.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="6yuPXOSZRpyw7iEV" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) X-Barracuda-Connect: UNKNOWN[213.219.148.14] X-Barracuda-Start-Time: 1483460481 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://212.71.1.222:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 5484 X-Virus-Scanned: by bsmtpd at edpnet.be X-Barracuda-BRTS-Status: 1 X-Barracuda-Bayes: INNOCENT GLOBAL 0.5000 1.0000 0.0100 X-Barracuda-Spam-Score: 0.01 X-Barracuda-Spam-Status: No, SCORE=0.01 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=6.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.35545 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jan 2017 16:21:26 -0000 --6yuPXOSZRpyw7iEV Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable is it not the same issue as PR 211990 ? can you try by turning off jumbo frames ? On Tue, Jan 03, 2017 at 06:27:15AM +0000, Meny Yossefi wrote: >=20 > ________________________________________ > From: owner-freebsd-net@freebsd.orgOn Behalf OfBen RUBSON > Sent: Monday, January 2, 2017 11:09:15 AM (UTC+00:00) Monrovia, Reykjavik > To: freebsd-net@freebsd.org > Cc: Meny Yossefi; Yuval Bason; Hans Petter Selasky > Subject: Re: iSCSI failing, MLX rx_ring errors ? >=20 > Hi Meny, >=20 > Thank you very much for your feedback. >=20 > I think you are right, this could be a mbufs issue. > Here are some more numbers : >=20 > # vmstat -z | grep -v "0, 0$" > ITEM SIZE LIMIT USED FREE REQ FA= IL SLEEP > 4 Bucket: 32, 0, 2673, 28327, 88449799, 173= 17, 0 > 8 Bucket: 64, 0, 449, 15609, 13926386, 48= 71, 0 > 12 Bucket: 96, 0, 335, 5323, 10293892, 1428= 72, 0 > 16 Bucket: 128, 0, 533, 6070, 7618615, 4726= 47, 0 > 32 Bucket: 256, 0, 8317, 22133, 36020376, 5634= 79, 0 > 64 Bucket: 512, 0, 1238, 3298, 20138111, 114307= 42, 0 > 128 Bucket: 1024, 0, 1865, 2963, 21162182, 1587= 52, 0 > 256 Bucket: 2048, 0, 1626, 450, 80253784, 48901= 64, 0 > mbuf_jumbo_9k: 9216, 603712, 16400, 8744, 4128521064, 26= 61, 0 >=20 > # netstat -m > 32801/18814/51615 mbufs in use (current/cache/total) > 16400/9810/26210/4075058 mbuf clusters in use (current/cache/total/max) > 16400/9659 mbuf+clusters out of packet secondary zone in use (current/cac= he) > 0/8647/8647/2037529 4k (page size) jumbo clusters in use (current/cache/t= otal/max) > 16400/8744/25144/603712 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/339588 16k jumbo clusters in use (current/cache/total/max) 188600K/= 137607K/326207K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/2661/0 requests for jumbo clusters denied (4k/9k/16k) > 0 sendfile syscalls > 0 sendfile syscalls completed without I/O request > 0 requests for I/O initiated by sendfile > 0 pages read by sendfile as part of a request > 0 pages were valid at time of a sendfile request > 0 pages were requested for read ahead by applications > 0 pages were read ahead by sendfile > 0 times sendfile encountered an already busy page > 0 requests for sfbufs denied > 0 requests for sfbufs delayed >=20 > I did not perform any mbufs tuning, numbers above are from FreeBSD itself. >=20 > This server has 64GB of memory. > It has a ZFS pool for which I limit ARC memory impact with : > vfs.zfs.arc_max=3D64424509440 #60G >=20 > The only thing I did is some TCP tuning to improve throughput over high-l= atency long-distance private links : > kern.ipc.maxsockbuf=3D7372800 > net.inet.tcp.sendbuf_max=3D6553600 > net.inet.tcp.recvbuf_max=3D6553600 > net.inet.tcp.sendspace=3D65536 > net.inet.tcp.recvspace=3D65536 > net.inet.tcp.sendbuf_inc=3D65536 > net.inet.tcp.recvbuf_inc=3D65536 > net.inet.tcp.cc.algorithm=3Dhtcp >=20 > Here are some graphs of memory & ARC usage when issue occurs. > Crosshair (vertical red line) is at the timestamp where I get iSCSI disco= nnections. > https://postimg.org/gallery/1kkekrc4e/ > What is strange is that each time issue occurs there is around 1GB of fre= e memory. > So FreeBSD should still be able to allocate some more mbufs ? > Unfortunately I do not have graphs about mbufs. >=20 > What should I ideally do ? >=20 > >> Have you tried increasing the mbufs limit?=20 > (sysctl) kern.ipc.nmbufs (Maximum number of mbufs allowed) >=20 >=20 > Thank you again, >=20 > Best regards, >=20 > Ben >=20 >=20 >=20 > > On 01 Jan 2017, at 09:16, Meny Yossefi wrote: > > > > Hi Ben, > > > > Those are not HW errors, note that: > > > > hw.mlxen1.stat.rx_dropped: 0 > > hw.mlxen1.stat.rx_errors: 0 > > > > It seems to be triggered when you are failing to allocate a replacement= buffer. > > Any chance you ran out of mbufs in the system? > > > > en_rx.c: > > > > mlx4_en_process_rx_cq(): > > > > mb =3D mlx4_en_rx_mb(priv, rx_desc, mb_list, length); > > if (!mb) { > > ring->errors++; > > goto next; > > } > > > > mlx4_en_rx_mb() =C3=A0 mlx4_en_complete_rx_desc(): > > > > /* Allocate a replacement page */ > > if (mlx4_en_alloc_buf(priv, rx_desc, mb_list, nr)) > > goto fail; > > > > -Meny > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" --=20 Julien Cigar Belgian Biodiversity Platform (http://www.biodiversity.be) PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced. --6yuPXOSZRpyw7iEV Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE7vn2l0to0nV7EWolsrs3EKIEI8AFAlhrz30ACgkQsrs3EKIE I8CsiA/8D+q1LZJekx/l1E3Z2XuOp2xEst7DjgO0dInY4tMHDS32I5N9R7Ep5tXM PJNO/cm0kBrJtIXLTbQliN87MYk8HCd6ih6MGOkeaxSkcyj+dHQp3x1cd8i5orAq wiTvbyDXrcBPA5YYMqPeFxXbMEpoDf1iPGWzHk33vharGc7/TInEodic6/5c3B+s 2+HRatC0vubYmxt/HS5mRTomoKoHb9dc1CDhCB9SO2rlbGRsJ/QxGFFLssBJ3qxg vGo/Nc0k4sQ6zRG1uUbgYW/wfkeIEtqjiPdEeGCq+ydRnp8qNpV48N0vKfb048P+ TFwAu54TPs+J5Qt3P7Q9v/GU+SMBBgoYdns+TUbB5FirZSelg95LCZve8B3tE7d8 ZKyEbFS0Hihy5gt6a6+4E3Ht29ST326zFcXF+KWppwrlDGDOdh7FINH2nxW9SNJZ lxlSsw9iPe+F+oTz28IfJLOcDPszIEW2ZaQyQpRnGkbGmuic7C+QkUWdJGTqTSiV mc9iwv9d1MEcHfh6D8xVguV7eoiFVCe5a2lh7uQuvReK4ChtoFv1umIWG7lmx/IF ixDP4s2awSokuFVsCL8VX/ESiwsH2xiAZlWEgToWFxtQpANK7/4KH4dTYDjGOSOg JvWmjAinrEfpa2yY38yIo3pmI8WIzl6uoruZYgqzUtYfUPgMcIw= =e9ZB -----END PGP SIGNATURE----- --6yuPXOSZRpyw7iEV--