Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 Jan 2017 17:21:21 +0100
From:      Julien Cigar <julien@perdition.city>
To:        Meny Yossefi <menyy@mellanox.com>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Ben RUBSON <ben.rubson@gmail.com>, Yuval Bason <yuvalba@mellanox.com>, Hans Petter Selasky <hanss@mellanox.com>
Subject:   Re: FW: iSCSI failing, MLX rx_ring errors ?
Message-ID:  <20170103162120.GX15696@mordor.lan>
In-Reply-To: <DB3PR05MB089011A41EF87A40C7AC741C36E0@DB3PR05MB089.eurprd05.prod.outlook.com>
References:  <486A6DA0-54C8-40DF-8437-F6E382DA01A8@gmail.com> <6a31ef00-5f7a-d36e-d5e6-0414e8b813c7@selasky.org> <DB3PR05MB089A5789A0A619FA8B7CA36C36C0@DB3PR05MB089.eurprd05.prod.outlook.com> <613AFD8E-72B2-4E3F-9C70-1D1E43109B8A@gmail.com> <2c9a9c2652a74d8eb4b34f5a32c7ad5c@AM5PR0502MB2916.eurprd05.prod.outlook.com> <DB3PR05MB089011A41EF87A40C7AC741C36E0@DB3PR05MB089.eurprd05.prod.outlook.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--6yuPXOSZRpyw7iEV
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

is it not the same issue as PR 211990 ? can you try by turning off jumbo
frames ?

On Tue, Jan 03, 2017 at 06:27:15AM +0000, Meny Yossefi wrote:
>=20
> ________________________________________
> From: owner-freebsd-net@freebsd.orgOn Behalf OfBen RUBSON
> Sent: Monday, January 2, 2017 11:09:15 AM (UTC+00:00) Monrovia, Reykjavik
> To: freebsd-net@freebsd.org
> Cc: Meny Yossefi; Yuval Bason; Hans Petter Selasky
> Subject: Re: iSCSI failing, MLX rx_ring errors ?
>=20
> Hi Meny,
>=20
> Thank you very much for your feedback.
>=20
> I think you are right, this could be a mbufs issue.
> Here are some more numbers :
>=20
> # vmstat -z | grep -v "0,   0$"
> ITEM                   SIZE   LIMIT     USED     FREE         REQ      FA=
IL SLEEP
> 4 Bucket:                32,      0,    2673,   28327,   88449799,    173=
17, 0
> 8 Bucket:                64,      0,     449,   15609,   13926386,     48=
71, 0
> 12 Bucket:               96,      0,     335,    5323,   10293892,   1428=
72, 0
> 16 Bucket:              128,      0,     533,    6070,    7618615,   4726=
47, 0
> 32 Bucket:              256,      0,    8317,   22133,   36020376,   5634=
79, 0
> 64 Bucket:              512,      0,    1238,    3298,   20138111, 114307=
42, 0
> 128 Bucket:            1024,      0,    1865,    2963,   21162182,   1587=
52, 0
> 256 Bucket:            2048,      0,    1626,     450,   80253784,  48901=
64, 0
> mbuf_jumbo_9k:         9216, 603712,   16400,    8744, 4128521064,     26=
61, 0
>=20
> # netstat -m
> 32801/18814/51615 mbufs in use (current/cache/total)
> 16400/9810/26210/4075058 mbuf clusters in use (current/cache/total/max)
> 16400/9659 mbuf+clusters out of packet secondary zone in use (current/cac=
he)
> 0/8647/8647/2037529 4k (page size) jumbo clusters in use (current/cache/t=
otal/max)
> 16400/8744/25144/603712 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/339588 16k jumbo clusters in use (current/cache/total/max) 188600K/=
137607K/326207K bytes allocated to network (current/cache/total)
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> 0/2661/0 requests for jumbo clusters denied (4k/9k/16k)
> 0 sendfile syscalls
> 0 sendfile syscalls completed without I/O request
> 0 requests for I/O initiated by sendfile
> 0 pages read by sendfile as part of a request
> 0 pages were valid at time of a sendfile request
> 0 pages were requested for read ahead by applications
> 0 pages were read ahead by sendfile
> 0 times sendfile encountered an already busy page
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
>=20
> I did not perform any mbufs tuning, numbers above are from FreeBSD itself.
>=20
> This server has 64GB of memory.
> It has a ZFS pool for which I limit ARC memory impact with :
> vfs.zfs.arc_max=3D64424509440 #60G
>=20
> The only thing I did is some TCP tuning to improve throughput over high-l=
atency long-distance private links :
> kern.ipc.maxsockbuf=3D7372800
> net.inet.tcp.sendbuf_max=3D6553600
> net.inet.tcp.recvbuf_max=3D6553600
> net.inet.tcp.sendspace=3D65536
> net.inet.tcp.recvspace=3D65536
> net.inet.tcp.sendbuf_inc=3D65536
> net.inet.tcp.recvbuf_inc=3D65536
> net.inet.tcp.cc.algorithm=3Dhtcp
>=20
> Here are some graphs of memory & ARC usage when issue occurs.
> Crosshair (vertical red line) is at the timestamp where I get iSCSI disco=
nnections.
> https://postimg.org/gallery/1kkekrc4e/
> What is strange is that each time issue occurs there is around 1GB of fre=
e memory.
> So FreeBSD should still be able to allocate some more mbufs ?
> Unfortunately I do not have graphs about mbufs.
>=20
> What should I ideally do ?
>=20
> >> Have you tried increasing the mbufs limit?=20
> (sysctl) kern.ipc.nmbufs (Maximum number of mbufs allowed)
>=20
>=20
> Thank you again,
>=20
> Best regards,
>=20
> Ben
>=20
>=20
>=20
> > On 01 Jan 2017, at 09:16, Meny Yossefi <menyy@mellanox.com> wrote:
> >
> > Hi Ben,
> >
> > Those are not HW errors, note that:
> >
> > hw.mlxen1.stat.rx_dropped: 0
> > hw.mlxen1.stat.rx_errors: 0
> >
> > It seems to be triggered when you are failing to allocate a replacement=
 buffer.
> > Any chance you ran out of mbufs in the system?
> >
> > en_rx.c:
> >
> > mlx4_en_process_rx_cq():
> >
> >    mb =3D mlx4_en_rx_mb(priv, rx_desc, mb_list, length);
> >                 if (!mb) {
> >                         ring->errors++;
> >                         goto next;
> >                 }
> >
> > mlx4_en_rx_mb() =C3=A0 mlx4_en_complete_rx_desc():
> >
> >   /* Allocate a replacement page */
> >                 if (mlx4_en_alloc_buf(priv, rx_desc, mb_list, nr))
> >                         goto fail;
> >
> > -Meny
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

--=20
Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

--6yuPXOSZRpyw7iEV
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEE7vn2l0to0nV7EWolsrs3EKIEI8AFAlhrz30ACgkQsrs3EKIE
I8CsiA/8D+q1LZJekx/l1E3Z2XuOp2xEst7DjgO0dInY4tMHDS32I5N9R7Ep5tXM
PJNO/cm0kBrJtIXLTbQliN87MYk8HCd6ih6MGOkeaxSkcyj+dHQp3x1cd8i5orAq
wiTvbyDXrcBPA5YYMqPeFxXbMEpoDf1iPGWzHk33vharGc7/TInEodic6/5c3B+s
2+HRatC0vubYmxt/HS5mRTomoKoHb9dc1CDhCB9SO2rlbGRsJ/QxGFFLssBJ3qxg
vGo/Nc0k4sQ6zRG1uUbgYW/wfkeIEtqjiPdEeGCq+ydRnp8qNpV48N0vKfb048P+
TFwAu54TPs+J5Qt3P7Q9v/GU+SMBBgoYdns+TUbB5FirZSelg95LCZve8B3tE7d8
ZKyEbFS0Hihy5gt6a6+4E3Ht29ST326zFcXF+KWppwrlDGDOdh7FINH2nxW9SNJZ
lxlSsw9iPe+F+oTz28IfJLOcDPszIEW2ZaQyQpRnGkbGmuic7C+QkUWdJGTqTSiV
mc9iwv9d1MEcHfh6D8xVguV7eoiFVCe5a2lh7uQuvReK4ChtoFv1umIWG7lmx/IF
ixDP4s2awSokuFVsCL8VX/ESiwsH2xiAZlWEgToWFxtQpANK7/4KH4dTYDjGOSOg
JvWmjAinrEfpa2yY38yIo3pmI8WIzl6uoruZYgqzUtYfUPgMcIw=
=e9ZB
-----END PGP SIGNATURE-----

--6yuPXOSZRpyw7iEV--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170103162120.GX15696>