From owner-freebsd-current@FreeBSD.ORG Tue Jan 21 09:04:54 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EEF5B65D; Tue, 21 Jan 2014 09:04:54 +0000 (UTC) Received: from mx12.netapp.com (mx12.netapp.com [216.240.18.77]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C467F1584; Tue, 21 Jan 2014 09:04:54 +0000 (UTC) X-IronPort-AV: E=Sophos;i="4.95,695,1384329600"; d="asc'?scan'208";a="138128821" Received: from vmwexceht03-prd.hq.netapp.com ([10.106.76.241]) by mx12-out.netapp.com with ESMTP; 21 Jan 2014 01:04:47 -0800 Received: from SACEXCMBX06-PRD.hq.netapp.com ([169.254.9.60]) by vmwexceht03-prd.hq.netapp.com ([10.106.76.241]) with mapi id 14.03.0123.003; Tue, 21 Jan 2014 01:04:48 -0800 From: "Eggert, Lars" To: John Baldwin Subject: Re: using ConnectX card as Ethernet (mlxen) Thread-Topic: using ConnectX card as Ethernet (mlxen) Thread-Index: AQHOfL84Utvue43Zsk+plvzCsSUACpldPAwAgTIBQACAAJRngIAAypCA Date: Tue, 21 Jan 2014 09:04:46 +0000 Message-ID: <0C5748ED-5142-46A4-93FA-A6BA2FF77E52@netapp.com> References: <3A359B33-380C-4230-A62C-623765E9376A@jnielsen.net> <2BA72819-3004-4FB6-BB4F-5964B41F6B2F@jnielsen.net> <8D21D2EF-8A74-40A3-A49F-73FDE7C3CFD2@netapp.com> <1536242.pn1tTPOXXc@pippin.baldwin.cx> In-Reply-To: <1536242.pn1tTPOXXc@pippin.baldwin.cx> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [10.106.53.51] Content-Type: multipart/signed; boundary="Apple-Mail=_D79EDB87-80B0-4DC1-87CA-1B75D20FF0BB"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: "freebsd-current@freebsd.org" , John Nielsen X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Jan 2014 09:04:55 -0000 --Apple-Mail=_D79EDB87-80B0-4DC1-87CA-1B75D20FF0BB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi, On 2014-1-20, at 21:59, John Baldwin wrote: > I believe this should work, yes. Getting a crashdump or the panic = messages=20 > would be really helpful in figuring out why it isn't. Thanks. I rebuilt the kernel, and see no crashes anymore. So that's good. But there are a bunch of other issues that maybe someone has some ideas = about: (1) Late attach The ConnectX-3 attaches very late during the boot process, after the = system is already in single-user mode. See the attached dmesg; pci17 and = pci18 (there are two identical cards in this system) first show as "no = driver attached" during the PCI bus enumeration. Only after the system = is single-user mode does the mlx4_core attach to the cards. That means that e.g. trying to set sysctls for these cards in = /etc/sysctl.conf, or configuring their IP addresses via rc.conf is not = possible. At the moment, I work around this by sleeping in rc.local and = then doing assignments there, but that's a hack. Any clues why these cards attach so late? (2) Device numbers change After booting, these cards show up in InfiniBand mode: ib0: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.21 nd6 options=3D21 ib1: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.22 nd6 options=3D21 ib2: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1 nd6 options=3D21 ib3: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2 nd6 options=3D21 Then I force one into Ethernet mode: # sysctl sys.device.mlx4_core0.mlx4_port1=3Deth sys.device.mlx4_core0.mlx4_port1: auto (ib) -> eth and the device numbers on the ib devices change: ib1 is now ib4, and I = have a new mlxen0 device. ib2: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1 nd6 options=3D21 ib3: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2 nd6 options=3D21 mlxen0: flags=3D8843 metric 0 = mtu 1500 = options=3Dd05bb ether f4:52:14:10:d1:21 inet6 fe80::f652:14ff:fe10:d121%mlxen0 prefixlen 64 scopeid 0xe=20= nd6 options=3D21 media: Ethernet autoselect status: no carrier ib4: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.4a.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d1.22 nd6 options=3D21 When I change another port into Ethernet mode # sysctl sys.device.mlx4_core0.mlx4_port2=3Deth sys.device.mlx4_core0.mlx4_port2: auto (ib) -> eth device numbers change again. Now mxlen0 disappears and becomes mxlen1, = and I have a new mxlen2 device: ib2: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.48.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d1 nd6 options=3D21 ib3: flags=3D8002 metric 0 mtu 65520 options=3D80018 lladdr 80.0.0.49.fe.80.0.0.0.0.0.0.f4.52.14.3.0.10.d0.d2 nd6 options=3D21 mlxen1: flags=3D8843 metric 0 = mtu 1500 = options=3Dd05bb ether f4:52:14:10:d1:21 inet6 fe80::f652:14ff:fe10:d121%mlxen1 prefixlen 64 scopeid 0xe=20= nd6 options=3D21 media: Ethernet autoselect status: no carrier mlxen2: flags=3D8843 metric 0 = mtu 1500 = options=3Dd05bb ether f4:52:14:10:d1:22 inet6 fe80::f652:14ff:fe10:d122%mlxen2 prefixlen 64 scopeid 0xf=20= nd6 options=3D21 media: Ethernet autoselect status: no carrier Changing the other two ports (on the second card) to Ethernet mode=20 # sysctl sys.device.mlx4_core1.mlx4_port1=3Deth sys.device.mlx4_core1.mlx4_port1: auto (ib) -> eth # sysctl sys.device.mlx4_core1.mlx4_port2=3Deth sys.device.mlx4_core1.mlx4_port2: auto (ib) -> eth leaves me with mlxen1, mlxen2, mlxen4 and mlxen 5: mlxen1: flags=3D8843 metric 0 = mtu 1500 = options=3Dd05bb ether f4:52:14:10:d1:21 inet6 fe80::f652:14ff:fe10:d121%mlxen1 prefixlen 64 scopeid 0xe=20= nd6 options=3D21 media: Ethernet autoselect status: no carrier mlxen2: flags=3D8843 metric 0 = mtu 1500 = options=3Dd05bb ether f4:52:14:10:d1:22 inet6 fe80::f652:14ff:fe10:d122%mlxen2 prefixlen 64 scopeid 0xf=20= inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255=20 nd6 options=3D21 media: Ethernet autoselect (40Gbase-CR4 = ) status: active mlxen4: flags=3D8843 metric 0 = mtu 1500 = options=3Dd05bb ether f4:52:14:10:d0:d1 inet6 fe80::f652:14ff:fe10:d0d1%mlxen4 prefixlen 64 scopeid 0x10=20= nd6 options=3D21 media: Ethernet autoselect status: no carrier mlxen5: flags=3D8843 metric 0 = mtu 1500 = options=3Dd05bb ether f4:52:14:10:d0:d2 inet6 fe80::f652:14ff:fe10:d0d2%mlxen5 prefixlen 64 scopeid 0x11=20= inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255=20 nd6 options=3D21 media: Ethernet autoselect (10Gbase-CX4 = ) status: active Needless to say, having devices change numbers is problematic. (3) 40G TCP performance I barely get over 10G with netperf over the 40G interfaces: root@one:~ # netperf -H two-mlxen2 -- -s512k -S512K MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to = two-mlxen2.muclab () port 0 AF_INET : histogram : interval : dirty data = : demo Recv Send Send =20 Socket Socket Message Elapsed =20 Size Size Size Time Throughput =20 bytes bytes bytes secs. 10^6bits/sec =20 524288 512000 512000 10.07 10268.01 =20 Any clues as to what could be limiting performance here? Thanks, Lars --Apple-Mail=_D79EDB87-80B0-4DC1-87CA-1B75D20FF0BB Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQCVAwUBUt44K9ZcnpRveo1xAQLNZAP/Zb4RgcWGfayz8qAx7Zqd/iC306na4yCq KTb4VKA7vduD9iKEzkD3+XOY2jbHHgpWzGljStPu0X1OYErkn+2IMoICBXMMn/1I uRPrgOFJqAzcCZmBNQ6G8FFCxX2ahb/CuNDTfhGWpfV7vP4IouGPAN81GaSq794/ gsodbbfJcG8= =ALPM -----END PGP SIGNATURE----- --Apple-Mail=_D79EDB87-80B0-4DC1-87CA-1B75D20FF0BB--