Date: Thu, 26 Jan 2017 13:08:06 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 216493] [Hyper-V] Mellanox ConnectX-3 VF driver can't work when FreeBSD runs on Hyper-V 2016 Message-ID: <bug-216493-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216493 Bug ID: 216493 Summary: [Hyper-V] Mellanox ConnectX-3 VF driver can't work when FreeBSD runs on Hyper-V 2016 Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: decui@microsoft.com Windows Server 2016 (Hyper-V 2016) has the ability to support PCIe pass-through and NIC SR-IOV for non-Windows virtual machines (VMs) like Linux and FreeBSD VMs. A few months ago, we enabled PCIe pass-through for FreeBSD VM running on Hyper-V and successfully assigned a Mellanox ConnectX-3 PF device to the VM and the device worked fine in the VM. Now we have added code to support NIC SR-IOV (which is based on PCIe pass-through) in the Hyper-V hv_netvsc driver, but it turned out the VF driver failed to load, so I ported two patches from Linux: https://reviews.freebsd.org/D8867 https://reviews.freebsd.org/D8868 (Note: I only tested the PF/VF drivers in FreeBSD VM running on Hyper-V, but I didn’t test them with the patches on a bare metal FreeBSD machine (it’s not so easy to install such a FreeBSD machine in our lab for now), so it would be really helpful & important if people could review the patches and help to test bare metal.) With the 2 patches, the VF driver worked in my limited test. BTW, this link (https://community.mellanox.com/docs/DOC-2242) shows how to enable Mellanox ConnectX-3 VF for Windows VM running on Hyper-V 2012 R2. What I did to FreeBSD VM on Hyper-V 2016 is pretty similar. Next, I did more testing and identified 4 issues we need to address: 1. When the VF is hot removed, I see the below error, but it looks nonfatal, because later when the VF is hot added, it can still work. mlx4_core0: Failed to free mtt range at:20769 order:0 mlx4_core0: detached 2. The VF works fine when the VM has <=12 virtual CPUs, but if the VM has >=13 vCPUs, the VF driver fails to load: mlx4_core0: <mlx4_core> at device 2.0 on pci1 mlx4_core: Initializing mlx4_core: Mellanox ConnectX VPI driver v2.1.6 vmbus0: allocated type 3 (0xfe0800000-0xfe0ffffff) for rid 18 of mlx4_core0 mlx4_core0: Lazy allocation of 0x800000 bytes rid 0x18 type 3 at 0xfe0800000 mlx4_core0: Detected virtual function - running in slave mode mlx4_core0: Sending reset mlx4_core0: Sending vhcr0 mlx4_core0: HCA minimum page size:512 mlx4_core0: Timestamping is not supported in slave mode. mlx4_core0: attempting to allocate 20 MSI-X vectors (52 supported) mlx4_core0: using IRQs 256-275 for MSI-X mlx4_core0: Failed to allocate mtts for 1024 pages(order 10) mlx4_core0: Failed to initialize event queue table (err=-12), aborting. 3. The VF can't ping other VM's VF on the same host, and can't ping the PF on the same host either. On the same host, Windows VM <-> Windows VM and Windows VM <-> Linux VM are both OK. Only FreeBSD VM <-> Windows/Linux VMs can't work. I suspect something is wrong or missing in the mlx4 VF driver in FreeBSD. 4. I got the below when Live Migration didn’t work. It seems the VF’s detach method couldn’t finish successfully. Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: Failed to free mtt range at:5937 order:0 Jan 11 19:16:54 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:16:54 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode CLOSE_PORT (0xa) Jan 11 19:18:04 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:18:04 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id = 0x9000000000002 Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id = 0x9000000000003 Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id = 0x9000000000004 Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id = 0x9000000000005 Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id = 0x9000000000006 Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id = 0x9000000000007 Jan 11 19:26:16 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:26:16 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode SET_MCAST_FLTR (0x48) Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: Failed to free icm of qp:2279 Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: Failed to release qp range base:2279 cnt:1 Jan 11 19:29:46 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:29:46 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode 2RST_QP (0x21) Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode HW2SW_CQ (0x17) Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: HW2SW_CQ failed (-35) for CQN 0000b5 Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: Failed freeing cq:181 More info about issue 4: In the case of Live Migration, it looks the host just rescinds the VF by force without sending the PCI_EJECT message to the VM. It looks the current Mellanox VF driver in FreeBSD can’t handle this case (i.e. the VF device disappears suddenly) and always hangs due to command timeout, because at that time the host denies the VM’s access to the VF. BTW, the VF driver in Linux VM doesn’t hang and it looks Live Migration can work, but the driver also prints out these scary messages: Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Internal error detected on the communication channel Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: device is going to be reset Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: VF reset is not needed Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: device was reset successfully Jan 26 02:40:06 decui-lin-vm kernel: mlx4_en 99bb:00:02.0: Internal error detected, restarting device Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: command 0x5 failed: fw status = 0x1 Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: VF down: enP39355p0s2 Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: Data path switched from VF: enP39355p0s2 Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: VF unregistering: enP39355p0s2 Jan 26 02:40:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Failed to close slave function Jan 26 02:40:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Detected virtual function - running in slave mode Jan 26 02:40:37 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: recovering from previously mis-behaved VM Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Communication channel is offline. Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: PF is not responsive, skipping initialization Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Failed to initialize slave Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_restart_one: ERROR: mlx4_load_one failed, pci_name=99bb:00:02.0, err=-5 Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_restart_one was ended, ret=-5 Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_remove_one: interface is down I think at least we need to port this patch “net/mlx4_core: Enable device recovery flow with SRIOV “ (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=55ad359225b2232b9b8f04a0dfa169bd3a7d86d2) from Linux to FreeBSD. -- You are receiving this mail because: You are the assignee for the bug.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-216493-8>
