From owner-freebsd-virtualization@freebsd.org Tue Jan 10 08:52:32 2017 Return-Path: Delivered-To: freebsd-virtualization@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 72806CA7A1F for ; Tue, 10 Jan 2017 08:52:32 +0000 (UTC) (envelope-from soralx@cydem.org) Received: from smtp.triumf.ca (smtp.triumf.ca [142.90.100.188]) by mx1.freebsd.org (Postfix) with ESMTP id 607C71FF5 for ; Tue, 10 Jan 2017 08:52:32 +0000 (UTC) (envelope-from soralx@cydem.org) Received: from mscad14 (mscad14.triumf.ca [142.90.115.36]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.triumf.ca (Postfix) with ESMTP id 36466F802; Tue, 10 Jan 2017 00:33:33 -0800 (PST) Date: Tue, 10 Jan 2017 00:33:32 -0800 From: To: , Subject: Re: Issues with GTX960 on CentOS7 using bhyve PCI passthru (FreeBSD 11-RC2) Message-ID: <20170110003332.7cf8ba15@mscad14> X-Mailer: Claws Mail 3.14.1 (GTK+ 2.24.29; amd64-portbld-freebsd9.3) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Jan 2017 08:52:32 -0000 Howdy, virtualization zealots! This is in reply to maillist thread [0]. It so happens that I have to get GPU-accelerated OpenCL working on my machine, so I had a play with bhyve & PCI-e passthrough for VGA. I was using nVidia Quadro 600 (GF108) for testing (planning to use AMD/ATI for OpenCL, of course). I tried a Linux guest with the proprietary nVidia driver, and the result was that the driver couldn't init the VGA during boot: [ 1.394726] nvidia: module license 'NVIDIA' taints kernel. [ 1.395140] Disabling lock debugging due to kernel taint [ 1.412132] nvidia: module verification failed: signature and/or required key missing - tainting kernel [ 1.419359] nvidia 0000:00:04.0: can't derive routing for PCI INT A [ 1.419807] nvidia 0000:00:04.0: PCI INT A: no GSI [ 1.420157] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid: [ 1.420157] NVRM: BAR1 is 0M @ 0x0 (PCI:0000:00:04.0) [ 1.421023] NVRM: The system BIOS may have misconfigured your GPU. [ 1.421476] nvidia: probe of 0000:00:04.0 failed with error -1 [ 1.437301] nvidia-nvlink: Nvlink Core is being initialized, major device number 247 [ 1.440094] NVRM: The NVIDIA probe routine failed for 1 device(s). [ 1.440530] NVRM: None of the NVIDIA graphics adapters were initialized! After adding the "pci=nocrs" Linux boot option (which, from what I understand, magically helps to [partially] workaround bhyve assigning addresses beyond host CPU's physically addressable space for PCIe memory-mapped registers), the guest couldn't finish booting, because bhyve would segfault. Turns out the what peripherals are used, and their order on the command line, are important. Edit: actually, looks like it's the number of CPUs (the '-c' flag's argument) that makes the difference; the machine has a CPU with 4 core, no multithreading. This didn't work (segfault): `bhyve -A -H -P -s 0:0,hostbridge -s 1:0,lpc -s 2:0,virtio-net,tap0 \ -s 3:0,virtio-blk,./bhyve_lunix.img \ -s 4:0,ahci-cd,./ubuntu-16.04.1-server-amd64.iso \ -s 5:0,passthru,1/0/0 -l com1,stdio -c 4 -m 1024M -S lunixguest` [...] [ OK ] Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch. [ OK ] Reached target Swap. Assertion failed: (pi->pi_bar[baridx].type == PCIBAR_IO), function passthru_write, file /usr/src/usr.sbin/bhyve/pci_passthru.c, line 850. Abort (core dumped) But his worked, finally: `bhyve -c 1 -m 1024M -S -A -H -P -s 0:0,hostbridge -s 1:0,lpc \ -s 2:0,virtio-net,tap0 -s 3:0,virtio-blk,./bhyve_lunix.img \ -s 4:0,passthru,1/0/0 -l com1,stdio lunixguest` So, the guest booted, and didn't complain about non-addressable- -by-CPU BARs anymore. However, the same fate befall me as Dom had in this thread -- the driver loaded: [ 1.691216] nvidia: module verification failed: signature and/or required key missing - tainting kernel [ 1.696641] nvidia 0000:00:04.0: can't derive routing for PCI INT A [ 1.698093] nvidia 0000:00:04.0: PCI INT A: no GSI [ 1.699277] vgaarb: device changed decodes: PCI:0000:00:04.0,olddecodes=io+mem,decodes=none:owns=io+mem [ 1.701461] nvidia-nvlink: Nvlink Core is being initialized, major device number 247 [ 1.702649] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 375.26 Thu Dec 8 18:36:43 PST 2016 (using threaded interrupts) [ 1.705481] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 375.26 Thu Dec 8 18:04:14 PST 2016 [ 1.708941] [drm] [nvidia-drm] [GPU ID 0x00000004] Loading driver but couldn't talk to the card: [lost the log, but it was the same as Dom's: "NVRM: rm_init_adapter failed"]. So I decided to try test in a FreeBSD 10.3-STABLE guest. With older driver, or just loading 'nvidia' without modesetting, I got guest kernel panics [1]. I loaded 'nvidia-modeset', there was more success: Linux ELF exec handler installed Linux x86-64 ELF exec handler installed nvidia0: on vgapci0 vgapci0: child nvidia0 requested pci_enable_io vgapci0: attempting to allocate 1 MSI vectors (1 supported) msi: routing MSI IRQ 269 to local APIC 2 vector 51 vgapci0: using IRQ 269 for MSI vgapci0: child nvidia0 requested pci_enable_io nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 367.44 Wed Aug 17 22:05:09 PDT 2016 But: # nvidia-smi NVRM: Xid (PCI:0000:00:04): 62, !2369(0000) NVRM: RmInitAdapter failed! (0x26:0x65:1072) nvidia0: NVRM: rm_init_adapter() failed! No devices were found It also panicked after starting Xorg. After stumbling upon some Xen forums, I found the solution: nVidia crippled the driver so that it detects virtualization environment, and refuses to attach to anything but high-end pro cards! Those bastards [if the speculation is true]! GTX960 didn't work. Quadro 600 didn't work. So I tried with a Quadro 2000: root@fbsd12tst:~ # sync root@fbsd12tst:~ # kldload nvidia-modeset Linux ELF exec handler installed nvidia0: on vgapci0 vgapci0: child nvidia0 requested pci_enable_io vgapci0: attempting to allocate 1 MSI vectors (1 supported) msi: routing MSI IRQ 269 to local APIC 3 vector 51 vgapci0: using IRQ 269 for MSI vgapci0: child nvidia0 requested pci_enable_io random: harvesting attach, 8 bytes (4 bits) from nvidia0 [a bit more]Success! However: root@fbsd12tst:~ # nvidia-smi acquiring duplicate lock of same type: "os.lock_sx" 1st os.lock_sx @ nvidia_os.c:599 2nd os.lock_sx @ nvidia_os.c:599 stack backtrace: #0 0xffffffff80aa6780 at witness_debugger+0x70 #1 0xffffffff80aa6683 at witness_checkorder+0xde3 #2 0xffffffff80a4fac2 at _sx_xlock+0x72 #3 0xffffffff82a515c2 at os_acquire_mutex+0x32 #4 0xffffffff82a21068 at _nv016673rm+0x18 Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0xfffffe004f601088 fault code = supervisor write data, reserved bits in PTE instruction pointer = 0x20:0xffffffff82a512e3 stack pointer = 0x28:0xfffffe0000221138 frame pointer = 0x28:0xfffffe0001a76ba8 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 634 (nvidia-smi) [ thread pid 634 tid 100100 ] Stopped at os_mem_copy+0xf3: movb %dl,(%rax) db> bt Tracing pid 634 tid 100100 td 0xfffff8000b866500 os_mem_copy() at os_mem_copy+0xf3/frame 0xfffffe0001a76ba8 ??() at 0xfffff8000b8beb00 db> (I upgraded to FreeBSD 12.0-CURRENT (GENERIC) #0 r311659, but initially did the test with Quadro 2000 on the same 10.3-STABLE as before, with the same results). Linux succeeds loading the driver with Quadro 2000, too: [ 1.374925] nvidia: module license 'NVIDIA' taints kernel. [ 1.375348] Disabling lock debugging due to kernel taint [ 1.400506] nvidia: module verification failed: signature and/or required key missing - tainting kernel [ 1.413539] nvidia 0000:00:04.0: can't derive routing for PCI INT A [ 1.414003] nvidia 0000:00:04.0: PCI INT A: no GSI [ 1.414417] vgaarb: device changed decodes: PCI:0000:00:04.0,olddecodes=io+mem,decodes=none:owns=io+mem [ 1.421807] nvidia-nvlink: Nvlink Core is being initialized, major device number 247 [ 1.422369] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 375.26 Thu Dec 8 18:36:43 PST 2016 (using threaded interrupts) [ 1.424568] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 375.26 Thu Dec 8 18:04:14 PST 2016 [ 1.426837] [drm] [nvidia-drm] [GPU ID 0x00000004] Loading driver But I get the same assertion and segfault from bhyve if I try to run `nvidia-smi` after the OS finished booting [at least it seemed to before, but can't get it to finish booting now, just hangs]. And now to the point: how would one go about fixing bhyve's tendency to segfault because of assert (it is saying that something is still very wrong?), and get Linux working with the GPU? And what to do about FreeBSD's guest kernel panics? P.S.: please CC, as I'm not subscribed. [0] https://lists.freebsd.org/pipermail/freebsd-virtualization/2016-September/004704.html [1] Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0xfffffe003d7c508c fault code = supervisor write data, reserved bits in PTE instruction pointer = 0x20:0xffffffff820bb5d5 stack pointer = 0x28:0xfffffe003d69d380 frame pointer = 0x28:0xfffffe000154ce68 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 5084 (nvidia-smi) trap number = 12 panic: page fault -- [SorAlx] ridin' VN2000 Classic LT