Date: Sun, 17 Mar 2019 16:22:29 +0000 From: Robert Crowston <crowston@protonmail.com> To: "freebsd-virtualization@freebsd.org" <freebsd-virtualization@freebsd.org> Subject: GPU passthrough: mixed success on Linux, not yet on Windows Message-ID: <H0Gbov17YtZC1-Ao1YkjZ-nuOqPv4LPggc_mni3cS8WWOjlSLBAfOGGPf4aZEpOBiC5PAUGg6fkgeutcLrdbmXNO5QfaxFtK_ANn-Nrklws=@protonmail.com>
next in thread | raw e-mail | index | archive | help
Hi folks, this is my first post to the group. Apologies for length. I've been experimenting with GPU passthrough on bhyve. For background, the = host system is FreeBSD 12.0-RELEASE on an AMD Ryzen 1700 CPU @ 3.8 GHz, 32 = GB of ECC RAM, with two nVidia GPUs. I'm working with a Linux Debian 9 gues= t and a Windows Server 2019 (desktop experience installed) guest. I also ha= ve a USB controller passed-through for bluetooth and keyboard. With some unpleasant hacks I have succeeded in starting X on the Linux gues= t, passing-through an nVidia GT 710 under the nouveau driver. I can run the= "mate" desktop and glxgears, both of which are smooth at 4K. The Unity Hea= ven benchmark runs at an embarrassing 0.1 fps, and 2160p x264 video in VLC = runs at about 5 fps. Neither appears to be CPU-bound in the host or the gue= st. The hack I had to make: I found that many instructions to access memory-map= ped PCI BARs are not being executed on the CPU in guest mode but are being = passed back for emulation in the hypervisor. This causes an assertion to fa= il inside passthru_write() in pci_passthru.c ["pi->pi_bar[baridx].type =3D= =3D PCIBAR_IO"] because it does not expect to perform memory-mapped IO for = the guest. Examining the to-be-emulated instructions in vmexit_inst_emul() = {e.g., movl (%rdi), %eax}, they look benign to me, and I have no explanatio= n for why the CPU refused to execute them in guest mode. As an amateur work-around, I removed the assertion and instead I obtain the= desired offset into the guest's BAR, calculate what that guest address tra= nslates to in the host's address space, open(2) /dev/mem, mmap(2) over to t= hat address, and perform the write directly. I do a similar trick in passth= ru_read(). Ugly, slow, but functional. This code path is accessed continuously whether or not X is running, with a= n increase in activity when running anything GPU-heavy. Always to bar 1, an= d mostly around the same offsets. I added some logging of this event. It ru= ns at about 100 lines per second while playing video. An excerpt is: ... Unexpected out-of-vm passthrough write #492036 to bar 1 at offset 41100. Unexpected out-of-vm passthrough write #492037 to bar 1 at offset 41100. Unexpected out-of-vm passthrough read #276162 to bar 1 at offset 561280. Unexpected out-of-vm passthrough write #492038 to bar 1 at offset 38028. Unexpected out-of-vm passthrough write #492039 to bar 1 at offset 38028. Unexpected out-of-vm passthrough read #276163 to bar 1 at offset 561184. Unexpected out-of-vm passthrough read #276164 to bar 1 at offset 561184. Unexpected out-of-vm passthrough read #276165 to bar 1 at offset 561184. Unexpected out-of-vm passthrough read #276166 to bar 1 at offset 561184. ... So my question here is, 1. How do I diagnose why the instructions are not being executed in guest m= ode? Some other problems: 2. Once the virtual machine is shut down, the passed-through GPU doesn't ge= t turned off. Whatever message was on the screen in the final throes of Lin= ux's shutdown stays there. Maybe there is a specific detach command which b= hyve or nouveau hasn't yet implemented? Alternatively, maybe I could exploi= t some power management feature to reset the card when bhyve exits. 3. It is not possible to reboot the guest and then start X again without an= intervening host reboot. The text console works fine. Xorg.0.log has a mes= sage like (EE) [drm] Failed to open DRM device for pci:0000:00:06.0: -19 (EE) open /dev/dri/card0: No such file or directory dmesg is not very helpful either.[0] I suspect that this is related to prob= lem (2). 4. There is a known bug in the version of the Xorg server that ships with D= ebian 9, where the switch from an animated mouse cursor back to a static cu= rsor causes the X server to sit in a busy loop of gradually increasing stac= k depth, if the GPU takes too long to communicate with the driver.[1] For m= e, this consistently happens after I type my password into the Debian login= dialog box and eventually (~ 120 minutes) locks up the host by eating all = the swap. A work-around is to replace the guest's animated cursors with sta= tic cursors. The bug is fixed in newer versions of X, but I haven't tested = whether their fix works for me yet. 5. The GPU doesn't come to life until the nouveau driver kicks in. What is = special about the driver? Why doesn't the UEFI open the GPU and send it out= put before the boot? Any idea if the problem is on the UEFI side or the hyp= ervisor side? 6. On Windows, the way Windows probes multi-BAR devices seems to be inconsi= stent with bhyve's model for storing io memory mappings. Specifically, I be= lieve Windows assigns the 0xffffffff sentinel to all BARs on a device in on= e shot, then reads them back and assigns the true addresses afterwards. How= ever, bhyve sees the multiple 0xffffffff assignments to different BARs as a= clash and errors out on the second BAR probe. I removed most of the mmio_r= b_tree error handling in mem.c and this is sufficient for Windows to boot, = and detect and correctly identify the GPU. (A better solution might be to h= andle the initial 0xffffffff write as a special case.) I can then install t= he official nVidia drivers without problem over Remote Desktop. However, th= e GPU never springs into life: I am stuck with a "Windows has stopped this = device because it has reported problems. (Code 43)" error in the device man= ager, a blank screen, and not much else to go on. Is it worth me continuing to hack away at these problems---of course I'm ha= ppy to share anything I come up with---or is there an official solution to = GPU support in the pipe about to make my efforts redundant :)? Thanks, Robert Crowston. --- Footnotes [0] Diff'ing dmesg after successful GPU initialization (+) and after failu= re (-), and cutting out some lines that aren't relevant: nouveau 0000:00:06.0: bios: version 80.28.a6.00.10 +nouveau 0000:00:06.0: priv: HUB0: 085014 ffffffff (1f70820b) nouveau 0000:00:06.0: fb: 1024 MiB DDR3 @@ -466,24 +467,17 @@ nouveau 0000:00:06.0: DRM: DCB conn 00: 00001031 nouveau 0000:00:06.0: DRM: DCB conn 01: 00002161 nouveau 0000:00:06.0: DRM: DCB conn 02: 00000200 -nouveau 0000:00:06.0: disp: chid 0 mthd 0000 data 00000400 00001000 000000= 02 -nouveau 0000:00:06.0: timeout at /build/linux-UEAD6s/linux-4.9.144/drivers= /gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:88/gf119_disp_dmac_init()! -nouveau 0000:00:06.0: disp: ch 1 init: c207009b -nouveau: DRM:00000000:0000927c: init failed with -16 -nouveau 0000:00:06.0: timeout at /build/linux-UEAD6s/linux-4.9.144/drivers= /gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:54/gf119_disp_dmac_fini()! -nouveau 0000:00:06.0: disp: ch 1 fini: c2071088 -nouveau 0000:00:06.0: timeout at /build/linux-UEAD6s/linux-4.9.144/drivers= /gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:54/gf119_disp_dmac_fini()! -nouveau 0000:00:06.0: disp: ch 1 fini: c2071088 +[drm] Supports vblank timestamp caching Rev 2 (21.10.2013). +[drm] Driver supports precise vblank timestamp query. +nouveau 0000:00:06.0: DRM: MM: using COPY for buffer copies +nouveau 0000:00:06.0: DRM: allocated 1920x1080 fb: 0x60000, bo ffff96fdb39= a1800 +fbcon: nouveaufb (fb0) is primary device -nouveau 0000:00:06.0: timeout at /build/linux-UEAD6s/linux-4.9.144/drivers= /gpu/drm/nouveau/nvkm/engine/disp/coregf119.c:187/gf119_disp_core_fini() -nouveau 0000:00:06.0: disp: core fini: 8d0f0088 -[TTM] Finalizing pool allocator -[TTM] Finalizing DMA pool allocator -[TTM] Zone kernel: Used memory at exit: 0 kiB -[TTM] Zone dma32: Used memory at exit: 0 kiB -nouveau: probe of 0000:00:06.0 failed with error -16 +Console: switching to colour frame buffer device 240x67 +nouveau 0000:00:06.0: fb0: nouveaufb frame buffer device +[drm] Initialized nouveau 1.3.1 20120801 for 0000:00:06.0 on minor 0 [1] https://devtalk.nvidia.com/default/topic/1028172/linux/titan-v-ubuntu-1= 6-04lts-and-387-34-driver-crashes-badly/post/5230898/#5230898
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?H0Gbov17YtZC1-Ao1YkjZ-nuOqPv4LPggc_mni3cS8WWOjlSLBAfOGGPf4aZEpOBiC5PAUGg6fkgeutcLrdbmXNO5QfaxFtK_ANn-Nrklws=>