Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Aug 2023 15:48:50 +0000
From:      bugzilla-noreply@freebsd.org
To:        virtualization@FreeBSD.org
Subject:   [Bug 270966] PCI passthru stops working after ~30 guest reboots (ivhd, ILLEGAL CMD, IO_PAGE_FAULT)
Message-ID:  <bug-270966-27103-hsVF8G1VQO@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-270966-27103@https.bugs.freebsd.org/bugzilla/>
References:  <bug-270966-27103@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270966

Santiago Martinez <sm@codenetworks.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sm@codenetworks.net

--- Comment #16 from Santiago Martinez <sm@codenetworks.net> ---
Hi Raul, 

I'm seeing the same issue on AMD EPYC proc. Checking on kernel.org (Linux)
seems that they also had issues with AMD-VI. In the Linux world, many people
are using iommu=pt to overcome this. This is also a known bug on Redhat KB.

I'm running a script similar to yours and the server behaves quite erratic.

My script  does the following:

- Start and stop 200 times a VM with a PCI pass (in this case is a SRIOV VF,
but it does the same without SRIOV, or with any other device, non-network
related).  - After that 200 times, it reboots the server. 
- When the server starts it runs the script again.

Sometimes, the script can start and stop the VM 200 times, even if I see IVH
errors (command not completed or cmd error), and sometimes can only start and
stop the VM once, and the server reboots after a few IO_PAGE_FAULT (something
gets corrupted and the NVME stops responding and machines reboots after command
retry-timeout).

The server showing the issue is a SuperMicro H12SSW-NT.
- AMD EPYC 7552 48-Core Processor                

I have updated the BIOS to the latest release as on the Linux forum they
mentioned issues with the SP3.

Michael Dexter and I  also tried to replicate it on other AMD processors
without any success.
- AMD EPYC 7702P 64-Core Processor
- AMD Ryzen 7 3700X 8-Core Processor 
- Ryzen 6800H

-- 
You are receiving this mail because:
You are the assignee for the bug.


Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-270966-27103-hsVF8G1VQO>