Date: Wed, 11 Oct 2017 21:37:13 +0200 From: Harry Schmalzbauer <freebsd@omnilan.de> To: freebsd-stable@freebsd.org, FreeBSD virtualization <freebsd-virtualization@freebsd.org> Subject: bhyve ppt usage can cause severe RAM corruption [Was: Re: panic: Memory modified after free in zio_create, passthru in use] Message-ID: <59DE72E9.1050006@omnilan.de> In-Reply-To: <593D1D5C.907@omnilan.de> References: <59369A15.2010901@omnilan.de> <593D1D5C.907@omnilan.de>
next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format. --------------090102070804030709070902 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Bezüglich Harry Schmalzbauer's Nachricht vom 11.06.2017 12:37 (localtime): > Bezüglich Harry Schmalzbauer's Nachricht vom 06.06.2017 14:03 (localtime): >> Hello, >> >> suddenly, I'm getting this error: >> /lib/libc.so.7: Undefined symbol "xdr_accepted_reply" >> >> Very mysterious: It showed up on a running system, which worked >> flawlessly for some hours. And that host has root-fs (/) mounted >> readonly from a memorydisk. So to my understanding, it's completely >> impossible that /lib/libc.so.7 is corrupted since last boot. >> >> I'm completely out of ideas what could cause this strange error during >> "normal" operation. >> >> Normal operation in this case is serving as a bhyve test machine. >> I first noticed that error after one guest - with passthru device >> attached - was shut down. >> >> My suspicion is some undiscovered passthru interference... Since I >> noticed one other _very_ strange passthru-effect: >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215740 > Hello, > > this time I caught a panic with a debuging kernel under 11.1-BETA1, > which again occured after shuting down a VM which had ppt in use: > … > Please, can anybody of the xperts add a comment? It turned out that it's a problem with PCIe cards which don't support FLR or cards, which are not PCIe, even if they have FLR capabilitiy. jhb@ helped me to diagnose this. Unfortunately I once forgot to manually bring down the passthrough-nics in question, which resulted in a completely destroyed ZFS pool. That hurted, so I won't rely on manual intervention before shutting down (I had to recreate the complete (system) pool). Unfortunately my skills don't allow me to help fixing the root cause, so I created a little rc(8) script, which should protect reliably. Please see also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222937 Since it's quite small overhead, I'll also attach it here (to be copied to /etc/rc.d). -harry --------------090102070804030709070902 Content-Type: text/plain; name="pciptdetach" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="pciptdetach" #!/bin/sh # # PROVIDE: pciptdetach # REQUIRE: swap # BEFORE: devd # KEYWORD: shutdown . /etc/rc.subr name=pciptdetach rcvar=pciptdetach_enable load_rc_config ${name} : ${pciptdetach_enable:="YES"} start_cmd="true" stop_cmd="${name}" pciptdetach() { sysctl -n hw.hv_vendor | grep -q bhyve || return 0 echo "Disabling passthrough adapters:" pptcandidate=`pciconf -l | grep -v -E \ "^([[:blank:]]|hostb|virtio|isab)[^@]+" | sed -n -E \ 's/^[[:blank:]]*(^[[:alnum:]]+)@([^[:blank:]]+)(:[[:blank:]]).*$/\2/p'` for pcidev in ${pptcandidate}; do drv_class=`pciconf -lv | grep -A 3 "@${pcidev}" | sed -n -E -e \ 's/^[[:blank:]]*class[[:blank:]]+=[[:blank:]]+([^[:blank:]].*)$/\1/p' \ -e 's/^([[:alnum:]]+)@.*$/\1/p' | tr '\n' ' '` # Don't disable mass storage devices, might be busy for shutdown [ X"${drv_class}" = X"${drv_class%mass storage*}" ] || continue # Make sure network adapters don't have active vlan(4) clones. if [ -z "${netstoped}" ] && [ X"${drv_class}" != X"${drv_class%network*}" ] then /etc/rc.d/netif stop >/dev/null 2>&1 && netstoped=y fi # Non-PCIe devices and PCIe devices without FLR support are # known to cause RAM corruption. if ! pciconf -lc ${pcidev} | grep -A 20 PCI-Express | grep -q "[[:blank:]]FLR" then devctl disable ${pcidev} >/dev/null 2>&1 || echo " ${drv_class%% *}:FAILED" fi done } run_rc_command "$1" --------------090102070804030709070902--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?59DE72E9.1050006>