Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Jan 2024 00:23:06 +0000
From:      bugzilla-noreply@freebsd.org
To:        virtualization@FreeBSD.org
Subject:   [Bug 276575] Host can cause a crash in bhyve nvme emulation
Message-ID:  <bug-276575-27103@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D276575

            Bug ID: 276575
           Summary: Host can cause a crash in bhyve nvme emulation
           Product: Base System
           Version: 14.0-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: bhyve
          Assignee: virtualization@FreeBSD.org
          Reporter: dpy@pobox.com

Created attachment 247908
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D247908&action=
=3Dedit
output of windows minidump analysis

Hello,

OS: 14.0-RELEASE-p4

I have windows-10 vms which will BSOD upon heavy disk load on the guest or
host.

2 different cases. Both running 3 disks (all nvme emulated). 1 boot disk (c=
:)
and two data disks (3tb and 1 tb), backed as image files on zfs. sync is se=
t to
always and a small optane zil is used on the pool. I am using vm-bhyve to
manage the vms.

Pool setup:
# zpool list -v data_pool
NAME                          SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG=
=20=20=20
CAP  DEDUP    HEALTH  ALTROOT
data_pool                    10.9T  3.95T  6.96T        -         -     1%=
=20=20=20
36%  1.00x    ONLINE  -
  mirror-0                   10.9T  3.95T  6.96T        -         -     1%=
=20
36.2%      -    ONLINE
    gpt/data_pool_00         10.9T      -      -        -         -      -=
=20=20=20=20=20
-      -    ONLINE
    gpt/data_pool_01         10.9T      -      -        -         -      -=
=20=20=20=20=20
-      -    ONLINE
logs                             -      -      -        -         -      -=
=20=20=20=20=20
-      -         -
  gpt/data_pool_zil            32G  3.17M  31.5G        -         -     0%=
=20
0.00%      -    ONLINE
cache                            -      -      -        -         -      -=
=20=20=20=20=20
-      -         -
  gpt/data_pool_cache_0_ssd   932G  55.9G   876G        -         -     0%=
=20
6.00%      -    ONLINE
  gpt/data_pool_cache_1_ssd   932G  53.9G   878G        -         -     0%=
=20
5.78%      -    ONLINE

cache drives replace recently, so haven't filled up yet.

First case could not complete a full backup of the 3tb data drive (a windows
backup). The time of failure would occur after the machine had been running=
 the
backup for a while (2-3 minutes +).  These would run at 200+ MB/s (via a 10=
gb
network) to the backup machine. BSOD resulted and minidumps produced indica=
ting
NVME issues. I can supply details if required.

Second case was a little worse, since the VM was fairly quiet, but I was
testing an nfs connection (sending over a 10gb network), running at 400+ MB=
 .=20
In this case the 2nd drive on the windows VM just stopped working.  Upon re=
boot
the disk image had been corrupted to the point of needing reformatting to
function. (I rollbacked to a previous snapshot and reapplied the days
transactions).

The data set has sync=3Dalways (optane zil) and the vm dataset is on a zrai=
d1
dataset (3+1).

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-276575-27103>