From owner-freebsd-virtualization@FreeBSD.ORG Tue Jun 2 16:46:08 2015 Return-Path: Delivered-To: freebsd-virtualization@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0B23CBB9 for ; Tue, 2 Jun 2015 16:46:08 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 378791E62 for ; Tue, 2 Jun 2015 16:46:07 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA26583 for ; Tue, 02 Jun 2015 19:46:05 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1YzpKH-0002cs-Hz for freebsd-virtualization@freebsd.org; Tue, 02 Jun 2015 19:46:05 +0300 Message-ID: <556DDDA9.6090005@FreeBSD.org> Date: Tue, 02 Jun 2015 19:45:29 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: freebsd-virtualization@FreeBSD.org Subject: Re: bhyve: corrupting zfs pools? References: <556D9005.4020802@FreeBSD.org> In-Reply-To: <556D9005.4020802@FreeBSD.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Jun 2015 16:46:08 -0000 On 02/06/2015 14:14, Andriy Gapon wrote: > > I am doing a simple experiment. > > I get FreeBSD image from here: > ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/11.0/FreeBSD-11.0-CURRENT-amd64-r283577-20150526-memstick.img.xz > > Then I run in bhyve with two additional "disks" created with truncate -s 4g: > $ bhyveload -m 1G -d > ~/tmp/FreeBSD-11.0-CURRENT-amd64-r283577-20150526-memstick.img test > $ bhyve -A -HP -s 0:0,hostbridge -s 1,lpc -s 2:0,virtio-net,tap0 -s > 3:0,virtio-blk,/home/avg/tmp/FreeBSD-11.0-CURRENT-amd64-r283577-20150526-memstick.img > -s 3:1,virtio-blk,/tmp/l2arc-test/hdd1,sectorsize=512/4096 -s > 3:2,virtio-blk,/tmp/l2arc-test/hdd2,sectorsize=512/4096 -l com1,stdio -l > com2,/dev/nmdm0A -c 2 -m 1g test > > Note sectorsize=512/4096 options. Not sure if it's them that cause the trouble. > > Then, in the VM: > $ zpool create l2arc-test mirror /dev/vtbd1 /dev/vtbd2 > $ zfs create -p l2arc-test/ROOT/initial > $ tar -c --one-file-system -f - / | tar -x -C /l2arc-test/ROOT/initial -f - > > Afterwards, zpool status -v reports no problem. > But then I run zpool scrub and get the following in the end: > $ zpool status -v > pool: l2arc-test > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://illumos.org/msg/ZFS-8000-8A > scan: scrub repaired 356K in 0h0m with 9 errors on Tue Jun 2 13:58:17 2015 > config: > > NAME STATE READ WRITE CKSUM > l2arc-test ONLINE 0 0 9 > mirror-0 ONLINE 0 0 18 > vtbd1 ONLINE 0 0 25 > vtbd2 ONLINE 0 0 23 > > errors: Permanent errors have been detected in the following files: > > /l2arc-test/ROOT/initial/usr/bin/svnlitesync > /l2arc-test/ROOT/initial/usr/freebsd-dist/kernel.txz > /l2arc-test/ROOT/initial/usr/freebsd-dist/src.txz > > /l2arc-test/ROOT/initial/usr/lib/clang/3.6.1/lib/freebsd/libclang_rt.asan-x86_64.a > > > The same issue is reproducible with ahci-hd. > > My host system is a recent amd64 CURRENT as well. The hardware platform is AMD. > I used the following monstrous command line to reproduce the test in qemu: $ qemu-system-x86_64 -smp 2 -m 1024 -drive file=/tmp/livecd2/R2.img,format=raw,if=none,id=bootd -device virtio-blk-pci,drive=bootd -drive file=/tmp/l2arc-test/hdd1,if=none,id=hdd1,format=raw -device virtio-blk-pci,drive=hdd1,logical_block_size=4096 -drive file=/tmp/l2arc-test/hdd2,id=hdd2,if=none,format=raw -device virtio-blk-pci,drive=hdd2,logical_block_size=4096 -drive file=/tmp/l2arc-test/ssd,id=ssd,if=none,format=raw -device virtio-blk-pci,drive=ssd,logical_block_size=4096 ... And several other variations of logical_block_size and physical_block_size. The tests a re very slow, but there are no checksum errors. So, I suspect guest memory corruption caused by bhyve. Perhaps the problem is indeed specific to AMD-V. -- Andriy Gapon