Date: Mon, 30 Dec 2019 19:06:11 +0100 From: Willem Jan Withagen <wjw@digiware.nl> To: virtualization@FreeBSD.org Subject: Adding a different type of blockstore to Bhyve Message-ID: <6e5508d0-4a41-8442-3807-8b9e22bba933@digiware.nl>
next in thread | raw e-mail | index | archive | help
Hi, One of the ways to run backing blockstore with KVM/Qemu is thru the Ceph Rados Block Device (RBD). https://github.com/qemu/qemu/blame/master/block/rbd.c And is make it possible use as boot-image or other blockdevice. Where the virtual machine using this image can migrate to another Dom0 host. I've been working on Ceph for quite some time, and one of the ways to offer a block device on FreeBSD is with rbd-ggate. This works thru geom-gate and will give a /dev/ggate# device that is mapped to an image in a rados pool. And I not into migration for Bhyve, but I would like to integrate RBD into Bhyve as an alternative backing store.... Something like: bhyve -s 1,virtio-blk,rbd:poolname/imagename[@snapshotname] \ [:option1=value1[:option2=value2...]] So started browsing the bhyve code, and end up in block_if.{hc}. But code there is rather strongly targeted towards a local filesystem storage.... I also ran into net_backends.{ch}, and I guess it would be a nicer solution to create a block_backends.{ch} as well for interfacing to more than just one blockstore provider. And then load the RBD provider in the chain of blockstore providers. That way would it even be possible to make that code dl-loadable in case the LGPL ceph code is not directly importable in the usr.sbin tree. (Which I suspect it is) The alternative is to start using the /dev/ggate# devices but then we probably lose the option of live migration. And performance takes a serious hit: A block write/read would go from the vm kernel to the bhyve process in userspace. Then it would go to/dev/ggate# and again end up in the kernel only to have geom-gate send it back to userspace where rbd-ggate sends it to the cluster. Just typing this data flow is a lot of steps, showing that this might not be the best architecture. So the questions are: 1) Is the abstraction of block_backends.{ch} the way to go? 1.1) And would the extra indirection there be acceptable? (For network devices it seems no problem) 2) Does anybody already have such a framework for blockdevs? (Otherwise I'll try to morph the net_backends.{ch} 3) Other suggestions I need to consider? Thanx, --WjW
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6e5508d0-4a41-8442-3807-8b9e22bba933>