Date: Tue, 7 Jan 2020 10:53:20 +0100 From: Willem Jan Withagen <wjw@digiware.nl> To: virtualization@FreeBSD.org Subject: Re: Adding a different type of blockstore to Bhyve Message-ID: <13c3951d-dd0e-7666-a56d-41f6368465c7@digiware.nl> In-Reply-To: <6e5508d0-4a41-8442-3807-8b9e22bba933@digiware.nl> References: <6e5508d0-4a41-8442-3807-8b9e22bba933@digiware.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
On 30-12-2019 19:06, Willem Jan Withagen wrote: > Hi, > > One of the ways to run backing blockstore with KVM/Qemu is thru > the Ceph Rados Block Device (RBD). > https://github.com/qemu/qemu/blame/master/block/rbd.c > > And is make it possible use as boot-image or other blockdevice. Where > the virtual machine using this image can migrate to another Dom0 host. > > I've been working on Ceph for quite some time, and one of the ways to > offer a block device on FreeBSD is with rbd-ggate. > This works thru geom-gate and will give a /dev/ggate# device that is > mapped to an image in a rados pool. > > And I not into migration for Bhyve, but I would like to integrate RBD > into Bhyve as an alternative backing store.... > > Something like: > bhyve -s 1,virtio-blk,rbd:poolname/imagename[@snapshotname] \ > [:option1=value1[:option2=value2...]] > > So started browsing the bhyve code, and end up in block_if.{hc}. > But code there is rather strongly targeted towards a local > filesystem storage.... > > I also ran into net_backends.{ch}, and I guess it would be a nicer > solution to create a block_backends.{ch} as well for interfacing to > more than just one blockstore provider. > And then load the RBD provider in the chain of blockstore providers. > That way would it even be possible to make that code dl-loadable in > case the LGPL ceph code is not directly importable in the usr.sbin > tree. (Which I suspect it is) > > The alternative is to start using the /dev/ggate# devices but then > we probably lose the option of live migration. > And performance takes a serious hit: > A block write/read would go from the vm kernel > to the bhyve process in userspace. > Then it would go to/dev/ggate# and again end up in the kernel > only to have geom-gate send it back to userspace > where rbd-ggate sends it to the cluster. > > Just typing this data flow is a lot of steps, showing that this > might not be the best architecture. > > So the questions are: > 1) Is the abstraction of block_backends.{ch} the way to go? > 1.1) And would the extra indirection there be acceptable? > (For network devices it seems no problem) > > 2) Does anybody already have such a framework for blockdevs? > (Otherwise I'll try to morph the net_backends.{ch} > > 3) Other suggestions I need to consider? Looking for reviewers of: https://reviews.freebsd.org/D23010 In the days after newyear I made a first attempt to refactor the block_if stuff into a generic backend: blockbe_ and and implementation of the local storage: lockblk_ I've submitted it to phabricator, and I'm seeking reviews and ultimately somebody that will commit this when all issues are worked out. --WjW
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?13c3951d-dd0e-7666-a56d-41f6368465c7>