FreeBSD Mail Archives

Date:      Mon, 30 Dec 2019 19:06:11 +0100
From:      Willem Jan Withagen <wjw@digiware.nl>
To:        virtualization@FreeBSD.org
Subject:   Adding a different type of blockstore to Bhyve
Message-ID:  <6e5508d0-4a41-8442-3807-8b9e22bba933@digiware.nl>

next in thread | raw e-mail | index | archive | help

Hi,

One of the ways to run backing blockstore with KVM/Qemu is thru
the Ceph Rados Block Device (RBD).
https://github.com/qemu/qemu/blame/master/block/rbd.c

And is make it possible use as boot-image or other blockdevice. Where
the virtual machine using this image can migrate to another Dom0 host.

I've been working on Ceph for quite some time, and one of the ways to
offer a block device on FreeBSD is with rbd-ggate.
This works thru geom-gate and will give a /dev/ggate# device that is
mapped to an image in a rados pool.

And I not into migration for Bhyve, but I would like to integrate RBD
into Bhyve as an alternative backing store....

Something like:
   bhyve -s 1,virtio-blk,rbd:poolname/imagename[@snapshotname] \
                          [:option1=value1[:option2=value2...]]

So started browsing the bhyve code, and end up in block_if.{hc}.
But code there is rather strongly targeted towards a local
filesystem storage....

I also ran into net_backends.{ch}, and I guess it would be a nicer
solution to create a block_backends.{ch} as well for interfacing to
more than just one blockstore provider.
And then load the RBD provider in the chain of blockstore providers.
That way would it even be possible to make that code dl-loadable in
case the LGPL ceph code is not directly importable in the usr.sbin
tree. (Which I suspect it is)

The alternative is to start using the /dev/ggate# devices but then
we probably lose the option of live migration.
And performance takes a serious hit:
     A block write/read would go from the vm kernel
     to the bhyve process in userspace.
     Then it would go to/dev/ggate# and again end up in the kernel
     only to have geom-gate send it back to userspace
     where rbd-ggate sends it to the cluster.

Just typing this data flow is a lot of steps, showing that this
might not be the best architecture.

So the questions are:
1)   Is the abstraction of block_backends.{ch} the way to go?
1.1) And would the extra indirection there be acceptable?
      (For network devices it seems no problem)

2)   Does anybody already have such a framework for blockdevs?
      (Otherwise I'll try to morph the net_backends.{ch}

3)   Other suggestions I need to consider?

Thanx,
--WjW

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6e5508d0-4a41-8442-3807-8b9e22bba933>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation