From owner-freebsd-virtualization@freebsd.org Tue Jan 7 09:53:27 2020 Return-Path: Delivered-To: freebsd-virtualization@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 728861FB8CE for ; Tue, 7 Jan 2020 09:53:27 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 47sSPM0mc8z3wwh for ; Tue, 7 Jan 2020 09:53:27 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: by mailman.nyi.freebsd.org (Postfix) id 1A69B1FB8CD; Tue, 7 Jan 2020 09:53:27 +0000 (UTC) Delivered-To: virtualization@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1A2AD1FB8CC for ; Tue, 7 Jan 2020 09:53:27 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [176.74.240.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47sSPL0Hm2z3wwg for ; Tue, 7 Jan 2020 09:53:25 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 246FD117B; Tue, 7 Jan 2020 10:53:24 +0100 (CET) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uTEGWwx1fumy; Tue, 7 Jan 2020 10:53:23 +0100 (CET) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 68BF3117A for ; Tue, 7 Jan 2020 10:53:23 +0100 (CET) Subject: Re: Adding a different type of blockstore to Bhyve From: Willem Jan Withagen To: virtualization@FreeBSD.org References: <6e5508d0-4a41-8442-3807-8b9e22bba933@digiware.nl> Message-ID: <13c3951d-dd0e-7666-a56d-41f6368465c7@digiware.nl> Date: Tue, 7 Jan 2020 10:53:20 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: <6e5508d0-4a41-8442-3807-8b9e22bba933@digiware.nl> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 47sSPL0Hm2z3wwg X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of wjw@digiware.nl designates 176.74.240.9 as permitted sender) smtp.mailfrom=wjw@digiware.nl X-Spamd-Result: default: False [-5.61 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[virtualization@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[4]; RCVD_TLS_LAST(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[9.240.74.176.list.dnswl.org : 127.0.9.2]; DMARC_NA(0.00)[digiware.nl]; IP_SCORE(-3.11)[ip: (-9.76), ipnet: 176.74.224.0/19(-4.88), asn: 28878(-0.96), country: NL(0.03)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:28878, ipnet:176.74.224.0/19, country:NL]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jan 2020 09:53:27 -0000 On 30-12-2019 19:06, Willem Jan Withagen wrote: > Hi, > > One of the ways to run backing blockstore with KVM/Qemu is thru > the Ceph Rados Block Device (RBD). > https://github.com/qemu/qemu/blame/master/block/rbd.c > > And is make it possible use as boot-image or other blockdevice. Where > the virtual machine using this image can migrate to another Dom0 host. > > I've been working on Ceph for quite some time, and one of the ways to > offer a block device on FreeBSD is with rbd-ggate. > This works thru geom-gate and will give a /dev/ggate# device that is > mapped to an image in a rados pool. > > And I not into migration for Bhyve, but I would like to integrate RBD > into Bhyve as an alternative backing store.... > > Something like: >   bhyve -s 1,virtio-blk,rbd:poolname/imagename[@snapshotname] \ >                          [:option1=value1[:option2=value2...]] > > So started browsing the bhyve code, and end up in block_if.{hc}. > But code there is rather strongly targeted towards a local > filesystem storage.... > > I also ran into net_backends.{ch}, and I guess it would be a nicer > solution to create a block_backends.{ch} as well for interfacing to > more than just one blockstore provider. > And then load the RBD provider in the chain of blockstore providers. > That way would it even be possible to make that code dl-loadable in > case the LGPL ceph code is not directly importable in the usr.sbin > tree. (Which I suspect it is) > > The alternative is to start using the /dev/ggate# devices but then > we probably lose the option of live migration. > And performance takes a serious hit: >     A block write/read would go from the vm kernel >     to the bhyve process in userspace. >     Then it would go to/dev/ggate# and again end up in the kernel >     only to have geom-gate send it back to userspace >     where rbd-ggate sends it to the cluster. > > Just typing this data flow is a lot of steps, showing that this > might not be the best architecture. > > So the questions are: > 1)   Is the abstraction of block_backends.{ch} the way to go? > 1.1) And would the extra indirection there be acceptable? >      (For network devices it seems no problem) > > 2)   Does anybody already have such a framework for blockdevs? >      (Otherwise I'll try to morph the net_backends.{ch} > > 3)   Other suggestions I need to consider? Looking for reviewers of: https://reviews.freebsd.org/D23010 In the days after newyear I made a first attempt to refactor the block_if stuff into a generic backend: blockbe_ and and implementation of the local storage: lockblk_ I've submitted it to phabricator, and I'm seeking reviews and ultimately somebody that will commit this when all issues are worked out. --WjW