From owner-freebsd-virtualization@FreeBSD.ORG Mon Apr 6 20:39:02 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 382BEA1F for ; Mon, 6 Apr 2015 20:39:02 +0000 (UTC) Received: from mail-lb0-x22e.google.com (mail-lb0-x22e.google.com [IPv6:2a00:1450:4010:c04::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AB09D136 for ; Mon, 6 Apr 2015 20:39:01 +0000 (UTC) Received: by lboc7 with SMTP id c7so27922033lbo.1 for ; Mon, 06 Apr 2015 13:38:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=EcCvFXgzPpdHZ5qZNjsyxHQ52pOKIhTbuyXXzAVsj/Q=; b=pEX/kn4QfFeTzls54Z8lRb9O9N+hCrUwwM8WayCI0aBw6CjNdGoxqTBQ+Jfq5ipdzS gQsKHtil7GzqdSudvdA62/AzvaLtu5ancmF5AclATbT5T7b7LdWQOvb125WLgDvsgTB3 +/SXXG0MVX+SU/LleutbgK2WnLu9keZluBH3uO2zQnJdKyjXBosN7/3aHk0PNMscBzZm I11XECXPcSYKgZ88wtjqcyaBbqDuTloZtFbdSue26GbIhl/y9QKcjAKNy1gK2ayUCd8h PO49xJSW8ZTAbgXdlucjU8LW8P1PIfoH9AGAv2qHSIvmx9C0qVIdK57+BbUIoeyKE8Z7 dwkg== X-Received: by 10.112.171.65 with SMTP id as1mr15094322lbc.45.1428352739758; Mon, 06 Apr 2015 13:38:59 -0700 (PDT) Received: from mavbook.mavhome.dp.ua ([134.249.139.101]) by mx.google.com with ESMTPSA id i9sm1259756lbs.26.2015.04.06.13.38.57 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 Apr 2015 13:38:58 -0700 (PDT) Sender: Alexander Motin Message-ID: <5522EEE0.5010807@FreeBSD.org> Date: Mon, 06 Apr 2015 23:38:56 +0300 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Julian Hsiao , "freebsd-virtualization@freebsd.org" Subject: Re: Bhyve storage improvements (was: Several bhyve quirks) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Apr 2015 20:39:02 -0000 Hi, Julian. > I had some time to try it out today, but I'm still having issues: I've just made experiment alike to your with making bhyve to work on top of GEOM device instead of preferable "dev" mode of ZVOL. And I indeed reproduced the problem. But the problem that I see is not related to the block size. The block size is reported to the guest correctly as 4K, and as I can see it works as such at least in FreeBSD guest. The problem is in the way how bhyve inter-operates with block/GEOM devices. bhyve sends requests to the kernel with preadv()/pwritev() calls, specifying scatter/gather lists of buffer addresses provided by the guest. But GEOM code can not handle scatter/gather lists, only sequential buffer, and so single request is split into several. The problem is that splitting happens according to scatter/gather elements, and those elements in general case may not be multiple to the block size, that is fatal for GEOM and any block device. I am not yet sure how to fix this problem. The most straightforward way is to copy the data at some point to collect elements of scatter/gather list into something sequential to pass to GEOM, but that requires additional memory allocation, and the copying is not free. May be some cases could be optimized to work without copying but with some clever page mapping, but that seems absolutely not trivial. -- Alexander Motin