Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 2 Dec 2017 23:53:02 -0500
From:      Allan Jude <allanjude@freebsd.org>
To:        "K. Macy" <kmacy@freebsd.org>
Cc:        "freebsd-virtualization@freebsd.org" <freebsd-virtualization@freebsd.org>
Subject:   Re: bhyve uses all available memory during IO-intensive operations
Message-ID:  <571ab0b4-ec6c-2bc4-438b-d3dce35cd775@freebsd.org>
In-Reply-To: <CAHM0Q_P9G8CND3XrfpNoVRoNws%2BMgR02jpJRY3BQpY93Eo5s0Q@mail.gmail.com>
References:  <F4E35CB9-30F9-4C63-B4CC-F8ADC9947E3C@ebureau.com> <CAHM0Q_MPNEBq=J9yJADhzA96nKvdgEiFESV-0Y9JB5mewfGspQ@mail.gmail.com> <59DFCE5F-029F-4585-B0BA-8FABC43357F2@ebureau.com> <11e6e55d-9802-c9fc-859c-37c026eaba2b@freebsd.org> <CAHM0Q_P9G8CND3XrfpNoVRoNws%2BMgR02jpJRY3BQpY93Eo5s0Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2017-12-02 20:21, K. Macy wrote:
> On Sat, Dec 2, 2017 at 5:16 PM, Allan Jude <allanjude@freebsd.org> wrote:
>> On 12/02/2017 00:23, Dustin Wenz wrote:
>>> I have noticed significant storage amplification for my zvols; that could very well be the reason. I would like to know more about why it happens.
>>>
>>> Since the volblocksize is 512 bytes, I certainly expect extra cpu overhead (and maybe an extra 1k or so worth of checksums for each 128k block in the vm), but how do you get a 10X expansion in stored data?
>>>
>>> What is the recommended zvol block size for a FreeBSD/ZFS guest? Perhaps 4k, to match the most common mass storage sector size?
>>>
>>>     - .Dustin
>>>
>>>> On Dec 1, 2017, at 9:18 PM, K. Macy <kmacy@freebsd.org> wrote:
>>>>
>>>> One thing to watch out for with chyves if your virtual disk is more
>>>> than 20G is the fact that it uses 512 byte blocks for the zvols it
>>>> creates. I ended up using up 1.4TB only half filling up a 250G zvol.
>>>> Chyves is quick and easy, but it's not exactly production ready.
>>>>
>>>> -M
>>>>
>>>>
>>>>
>>>>> On Thu, Nov 30, 2017 at 3:15 PM, Dustin Wenz <dustinwenz@ebureau.com> wrote:
>>>>> I'm using chyves on FreeBSD 11.1 RELEASE to manage a few VMs (guest OS is also FreeBSD 11.1). Their sole purpose is to house some medium-sized Postgres databases (100-200GB). The host system has 64GB of real memory and 112GB of swap. I have configured each guest to only use 16GB of memory, yet while doing my initial database imports in the VMs, bhyve will quickly grow to use all available system memory and then be killed by the kernel:
>>>>>
>>>>>        kernel: swap_pager: I/O error - pageout failed; blkno 1735,size 4096, error 12
>>>>>        kernel: swap_pager: I/O error - pageout failed; blkno 1610,size 4096, error 12
>>>>>        kernel: swap_pager: I/O error - pageout failed; blkno 1763,size 4096, error 12
>>>>>        kernel: pid 41123 (bhyve), uid 0, was killed: out of swap space
>>>>>
>>>>> The OOM condition seems related to doing moderate IO within the VM, though nothing within the VM itself shows high memory usage. This is the chyves config for one of them:
>>>>>
>>>>>        bargs                      -A -H -P -S
>>>>>        bhyve_disk_type            virtio-blk
>>>>>        bhyve_net_type             virtio-net
>>>>>        bhyveload_flags
>>>>>        chyves_guest_version       0300
>>>>>        cpu                        4
>>>>>        creation                   Created on Mon Oct 23 16:17:04 CDT 2017 by chyves v0.2.0 2016/09/11 using __create()
>>>>>        loader                     bhyveload
>>>>>        net_ifaces                 tap51
>>>>>        os                         default
>>>>>        ram                        16G
>>>>>        rcboot                     0
>>>>>        revert_to_snapshot
>>>>>        revert_to_snapshot_method  off
>>>>>        serial                     nmdm51
>>>>>        template                   no
>>>>>        uuid                       8495a130-b837-11e7-b092-0025909a8b56
>>>>>
>>>>>
>>>>> I've also tried using different bhyve_disk_types, with no improvement. How is it that bhyve can use far more memory that I'm specifying?
>>>>>
>>>>>        - .Dustin
>>> _______________________________________________
>>> freebsd-virtualization@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
>>> To unsubscribe, send any mail to "freebsd-virtualization-unsubscribe@freebsd.org"
>>>
>>
>> Storage amplification usually has to do with ZFS RAID-Z padding. If your
>> ZVOL block size does not make sense with your disk sector size, and
>> RAID-Z level, you can get pretty silly numbers.
> 
> That's not what I'm talking about here. If your volblocksize is too
> small you end up using (vastly) more space for indirect blocks than
> data blocks.
> 
> -M
> 

In addition, if you have say, 4k sectors, and a RAID-Z2, it means every
allocation of 4k or less, requires 12k of disk space.

Allocations of 8k are worse in this case, since all allocations must be
in units of 1+p, where p is the parity level. So allocating 8kb of space
(2x 4k sectors), plus 2x 4k parity sectors = 4 sectors, Rounded up the
to the next multiple of 3 is 6.

That means 8k of data took: 8kb for data + 8kb for parity + 8kb for
padding = 24kb of space.

If you were using RAID-Z1, it would have been just 12kb (8kb data, 4kb
parity, 0kb padding)

Or if you used 16kb record size on the zvol:
4 sectors data, 2 sectors parity = 6, which is a multiple of 3, so no
padding required.

-- 
Allan Jude



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?571ab0b4-ec6c-2bc4-438b-d3dce35cd775>