From owner-freebsd-virtualization@freebsd.org Sat Dec 2 11:11:24 2017 Return-Path: Delivered-To: freebsd-virtualization@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 15B08DFB1AB for ; Sat, 2 Dec 2017 11:11:24 +0000 (UTC) (envelope-from paul.g.webster@googlemail.com) Received: from mail-yw0-x233.google.com (mail-yw0-x233.google.com [IPv6:2607:f8b0:4002:c05::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 96EB16A3EB; Sat, 2 Dec 2017 11:11:23 +0000 (UTC) (envelope-from paul.g.webster@googlemail.com) Received: by mail-yw0-x233.google.com with SMTP id m81so5041807ywd.2; Sat, 02 Dec 2017 03:11:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=U0ZrOmVegDeJatiBeioqiWaeGJiCE3sqth0UJJqk7hQ=; b=Q7hrOwFt3T5gy9q8n8W2Pqt1gZDM2IW2lBVyF8HWPJ+gUrkxNsgXM8ibW6TA67HFqO biFTA5xqxA4D+qZBp2k+7C77blDIRmBGjDw7Aydgj26c8ySFlyCTr9IOhNS9+CZWX1qf 9IaBoh8as101+UcLE4l3jMnEZcYVTUQGCaFSn/z7rqZUnE7om4Ye4VHaDO4k2K6GoHFq g0ivtGanaskcIvhOmlDy8LsYCRSSWzq1l84dj8nC9JC8JWp7UnK2neiCIA2cJb+PotJ9 +sCx2zJwzR3jr52rKrHhKGkY3W52X053MqxmKUg8LjlcX6utxyyAjqo4Etrii4Gbo3Ro 44BA== X-Gm-Message-State: AJaThX65I3QiA+HUGHhwdZ1ZtXxJg1rgIIlOKoHyOMP/daqmqztJMOF4 Tcyt9tfOr9IMjHVeX7ytqAsbTp3Re+WtpREuGj6Tng== X-Google-Smtp-Source: AGs4zMZs/wdrE8xPQSqyGWdQosks/nmC+xNGM7+1KzI7ghRsmdvjFeVTtjjzqp6Eh7vxziBmlW39mhOGSuN+tKVEmVU= X-Received: by 10.129.118.74 with SMTP id j10mr5847239ywk.152.1512213082197; Sat, 02 Dec 2017 03:11:22 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.165.8 with HTTP; Sat, 2 Dec 2017 03:11:21 -0800 (PST) In-Reply-To: References: <59DFCE5F-029F-4585-B0BA-8FABC43357F2@ebureau.com> From: Paul Webster Date: Sat, 2 Dec 2017 11:11:21 +0000 Message-ID: Subject: Re: bhyve uses all available memory during IO-intensive operations To: "K. Macy" Cc: Dustin Wenz , "freebsd-virtualization@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Dec 2017 11:11:24 -0000 Just as I was near one at the time, apparently ext4 is 4096 default sudo tune2fs -l /dev/sda tune2fs 1.43.4 (31-Jan-2017) Filesystem volume name: xdock Last mounted on: /var/lib/docker Filesystem UUID: b1dd0790-970d-4596-9192-49c704337015 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 14655488 Block count: 58607766 Reserved block count: 2930388 Free blocks: 44314753 Free inodes: 13960548 First block: 0 Block size: 4096 Fragment size: 4096 Group descriptor size: 64 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Flex block group size: 16 Filesystem created: Thu Nov 9 10:32:16 2017 Last mount time: Wed Nov 29 17:08:30 2017 Last write time: Wed Nov 29 17:08:30 2017 Mount count: 21 Maximum mount count: -1 Last checked: Thu Nov 9 10:32:16 2017 Check interval: 0 () Lifetime writes: 147 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 32 Desired extra isize: 32 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: e943c6b0-9b5c-402a-a2ca-5f7dd094712d Journal backup: inode blocks Checksum type: crc32c Checksum: 0x04f644e2 On 2 December 2017 at 06:47, K. Macy wrote: > On Fri, Dec 1, 2017 at 9:23 PM, Dustin Wenz > wrote: > > I have noticed significant storage amplification for my zvols; that could > > very well be the reason. I would like to know more about why it happens. > > > > Since the volblocksize is 512 bytes, I certainly expect extra cpu > overhead > > (and maybe an extra 1k or so worth of checksums for each 128k block in > the > > vm), but how do you get a 10X expansion in stored data? > > > > What is the recommended zvol block size for a FreeBSD/ZFS guest? Perhaps > 4k, > > to match the most common mass storage sector size? > > I would err somewhat larger, the benefits of shallower indirect block > chains will outweigh the cost of RMW I would guess. And I think it > should be your guest file system block size. I don't know what ext4 > is, but ext2/3 was 16k by default IIRC. > > -M > > > > > - .Dustin > > > > On Dec 1, 2017, at 9:18 PM, K. Macy wrote: > > > > One thing to watch out for with chyves if your virtual disk is more > > than 20G is the fact that it uses 512 byte blocks for the zvols it > > creates. I ended up using up 1.4TB only half filling up a 250G zvol. > > Chyves is quick and easy, but it's not exactly production ready. > > > > -M > > > > > > > > On Thu, Nov 30, 2017 at 3:15 PM, Dustin Wenz > wrote: > > > > I'm using chyves on FreeBSD 11.1 RELEASE to manage a few VMs (guest OS is > > also FreeBSD 11.1). Their sole purpose is to house some medium-sized > > Postgres databases (100-200GB). The host system has 64GB of real memory > and > > 112GB of swap. I have configured each guest to only use 16GB of memory, > yet > > while doing my initial database imports in the VMs, bhyve will quickly > grow > > to use all available system memory and then be killed by the kernel: > > > > > > kernel: swap_pager: I/O error - pageout failed; blkno 1735,size > 4096, > > error 12 > > > > kernel: swap_pager: I/O error - pageout failed; blkno 1610,size > 4096, > > error 12 > > > > kernel: swap_pager: I/O error - pageout failed; blkno 1763,size > 4096, > > error 12 > > > > kernel: pid 41123 (bhyve), uid 0, was killed: out of swap space > > > > > > The OOM condition seems related to doing moderate IO within the VM, > though > > nothing within the VM itself shows high memory usage. This is the chyves > > config for one of them: > > > > > > bargs -A -H -P -S > > > > bhyve_disk_type virtio-blk > > > > bhyve_net_type virtio-net > > > > bhyveload_flags > > > > chyves_guest_version 0300 > > > > cpu 4 > > > > creation Created on Mon Oct 23 16:17:04 CDT > 2017 by > > chyves v0.2.0 2016/09/11 using __create() > > > > loader bhyveload > > > > net_ifaces tap51 > > > > os default > > > > ram 16G > > > > rcboot 0 > > > > revert_to_snapshot > > > > revert_to_snapshot_method off > > > > serial nmdm51 > > > > template no > > > > uuid 8495a130-b837-11e7-b092-0025909a8b56 > > > > > > > > I've also tried using different bhyve_disk_types, with no improvement. > How > > is it that bhyve can use far more memory that I'm specifying? > > > > > > - .Dustin > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to "freebsd-virtualization- > unsubscribe@freebsd.org" >