From owner-freebsd-hackers@freebsd.org Tue Apr 11 14:30:13 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1ED72D3986F for ; Tue, 11 Apr 2017 14:30:13 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BAB4E13B for ; Tue, 11 Apr 2017 14:30:12 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v3BEU3V4010382 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 11 Apr 2017 17:30:03 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v3BEU3V4010382 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v3BEU31S010380; Tue, 11 Apr 2017 17:30:03 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 11 Apr 2017 17:30:03 +0300 From: Konstantin Belousov To: Flavius Anton Cc: freebsd-hackers@freebsd.org Subject: Re: On COW memory mapping in d_mmap_single Message-ID: <20170411143003.GT1788@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.0 (2017-02-23) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Apr 2017 14:30:13 -0000 On Tue, Apr 11, 2017 at 04:55:00PM +0300, Flavius Anton wrote: > >On Tue, Apr 11, 2017 at 04:00:21PM +0300, Konstantin Belousov wrote: > >>On Tue, Apr 11, 2017 at 03:37:26PM +0300, Flavius Anton wrote: > >> Hi everyone, > >> > >> I'll start by giving some context, so you can better understand what > >> is the problem I'm trying to solve. I???ve been working for a while on > >> bhyve trying to implement save/restore [1]. We've currently managed to > >> get it working for VMs using a ramdisk and no devices, so just vCPU > >> and memory states are saved and restored so far. > >> > >> Last week I started looking into network devices, specifically > >> virtio-net devices. The problem was that when I issue a checkpoint > >> operation, the guest virtio driver stops working. After digging for a > >> while, I figured out the problem is marking VM memory as COW. If I > >> don't do this, the driver continues with no problem after > >> checkpointing. > >> > >> Each VM has an associated vmspace and a /dev/vmm/VM_NAME device. When > >> the user space does a mmap on the /dev device, we would like to mark > >> VM memory as COW, thus the VM can continue touching pages while the > >> user space is writing the 'freezed', COW marked memory to a persistent > >> storage. We do this by iterating through all vm_entries from VM's > >> vmspace, we find which entry is mapping the object that has VM memory > >> and then we roughly just set MAP_ENTRY_COW and MAP_ENTRY_NEEDS_COPY on > >> that entry. You can see the code here [2]. > > > >This is very strange operation, to put it mildly. First, are other vCPUs > >operate while you do your 'COW' ? If yes, you are guaranteed to get > >inconsistent snapshot. If not, then you do not need 'COW'. > > Yes, all vCPUs are locked before calling mmap(). I agree that we don't > need 'COW', as long as we keep all vCPUs locked while we copy the > entire VM memory. But this might take a while, imagine a VM with 32GB > or more of RAM. This will take maybe minutes to write to disk, so we > don't actually want the VM to be freezed for so long. That's the > reason we'd like to map the memory COW and then unlock vCPUs. > > >More, what kinds of VM objects are mapped into the vmspace ? FreeBSD VM > >does not support shadowing of device objects (which means, inserting > >shadow objects into the device object chain breaks VM invariants). One > >of the main reasons why it not needed to be supported is because shadow > >copy cannot see changes which are performed on the shadowed pages, > >supposedly done by device. If vmm mmaps some devices into guest vmspace, > >the devices would kind of 'freeze' from the guest PoV. > > It's a OBJT_DEFAULT. It's not a device object, it's the memory object > given to guest to use as physical memory. Perhaps add asserts that you only shadow default/swap/vnode objects. Then you will see if the issue is what I noted above, or not. > > >Next, how do you undo the damage done by your 'COW' ? > > This is one thing that we've thought about, but we don't have a > solution for now. I agree it is very important, though. I figured that > it might be possible to 'unmark' the memory object as COW with some > additional tricks. You might consider using vm_object_collapse().