From owner-freebsd-hackers@freebsd.org Tue Apr 11 12:37:29 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 63AAFD39C68 for ; Tue, 11 Apr 2017 12:37:29 +0000 (UTC) (envelope-from f.v.anton@gmail.com) Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com [IPv6:2a00:1450:400c:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F10FAB0D for ; Tue, 11 Apr 2017 12:37:28 +0000 (UTC) (envelope-from f.v.anton@gmail.com) Received: by mail-wm0-x230.google.com with SMTP id u2so59946975wmu.0 for ; Tue, 11 Apr 2017 05:37:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to :content-transfer-encoding; bh=WvxxcZgDNDwMwhMhgwSXfFh/eyysDmXlZLA+Jvla+WA=; b=dMG5l+LKqlY/9Ko17ZNKSCdJrd8Ok6RJhFNdML6xnBrfnaY0J61BRSqzozDnO2QYm+ SurW/z1jg9rPoMlt4EKXzQAE7CukQXVreiYkATswU4fD1wJz7bwoj4UrH9WfyLVGJst4 xP4VbN676hKqZgAzwzFqgl9/4bX1fx55a6Ds3s0xeLcYmJ3+pnibiRobp8tiMJwf96lc LEYvi8qGcmznOmXJKbNfWem4GXQC4RzriJaYeHwtFZdR1auoWlnUjhWVsO5RppOIJRcQ Ae4LepPggA3YLURLo1XubefbzakNqtSlXYBu9JyY8ZT9n+G3KkGp7hW5LHuzlad2K++3 HyHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to :content-transfer-encoding; bh=WvxxcZgDNDwMwhMhgwSXfFh/eyysDmXlZLA+Jvla+WA=; b=ZeBW43nXvZLvxI93OCD9+BsohTC9WU1oUKUdSaBgEi/xHk+xmW/iAUvgUHbMh7dh5O 5OPnmLBonRrWA8ISD6O+x2VNjW1yeNgVGh7s99CT1Ts6GeGKZd5NmEqbEhy2IqPQkDJl Gy++UKS3HDcrBIDUNxrFIDXBkQSclrufWU+qJtUttCYYVZGxtkdaOALz1vxJR74EhKuV Lq10q7Y/nkKYmPHIkfNArU7uGdrVWP2WpkPNdHHM3yagDwvB8tTULJ25R2qRCOaZhbo1 0xayrK+GV+oDiUdzV5Q1SQtqnecHe3ADnjUESQP4T5/XaX7sfi9/vG+uYCR6Fw9kpoZT Xn9w== X-Gm-Message-State: AN3rC/5OCneQ2ZNdIi5Za2WBDaqLHsmkqoBk1kE3SGt6HFAvXVEdJq7TUHgez+/M+yV1rLGlcYx+9wpB05w5xA== X-Received: by 10.28.72.67 with SMTP id v64mr14580261wma.98.1491914246611; Tue, 11 Apr 2017 05:37:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.178.10 with HTTP; Tue, 11 Apr 2017 05:37:26 -0700 (PDT) From: Flavius Anton Date: Tue, 11 Apr 2017 15:37:26 +0300 Message-ID: Subject: On COW memory mapping in d_mmap_single To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Apr 2017 12:37:29 -0000 Hi everyone, I'll start by giving some context, so you can better understand what is the problem I'm trying to solve. I=E2=80=99ve been working for a while o= n bhyve trying to implement save/restore [1]. We've currently managed to get it working for VMs using a ramdisk and no devices, so just vCPU and memory states are saved and restored so far. Last week I started looking into network devices, specifically virtio-net devices. The problem was that when I issue a checkpoint operation, the guest virtio driver stops working. After digging for a while, I figured out the problem is marking VM memory as COW. If I don't do this, the driver continues with no problem after checkpointing. Each VM has an associated vmspace and a /dev/vmm/VM_NAME device. When the user space does a mmap on the /dev device, we would like to mark VM memory as COW, thus the VM can continue touching pages while the user space is writing the 'freezed', COW marked memory to a persistent storage. We do this by iterating through all vm_entries from VM's vmspace, we find which entry is mapping the object that has VM memory and then we roughly just set MAP_ENTRY_COW and MAP_ENTRY_NEEDS_COPY on that entry. You can see the code here [2]. I'm not sure if the above is sufficient for our purpose. In other words, how would you do this? You have a vm_object that is referenced via a vm_entry by process A (the user space). Somebody else, process B let's say, does an mmap() on your device and you'd like to freeze that object, such that process B can see a consistent snapshot of it, while you want process A to be able to continue reading and writing from/to it. I've also read through Design Elements of the FreeBSD VM system [3], but I am still afraid (I am sure) that I have some misunderstandings. Thank you very much for bearing with me and going through this wall of text= . -- Flavius [1] https://github.com/flaviusanton/freebsd/tree/bhyve-save-restore [2] https://github.com/flaviusanton/freebsd/blob/bhyve-save-restore/sys/amd= 64/vmm/vmm_dev.c#L862 [3] https://www.freebsd.org/doc/en/articles/vm-design/index.html