From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 25 21:45:57 2005 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9481D16A4D1 for ; Tue, 25 Jan 2005 21:45:57 +0000 (GMT) Received: from ppsw-5.csi.cam.ac.uk (ppsw-5.csi.cam.ac.uk [131.111.8.135]) by mx1.FreeBSD.org (Postfix) with ESMTP id C8BA743D41 for ; Tue, 25 Jan 2005 21:45:56 +0000 (GMT) (envelope-from sos22@cantab.net) Received: from hermes-1.csi.cam.ac.uk ([131.111.8.51]:57852 helo=archibold.nowhere) by ppsw-5.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.155]:25) with smtp id 1CtYVM-0004f6-Ha (Exim 4.44) (return-path ); Tue, 25 Jan 2005 21:45:44 +0000 Received: by archibold.nowhere (sSMTP sendmail emulation); Tue, 25 Jan 2005 21:45:43 +0000 Date: Tue, 25 Jan 2005 21:45:43 +0000 From: Steven Smith To: Matthew Dillon Message-ID: <20050125214543.GA1113@archibold> References: <86pszu639o.fsf@borg.borderworlds.dk> <86brbe6052.fsf@borg.borderworlds.dk> <200501242240.j0OMeIXP043763@apollo.backplane.com> <41F59242.7090900@jonny.eng.br> <200501251948.j0PJmpYG048845@apollo.backplane.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="SLDf9lqlvOQaIe6s" Content-Disposition: inline In-Reply-To: <200501251948.j0PJmpYG048845@apollo.backplane.com> X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ X-Cam-AntiVirus: No virus found X-Cam-SpamDetails: Not scanned cc: ctodd@chrismiller.com cc: Christian Laursen cc: Jo?o Carlos Mendes Lu?s cc: hackers@freebsd.org cc: sos22@srcf.ucam.org cc: Dominic Marks Subject: Re: Resuming from a crashdump X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jan 2005 21:45:57 -0000 --SLDf9lqlvOQaIe6s Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > You basically would either have to make all device drivers support a = new=20 > hibernation/restore API (because it is not really possible to restore > a device driver based on a dump), How much overlap is there likely to be between this and the sorts of things you need in order to resume from power management modes? > Also, if the machine has a lot of memory it could take longer to save > and restore then to reboot from scratch. A typical laptop HD is=20 > ~30 MB/sec. If your laptop has 512MB then it would take 16 seconds > to go into hibernation mode, and 16 seconds to come out of, plus BIOS > and loader overhead. *shrug* If the image you're saving is just sitting at a login prompt, it probably doesn't buy you much, but once you've got a couple of dozen xterms open it could easily take more than 30 seconds to restore all of the state by hand. Also, have you ever looked at the live migration stuff Xen uses? The aim here is to move a running operating system from one machine to another with minimal downtime. Essentially, you just start copying pages across willy nilly, keeping track of pages which get dirtied. After every page has been copied, you go back over the list of dirty pages, and just migrate them, and so on, until you stop making any progress. At that point, you stop the guest operating system and copy everything that's left in one big go, and start it going on the new machine. If you just send pages to disk rather than to another machine on the network, then you should be able to suspend-to-disk an entire operating system with minimal user-perceived downtime. One possibility here would be to e.g. live suspend the machine every five minutes or so, and guarantee the user never loses more than five minutes of work. > I think it would probably be more realistic to persue a process=20 > save/restore rather then a kernel save/restore. The overhead is going > to be the disk I/O anyway and that seems to be about the same either > way (maybe less for a process restore), plus you can at least demand-= load > the process restore. The problem with a process checkpoint is that it's then rather difficult to get all of the inter-process stuff right. If you checkpoint an entire OS, that comes for free. Steven Smith. --=20 'Double-entry bookkeeping ....simple to adapt to modern computer methods by using positive or negative electric charges to signal whether an account should be debited or credited.' -- Accounting Theory and Practice, Glautier M.W.E --SLDf9lqlvOQaIe6s Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (FreeBSD) iD8DBQFB9r4HO4S8/gLNrjcRAov4AKC8tGaBqi7LNdvF9JV5Cxn5+06wagCeO4iD rjz+cz22kG9XZNbDyFM/gmM= =QYuM -----END PGP SIGNATURE----- --SLDf9lqlvOQaIe6s--