From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 24 22:40:23 2005 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F30F816A4CF for ; Mon, 24 Jan 2005 22:40:22 +0000 (GMT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id A083843D4C for ; Mon, 24 Jan 2005 22:40:22 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) j0OMeI0e043764; Mon, 24 Jan 2005 14:40:18 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id j0OMeIXP043763; Mon, 24 Jan 2005 14:40:18 -0800 (PST) (envelope-from dillon) Date: Mon, 24 Jan 2005 14:40:18 -0800 (PST) From: Matthew Dillon Message-Id: <200501242240.j0OMeIXP043763@apollo.backplane.com> To: ctodd@chrismiller.com References: <86pszu639o.fsf@borg.borderworlds.dk> <86brbe6052.fsf@borg.borderworlds.dk> cc: hackers@freebsd.org cc: Christian Laursen cc: Dominic Marks Subject: Re: Resuming from a crashdump X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jan 2005 22:40:23 -0000 :Well booting the kernel generally takes little time, but if all the :processes could be restored this would be a step in the right direction. :As John said, restoring the state of some programs will have to rely on :the program, but perhaps this could lead to an API of some sort that would :make this less painful on the program author. Eventually most programs :would support this over time. : :So what would it take to get the system to boot the kernel, then rebuild :the processes from VM? : :Chris I think this is doable but not universal. A kernel core dump can't be used for that sort of thing (it overwrites swap and swap might contain portions of the user processes in it). The kernel would have to write out a special save-to-disk file containing the VM image, file handles, signal state, register set, and so forth for each process in the system. Basically it would need to take DragonFly's checkpointing code (as a basis), extend it suitably, and use it to dump each process. Additional state would also have to be saved.... bound UDP sockets and sockets in a LISTEN state would have to be saved and restored. This is doable. But the following is far more difficult: * tty associations - restore * tty state - restore * job control and process group state - restore * open pipes - restore * open fifos - restore * open socketpairs - restore * established connections - throw away And this is very difficult: * X windows state, established connections to X from applications, and so forth. However, it might be possible to quiece X out of its video mode, remap the framebuffer, and then switch back into it. But it would still be a nasty problem. - Also, it could take just as long to restore the mess as it would just to reboot normally and restart your applications. After all, the system is likely to be disk-bound either way. -Matt Matthew Dillon