Date: Tue, 24 Oct 1995 22:00:38 -0700 From: "Russell L. Carter" <rcarter@geli.com> To: Julian Elischer <julian@ref.tfs.com> Cc: terry@lambert.org (Terry Lambert), bugs@ns1.win.net, hackers@FreeBSD.ORG Subject: Re: process migration Message-ID: <199510250500.WAA04762@geli.clusternet> In-Reply-To: Your message of "Tue, 24 Oct 1995 21:34:47 PDT." <199510250434.VAA16988@ref.tfs.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> > > > > > There's also other problems: > > > > 1) File as swap store. The executable file is acting as its own > > swap store; this means you must reopen the file (which means > > you need its name) and reestablish the flags on the vnode to > > orevent writes to it. > write the entire process space including non resident pages.. > (implies that shared programs become static ) > > > > 2) Memory overcommit. There very well may not be enough swap > > to checkpoint the program. > put it out to a file....... If overcommitted ignore it. Too bad. > > > > 3) Shared libraries. The shared library mappings must be > > restored, probably seperately. > static.. quite possibly this might be used in a specialist environment > (such as what russel is working on,) where shared libs might not be required > in any case) Righto. Cray Research machines have been checkpointing fine for 10 years. Of course, they only swap and don't page (or didn't use to, I haven't played with the SPARC stuff). Everything is statically linked. Primitive model, works fine with the bulk of *their* workload. Would work fine with my model too, as long as it just applied to user apps. A great deal of effort is expended to protect users from themselves, but if they need checkpointing, the users often are very savvy about getting themselves on the boat. That includes app developers too. Note: this is for jobs that run a minimum of several days, sometimes weeks. Regards, Russell
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199510250500.WAA04762>