From owner-freebsd-hackers Wed Mar 4 12:58:33 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id MAA24978 for freebsd-hackers-outgoing; Wed, 4 Mar 1998 12:58:33 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from sendero.simon-shapiro.org (sendero-fxp0.Simon-Shapiro.ORG [206.190.148.34]) by hub.freebsd.org (8.8.8/8.8.8) with SMTP id MAA24939 for ; Wed, 4 Mar 1998 12:58:24 -0800 (PST) (envelope-from shimon@sendero-fxp0.simon-shapiro.org) Received: (qmail 10343 invoked by uid 1000); 4 Mar 1998 20:58:32 -0000 Message-ID: X-Mailer: XFMail 1.3-alpha-021598 [p0] on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <199803041843.TAA01389@yedi.iaf.nl> Date: Wed, 04 Mar 1998 12:58:32 -0800 (PST) Reply-To: shimon@simon-shapiro.org Organization: The Simon Shapiro Foundation From: Simon Shapiro To: Wilko Bulte Subject: Re: SCSI Bus redundancy... Cc: julian@whistle.com, hackers@FreeBSD.ORG Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On 04-Mar-98 Wilko Bulte wrote: ... > Anxiously awaiting. I just missed an opportunity today to obtain a > Mylex DAC960 3 channel RAIDcard. Bah. Last I touched these, they were where DPT was 5 years prior, only buggier. I was at Intel at the time, working on a ``big'' benchmark and could get zilch support. I far a lot better calling, anonymously into DPT hotline, saying ``I have this 1991 vintage card a friend gave me, and it does...'' Part of a product is its producer and support. Maybe Mylex is much better at it today. ... > A couple of years ago while working at Philips Info Systems we had a > SysV2 derivative that could do powerfail/restart (as we called it). > It used some battery backed up RAM, and it was not a PC (M68K cpu). > Having never worked on that kernel I don't know how they did it. > But it worked pretty well. The details fail me and we may be talking about two different things: A device driver monitors the power-fail line (typically, on VME it is an NMI). The driver's inerrupt service routine pushes the stack into memory, sets a bit and halts. When you boot, you FIRST look at that bit. If it is ON, you do NOT run memory test :-), you simply pop the stack and CONT (or whatever). That driver leaked into the SVR4 source tree. I used it in another project on a 486 port, but we did not use a BIOS *Yup, we built a PC that could not boot DOS, only Unix. >> Memory SNAP: If you write it into a DPT controller, and the controller >> has >> enough cache to hold it, it is pretty fast. I can sustain about 2us per >> transaction overhead and about 120MB/Sec. This gives us about a second >> or >> two. The new DPT's can retain the cache until power returns. >> Even a small UPS (with poer alarms will last long enough. > > But how do you checkpoint things? So, where did the processor leave > off? The DPT gets transactions form the host. It processes them in an autonomous manner. If the entire transaction is OK, an ACK is sent to the host. If not, not. If Power-Fail is detected, the DPT simply halts until it sees a reset from the host. Once the reset arrives, it checks the disks. If they are all there, it can choose to flush the caches. One the host, once you detect a power-fail, you write all that you want to the DPT. The DPT takes the WRITE requests and ACKs (it acts as a write-back cache, normal modus operandum). The only fly in this cup is; Whatt if there is more main memory than cache on the DPT (which is normally the case)? What we do here, is a callback to an emergency shutdown routine that calls sync() in the kernel, and then calls boot(). It assumes the UPS can sustain the system this long, but that is very doable. 1GB worth of buffers will take (at 6 MB/sec - slow RAID-5) just over two minutes to flush. Most systems are much faster than that. So, the answer is; There is exactly one checkpoint, and it is a one-shot. Once we detect power failure, we assume we have reserve power to flush everything and shutdown. This does not protect you from disk bay power failures, but these are almost aloways on N+1 power systems and hooked up to separate UPSs. To have the kernel actually checkpoint itself, with any better resolution, or intelligence will have to change too many things. I am trying to make the system monitoring drivers implement a general purpose, hardware independent manner. How successful that will be I do not know yet. ---------- Sincerely Yours, Simon Shapiro Shimon@Simon-Shapiro.ORG Voice: 503.799.2313 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message