Date: Wed, 4 Mar 1998 22:54:04 +0100 (MET) From: Wilko Bulte <wilko@yedi.iaf.nl> To: shimon@simon-shapiro.org Cc: julian@whistle.com, hackers@FreeBSD.ORG Subject: Re: SCSI Bus redundancy... Message-ID: <199803042154.WAA04141@yedi.iaf.nl> In-Reply-To: <XFMail.980304125832.shimon@simon-shapiro.org> from Simon Shapiro at "Mar 4, 98 12:58:32 pm"
next in thread | previous in thread | raw e-mail | index | archive | help
As Simon Shapiro wrote... > > On 04-Mar-98 Wilko Bulte wrote: > > ... > > > Anxiously awaiting. I just missed an opportunity today to obtain a > > Mylex DAC960 3 channel RAIDcard. Bah. > > Last I touched these, they were where DPT was 5 years prior, only buggier. > I was at Intel at the time, working on a ``big'' benchmark and could get > zilch support. I far a lot better calling, anonymously into DPT hotline, > saying ``I have this 1991 vintage card a friend gave me, and it does...'' > > Part of a product is its producer and support. Maybe Mylex is much better > at it today. I never talked to Mylex directly. And I'd only get one if I get it (nearly) free, like a $10-15 pricepoint. No real use for it here, but maybe fun to play with. Like the FDDI network here ;-) > >> enough cache to hold it, it is pretty fast. I can sustain about 2us per > >> transaction overhead and about 120MB/Sec. This gives us about a second > >> or > >> two. The new DPT's can retain the cache until power returns. > >> Even a small UPS (with poer alarms will last long enough. > > > > But how do you checkpoint things? So, where did the processor leave > > off? > > The DPT gets transactions form the host. It processes them in an > autonomous manner. If the entire transaction is OK, an ACK is sent to the > host. If not, not. If Power-Fail is detected, the DPT simply halts until > it sees a reset from the host. Once the reset arrives, it checks the > disks. If they are all there, it can choose to flush the caches. > > One the host, once you detect a power-fail, you write all that you want to > the DPT. The DPT takes the WRITE requests and ACKs (it acts as a > write-back cache, normal modus operandum). The only fly in this cup is; > Whatt if there is more main memory than cache on the DPT (which is normally > the case)? What we do here, is a callback to an emergency shutdown routine > that calls sync() in the kernel, and then calls boot(). It assumes the UPS > can sustain the system this long, but that is very doable. 1GB worth of > buffers will take (at 6 MB/sec - slow RAID-5) just over two minutes to > flush. Most systems are much faster than that. Agreed. Sounds ok. > So, the answer is; There is exactly one checkpoint, and it is a one-shot. > Once we detect power failure, we assume we have reserve power to flush > everything and shutdown. > > This does not protect you from disk bay power failures, but these are > almost aloways on N+1 power systems and hooked up to separate UPSs. An extra power supply is money well spent. We ship all our standalone arrays at least with N+1, optional 2N power. 2N gives you 2 seperate power entry points to the power grid. Now we only need to educate people to use two different power branches (phases? what's the right English term?) > To have the kernel actually checkpoint itself, with any better resolution, > or intelligence will have to change too many things. I am trying to make OK, that was my original question. Had a bad feeling about exactly what you mention here. _ ______________________________________________________________________ | / o / / _ Bulte email: wilko @ yedi.iaf.nl http://www.tcja.nl/~wilko |/|/ / / /( (_) Arnhem, The Netherlands - Do, or do not. There is no 'try' --------------- Support your local daemons: run [Free,Net,Open]BSD Unix -- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803042154.WAA04141>