Date: Wed, 04 Mar 1998 11:33:57 -0800 (PST) From: Simon Shapiro <shimon@simon-shapiro.org> To: sbabkin@dcn.att.com Cc: wilko@yedi.iaf.nl, tlambert@primenet.com, jdn@acp.qiv.com, blkirk@float.eli.net, hackers@FreeBSD.ORG, grog@lemis.com, karl@mcs.net Subject: RE: SCSI Bus redundancy... Message-ID: <XFMail.980304113357.shimon@simon-shapiro.org> In-Reply-To: <C50B6FBA632FD111AF0F0000C0AD71EE4132D3@dcn71.dcn.att.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 04-Mar-98 sbabkin@dcn.att.com wrote: ... >> I wrote a white paper at Oracle some years ago, claiming that >> databases >> over a certain size simply cannot be backed up. I became very >> UN-popular >> very quickly. In you moderate setup, you already see the proof of >> corectness. >> > IMHO they CAN be backed up. As long as you have enough spare equipment. > At my previous work in bank where we were paranoid > about backup and downtime I think I have found a scaleable way > of doing so. We used it on a relatively small database (~15G) but > I can't see why it can not be scaled. First, forget about exports. > Copy the database files and archived logs. Additionally to the > production instance have two more instances. One gets archived logs > copied and rolled forward immediately. Another one gets archived > logs copied immediately, but rolled forward only after they aged. > Copy this third instance to tapes time to time. Copy archived > logs to tape as fast as they get produced. Yes. This scheme works, but you ar not backing up the database, nor is it scalable. Operating on a databse (from a backup point of view) makes arbitrary changes to the files. If you back them up, you will have an inconsistent view of the data. Problem number 2: If your system's storage I/O is utilized at higher that 50%, you cannot dump the files at all. > If the production instance crashes, use the second one. If someone > removed a table and that was more recently than the age of third > instance, start this instance and get this table from it. If this > removal was noticed too late, there will be big PITA with restoring > from tapes. What you describe here is application-level mirroring. It works after a fasion, but in case the two databases go out of sync, you have no way of proving which side is correct. Also, it is not a deterministic system; You cannot really commit the master until the slave committed. This gets nasty in a hurry. One database with one mirror may work. Twenty of them? > Do offline (better but with downtime) or online backup if you do reset > logs. This can be done fast if the I/O subsystem is has enough > throughput to copy all the disks of database to backup disks in > parallel, and if the disks can be remapped between machines > easily. For 4G disks this will be not more than 1 hour. There are databases which cannot go offline. Banks have the unique position where they hold the customer's money behind a locked door :-) An ISPs Radius database cannot shut down. A telephone company authentication server cannot shutdown, A web server should not shut down. A mail server can shutdown. A DNS server cannot shutdown. You may disagree with some of these classifications, but some of them cannot be shutdown, and actually cannot get out of sync either. ... > Nope. Databases must have dedicated filesystems. And as long > there are no files created or removed in these filesystems > or blocks added or removed to/from any files in them > (in other words, no change of metadata, what is normal for databases) > there is no chance that you will lose your database. > I know that not everyone follows this rule (looks like everyone > in AT&T does not do it) but this is their personal problem and > not the problem of Unix. I was hoping you will say that :-) You are talking theory. I am talking practice. I have demonstrated cases, (many times) where you boot a system, mount everything, crash it and upon re-boot, the filesystem is severely corrupt. Besides, a living database will change things on disk. There is no Unix semantics to pre-alloacte blocks to a file in Unix. Some of you may remember the old Oracle ccf utility. It did exactly that. Therfore, you may add a block to file A, which shares superblock sector with file B, have the system crash three days later, and then fsck will decide that file A belongs in lost+found, or, less commonly, rearrange it a bit. If you never saw it, you simply did not look long enough. I totally agree that most of it is a filesystem problem, not a Unix problem. I am working on such filesystem right now. The problem I have is that the Unix semantics for creat(2), open(2), etc. are wrong for such a filesystem. We can only estimate the degree of noise generated and guess at the outcome, if I suggested that these system calls need a new definition, or that new ones are needed. I'll save that for another day :-) Please do not misunderstand me; I like Unix, I love FreeBSD, but perfect for all occasions neither one is. Simon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980304113357.shimon>