Date: Sun, 29 Dec 2002 23:09:13 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Brad Knowles <brad.knowles@skynet.be> Cc: Patrick Cable II <freebsd@slaudiovis.org>, chat@freebsd.org Subject: Re: Backup Solutions Message-ID: <3E0FF119.7792A270@mindspring.com> References: <3E0DC536.8010001@slaudiovis.org> <3E0EBC49.86AD7E28@mindspring.com> <a05200f09ba3573361365@[10.0.1.5]>
next in thread | previous in thread | raw e-mail | index | archive | help
Brad Knowles wrote: > At 1:11 AM -0800 2002/12/29, Terry Lambert wrote: > > I expect that the correct thing to do is to have a replica and > > a non-volatile backup mechanism, in combination. > > Sounds good. But what are good tools to achieve these goals? > > Myself, I would be interested in extending this question to also > cover PC/Windows & Macintosh (MacOS X) clients, in addition to the > FreeBSD server. So, in addition to backing up the server itself, you > also need software to back the clients up to the server, which can > then be rolled into the server "data" to be backed up. The problem with Windows and Macintosh is the software doesn't provide transaction triggers. Most places where you would want to do this sort of thing are for replica servers for databases for small businesses; large ones already have Veritas with snapshots or Oracle or Sybase or some other "real" database. For Windows and Macintosh, you are most likely to be using some Microsoft based solution, like Access, or Access with a Microsoft SQL back end. Most often, this is MSDE, which is a cut-down SQL server that comes free with the developer's software, and has a distribution license, which saves you from the license fees and the per-user license requirements of MS SQL Server. In the common case, a well-written application will close the database between transactions, so that implied atomicity guarantees happen. But most of these applications are written by non-CS people who are writing code to get something working, and never think about the consequences of big customers, etc.. As an example, one of my dad's businesses has a number of various purpose-built applications that have grown up in order to implement his business rules and internal practices. One of them is a client/server application that uses MS SQL server, another is mostly Access based (but thinking of moving the data store over to MS SQL), and another is the MSDE "free" SQL server (it's a document imaging system). None of them can really work properly for a backup that doesn't go out of its way to deal with open files, so there is a one day latency in data recovery, except for tiny windows that would make the database(s) unavailable for however long the backup or replica creation took. The MS SQL one could be handled, though... it's possible to make a replicating proxy. But then corrupt data would still be replicated all over, as would improper deletions, since there is not a deleted record marker and a pending index, to delay actual deletion until a purge operation takes place. Access can be backed by MS SQL, and you can replace MSDE with MS SQL, meaning you can proxy all of them, but then you are paying on the order of $10,000, which is a little steep for a 25 person office. In etiher case, the offline backup is still needed for the reasons stated previously. That business uses tape to do a rotating offsite backup, with an incremental and full archival dump schedule. These require that people exit the applications, so it can be some work walking around the office after hours, for the person doing it, if anyone has left their machine on and accessing a record in any of them. > Is there an Amanda PC/Windows client? Or an Amanda MacOS X > client? There is one in beta right now. IT's available from: http://sourceforge.net/projects/amanda-win32/ It doesn't seem to have been updated since last June. 8-(. You probably are not going to find it useful, due to the "open file backup" problem. The normal way to handle "open file backup" on windows is to install an IFSMgr hook to hook calls into the IFS manager. You can then use the existing open instance for the file in question in order to back it up. This is usually not very satisfying for database files, since the transactional representation of atomicity and/or idempotence to the application are done at the database application level, and can't be guaranteed at the IFSMgr level (there is no nesting information pushed by the application to the IFSMgr, so you can wait for a 1->0 nesting level for transactions in progress to complete, before doing your backup). The result is "corrupted" database files (they aren't really corrupt, they're more like there was a crash in the middle of however many updates were in progress at the time). Most database software will not do a check on the data for you automatically, and you may not be able to trigger crash-recovery behaviour following a restore, without special knowledge. The most common method is to export the FS as a share, and then use Amanda with SAMBA (client) in order to back up the data; this has the same problems. See the online book at: http://www.backupcentral.com/amanda.html specifically: http://www.backupcentral.com/amanda-13.html > What about the handling of tape swapping, archiving, and > other things normally done with stackers and libraries? You use stackers and libraries. 8-). > > I also suggest that you avoid the "active file can't be backed up" > > problem, by choosing the correct software (and no, "snapshots" are > > not good enough, because they don't trap the right state for the > > implied metadata, among other deficiencies). > > Good point. What are good tools to avoid this problem, at least > with regards to FreeBSD? There aren't any, per se. You have to have people write the code the right way, instead. For Access, the most common method of making code "backup safe" is to close it between operations, to mark the ends of transactions. The database software has the same problem opening the file as you would, if the database software has the file open (barring the IFSMgr shim approach). Even so, this only works to protect the integrity of data or of metadata, but not of implied metadata. For example, if you have a seperate index and record file, and there is no guarantee on the order of operation on th two of them, even if the application were to sync the data out to guarantee idempostence against reuse of the data by the application, that doesn't make the backup operation into an atomic snapshot. Even if the backup software opened the next file before closing the last, you can't guarantee that the snapshot that you will get is accurate. Most professional database software comes with software that does its accesses the same as the database software, and sometimes even does its dumps by connecting to the database, locking records, and dumping them out (MySQL does it that way, but a record that contains a non-indexed field which is an index for another database, can still have inconsistant state: that's an implied metadata relationship). If you have professional database software, then you will back up the database contents with the vendor supplied software, and not treat the database files as if they were files. In most cases, you are talking about spending money for commercial software, since the professional database backup software is often a seperate product add-on. That's the case for all MS products using FreeBSD systems as a file server to store database files on remotely accessed volumes. If you're talking about FreeBSD databases on FreeBSD, then most of them have ways to dump the database contents atomically, for backup purposes. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E0FF119.7792A270>